Skip to content

fix(pass): Skip R-seq InOut promotion for disjoint variable-offset stores#1232

Open
Crystal-wzy wants to merge 1 commit intohw-native-sys:mainfrom
Crystal-wzy:main
Open

fix(pass): Skip R-seq InOut promotion for disjoint variable-offset stores#1232
Crystal-wzy wants to merge 1 commit intohw-native-sys:mainfrom
Crystal-wzy:main

Conversation

@Crystal-wzy
Copy link
Copy Markdown
Contributor

@Crystal-wzy Crystal-wzy commented Apr 30, 2026

Summary

  • Add DisjointStoreVisitor to detect when a callee writes to an Out
    parameter exclusively via tile.store with offsets that depend on other
    function parameters (position-dependent, disjoint writes)
  • At the call site, verify the corresponding arguments are loop-variant
    (reference a sequential loop induction variable) before keeping the
    original OutputExisting direction instead of promoting to InOut
  • Track sequential loop variables in CallDirectionMutator and cache
    per-callee analysis results in offset_param_cache_ to avoid redundant
    visitor traversals
  • Add ExprReferencesAnyOf utility for checking transitive Var references
    across BinaryExpr, UnaryExpr, TupleGetItemExpr, Call, and MakeTuple

Testing

  • Added test_out_param_variable_offset_store_in_seq_loop_not_promoted
    verifying disjoint stores keep OutputExisting
  • Added test_out_param_invariant_offset_in_seq_loop_promoted
    verifying loop-invariant offsets still promote to InOut

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 30, 2026

📝 Walkthrough

Walkthrough

This PR extends the call direction derivation pass with expression-variable dependency analysis to detect when callees write to Out parameters through tile.store offsets that depend on other parameters. It introduces an exception to the default OutInOut promotion under sequential ancestors when call-site arguments reference sequential loop induction variables, and adds tests verifying R-seq behavior.

Changes

Cohort / File(s) Summary
Call Direction Derivation Logic
src/ir/transforms/derive_call_directions_pass.cpp
Adds ExprReferencesAnyOf helper, DisjointStoreVisitor IR visitor, and CalleeHasOnlyVariableOffsetStores function to analyze when callee Out parameter writes use parameter-dependent offsets. Updates direction derivation to conditionally skip OutInOut promotion under sequential ancestors when offsets depend on loop induction variables. Includes caching for callee qualification and offset-parameter sets.
Direction Derivation Tests
tests/ut/ir/transforms/test_derive_call_directions.py
Adds two unit tests for R-seq Out parameter behavior: one verifying OutputExisting preservation when offset varies per iteration, another confirming InOut promotion when offset is loop-invariant.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

  • PR #1131: Modifies OutInOut promotion logic in the same pass via prior-writer and sequential-depth analysis.
  • PR #1211: Extends OutInOut promotion rules to enclosing-parameter roots in the same file.
  • PR #282: Refactors orchestration codegen to consume ParamDirection annotations that this PR's logic produces.

Suggested reviewers

  • lyfne123
  • Hzfengsy

Poem

🐰 With clever offset spies peeking through,
No hasty Out-to-InOut conversions brew!
When tile.store dances on variable bounds,
Our directions stay true on loop-wrapped grounds—
Smart analysis wins the day with cheer! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: a fix for skipping R-seq InOut promotion when tile.store writes use variable offsets, which is the core objective of the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description accurately and specifically describes the changeset, detailing the new analysis utilities, the behavior change for R-seq promotion, and the testing additions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an optimization to avoid promoting OutputExisting to InOut in sequential loops when writes are disjoint, using a new CalleeHasOnlyVariableOffsetStores check and a caching mechanism. The review feedback points out that the analysis is currently insufficient as it uses a shallow traversal that misses nested stores and fails to verify that offset arguments are variant within the loop. Additionally, the ExprReferencesAnyOf helper should be extended to support more IR node types, and further testing is recommended to ensure the soundness of the disjointness logic.

Comment on lines +106 to +129
auto stmts = transform_utils::FlattenToStmts(callee->body_);
for (const auto& stmt : stmts) {
auto assign = As<AssignStmt>(stmt);
if (!assign) continue;
auto call = As<Call>(assign->value_);
if (!call || !call->op_) continue;

if (call->op_->name_ == "tile.store" && call->args_.size() >= 3) {
auto target_var = AsVarLike(call->args_[2]);
if (target_var && aliases.count(target_var.get())) {
found_store = true;
aliases.insert(assign->var_.get());
if (!ExprReferencesAnyOf(call->args_[1], other_params)) {
all_variable = false;
}
}
}
if (call->op_->name_ == "tensor.assemble" && !call->args_.empty()) {
auto target_var = AsVarLike(call->args_[0]);
if (target_var && aliases.count(target_var.get())) {
aliases.insert(assign->var_.get());
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This analysis uses FlattenToStmts, which only inspects top-level statements. This is insufficient for resolving call-site parameter directions as nested stores or assignments will be missed. Since effective directions for Spmd and Group types must be computed by inspecting inner kernel calls for accurate dependency tracking, a full IRVisitor should be used instead of a shallow traversal.

References
  1. When resolving call-site parameter directions, compute effective directions for both Spmd and Group function types by inspecting their inner kernel calls. These function types act as wrappers and their formal parameter directions may not reflect the true data flow, which is crucial for dependency tracking.

if (cache_it != fn_cache.end()) {
disjoint = cache_it->second;
} else {
disjoint = CalleeHasOnlyVariableOffsetStores(callee, i);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The CalleeHasOnlyVariableOffsetStores check only ensures that the callee's store offsets depend on its parameters. This is insufficient to guarantee disjoint writes at the call site. If the caller passes a loop-invariant value to the offset parameter, the writes will overlap across iterations. To safely skip InOut promotion, the pass should also verify that the argument passed to the offset parameter is variant with respect to the sequential loop.

Comment on lines +73 to +82
bool ExprReferencesAnyOf(const ExprPtr& expr, const std::unordered_set<const Var*>& vars) {
if (!expr) return false;
if (auto var = As<Var>(expr)) return vars.count(var.get()) > 0;
if (auto tuple = As<MakeTuple>(expr)) {
for (const auto& e : tuple->elements_) {
if (ExprReferencesAnyOf(e, vars)) return true;
}
}
return false;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation of ExprReferencesAnyOf only handles Var and MakeTuple nodes. To handle common offset expressions (e.g., offset + 1), it should be extended to handle BinaryExpr, UnaryExpr, and TupleGetItemExpr, ensuring consistent traversal across IR node types as required by repository standards for collection utilities.

bool ExprReferencesAnyOf(const ExprPtr& expr, const std::unordered_set<const Var*>& vars) {
  if (!expr) return false;
  if (auto var = As<Var>(expr)) return vars.count(var.get()) > 0;
  if (auto tuple = As<MakeTuple>(expr)) {
    for (const auto& e : tuple->elements_) {
      if (ExprReferencesAnyOf(e, vars)) return true;
    }
  }
  if (auto bin = As<BinaryExpr>(expr)) {
    return ExprReferencesAnyOf(bin->left_, vars) || ExprReferencesAnyOf(bin->right_, vars);
  }
  if (auto un = As<UnaryExpr>(expr)) {
    return ExprReferencesAnyOf(un->operand_, vars);
  }
  if (auto tgi = As<TupleGetItemExpr>(expr)) {
    return ExprReferencesAnyOf(tgi->tuple_, vars);
  }
  return false;
}
References
  1. When adding support for a new IR node type to a transformation pass, ensure all relevant traversal and collection utilities within that pass are updated to handle the new type consistently.

out = passes.derive_call_directions()(Prog)
calls = [c for c in _user_calls(out) if c.op.name == "kernel"]
assert len(calls) == 1
assert _dirs(calls[0]) == [ir.ArgDirection.Input, ir.ArgDirection.Scalar, ir.ArgDirection.OutputExisting]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider adding a test case where a loop-invariant value is passed to the offset parameter. This would demonstrate the soundness issue regarding disjointness at the call site. Per repository guidelines, when testing a pass with a known bug, the expected IR should match the actual buggy output while the issue is tracked.

References
  1. When testing a pass with a known bug, the 'Expected' IR should match the actual (buggy) output of the pass. The bug itself should be tracked in a separate issue.

@Crystal-wzy Crystal-wzy force-pushed the main branch 2 times, most recently from be65189 to 4f7be22 Compare April 30, 2026 06:38
…ores

## Summary
- Add `DisjointStoreVisitor` to detect when a callee writes to an Out
  parameter exclusively via `tile.store` with offsets that depend on other
  function parameters (position-dependent, disjoint writes)
- At the call site, verify the corresponding arguments are loop-variant
  (reference a sequential loop induction variable) before keeping the
  original `OutputExisting` direction instead of promoting to `InOut`
- Track sequential loop variables in `CallDirectionMutator` and cache
  per-callee analysis results in `offset_param_cache_` to avoid redundant
  visitor traversals
- Add `ExprReferencesAnyOf` utility for checking transitive Var references
  across BinaryExpr, UnaryExpr, TupleGetItemExpr, Call, and MakeTuple

## Testing
- [x] Added `test_out_param_variable_offset_store_in_seq_loop_not_promoted`
  verifying disjoint stores keep `OutputExisting`
- [x] Added `test_out_param_invariant_offset_in_seq_loop_promoted`
  verifying loop-invariant offsets still promote to `InOut`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant