test(st): Add Mat slice to Left matmul pattern system test by Crystal-wzy · Pull Request #1225 · hw-native-sys/pypto

Crystal-wzy · 2026-04-29T08:25:46Z

Summary

Add TestMatSliceToLeft PTOTestCase validating the wide Mat load → slice → Left matmul pattern for issue [Feature] Support target_memory on pl.slice (or fuse slice+move) for Mat→Left/Right path #1198, where a single [M, 2K] BF16 tile is loaded into Mat to satisfy the 512 B GM row-alignment constraint, sliced into two [M, K] subviews, moved to Left, and used in K-split matmul + matmul_acc accumulation
Add TestMatSliceToLeftSuite pytest suite parametrized over PLATFORMS with PyTorch golden reference (A[:,:K]@b0 + A[:,K:]@b1)
Codegen expectation: pl.slice on Mat tile → pto.subview (zero-copy), pl.move from subview → pto.tmov (Mat subview → Left)

Testing

Parametrized test passes on 910B PTO hardware
PyTorch golden reference validated
Pre-commit hooks pass (ruff, pyright)

Closes #1198

coderabbitai · 2026-04-29T08:26:01Z

Warning

Rate limit exceeded

@Crystal-wzy has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 55 minutes and 13 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 930c3bee-d6c9-4887-9050-d89bd866fa0d

📥 Commits

Reviewing files that changed from the base of the PR and between fcc9c07 and 3aa67fd.

📒 Files selected for processing (1)

tests/st/runtime/test_mat_slice_to_left.py

📝 Walkthrough

Walkthrough

Adds a new runtime test that validates Mat → slice → move-to-Left → matmul(+matmul_acc) codegen, and updates the call-direction derivation pass to avoid promoting certain Out args to InOut when writes are disjoint-store patterns (with a per-callee/arg cache). Also adds a unit test for the R-seq exception case.

Changes

Cohort / File(s)	Summary
Runtime test `tests/st/runtime/test_mat_slice_to_left.py`	New PyTest runtime test that compiles/runs a program loading a wide BF16 Mat tile, slices it into two Left subviews, loads two Right tiles, performs matmul + matmul_acc, stores FP32 result, and asserts correctness vs. PyTorch reference.
Call-direction analysis `src/ir/transforms/derive_call_directions_pass.cpp`	Added transform_utils import, a disjoint-store analysis and memoizing `disjoint_store_cache_`, and updated `CallDirectionMutator` logic to skip R-seq promotion to `InOut` when a callee/arg exhibits disjoint tile.store behavior.
Unit test `tests/ut/ir/transforms/test_derive_call_directions.py`	New unit test covering the R-seq “exception” case: verifies an Out param remains `OutputExisting` inside a sequential loop when writes use tile.store offsets dependent on another param, and checks intermediate scalar direction and single kernel call.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

fix(ir): Avoid over-promoting parallel-loop Out args to InOut (#1086) #1131 — Modifies derive_call_directions_pass.cpp around R-seq promotion and cached analysis checks; closely related.
fix(ir): apply call-direction promotion to enclosing-param roots #1211 — Also touches CallDirectionMutator and R-seq promotion logic; likely overlapping intent and risks.

Suggested reviewers

lyfne123
Hzfengsy

Poem

🐰 I nibble code and hop with glee,
Wide Mat splits tuck snug as can be.
Left and Right now dance in pair,
Accumulating sums with flair,
Rabbit cheers — the tests agree! 🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: adding a test for the Mat slice to Left matmul pattern from issue `#1198`.
Linked Issues check	✅ Passed	The PR adds a system test validating the Mat slice to Left matmul pattern required by issue `#1198`, implementing the test case that demonstrates the desired slice-to-Left behavior.
Out of Scope Changes check	✅ Passed	All changes are focused on adding a new runtime test module for the Mat slice to Left pattern; no out-of-scope modifications to unrelated code.
Description check	✅ Passed	The pull request description comprehensively relates to the changeset, detailing the new test case for the Mat slice to Left matmul pattern, including the tensor shapes, program structure, expected codegen behavior, and testing validation.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 55 minutes and 13 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/st/runtime/test_mat_slice_to_left.py (1)
131-138: Assert codegen pattern explicitly, not only result.passed.

This test currently proves correctness/compilability, but it doesn’t explicitly lock the regression target (pl.slice(Mat) -> pto.subview, then pl.move(subview, Left) -> pto.tmov). Adding codegen-text assertions here would make the test much more robust against future lowering drift.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/st/runtime/test_mat_slice_to_left.py` around lines 131 - 138, The test
should not only assert result.passed but also verify the generated IR/text
contains the expected lowering patterns: check that the lowering of
TestMatSliceToLeft includes a pl.slice(Mat) -> pto.subview sequence and a
pl.move(subview, Left) -> pto.tmov sequence; after calling
test_runner.run(TestMatSliceToLeft) use the runner's returned artifact that
exposes generated code (e.g., result.codegen or result.ir) and add assertions
that those exact substrings/patterns appear (or use regexes) to lock the
regression target in the TestMatSliceToLeft test.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/st/runtime/test_mat_slice_to_left.py`:
- Line 18: Docstring text contains the Unicode multiplication sign "×" which
triggers Ruff RUF002; update the docstring line containing the phrase "K-chunks
into one wider GM load (row = 2 × 128 × 2 = 512 B) and splitting" in
tests/st/runtime/test_mat_slice_to_left.py to use the ASCII letter "x" (e.g.,
"row = 2 x 128 x 2 = 512 B") so the comment no longer contains ambiguous Unicode
characters.

---

Nitpick comments:
In `@tests/st/runtime/test_mat_slice_to_left.py`:
- Around line 131-138: The test should not only assert result.passed but also
verify the generated IR/text contains the expected lowering patterns: check that
the lowering of TestMatSliceToLeft includes a pl.slice(Mat) -> pto.subview
sequence and a pl.move(subview, Left) -> pto.tmov sequence; after calling
test_runner.run(TestMatSliceToLeft) use the runner's returned artifact that
exposes generated code (e.g., result.codegen or result.ir) and add assertions
that those exact substrings/patterns appear (or use regexes) to lock the
regression target in the TestMatSliceToLeft test.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5b292123-43d0-45ab-99fe-5b4291d8f8b9

📥 Commits

Reviewing files that changed from the base of the PR and between 715927a and fcc9c07.

📒 Files selected for processing (1)

tests/st/runtime/test_mat_slice_to_left.py

gemini-code-assist

Code Review

This pull request introduces a new system test, test_mat_slice_to_left.py, which validates a specific memory access pattern: loading a wide BF16 tile into Mat memory to meet alignment requirements, slicing it into subviews, and moving those subviews to Left memory for matmul operations. This test specifically targets scenarios encountered in Qwen3 decode kernels. I have no feedback to provide.

@b0

## Summary - Add TestMatSliceToLeft PTOTestCase validating the wide Mat load → slice → Left matmul pattern for issue hw-native-sys#1198, where a single [M, 2K] BF16 tile is loaded into Mat to satisfy the 512 B GM row-alignment constraint, sliced into two [M, K] subviews, moved to Left, and used in K-split matmul + matmul_acc accumulation - Add TestMatSliceToLeftSuite pytest suite parametrized over PLATFORMS with PyTorch golden reference (A[:,:K]@b0 + A[:,K:]@b1) - Codegen expectation: pl.slice on Mat tile → pto.subview (zero-copy), pl.move from subview → pto.tmov (Mat subview → Left) ## Testing - [x] Parametrized test passes on 910B PTO hardware - [x] PyTorch golden reference validated - [x] Pre-commit hooks pass (ruff, pyright) Closes hw-native-sys#1198

github-project-automation Bot added this to pto project Apr 29, 2026

coderabbitai Bot reviewed Apr 29, 2026

View reviewed changes

Comment thread tests/st/runtime/test_mat_slice_to_left.py Outdated

gemini-code-assist Bot reviewed Apr 29, 2026

View reviewed changes

Crystal-wzy force-pushed the bugfix branch 4 times, most recently from ee59324 to 70dd9e9 Compare April 29, 2026 08:42

Crystal-wzy force-pushed the bugfix branch 2 times, most recently from 17bdc62 to 3aa67fd Compare April 30, 2026 03:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(st): Add Mat slice to Left matmul pattern system test#1225

test(st): Add Mat slice to Left matmul pattern system test#1225
Crystal-wzy wants to merge 1 commit intohw-native-sys:mainfrom
Crystal-wzy:bugfix

Crystal-wzy commented Apr 29, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Crystal-wzy commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Uh oh!

coderabbitai Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Crystal-wzy commented Apr 29, 2026 •

edited

Loading

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading