[ADR] InsertSync memory-phi/canonical-writer dedup architecture by TaoTao-real · Pull Request #604 · hw-native-sys/PTOAS

TaoTao-real · 2026-04-29T08:30:44Z

Summary

This PR adds an ADR for architecture-level InsertSync improvement using Memory-Phi / Canonical-Writer modeling to deduplicate semantically equivalent sync chains in loop + if/else scenarios.

Why

Current path-intersection logic merges sync state only, but cannot merge semantically equivalent dependency edges created from branch-local physical writers. This can lead to duplicated loop-carried chains (including extra seed/drain) and inflated sync count.

What is included

New design doc: docs/designs/ptoas-insertsync-memory-phi-canonical-writer-design.md
Architecture decisions, data model, invariants, algorithm phases, validation gates, and risk controls

Scope

Docs-only ADR PR
No implementation changes in this PR

Follow-up implementation

Implementation will be tracked in follow-up PRs, gated by regression suites including #428/#454 and issue-style workloads (#226/#233/FA).

gemini-code-assist

Code Review

This pull request introduces a design document for a new Memory-Phi and Canonical-Writer architecture within the InsertSync layer to optimize synchronization chains in complex control flows. The review feedback highlights the need for better definitions of key terms like 'writer equivalence' and 'loop-carried candidates,' requests clarification on the scopeId field, points out inconsistencies in the SyncDedupKey definition across sections, and notes a typo in the document's date.

gemini-code-assist · 2026-04-29T08:32:51Z

+# PTOAS InsertSync: Memory-Phi / Canonical-Writer 去重合并架构设计（ADR）
+
+- Status: Proposed
+- Date: 2026-04-29


The date 2026-04-29 appears to be in the future. If this is a typo, please correct it to the actual date of proposal to avoid confusion about the document's timeline.

gemini-code-assist · 2026-04-29T08:32:51Z

+
+建议新增或扩展如下类型：
+
+1. `CanonicalMemKey { rootBuffer, groupId, depKind, scopeId }`


The scopeId field in CanonicalMemKey could benefit from more explanation. Could you clarify what this ID represents? For instance, is it a unique identifier for a control flow scope (like a loop or branch) from the region tree mentioned in section 5.1? A more detailed description would help in understanding its role in the memory key.

gemini-code-assist · 2026-04-29T08:32:51Z

+2. `CanonicalWriterId { kind=Def|BranchPhi|LoopPhi, id }`
+3. `MemoryPhiNode { kind, incomingDefs, mergeScopeId }`
+4. `CanonicalDepEdge { producer, consumer, srcPipe, dstPipe, depKind, loopScopeId }`
+5. `SyncDedupKey { producer, consumerAnchor, srcPipe, dstPipe, depKind, loopScopeId, slotClass }`


There appears to be a slight inconsistency in the definition of SyncDedupKey between sections.

Section 4.2 (line 59) describes the key with producerCanonicalWriter and loopScope.

Section 5.2 (line 94) defines it with producer and loopScopeId.

To improve clarity and ensure consistency, please unify these terms. For example, you could clarify if producer is the CanonicalWriterId and use either loopScope or loopScopeId consistently throughout the document.

gemini-code-assist · 2026-04-29T08:32:51Z

+1. 收集 `def/use` 与控制流作用域。
+2. 按 `(rootBuffer, groupId)` 建立 `MemoryDef` 序列。
+3. 在 `if/else` 汇合创建 `BranchPhi`。
+4. 在循环头创建 `LoopPhi`（若存在 loop-carried 候选）。


The document states that a LoopPhi is created "若存在 loop-carried 候选" (if a loop-carried candidate exists). This condition is a bit vague. Could you please elaborate on the criteria for identifying a "loop-carried candidate"? Providing more detail on this would make the algorithm clearer and reduce ambiguity during implementation.

gemini-code-assist · 2026-04-29T08:32:51Z

+
+缓解策略：
+
+1. 合并前置条件必须包含控制域一致性、writer 等价性、alias 可证明性。


In the mitigation strategies, you mention "writer 等价性" (writer equivalence) as a precondition for merging. This term is ambiguous. Does it refer to writers being of the same operation type, writing to the same buffer, or something else? For example, section 1 mentions merging writes from tmatmul and tmatmul.acc. Clarifying what constitutes "equivalence" in this context is crucial for ensuring correctness.

reedhecre · 2026-04-29T08:41:32Z

Codex Review

该评论由 review 机器人自动更新。

PR: [ADR] InsertSync memory-phi/canonical-writer dedup architecture #604 [ADR] InsertSync memory-phi/canonical-writer dedup architecture
Author: TaoTao-real
Base/Head: main / codex/adr-canonical-writer-memory-phi
Head SHA: 749a24a687ff
Trigger: 检测到新的 open PR
Generated At: 2026-04-29T08:41:31Z
Status: completed

Summary

ADR #604 has two correctness-level design gaps in the canonical dependency model and is missing a loop zero-trip regression gate in its validation plan.

Findings

P1 Canonical writer model does not cover WAR and ACC read/read hazards docs/designs/ptoas-insertsync-memory-phi-canonical-writer-design.md:100

Lines 98-103 route every MemoryDependentAnalyzer candidate through CanonicalWriterAnalysis, and the core types only model CanonicalWriterId / CanonicalDepEdge { producer, consumer, ... }. The current pass still needs non-writer-source hazards: IsMemInfoHasDependency inserts syncs for def/use (WAR) and for the special cross-pipe ACC use/use case because those read/read sequences can fail on device. Those edges do not have a source writer, so the proposed schema cannot represent them. If implemented as written, this will either drop required syncs or miscanonicalize them through an unrelated writer.

P1 CanonicalMemKey drops subview/range information and can merge unrelated slices docs/designs/ptoas-insertsync-memory-phi-canonical-writer-design.md:47

The ADR builds one logical def chain per (rootBuffer, groupId) and the key types only keep rootBuffer/groupId plus scope metadata (lines 47, 82, 90, 124). The existing alias model is range-sensitive: BaseMemInfo carries baseAddresses and allocateSize, and overlap checks use those byte ranges before treating two accesses as the same memory. The repo already has real same-root disjoint-slice patterns such as test/lit/pto/subview_tmatmul_acc_slices_a5.pto. Unless groupId is explicitly defined to encode offset/size-equivalence, BranchPhi/LoopPhi can merge unrelated slices of the same root buffer and dedup away syncs that are still required.

P2 Validation plan omits the existing zero-trip loop regression gate docs/designs/ptoas-insertsync-memory-phi-canonical-writer-design.md:195

Section 11.1 only gates issue428 and issue454, but this ADR also rewrites loop-header/back-edge handling (LoopPhi, seed/drain, eventId decisions). The current implementation has dedicated zero-iteration correctness logic, and the repo already carries test/lit/pto/issue533_loop_zero_trip_sync_regression.pto to ensure post-loop waits are preserved when a loop executes zero times. Leaving that regression out of the must-pass list means a first-iteration/zero-trip regression in the new loop canonicalization can slip through CI.

docs: add ADR for InsertSync canonical-writer memory-phi dedup

749a24a

gemini-code-assist Bot reviewed Apr 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ADR] InsertSync memory-phi/canonical-writer dedup architecture#604

[ADR] InsertSync memory-phi/canonical-writer dedup architecture#604
TaoTao-real wants to merge 1 commit intohw-native-sys:mainfrom
TaoTao-real:codex/adr-canonical-writer-memory-phi

TaoTao-real commented Apr 29, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Uh oh!

reedhecre commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		建议新增或扩展如下类型：

		1. `CanonicalMemKey { rootBuffer, groupId, depKind, scopeId }`


		缓解策略：

		1. 合并前置条件必须包含控制域一致性、writer 等价性、alias 可证明性。

Conversation

TaoTao-real commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

What is included

Scope

Follow-up implementation

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

reedhecre commented Apr 29, 2026

Codex Review

Summary

Findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TaoTao-real commented Apr 29, 2026 •

edited

Loading