Skip to content

Fix issue593 by reserving auto reserve_buffer capacity in PlanMemory#601

Open
HecreReed wants to merge 1 commit intohw-native-sys:mainfrom
HecreReed:codex/issue593-planmemory-ub-reuse
Open

Fix issue593 by reserving auto reserve_buffer capacity in PlanMemory#601
HecreReed wants to merge 1 commit intohw-native-sys:mainfrom
HecreReed:codex/issue593-planmemory-ub-reuse

Conversation

@HecreReed
Copy link
Copy Markdown
Collaborator

@HecreReed HecreReed commented Apr 28, 2026

Summary

  • account aligned pto.reserve_buffer auto=true capacity before PlanMemory decides whether local buffers can be placed without reuse
  • use the reduced per-address-space capacity in overflow diagnostics and local-memory planning
  • add an issue593 regression that proves a vec reserve_buffer now forces automatic UB reuse and still passes with --enable-insert-sync

Why this fixes issue593

Issue #593 is not about enlarging UB capacity. The failure happens because PlanMemory currently plans ordinary local buffers against the full local-memory budget, often choosing the no-reuse path, and only afterwards tries to place reserve_buffer auto=true.

In FlashAttention-style cases, that leaves no hole for the reserved FIFO even though reusing dead tiles would have made the program fit.

This change keeps the existing reuse algorithm, but makes auto reserve buffers participate in capacity accounting up front:

  • sum aligned auto reserve sizes per local address space
  • subtract that reserved capacity before deciding whether reuse is needed
  • keep the final auto reserve placement step unchanged

That gives Level1/Level2 code the same automatic UB reuse opportunity that users previously had to spell out manually with Level3 explicit addresses.

Fixes #593.

Validation

  • cmake --build build --target ptoas -j8
  • build/tools/ptoas/ptoas /tmp/issue593_min_repro.pto
  • build/tools/ptoas/ptoas --enable-insert-sync test/lit/pto/issue593_auto_reserve_triggers_reuse.pto
  • python3.9 llvm-lit -sv --filter issue593_auto_reserve_triggers_reuse build/test/lit
  • compare with origin/main, where /tmp/issue593_min_repro.pto fails with error: 'pto.reserve_buffer' op failed to allocate local memory hole for reserve_buffer

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements support for fixed-address alloc_tile operations in the PlanMemory pass for levels 1 and 2, allowing constant addresses to be reserved before automatic planning. The changes include new pre-planning logic, validation in ptoas, and updated liveness analysis. Feedback was provided to ensure that memory size reporting correctly calculates the maximum extent across all entries when only fixed-address tiles are present.

Comment thread lib/PTO/Transforms/PTOPlanMemory.cpp Outdated
Comment on lines +1519 to +1523
if (autoEntries.empty()) {
failApplyBufferInfo[rootStorageEntry->bufInfo->bufferScope] =
rootStorageEntry->bitsOffset + rootStorageEntry->alignedConstBits;
return;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When autoEntries is empty, all entries in the scope have fixed addresses. The current logic only considers rootStorageEntry when calculating the required memory size for error reporting. It should instead iterate over all scopeEntries to find the maximum extent (offset + size) to provide an accurate report.

  if (autoEntries.empty()) {
    uint64_t maxAllocBits = 0;
    for (StorageEntry *entry : scopeEntries) {
      maxAllocBits = std::max(maxAllocBits, entry->bitsOffset + entry->alignedConstBits);
    }
    failApplyBufferInfo[rootStorageEntry->bufInfo->bufferScope] = maxAllocBits;
    return;
  }

@reedhecre
Copy link
Copy Markdown

reedhecre commented Apr 28, 2026

Codex Review

该评论由 review 机器人自动更新。

Summary

PR #601 引入了一个 reserve_buffer 容量计费回归:对非对齐 size 的 auto reserve 预扣过多容量,可能把本来合法的输入错误判成溢出。

Findings

  1. P2 auto reserve_buffer 的容量预扣按对齐后尺寸计费,和现有 size 语义不一致 lib/PTO/Transforms/PTOPlanMemory.cpp:366

这里把每个 auto reserve_buffer 预扣成 alignUpBytes(size, align),但当前 pass 其他地方并不是按这个 contract 处理的:analyzeReserveBufferPlans() 只校验 baseBytes + sizeBytes <= capacityBytesassignAutoReserveBufferBases() 也只要求 candidateBase + plan.sizeBytes 能放下,说明现有语义是“base 需要对齐,size 按实际字节数计费”。现在改成按 256B 对 size 向上取整后再从 MemPlan 容量里扣减,会把非对齐 size 的 reserve 过度计费;多个小 reserve 时还会把最后一个 reserve 之后本不需要的尾部 padding 也算进去。结果是一些按当前语义本来可以成功分配的 IR,会被错误地逼到 reuse 路径甚至直接报 overflow,属于明确的兼容性/正确性回归。

@HecreReed HecreReed marked this pull request as ready for review April 29, 2026 01:33
@HecreReed HecreReed changed the title Support fixed alloc_tile addr reuse in PlanMemory Fix PlanMemory local-memory capacity units Apr 29, 2026
@HecreReed HecreReed marked this pull request as draft April 29, 2026 05:06
@HecreReed HecreReed force-pushed the codex/issue593-planmemory-ub-reuse branch 2 times, most recently from be17f17 to 8505472 Compare April 29, 2026 06:25
@HecreReed HecreReed changed the title Fix PlanMemory local-memory capacity units Fix issue593 PlanMemory capacity accounting Apr 29, 2026
@HecreReed HecreReed force-pushed the codex/issue593-planmemory-ub-reuse branch from 8505472 to f29a40a Compare April 29, 2026 08:54
@HecreReed HecreReed changed the title Fix issue593 PlanMemory capacity accounting Fix issue593 by reserving auto reserve_buffer capacity in PlanMemory Apr 29, 2026
@HecreReed HecreReed marked this pull request as ready for review April 29, 2026 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow different tiles reusing/overlapping same physical address range to save buffer usage

2 participants