Skip to content

[asm] Replace RawOp string hacks with typed SALUPhys ops for physical…#1080

Open
harsh-nod wants to merge 1 commit intoiree-org:mainfrom
harsh-nod:fix_raw
Open

[asm] Replace RawOp string hacks with typed SALUPhys ops for physical…#1080
harsh-nod wants to merge 1 commit intoiree-org:mainfrom
harsh-nod:fix_raw

Conversation

@harsh-nod
Copy link
Copy Markdown
Collaborator

… register writes

RawOp was used to emit s_mov_b32/s_mov_b64/s_and_b32/s_or_b32 as raw assembly strings because the Pure trait on SALUUnaryOp/SALUBinaryOp causes DCE to eliminate ops whose SSA results have no consumer — which is exactly the case for SRD setup writes to physical registers.

Introduce SALUPhysUnaryOp and SALUPhysBinaryOp base classes that use the SpecialRegOp trait (non-Pure, prevents DCE/CSE) and take the destination physical register as an input operand rather than an SSA result. Add four concrete ops: S_MOV_B32_PHYS, S_MOV_B64_PHYS, S_AND_B32_PHYS, S_OR_B32_PHYS.

Replace 15 RawOp::create calls across emitSRDPrologue(), handleVectorStore() SRD adjustment, and handleFatRawBufferCast() with the new typed ops. Assembly output is identical — same mnemonics, operands, and instruction count.

Three RawOps remain for s_branch, .p2align, and label directives which have no typed op equivalent.

Add lit test verifying SALUPhys ops survive CSE while Pure variants are still eliminated.

… register writes

RawOp was used to emit s_mov_b32/s_mov_b64/s_and_b32/s_or_b32 as raw
assembly strings because the Pure trait on SALUUnaryOp/SALUBinaryOp
causes DCE to eliminate ops whose SSA results have no consumer — which
is exactly the case for SRD setup writes to physical registers.

Introduce SALUPhysUnaryOp and SALUPhysBinaryOp base classes that use
the SpecialRegOp trait (non-Pure, prevents DCE/CSE) and take the
destination physical register as an input operand rather than an SSA
result. Add four concrete ops: S_MOV_B32_PHYS, S_MOV_B64_PHYS,
S_AND_B32_PHYS, S_OR_B32_PHYS.

Replace 15 RawOp::create calls across emitSRDPrologue(),
handleVectorStore() SRD adjustment, and handleFatRawBufferCast() with
the new typed ops. Assembly output is identical — same mnemonics,
operands, and instruction count.

Three RawOps remain for s_branch, .p2align, and label directives which
have no typed op equivalent.

Add lit test verifying SALUPhys ops survive CSE while Pure variants
are still eliminated.

Signed-off-by: Harsh Menon <harsh.menon@amd.com>
@Hardcode84
Copy link
Copy Markdown
Contributor

Actually, you can just introduce something like compose_reg op, which takes multiple individual reg operands and returns the wide one. The op itself can be pure and also you can get rid of newSrdBase and delegate sgpr selection to regalloc completely.

@Hardcode84
Copy link
Copy Markdown
Contributor

So, I looked into this more, we have PackOp which we can use, but:

  1. It doesn't support SGPRs, we need to update the .td.
  2. Regalloc doesn't respect the PackOp, it can allocate inputs in non-contiguous registers and then do nothing. We need to do either:
  • Allocate input regs already contiguous
  • Insert movs to move inputs into contig storage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants