Added planar types to speed up complex half precision GEMMs#1142
Added planar types to speed up complex half precision GEMMs#1142cliffburdick wants to merge 9 commits intomainfrom
Conversation
Greptile SummaryThis PR introduces
Confidence Score: 4/5Safe to merge except when MATX_EN_JIT is enabled — the interleaved JIT path will produce incorrect imaginary values until the plane-offset bug is fixed. All three previously-raised P1 concerns are resolved. One new P1 correctness bug was found in the JIT path of ComplexInterleavedOp that must be fixed before this PR can be considered fully correct for JIT users. include/matx/operators/interleaved.h (JIT offset bug at line 93) Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["matmul(A, B) to C (complex half)"] --> B{Any tensor already planar?}
B -- "A not planar" --> C["planar(A) to a_hp; a_adj.Reset(a_hp)"]
B -- "A already planar" --> D["a_adj unchanged"]
C --> E
D --> E
B -- "B not planar" --> F["planar(B) to b_hp; b_adj.Reset(b_hp)"]
B -- "B already planar" --> G["b_adj unchanged"]
F --> H
G --> H
B -- "C not planar" --> I["Allocate c_hp; c_adj.Reset(c_hp)"]
B -- "C already planar" --> J["c_adj.Reset(c.Data())"]
I --> K
J --> K
E & H & K --> L["cuBLASLt GEMM with PLANE_OFFSET"]
L --> M{c_is_planar?}
M -- No --> N["interleaved(c_adj) to c"]
M -- Yes --> O["No conversion needed"]
|
|
/build |
1 similar comment
|
/build |
|
/build |
|
/build |
|
/build |
1 similar comment
|
/build |
|
/build |
1 similar comment
|
/build |
|
/build |
3 similar comments
|
/build |
|
/build |
|
/build |
No description provided.