Conversation
Signed-off-by: Gregory Becker <becker33@llnl.gov>
Signed-off-by: Gregory Becker <becker33@llnl.gov>
alalazo
left a comment
There was a problem hiding this comment.
Can you confirm you tried a build with Spack, before I merge this? I don't have any access to m4
|
I don't have access either, going to ask @haampie to test. This came up from a conversation he and I were having on slack. |
Signed-off-by: Gregory Becker <becker33@llnl.gov>
Signed-off-by: Gregory Becker <becker33@llnl.gov>
|
It seems gcc 15 added support up to |
|
A quick search gave me this. Wondering if adding
|
| "gcc": [ | ||
| { | ||
| "versions": "14.1:", | ||
| "flags": "-march=armv8.6-a+sme2 -mtune=generic" |
There was a problem hiding this comment.
See the table for another suggestion. For what is worth, it seems that using the armv8.6 flag as above is needed for GCC older than v14
There was a problem hiding this comment.
I think based on the table in the gcc manual pages that these two descriptions are identical, except that the one based on 8.6-a makes explicit that sme2 is included even though sve is not.
See
arch value | Architecture | Includes by default
-- | -- | --
‘armv8-a’ | Armv8-A | ‘+fp’, ‘+simd’
‘armv8.1-a’ | Armv8.1-A | ‘armv8-a’, ‘+crc’, ‘+lse’, ‘+rdma’
‘armv8.2-a’ | Armv8.2-A | ‘armv8.1-a’
‘armv8.3-a’ | Armv8.3-A | ‘armv8.2-a’, ‘+pauth’
‘armv8.4-a’ | Armv8.4-A | ‘armv8.3-a’, ‘+flagm’, ‘+fp16fml’, ‘+dotprod’
‘armv8.5-a’ | Armv8.5-A | ‘armv8.4-a’, ‘+sb’, ‘+ssbs’, ‘+predres’
‘armv8.6-a’ | Armv8.6-A | ‘armv8.5-a’, ‘+bf16’, ‘+i8mm’
‘armv8.7-a’ | Armv8.7-A | ‘armv8.6-a’, ‘+ls64’
‘armv8.8-a’ | Armv8.8-a | ‘armv8.7-a’, ‘+mops’
‘armv8.9-a’ | Armv8.9-a | ‘armv8.8-a’
‘armv9-a’. | Armv9-A | ‘armv8.5-a’, ‘+sve’, ‘+sve2’
‘armv9.1-a’ | Armv9.1-A | ‘armv9-a’, ‘+bf16’, ‘+i8mm’
‘armv9.2-a’ | Armv9.2-A | ‘armv9.1-a’, ‘+ls64’
‘armv9.3-a’ | Armv9.3-A | ‘armv9.2-a’, ‘+mops’
‘armv9.4-a’ | Armv9.4-A | ‘armv9.3-a’
‘armv8-r’ | Armv8-R | ‘armv8-r’
There was a problem hiding this comment.
Actually looking more closely, 9.2-a+nosve would be the same as 8.7-a, not 8.6-a. But the 8.7-a and 9.2-a tables include +wfxt+xs which we don't list as flags on the m4 architecture in archspec, so I'm hesitant to make that change.
There was a problem hiding this comment.
Ok after a bunch more reading:
- There is no way to use gcc to target sme instructions while only allowing sve in streaming mode (the only mode that apple-m4 supports).
- Apple only promises that their hardware supports the
armv8.6-aISA - Based on info from
sysctl hw.optionalit looks like the m4 does support the flags that distinguisharmv8.7-afromarmv8.6-a. - Apple mostly supports the
armv9.2-aISA, with the exception of SVE non-streaming instructions - You can mostly disable SVE non-streaming instructions with
-fno-tree-vectorize -fno-tree-slp-vectorize, but that's not a guarantee.
With all that said, I think the best option for us would be to use -march=armv9.2-a+sme2 -fno-tree-vectorize -fno-tree-slp-vectorize. This will do no autovectorization but would allow sme intrinsics to work.
If we want to be 100% safe, we would need to stay with -march=armv8.6-a+wfxt.
I'll update the PR with my recommendation.
There was a problem hiding this comment.
@becker33 JSON LGTM. Feel free to merge if you don't think we need to test this further. I plan to release a new version of archspec soon-ish.
Signed-off-by: Gregory Becker <becker33@llnl.gov>
…tions Signed-off-by: Gregory Becker <becker33@llnl.gov>
For gcc
m1 --> armv8.4-a (available from 8.0)
m2 --> armv8.6-a (available from 10.1)
m3 == m2
m4 --> armv8.6-a+sme2