Implement __sync builtins for thumbv6-none-eabi#1050
Implement __sync builtins for thumbv6-none-eabi#1050tgross35 merged 7 commits intorust-lang:mainfrom
Conversation
|
This looks OK to me, but it's a bit beyond my Arm experience. |
tgross35
left a comment
There was a problem hiding this comment.
Looks pretty good to me, thanks for the patch. This should wait until one of the compiler leads approves the target PR, though.
There was a problem hiding this comment.
Nit: could you name this arm_thumb_sync_builtins or similar?
There was a problem hiding this comment.
Actually, I think we may as well group this by functionality like we have for other areas, something like:
src/sync/
arm_linux.rs
thumbv6k.rs
arm_thumb_shared.rs // this file
I'll move the aarch64 builtins there too.
| concat!("ldrex", $suffix, " {out}, [{dst}]"), // atomic { out = *dst; EXCLUSIVE = dst } | ||
| "cmp {out}, {old}", // if out == old { Z = 1 } else { Z = 0 } | ||
| "bne 3f", // if Z == 0 { jump 'cmp-fail } | ||
| cp15_barrier!(), // fence | ||
| "2:", // 'retry: | ||
| concat!("strex", $suffix, " {r}, {new}, [{dst}]"), // atomic { if EXCLUSIVE == dst { *dst = new; r = 0 } else { r = 1 }; EXCLUSIVE = None } | ||
| "cmp {r}, #0", // if r == 0 { Z = 1 } else { Z = 0 } | ||
| "beq 3f", // if Z == 1 { jump 'success } | ||
| concat!("ldrex", $suffix, " {out}, [{dst}]"), // atomic { out = *dst; EXCLUSIVE = dst } | ||
| "cmp {out}, {old}", // if out == old { Z = 1 } else { Z = 0 } | ||
| "beq 2b", // if Z == 1 { jump 'retry } |
There was a problem hiding this comment.
For my own understanding: why is the barrier needed between the first ldrex/strex, but not between subsequent l/s pairs?
There was a problem hiding this comment.
This (omit the preceding fence if no write operation is performed) is what LLVM does (https://godbolt.org/z/G7jGEGz6q), but I can rewrite to always emits two fences.
There was a problem hiding this comment.
I don't know enough about the target to know what is better here, so it's your call. Either way, mind adding a comment?
If you get the chance at some point, I would be rather interested in having some of your atomic test suite here. Especially the part you mention since we don't have any no-std tests set up.
Assuming this would allow us to set |
|
@davidtwco said “LGTM” on the target but the target is blocked waiting for this PR. But it seems this PR is blocked waiting for the target? @taiki-e - should we still do this one first? |
This PR is fine without the target. (See the CI change using a custom target.) (However, the target or patched |
tgross35
left a comment
There was a problem hiding this comment.
Requested one more comment in #1050 (comment) but otherwise this looks great to me, thank you for the detailed work.
I'd still mildly prefer if rust-lang/rust#150138 merges first so this PR can test against the target as-added without the back-and-forth. But if a few days go by without any movement there, I'll go ahead and apply this.
I think it is easier to understand/use if the target consistently provides atomics regardless of the compiler version from the user's point of view. (In cases where atomic operations are missing, it may be necessary to enable the specific optional feature to enable atomic.) (Also, to run tests, it's preferable that atomics are enabled for the target (manually calling the sync built-in is annoying), so I guess adding no-std tests will need multiple PRs on the compiler-builtins side anyway.) |
|
Fair enough, applied now. I probably won't do a sync until after #1061 (unless that takes a long time) since that may have some UB in atomic fixes. (Bit of a weird issue, any idea if it's the sort of thing your testsuite would have caught?) |
Will cause LLVM to emit calls to the __sync routines that were added to compiler-builtins in rust-lang/compiler-builtins#1050.
This is a PR for thumbv6-none-eabi (bere-metal Armv6k in Thumb mode) which proposed to be added by rust-lang/rust#150138.
Armv6k supports atomic instructions, but they are unavailable in Thumb mode unless Thumb-2 instructions available (v6t2).
Using Thumb interworking (can be used via
#[instruction_set]) allows us to use these instructions even from Thumb mode without Thumb-2 instructions, but LLVM does not implement that processing (as of LLVM 21), so this PR implements it in compiler-builtins.The code around
__syncbuiltins is basically copied fromarm_linux.rswhich uses kernel_user_helpers for atomic implementation.The atomic implementation is a port of my atomic-maybe-uninit inline assembly code.
This PR has been tested on QEMU 10.2.0 using patched compiler-builtins and core that applied the changes in this PR and rust-lang/rust#150138 and the portable-atomic no-std test suite (can be run with
./tools/no-std.sh thumbv6-none-eabion that repo) which tests wrappers aroundcore::sync::atomic. (Note that the target-spec used in test sets max-atomic-width to 32 and atomic_cas to true, unlike the current rust-lang/rust#150138.) The original atomic-maybe-uninit implementation has been tested on real Arm hardware.(Note that Armv6k also supports 64-bit atomic instructions, but they are skipped here. This is because there is no corresponding code in
arm_linux.rs(since the kernel requirements increased in 1.64, it may be possible to implement 64-bit atomics there as well. see also taiki-e/portable-atomic#82), the code becomes more complex than for 32-bit and smaller atomics.)cc @thejpster (target maintainer)
I'll undraft this PR once the target maintainer approved this approach.