Skip to content

Implement __sync builtins for thumbv6-none-eabi#1050

Merged
tgross35 merged 7 commits intorust-lang:mainfrom
taiki-e:thumbv6k
Jan 22, 2026
Merged

Implement __sync builtins for thumbv6-none-eabi#1050
tgross35 merged 7 commits intorust-lang:mainfrom
taiki-e:thumbv6k

Conversation

@taiki-e
Copy link
Copy Markdown
Member

@taiki-e taiki-e commented Dec 27, 2025

This is a PR for thumbv6-none-eabi (bere-metal Armv6k in Thumb mode) which proposed to be added by rust-lang/rust#150138.

Armv6k supports atomic instructions, but they are unavailable in Thumb mode unless Thumb-2 instructions available (v6t2).

Using Thumb interworking (can be used via #[instruction_set]) allows us to use these instructions even from Thumb mode without Thumb-2 instructions, but LLVM does not implement that processing (as of LLVM 21), so this PR implements it in compiler-builtins.

The code around __sync builtins is basically copied from arm_linux.rs which uses kernel_user_helpers for atomic implementation.
The atomic implementation is a port of my atomic-maybe-uninit inline assembly code.

This PR has been tested on QEMU 10.2.0 using patched compiler-builtins and core that applied the changes in this PR and rust-lang/rust#150138 and the portable-atomic no-std test suite (can be run with ./tools/no-std.sh thumbv6-none-eabi on that repo) which tests wrappers around core::sync::atomic. (Note that the target-spec used in test sets max-atomic-width to 32 and atomic_cas to true, unlike the current rust-lang/rust#150138.) The original atomic-maybe-uninit implementation has been tested on real Arm hardware.

(Note that Armv6k also supports 64-bit atomic instructions, but they are skipped here. This is because there is no corresponding code in arm_linux.rs (since the kernel requirements increased in 1.64, it may be possible to implement 64-bit atomics there as well. see also taiki-e/portable-atomic#82), the code becomes more complex than for 32-bit and smaller atomics.)

cc @thejpster (target maintainer)

I'll undraft this PR once the target maintainer approved this approach.

Comment thread compiler-builtins/src/thumbv6k.rs Outdated
taiki-e added a commit to taiki-e/portable-atomic that referenced this pull request Dec 27, 2025
Comment thread compiler-builtins/src/thumbv6k.rs Outdated
@thejpster
Copy link
Copy Markdown

This looks OK to me, but it's a bit beyond my Arm experience.

@taiki-e taiki-e marked this pull request as ready for review December 28, 2025 13:00
Copy link
Copy Markdown
Contributor

@tgross35 tgross35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me, thanks for the patch. This should wait until one of the compiler leads approves the target PR, though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could you name this arm_thumb_sync_builtins or similar?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think we may as well group this by functionality like we have for other areas, something like:

src/sync/
    arm_linux.rs
    thumbv6k.rs
    arm_thumb_shared.rs // this file

I'll move the aarch64 builtins there too.

Copy link
Copy Markdown
Member Author

@taiki-e taiki-e Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 5ccc285.

Comment thread compiler-builtins/src/sync/thumbv6k.rs Outdated
Comment thread compiler-builtins/src/thumbv6k.rs Outdated
Comment on lines +69 to +79
concat!("ldrex", $suffix, " {out}, [{dst}]"), // atomic { out = *dst; EXCLUSIVE = dst }
"cmp {out}, {old}", // if out == old { Z = 1 } else { Z = 0 }
"bne 3f", // if Z == 0 { jump 'cmp-fail }
cp15_barrier!(), // fence
"2:", // 'retry:
concat!("strex", $suffix, " {r}, {new}, [{dst}]"), // atomic { if EXCLUSIVE == dst { *dst = new; r = 0 } else { r = 1 }; EXCLUSIVE = None }
"cmp {r}, #0", // if r == 0 { Z = 1 } else { Z = 0 }
"beq 3f", // if Z == 1 { jump 'success }
concat!("ldrex", $suffix, " {out}, [{dst}]"), // atomic { out = *dst; EXCLUSIVE = dst }
"cmp {out}, {old}", // if out == old { Z = 1 } else { Z = 0 }
"beq 2b", // if Z == 1 { jump 'retry }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own understanding: why is the barrier needed between the first ldrex/strex, but not between subsequent l/s pairs?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (omit the preceding fence if no write operation is performed) is what LLVM does (https://godbolt.org/z/G7jGEGz6q), but I can rewrite to always emits two fences.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know enough about the target to know what is better here, so it's your call. Either way, mind adding a comment?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a commit in 2a6a68c.

Comment thread compiler-builtins/src/sync/thumbv6k.rs
Comment thread compiler-builtins/src/thumbv6k.rs Outdated
@tgross35
Copy link
Copy Markdown
Contributor

tgross35 commented Jan 4, 2026

This PR has been tested on QEMU 10.2.0 using patched compiler-builtins and core that applied the changes in this PR and rust-lang/rust#150138 and the portable-atomic no-std test suite (can be run with ./tools/no-std.sh thumbv6-none-eabi on that repo) which tests wrappers around core::sync::atomic.

If you get the chance at some point, I would be rather interested in having some of your atomic test suite here. Especially the part you mention since we don't have any no-std tests set up.

(Note that Armv6k also supports 64-bit atomic instructions, but they are skipped here. This is because there is no corresponding code in arm_linux.rs (since the kernel requirements increased in 1.64, it may be possible to implement 64-bit atomics there as well. see also taiki-e/portable-atomic#82), the code becomes more complex than for 32-bit and smaller atomics.)

Assuming this would allow us to set target_has_atomic = "64", I think that would be worth it. Just not pressing, of course.

@thejpster
Copy link
Copy Markdown

@davidtwco said “LGTM” on the target but the target is blocked waiting for this PR. But it seems this PR is blocked waiting for the target?

@taiki-e - should we still do this one first?

@taiki-e
Copy link
Copy Markdown
Member Author

taiki-e commented Jan 21, 2026

@thejpster

But it seems this PR is blocked waiting for the target?

This PR is fine without the target. (See the CI change using a custom target.)

(However, the target or patched core is required to run the test suite not included in this PR.)

@taiki-e taiki-e requested a review from tgross35 January 21, 2026 16:31
Copy link
Copy Markdown
Contributor

@tgross35 tgross35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requested one more comment in #1050 (comment) but otherwise this looks great to me, thank you for the detailed work.

I'd still mildly prefer if rust-lang/rust#150138 merges first so this PR can test against the target as-added without the back-and-forth. But if a few days go by without any movement there, I'll go ahead and apply this.

@taiki-e
Copy link
Copy Markdown
Member Author

taiki-e commented Jan 22, 2026

I'd still mildly prefer if rust-lang/rust#150138 merges first so this PR can test against the target as-added without the back-and-forth. But if a few days go by without any movement there, I'll go ahead and apply this.

I think it is easier to understand/use if the target consistently provides atomics regardless of the compiler version from the user's point of view. (In cases where atomic operations are missing, it may be necessary to enable the specific optional feature to enable atomic.)

(Also, to run tests, it's preferable that atomics are enabled for the target (manually calling the sync built-in is annoying), so I guess adding no-std tests will need multiple PRs on the compiler-builtins side anyway.)

@tgross35 tgross35 enabled auto-merge (squash) January 22, 2026 09:51
@tgross35 tgross35 merged commit 723335f into rust-lang:main Jan 22, 2026
40 checks passed
@tgross35
Copy link
Copy Markdown
Contributor

Fair enough, applied now. I probably won't do a sync until after #1061 (unless that takes a long time) since that may have some UB in atomic fixes. (Bit of a weird issue, any idea if it's the sort of thing your testsuite would have caught?)

@taiki-e taiki-e deleted the thumbv6k branch January 22, 2026 11:49
thejpster added a commit to thejpster/rust that referenced this pull request Jan 22, 2026
Will cause LLVM to emit calls to the __sync routines that were added
to compiler-builtins in rust-lang/compiler-builtins#1050.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants