Rollup of 15 pull requests#155941
Conversation
…m-portable-simd-2026-01-28
Following rust-lang#89117, rustc has defaulted to the v0 mangling scheme by default (since Nov 20th 2025). This surfaced two bugs: - rust-lang#138261 was a small ICE (found via fuzzing) where an implementation-internal namespace was missing for global assembly - this occurs with names instantiated within global assembly (that can happen inside constants) - rust-lang#134479 only occurs with unstable `generic_const_exprs` Since there have been three-to-four months for users to find bugs with this mangling scheme on nightly, that the scheme has been waiting many years to be stabilised, and has been used successfully internally at Microsoft, Meta and Google for many years, this patch proposes stabilising the v0 mangling scheme on stable. This patch does not propose removing the legacy mangling, it will remain usable on nightly as an escape-hatch if there are remaining bugs (though admittedly it would require switching to nightly for those on stable) - it is anticipated that this would be unlikely given current testing undergone by v0. Legacy mangling can be removed in another follow-up.
Add `bounds::FloatPrimitive` Exhaustive float pattern match Fix GCC use span bugs
This will become the default soon.
…tgross35,RalfJung Merge `fabsf16/32/64/128` into `fabs::<F>` Following [a small conversation on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/Float.20intrinsics/with/521501401) (and because I'd be interested in starting to contribute on Rust), I thought I'd give a try at merging the float intrinsics :) This PR just merges `fabsf16`, `fabsf32`, `fabsf64`, `fabsf128`, as it felt like an easy first target. Notes: - I'm opening the PR for one intrinsic as it's probably easier if the shift is done one intrinsic at a time, but let me know if you'd rather I do several at a time to reduce the number of PRs. - Currently this PR increases LOCs, despite being an attempt at simplifying the intrinsics/compilers. I believe this increase is a one time thing as I had to define new functions and move some things around, and hopefully future PRs/commits will reduce overall LoCs - `fabsf32` and `fabsf64` are `#[rustc_intrinsic_const_stable_indirect]`, while `fabsf16` and `fabsf128` aren't; because `f32`/`f64` expect the function to be const, the generic version must be made indirectly stable too. We'd need to check with T-lang this change is ok; the only other intrinsics where there is such a mismatch is `minnum`, `maxnum` and `copysign`. - I haven't touched libm because I'm not familiar with how it works; any guidance would be welcome!
…alebzulawski,antoyo simd_fmin/fmax: make semantics and name consistent with scalar intrinsics This is the SIMD version of rust-lang#153343: change the documented semantics of the SIMD float min/max intrinsics to that of the scalar intrinsics, and also make the name consistent. The overall semantic change this amounts to is that we restrict the non-determinism: the old semantics effectively mean "when one input is an SNaN, the result non-deterministically is a NaN or the other input"; the new semantics say that in this case the other input must be returned. For all other cases, old and new semantics are equivalent. This means all users of these intrinsics that were correct with the old semantics are still correct: the overall set of possible behaviors has become smaller, no new possible behaviors are being added. In terms of providers of this API: - Miri, GCC, and cranelift already implement the new semantics, so no changes are needed. - LLVM is adjusted to use `minimumnum nsz` instead of `minnum`, thus giving us the new semantics. In terms of consumers of this API: - Portable SIMD almost certainly wants to match the scalar behavior, so this is strictly a bugfix here. - Stdarch mostly stopped using the intrinsic, except on nvptx, where arguably the new semantics are closer to what we actually want than the old semantics (rust-lang/stdarch#2056). Q: Should there be an `f` in the intrinsic name to indicate that it is for floats? E.g., `simd_fminimum_number_nsz`? Also see rust-lang#153395.
The `HashStable` and `ToStableHashKey` traits both have a type parameter that is sometimes called `CTX` and sometimes called `HCX`. (In practice this type parameter is always instantiated as `StableHashingContext`.) Similarly, variables with these types are sometimes called `ctx` and sometimes called `hcx`. This inconsistency has bugged me for some time. The `HCX`/`hcx` form is more informative (the `H`/`h` indicates what type of context it is) and it matches other cases like `tcx`, `dcx`, `icx`. Also, RFC 430 says that type parameters should have names that are "concise UpperCamelCase, usually single uppercase letter: T". In this case `H` feels insufficient, and `Hcx` feels better. Therefore, this commit changes the code to use `Hcx`/`hcx` everywhere.
This file has several methods that take a `FnOnce() -> R` closure: - `DepGraph::with_ignore` - `DepGraph::with_query_deserialization` - `DepGraph::with_anon_task` - `DepGraphData::with_anon_task_inner` It also has two methods that take a faux closure via an `A` argument and a `fn(TyCtxt<'tcx>, A) -> R` argument: - DepGraph::with_task - DepGraphData::with_task The rationale is that the faux closure exercises tight control over what state they have access to. This seems silly when (a) they are passed a `TyCtxt`, and (b) when similar nearby functions take real closures. And they are more awkward to use, e.g. requiring multiple arguments to be gathered into a tuple. This commit changes the faux closures to real closures.
Because the things in this module aren't MIR and don't use anything from `rustc_middle::mir`. Also, modules that use `mono` often don't use anything else from `rustc_middle::mir`.
This allows custom JSON targets to compile, as the assembler was previously failing with an error that `-Zunstable-options` had not been set.
Fix compilation for custom JSON targets
…m-portable-simd-2026-04-11
and provide it to LLVM for better optimization
…=petrochenkov Use closures more consistently in `dep_graph.rs`. This file has several methods that take a `FnOnce() -> R` closure: - `DepGraph::with_ignore` - `DepGraph::with_query_deserialization` - `DepGraph::with_anon_task` - `DepGraphData::with_anon_task_inner` It also has two methods that take a faux closure via an `A` argument and a `fn(TyCtxt<'tcx>, A) -> R` argument: - DepGraph::with_task - DepGraphData::with_task The rationale is that the faux closure exercises tight control over what state they have access to. This seems silly when (a) they are passed a `TyCtxt`, and (b) when similar nearby functions take real closures. And they are more awkward to use, e.g. requiring multiple arguments to be gathered into a tuple. This commit changes the faux closures to real closures. r? @Zalathar
…iser Start using pattern types in libcore cc rust-lang#135996 Replaces the innards of `NonNull` with `*const T is !null`. This does affect LLVM's optimizations, as now reading the field preserves the metadata that the field is not null, and transmuting to another type (e.g. just a raw pointer), will also preserve that information for optimizations. This can cause LLVM opts to do more work, but it's not guaranteed to produce better machine code. Once we also remove all uses of rustc_layout_scalar_range_start from rustc itself, we can remove the support for that attribute entirely and handle all such needs via pattern types
… r=nnethercote preserve SIMD element type information Preserve the SIMD element type and provide it to LLVM for better optimization. This is relevant for AArch64 types like `int16x4x2_t`, see also llvm/llvm-project#181514. Such types are defined like so: ```rust #[repr(simd)] struct int16x4_t([i16; 4]); #[repr(C)] struct int16x4x2_t(pub int16x4_t, pub int16x4_t); ``` Previously this would be translated to the opaque `[2 x <8 x i8>]`, with this PR it is instead `[2 x <4 x i16>]`. That change is not relevant for the ABI, but using the correct type prevents bitcasts that can (indeed, do) confuse the LLVM pattern matcher. This change will make it possible to implement the deinterleaving loads on AArch64 in a portable way (without neon-specific intrinsics), which means that e.g. Miri or the cranelift backend can run them without additional support. discussion at [#t-compiler > loss of vector element type information](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/loss.20of.20vector.20element.20type.20information/with/584483611)
|
@bors r+ rollup=never p=1000 |
This comment has been minimized.
This comment has been minimized.
…uwer Rollup of 15 pull requests Successful merges: - #155923 (Subtree sync for rustc_codegen_cranelift) - #155930 (Sync from portable simd 2026 04 28) - #155850 (Only exclude the #155473 change for 1-byte bool-likes) - #151994 (switch to v0 mangling by default on stable) - #154325 (Tweak irrefutable let else warning output) - #155273 (Lock stable_crate_ids once in create_crate_num) - #155361 (Document that CFI diverges from Rust wrt. ABI-compatibility rules) - #155692 (disable naked-dead-code-elimination test if no RET mnemonic is available) - #155747 (Update documentation for `wasm32-wali-linux-musl` after integrating n…) - #155768 (compiletest: Overhaul the code for running an incremental test revision) - #155907 (Handle hkl const closures) - #155910 (misc stuff from reading borrowck again :)) - #155913 (Delete the 12 year old fixme) - #155920 (remove review queue triagebot mentions) - #155936 (Rename `SharedContext::emit_dyn_lint*` into `emit_lint*`)
|
💔 Test for 5db1570 failed: CI. Failed job:
|
|
@bors retry |
This comment has been minimized.
This comment has been minimized.
|
Seems stable enough again |
|
Tree is now open for merging. |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing 03c609a (parent) -> 37d85e5 (this PR) Test differencesShow 304 test diffsStage 0
Stage 1
Stage 2
Additionally, 296 doctest diffs were found. These are ignored, as they are noisy. Job group index
Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard 37d85e592f9ae5f20f7d9a9f99785246fa7298da --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
📌 Perf builds for each rolled up PR:
previous master: 03c609abb6 In the case of a perf regression, run the following command for each PR you suspect might be the cause: |
|
Finished benchmarking commit (37d85e5): comparison URL. Overall result: ✅ improvements - no action needed@rustbot label: -perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 3.4%, secondary -3.8%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 2.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.0%, secondary -2.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 487.089s -> 486.76s (-0.07%) |
Successful merges:
-1forNone#155473 change for 1-byte bool-likes)wasm32-wali-linux-muslafter integrating n… #155747 (Update documentation forwasm32-wali-linux-muslafter integrating n…)SharedContext::emit_dyn_lint*intoemit_lint*#155936 (RenameSharedContext::emit_dyn_lint*intoemit_lint*)r? @ghost
Create a similar rollup