trunc: Use an assembly implementation on i586#1152
trunc: Use an assembly implementation on i586#1152tgross35 wants to merge 1 commit intorust-lang:mainfrom
Conversation
|
Based on #1142 to avoid conflicts. |
4b15979 to
a0d2380
Compare
This comment has been minimized.
This comment has been minimized.
e37a067 to
5aa3f86
Compare
This comment has been minimized.
This comment has been minimized.
|
I don't think we should have this just for the sake of consistency. It has strictly worse performance. |
This comment has been minimized.
This comment has been minimized.
What makes this slow - is frndint that much slower than soft ops? Looking at https://rust.godbolt.org/z/s6GjGsf6E I have no idea whether the latency of 100 is correct or just a worst case estimate since I can't find it cited anywhere. Any idea why LLVM inserts the |
The `trunc` implementation uses integer operations so currently works fine on i586. However, we already have the other three easy operations based on `frndint`, so add `trunc` and complete the set.
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
I'll admit I only tested by comparing against the x86-64 implementation. The 32-bit code does look more complex. Measuring
Looks like Agner Fog does provide measurements for x87 instructions too, see For Nehalem the listed latency for |
The
truncimplementation uses integer operations so currently worksfine on i586. However, we already have the other three easy operations
based on
frndint, so addtruncand complete the set.ci: skip-extensive