[arrow-ord]: add REE slice offset regression test#5
Draft
alamb wants to merge 3 commits intopolarsignals:asubiotto/reecmpfrom
Draft
[arrow-ord]: add REE slice offset regression test#5alamb wants to merge 3 commits intopolarsignals:asubiotto/reecmpfrom
alamb wants to merge 3 commits intopolarsignals:asubiotto/reecmpfrom
Conversation
This commit implements native comparisons on REE-encoded arrays which are treated similarly to dictionary indirection. This commit implements REE to scalar comparisons by operating on the physical values only then bulk expanding the boolean result. REE-to-REE comparisons are also optimized by computing aligned physical value runs to minimize comparisons. Mixed cases (REE vs flat) materialize a logical index mapping similar to dictionaries. This commit also supports REE<Dict>. For comparison, here are the benchmark results with flat arrays as a reference on my local machine: ``` eq Int32 time: [14.955 µs 15.162 µs 15.396 µs] eq scalar Int32 time: [11.379 µs 11.418 µs 11.459 µs] ree_comparison/eq_ree_scalar(phys=64,log=65536) time: [453.31 ns 454.88 ns 456.43 ns] ree_comparison/eq_ree_scalar(phys=1024,log=65536) time: [4.1224 µs 4.1298 µs 4.1368 µs] ree_comparison/eq_ree_scalar(phys=32768,log=65536) time: [93.506 µs 94.085 µs 94.993 µs] ree_comparison/eq_ree_ree(phys=64,log=65536) time: [413.96 ns 414.82 ns 415.87 ns] ree_comparison/eq_ree_ree(phys=1024,log=65536) time: [4.1597 µs 4.1660 µs 4.1749 µs] ree_comparison/eq_ree_ree(phys=32768,log=65536) time: [128.74 µs 144.40 µs 161.53 µs] ``` As is expected, the more we take advantage of REE encoding, the faster the comparisons are. Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
|
|
alamb
commented
Apr 14, 2026
| } | ||
|
|
||
| #[test] | ||
| fn test_ree_sliced_different_offsets() { |
Author
There was a problem hiding this comment.
This test should pass, but fails like
cargo test -p arrow-ord test_ree_sliced_different_offsets -- --nocapture
Compiling arrow-ord v58.1.0 (/private/tmp/arrow-pr9621-review/arrow-ord)
Finished `test` profile [unoptimized + debuginfo] target(s) in 0.75s
Running unittests src/lib.rs (target/debug/deps/arrow_ord-144a95337e3b3020)
running 1 test
thread 'cmp::tests::test_ree_sliced_different_offsets' (36129756) panicked at arrow-ord/src/cmp.rs:1396:9:
assertion `left == right` failed
left: BooleanArray
[
true,
false,
true,
true,
]
right: BooleanArray
[
true,
true,
true,
true,
]
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test cmp::tests::test_ree_sliced_different_offsets ... FAILED
failures:
failures:
cmp::tests::test_ree_sliced_different_offsets
test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 308 filtered out; finished in 0.00s
error: test failed, to rerun pass `-p arrow-ord --lib`
cfc2a0a to
a0a7521
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Targets