perf: Optimize string_to_array for scalar args#21131
Conversation
| Ok(()) | ||
| } | ||
|
|
||
| /// String_to_array SQL function |
There was a problem hiding this comment.
I had to move some code around to group the string_to_array functions together in the file, sorry for the noisy diff.
| list_builder.append(true); | ||
| } | ||
| (Some(string), None) => { | ||
| string.chars().map(|c| c.to_string()).for_each(|c| { |
There was a problem hiding this comment.
This was the inefficient NULL delimiter handling mentioned in the PR summary.
| fn get_str_value(array: &ArrayRef, i: usize) -> Option<&str> { | ||
| if array.is_null(i) { | ||
| return None; | ||
| } | ||
| match array.data_type() { | ||
| Utf8 => Some(array.as_string::<i32>().value(i)), | ||
| LargeUtf8 => Some(array.as_string::<i64>().value(i)), | ||
| Utf8View => Some(array.as_string_view().value(i)), | ||
| other => { | ||
| debug_assert!(false, "unexpected type in get_str_value: {other:?}"); | ||
| None | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
| fn get_str_value(array: &ArrayRef, i: usize) -> Option<&str> { | |
| if array.is_null(i) { | |
| return None; | |
| } | |
| match array.data_type() { | |
| Utf8 => Some(array.as_string::<i32>().value(i)), | |
| LargeUtf8 => Some(array.as_string::<i64>().value(i)), | |
| Utf8View => Some(array.as_string_view().value(i)), | |
| other => { | |
| debug_assert!(false, "unexpected type in get_str_value: {other:?}"); | |
| None | |
| } | |
| } | |
| } | |
| #[inline] | |
| fn get_str_value<O: OffsetSizeTrait>( | |
| array: &GenericStringArray<O>, | |
| i: usize, | |
| ) -> Option<&str> { | |
| if array.is_null(i) { | |
| None | |
| } else { | |
| Some(array.value(i)) | |
| } | |
| } |
There was a problem hiding this comment.
Thanks for the suggestion! However, GenericStringArray doesn't handle StringViewArray; the current approach was overall the cleanest way I could think of, but let me know if you know of a better way to do this.
comphead
left a comment
There was a problem hiding this comment.
Thanks @neilconway I left some comments but i didn't check its performace
|
@comphead Thanks for the review! |
comphead
left a comment
There was a problem hiding this comment.
Thanks @neilconway sorry, I lost this ticket
## Which issue does this PR close? - Closes apache#21129. ## Rationale for this change When the delimiter (and null string, if supplied) are scalars, we can implement `string_to_array` more efficiently. In particular, we can construct a `memmem::Finder` and use it to search for delimiters more efficiently. This PR implements this optimization; it also fixes a place where we were allocating an intermediate `String` for every character when the delimiter is `NULL`. (This isn't a common case but worth fixing.) Benchmarks (M4 Max): ``` single_char_delim/5: 34.8 µs (was 61.1 µs) -43% single_char_delim/20: 145.1 µs (was 220.7 µs) -34% single_char_delim/100: 679.4 µs (was 1.04 ms) -35% multi_char_delim/5: 41.7 µs (was 56.7 µs) -27% multi_char_delim/20: 158.9 µs (was 185.1 µs) -14% multi_char_delim/100: 731.4 µs (was 858.3 µs) -15% with_null_str/5: 43.1 µs (was 68.7 µs) -37% with_null_str/20: 179.3 µs (was 244.3 µs) -27% with_null_str/100: 895.8 µs (was 1.16 ms) -23% null_delim/5: 17.4 µs (was 64.1 µs) -73% null_delim/20: 63.0 µs (was 233.4 µs) -73% null_delim/100: 280.2 µs (was 1.12 ms) -75% columnar_delim/5: 65.2 µs (was 60.2 µs) +8% columnar_delim/20: 217.2 µs (was 224.1 µs) -3% columnar_delim/100: 1.02 ms (was 1.05 ms) -3% ``` ## What changes are included in this PR? * Add benchmark for `string_to_array` * Implement optimizations described above * Refactor columnar (fallback) path to get rid of a lot of type dispatch boilerplate * Improve SLT test coverage for the "columnar string, scalar other-args" case ## Are these changes tested? Yes. ## Are there any user-facing changes? No.
Which issue does this PR close?
string_to_arrayfor scalar args #21129.Rationale for this change
When the delimiter (and null string, if supplied) are scalars, we can implement
string_to_arraymore efficiently. In particular, we can construct amemmem::Finderand use it to search for delimiters more efficiently.This PR implements this optimization; it also fixes a place where we were allocating an intermediate
Stringfor every character when the delimiter isNULL. (This isn't a common case but worth fixing.)Benchmarks (M4 Max):
What changes are included in this PR?
string_to_arrayAre these changes tested?
Yes.
Are there any user-facing changes?
No.