Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
2285 commits
Select commit Hold shift + click to select a range
ace9cd4
perf: Optimize trim UDFs for single-character trims (#20328)
neilconway Feb 20, 2026
1ee782f
Migrate Python usage to uv workspace (#20414)
adriangb Feb 20, 2026
0f7a405
feat: support Spark-compatible `json_tuple` function (#20412)
CuteChuanChuan Feb 20, 2026
a936d0d
test: Extend Spark Array functions: `array_repeat `, `shuffle` and `s…
erenavsarogullari Feb 20, 2026
7f99947
chore: Cleanup "!is_valid(i)" -> "is_null(i)" (#20453)
neilconway Feb 21, 2026
fc98d5c
feat: Implement Spark `bitmap_bucket_number` function (#20288)
kazantsev-maksim Feb 21, 2026
1736fd2
refactor: Extract sort-merge join filter logic into separate module (…
viirya Feb 21, 2026
0d63ced
Implement FFI table provider factory (#20326)
davisp Feb 21, 2026
42dd427
bench: Add criterion benchmark for sort merge join (#20464)
andygrove Feb 21, 2026
d036015
chore: group minor dependencies into single PR (#20457)
comphead Feb 21, 2026
d2c5666
chore(deps): bump taiki-e/install-action from 2.68.0 to 2.68.6 (#20467)
dependabot[bot] Feb 21, 2026
626bc01
chore(deps): bump astral-sh/setup-uv from 6.1.0 to 7.3.0 (#20468)
dependabot[bot] Feb 21, 2026
043f908
chore(deps): bump the all-other-cargo-deps group with 6 updates (#20470)
dependabot[bot] Feb 21, 2026
cfdd7c1
chore(deps): bump testcontainers-modules from 0.14.0 to 0.15.0 (#20471)
dependabot[bot] Feb 21, 2026
f488a90
perf: Optimize scalar fast path for `regexp_like` and rejects g insid…
kumarUjjawal Feb 22, 2026
c1ad863
[Minor] Use buffer_unordered (#20462)
Dandandan Feb 22, 2026
bfc012e
bench: Add IN list benchmarks for non-constant list expressions (#20444)
zhangxffff Feb 22, 2026
9660c98
perf: Use zero-copy slice instead of take kernel in sort merge join (…
andygrove Feb 22, 2026
7815732
feat(memory-tracking): implement arrow_buffer::MemoryPool for MemoryP…
notfilippo Feb 23, 2026
60457d0
Runs-on for more actions (#20274)
blaginin Feb 23, 2026
89a8576
docs: Document that adding new optimizer rules are expensive (#20348)
alamb Feb 23, 2026
ed0323a
feat: support `arrays_zip` function (#20440)
comphead Feb 23, 2026
df8f818
chore: Avoid build fails on MinIO rate limits (#20472)
comphead Feb 23, 2026
d303f58
chore: Add end-to-end benchmark for array_agg, code cleanup (#20496)
neilconway Feb 23, 2026
b9328b9
Upgrade to sqlparser 0.61.0 (#20177)
alamb Feb 23, 2026
7602913
Switch to the latest Mac OS (#20510)
blaginin Feb 23, 2026
b6d46a6
perf: Optimize `initcap()` (#20352)
neilconway Feb 24, 2026
d59cdfe
Fix name tracker (#19856)
xanderbailey Feb 24, 2026
11ef486
Runs-on for extended CI checks (#20511)
blaginin Feb 24, 2026
3aa34b3
chore(deps): bump strum from 0.27.2 to 0.28.0 (#20520)
dependabot[bot] Feb 24, 2026
4c0a653
chore(deps): bump taiki-e/install-action from 2.68.6 to 2.68.8 (#20518)
dependabot[bot] Feb 24, 2026
6c79369
chore(deps): bump the all-other-cargo-deps group with 2 updates (#20519)
dependabot[bot] Feb 24, 2026
0dfa542
fix: HashJoin panic with dictionary-encoded columns in multi-key join…
Tim-53 Feb 24, 2026
4a41587
Make `custom_file_casts` example schema nullable to allow null `id` v…
kosiew Feb 24, 2026
a9c0901
Add support for FFI config extensions (#19469)
timsaucer Feb 24, 2026
9c85ac6
perf: Fix quadratic behavior of `to_array_of_size` (#20459)
neilconway Feb 24, 2026
17d770d
fix: handle out of range errors in DATE_BIN instead of panicking (#20…
mishop-15 Feb 24, 2026
670dbf4
fix: prevent duplicate alias collision with user-provided __datafusio…
adriangb Feb 24, 2026
e71e7a3
chore: Cleanup code to use `repeat_n` in a few places (#20527)
neilconway Feb 24, 2026
932418b
chore(deps): bump strum_macros from 0.27.2 to 0.28.0 (#20521)
dependabot[bot] Feb 24, 2026
db5197b
chore: Replace `matches!` on fieldless enums with `==` (#20525)
neilconway Feb 24, 2026
b16ad9b
fix: SortMergeJoin don't wait for all input before emitting (#20482)
rluvaton Feb 24, 2026
fdd36d0
Update comments on OptimizerRule about function name matching (#20346)
alamb Feb 24, 2026
e80694e
Remove recursive const check in `simplify_const_expr` (#20234)
AdamGS Feb 24, 2026
b8cebdd
Fix incorrect regex pattern in regex_replace_posix_groups (#19827)
GaneshPatil7517 Feb 24, 2026
34dad2c
Cache `PlanProperties`, add fast-path for `with_new_children` (#19792)
askalt Feb 24, 2026
585bbf3
perf: Optimize `array_has_any()` with scalar arg (#20385)
neilconway Feb 24, 2026
387e20c
Improve `HashJoinExecBuilder` to save state from previous fields (#20…
askalt Feb 24, 2026
2347306
[Minor] Fix error messages for `shrink` and `try_shrink` (#20422)
hareshkh Feb 25, 2026
d75fcb8
Fix physical expr adapter to resolve physical fields by name, not col…
kosiew Feb 25, 2026
e937cad
[fix] Add type coercion from NULL to Interval to make date_bin more p…
LiaCastaneda Feb 25, 2026
d7d6461
feat: Implement Spark `bin` function (#20479)
kazantsev-maksim Feb 25, 2026
e684994
fix: `cardinality()` of an empty array should be zero (#20533)
neilconway Feb 25, 2026
e894a03
perf: Use Hashbrown for array_distinct (#20538)
neilconway Feb 25, 2026
3a970c5
Clamp early aggregation emit to the sort boundary when using partial …
jackkleeman Feb 25, 2026
33b86fe
perf: Cache num_output_rows in sort merge join to avoid O(n) recount …
andygrove Feb 25, 2026
bcd42b0
fix: Unaccounted spill sort in row_hash (#20314)
EmilyMatt Feb 26, 2026
a026e7d
perf: Optimize heap handling in TopK operator (#20556)
AdamGS Feb 26, 2026
d6fb360
perf: Optimize `array_position` for scalar needle (#20532)
neilconway Feb 26, 2026
e76f0ee
fix: IS NULL panic with invalid function without input arguments (#20…
Acfboy Feb 26, 2026
3ab1301
fix: handle empty delimiter in split_part (closes #20503) (#20542)
gferrate Feb 26, 2026
a257c29
add redirect for old upgrading.html URL to fix broken changelog links…
mishop-15 Feb 26, 2026
a79e6e6
fix(substrait): Correctly parse field references in subqueries (#20439)
neilconway Feb 27, 2026
bc600b3
Split `push_down_filter.slt` into standalone sqllogictest files to re…
kosiew Feb 27, 2026
e583fe9
Add deterministic per-file timing summary to sqllogictest runner (#20…
kosiew Feb 27, 2026
c63ca33
fix: increase ROUND decimal precision to prevent overflow truncation …
kumarUjjawal Feb 27, 2026
7ef62b9
chore: Enable workspace lint for all workspace members (#20577)
neilconway Feb 27, 2026
451c79f
fix: Fix `array_to_string` with columnar third arg (#20536)
neilconway Feb 27, 2026
e567cb9
Fix serde of window lead/lag defaults (#20608)
avantgardnerio Feb 27, 2026
5d8249f
fix: Fix and Refactor Spark `shuffle` function (#20484)
erenavsarogullari Feb 28, 2026
acec058
perf: Use Arrow vectorized eq kernel for IN list with column referenc…
zhangxffff Feb 28, 2026
73fbd48
Upgrade DataFusion to arrow-rs/parquet 58.0.0 / `object_store` 0.13.0…
alamb Feb 28, 2026
3a23bb2
perf: Optimize `array_agg()` using `GroupsAccumulator` (#20504)
neilconway Feb 28, 2026
8df75c3
Document guidance on how to evaluate breaking API changes (#20584)
alamb Feb 28, 2026
8482e2e
feat: support extension planner for `TableScan` (#20548)
linhr Mar 1, 2026
6713439
perf: Optimize `array_to_string()`, support more types (#20553)
neilconway Mar 1, 2026
95de1bf
Add metrics for parquet sink (#20307)
xudong963 Mar 2, 2026
1f37a33
Update DataFusion meetups page on docs (#20629)
alamb Mar 2, 2026
93d177d
Extend dynamic filter to joins that preserve probe side ON (#20447)
helgikrs Mar 2, 2026
0af9ff5
Improve sqllogicteset speed by creating only a single large file rath…
Tim-53 Mar 2, 2026
43584ca
cli: Fix datafusion-cli hint edge cases (#20609)
comphead Mar 2, 2026
02dae77
Speedup sqllogictests by running long running tests first (#20576)
alamb Mar 2, 2026
a5f490e
Add `ExecutionPlan::apply_expressions()` (#20337)
LiaCastaneda Mar 2, 2026
12314c5
perf: Optimize `array_to_string` to avoid a copy (#20639)
neilconway Mar 2, 2026
0bf3767
fix: make the `sql` feature truly optional (#20625)
linhr Mar 3, 2026
657887d
Fix custom metric display (#20643)
gabotechs Mar 3, 2026
476c200
refactor: Set expected runtime config in error message when the used …
erenavsarogullari Mar 3, 2026
3238a7e
more families for the CI (#20663)
blaginin Mar 3, 2026
0ca9d65
CI: Add CodeQL workflow for GitHub Actions security scanning (#20636)
kevinjqliu Mar 3, 2026
d1a3058
perf: Apply logical regexp optimizations to Utf8View and LargeUtf8 in…
petern48 Mar 3, 2026
88fa0df
Add `Field` to `Expr::Cast` -- allow logical expressions to express a…
paleolimbot Mar 3, 2026
d2df7a5
perf: Optimize `array_concat` using `MutableArrayData` (#20620)
neilconway Mar 3, 2026
7f42c8c
chore(deps): bump astral-sh/setup-uv from 7.3.0 to 7.3.1 (#20660)
dependabot[bot] Mar 3, 2026
bcc52cd
chore(deps): bump taiki-e/install-action from 2.68.8 to 2.68.16 (#20661)
dependabot[bot] Mar 3, 2026
2cbee47
fix: use try_shrink instead of shrink in try_resize (#20424)
ariel-miculas Mar 4, 2026
79b5b24
fix: Provide more generic API for the capacity limit parsing (#20372)
erenavsarogullari Mar 4, 2026
3b028fe
Improve formatting of datatypes (#20605)
emilk Mar 4, 2026
a114840
docs: Update `datafusion-cli` doc for `top-memory-consumers` config (…
erenavsarogullari Mar 4, 2026
92d0a5c
Add explain plans for ClickBench queries (#20666)
alamb Mar 4, 2026
10b5f22
[main] Update version to 52.2.0 (#20573)
alamb Mar 4, 2026
909608a
fix: Fix bug in `array_has` scalar path with sliced arrays (#20677)
neilconway Mar 4, 2026
028e351
Add files_processed and files_scanned metrics to FileStreamMetrics (#…
adriangb Mar 4, 2026
d025869
fix: `HashJoin` panic with String dictionary keys (don't flatten keys…
alamb Mar 4, 2026
b092bd4
Update releases links with releases in 2025-2026 (#20630)
alamb Mar 4, 2026
d412ba5
Speedup push_down_filter_regression.slt by using uncompressed parquet…
alamb Mar 4, 2026
b0349ff
feat: support nanosecond date_part (#20674)
mhilton Mar 4, 2026
0f093f4
Implement cardinality_effect for window execs and UnionExec (#20321)
getChan Mar 4, 2026
5d27860
feat: parse `JsonAccess` as a binary operator, add `Operator::Colon` …
Samyak2 Mar 4, 2026
cbe5cb3
ci: Harden labeler workflow, remove unnecessary checkout from pull_re…
kevinjqliu Mar 4, 2026
ddfc282
Add tests for sqllogictest prioritization (#20656)
alamb Mar 4, 2026
340ef60
perf: Optimize `to_char` to allocate less, fix NULL handling (#20635)
neilconway Mar 4, 2026
27c9cda
correct parquet leaf index mapping when schema contains struct cols (…
friendlymatthew Mar 5, 2026
1f0232c
Reattach parquet metadata cache after deserializing in datafusion-pro…
nathanb9 Mar 5, 2026
848cd63
Eliminate deterministic group by keys with deterministic transformati…
Dandandan Mar 5, 2026
46ac990
Wire up with_new_state with DataSource (#20718)
gabotechs Mar 5, 2026
29e8495
chore: Enable `assigning_clones` clippy lint (#20670)
neilconway Mar 5, 2026
c919054
perf: short-circuit and collect_bool for IN list with column referenc…
zhangxffff Mar 5, 2026
953bdf4
feat: Support Spark `array_contains` builtin function (#20685)
comphead Mar 5, 2026
00e36e8
fix: Return `probe_side.len()` for RightMark/Anti count(*) queries (#…
jonathanc-n Mar 5, 2026
13cebf8
FFI_TableOptions are using default values only (#20721)
timsaucer Mar 5, 2026
631c918
perf: sort replace free()->try_grow() pattern with try_resize() to re…
mbutrovich Mar 5, 2026
03d17e8
Improve documentation for `AggregateUdfImpl::simplify` and `WindowUDF…
alamb Mar 5, 2026
a95da70
doc: Add more context to `Precision` (#20713)
jonathanc-n Mar 5, 2026
dd988f6
Fix test that's broken on Windows due to naive path handling (#20692)
Rafferty97 Mar 5, 2026
b3976d6
Fix DELETE/UPDATE filter extraction when predicates are pushed down i…
kosiew Mar 6, 2026
678d1ad
Minor: Add comment explaining rationale to avoid dependencies on func…
alamb Mar 6, 2026
33c922f
use linker optimization for extended sqllogictests (#20740)
blaginin Mar 6, 2026
d72b0b8
fix: preserve None projection semantics across FFI boundary in Foreig…
Kontinuation Mar 6, 2026
02ce571
Push even local limits past windows (#20752)
avantgardnerio Mar 6, 2026
0ac434d
Add case-heavy LEFT JOIN benchmark and debug timing/logging for PushD…
kosiew Mar 7, 2026
5211a8b
Fix repartition from dropping data when spilling (#20672)
xanderbailey Mar 7, 2026
bfa0ea8
Hash join buffering on probe side (#19761)
gabotechs Mar 7, 2026
8fe926d
test: Add `datafusion-cli` `fair` and `unbounded` memory-pool test co…
erenavsarogullari Mar 7, 2026
92078d9
Copy limits before repartitions (#20736)
avantgardnerio Mar 7, 2026
4dbb449
ser/de fetch in FilterExec (#20738)
haohuaijin Mar 8, 2026
1eb5206
feat: Integrate CastColumnExpr into PhysicalExprAdapter (#20269)
kumarUjjawal Mar 9, 2026
37b9a46
feat: `partition_statistics()` for HashJoinExec (#20711)
jonathanc-n Mar 9, 2026
15bc6bd
feat: make DefaultLogicalExtensionCodec support serialisation of buil…
Acfboy Mar 9, 2026
bb421db
Add tests for simplifying multiple aggregate expressions (#20723)
alamb Mar 9, 2026
33b9afa
Allow SQL `TypePlanner` to plan SQL types as extension types (#20676)
paleolimbot Mar 9, 2026
b51edff
Update reverse UDF to emit utf8view when input is utf8view (#20604)
Omega359 Mar 9, 2026
9b3d6a4
Make lower and upper emit Utf8View for Utf8View input (#20616)
kumarUjjawal Mar 9, 2026
097f04c
fix(spark): handle divide-by-zero in Spark `mod`/`pmod` with ANSI mod…
davidlghellin Mar 9, 2026
aca8c14
Fix FilterExec converting Absent column stats to Exact(NULL) (#20391)
fwojciec Mar 9, 2026
44dfa7b
Clean up date_part preimage implementation (#20350)
sdf-jkl Mar 9, 2026
84a22ea
Wrap Arc to Statistics for `partition_statistics` API (#20570)
xudong963 Mar 9, 2026
fd97799
Make Physical CastExpr Field-aware and unify cast semantics across ph…
kosiew Mar 10, 2026
75c7da5
Pass ConfigOptions to scalar UDFs via FFI (#20454)
timsaucer Mar 10, 2026
39226c3
[datafusion-cli] Replace mutex with AtomicU64 for stream duration tra…
buraksenn Mar 10, 2026
af79d14
Make translate emit Utf8View for Utf8View input (#20624)
shivaaang Mar 10, 2026
23b88fb
Allow filters on struct fields to be pushed down into Parquet scan (#…
friendlymatthew Mar 10, 2026
6f86c8d
Used constant with mapping instead of write! to display scalar value …
buraksenn Mar 10, 2026
1f87930
fix: sqllogictest cannot convert <subquery> to Substrait (#19739)
kumarUjjawal Mar 10, 2026
daa8f52
fix: interval analysis error when have two filterexec that inner filt…
haohuaijin Mar 10, 2026
fc514c2
perf: Optimize set operations to avoid RowConverter deserialization o…
neilconway Mar 10, 2026
31a4037
chore(deps): bump taiki-e/install-action from 2.68.16 to 2.68.25 (#20…
dependabot[bot] Mar 10, 2026
64b5228
chore(deps): bump github/codeql-action from 4.32.5 to 4.32.6 (#20843)
dependabot[bot] Mar 10, 2026
8e02b8e
chore: Ignore RUSTSEC-2024-0421 (#20850)
comphead Mar 10, 2026
5af7361
fix: SanityCheckPlan error with window functions and NVL filter (#20231)
EeshanBembi Mar 10, 2026
9b7cdda
chore(deps): bump quinn-proto from 0.11.13 to 0.11.14 (#20859)
dependabot[bot] Mar 11, 2026
48199b9
Use `ParquetPushDecoder` in `ParquetOpener` (#20839)
Dandandan Mar 11, 2026
86cb815
[Minor] Remove redundant ProjectionExec nodes in sort-based plans (#2…
Dandandan Mar 11, 2026
2589fa8
doc: Add documentation for pushing limit into plan (#20271)
2010YOUY01 Mar 11, 2026
4bac1cf
impl ser/de for preserve_order in RepartitionExec (#20798)
haohuaijin Mar 11, 2026
da05287
Fix FileStream scanning_total to include sync next-file open time (#2…
RatulDawar Mar 11, 2026
95a3dfd
chore: Ignore RUSTSEC-2024-0014 (#20862)
comphead Mar 11, 2026
ed793f0
chore: clean up dependencies (#20861)
comphead Mar 11, 2026
1efcbf5
Add benchmark for struct field filter pushdown in Parquet (#20829)
friendlymatthew Mar 11, 2026
21cf60a
Add Null Type Coercions for Placeholders (#20543)
cetra3 Mar 11, 2026
d68b800
Minor: Deprecate unused `PartitionedFileStream` (#20869)
alamb Mar 11, 2026
f8fb5bd
fix: Avoid unnecessary type casts in `concat_ws` (#20436)
neilconway Mar 11, 2026
981b5c3
chore(deps): bump substrait from 0.62 to 0.63.0 (#20876)
benbellick Mar 11, 2026
129c58f
fix: Remove `!=0` check from `supports_collect_by_thresholds` (#20730)
jonathanc-n Mar 11, 2026
8d9b080
[Minor] propagate distinct_count as inexact through unions (#20846)
buraksenn Mar 12, 2026
4b022c0
fix: do not recompute hash join exec properties if not required (#20900)
askalt Mar 12, 2026
6b71523
[main] Bump to 52.3.0 and changelog (#20790) (#20849)
alamb Mar 12, 2026
385d9db
try to remove redundant alias in expression rewriter and select (#20867)
buraksenn Mar 12, 2026
8b412de
fix: Optimize `!~ '.*'` case to `col IS NULL AND Boolean(NULL)` inste…
petern48 Mar 12, 2026
57b275a
feat: correct struct column names for `arrays_zip` return type (#20886)
comphead Mar 12, 2026
fcb1c93
Fix duplicate group keys after hash aggregation spill (#20724) (#20858)
gboucher90 Mar 12, 2026
422b545
fix: Track metrics in hash joins with empty build sides (#20810)
nuno-faria Mar 13, 2026
3c56e5d
perf: Use batched row conversion for `array_has_any`, `array_has_all`…
neilconway Mar 13, 2026
b7e4213
Include .proto files in datafusion-proto-common distribution (#20921)
haohuaijin Mar 13, 2026
d2278a9
Check sqllogictests for any dangling config settings (#17914) (#20838)
cj-zhukov Mar 13, 2026
10d8bcb
Add support for ListView in unnest (#20760)
brancz Mar 13, 2026
d09ff92
feat: Reduce allocations for aggregating `Statistics` (#20768)
jonathanc-n Mar 13, 2026
2c871b2
Project only accessed struct leaves in Parquet row filter pushdown (#…
friendlymatthew Mar 13, 2026
9c3c01a
refactor: Improve `SessionContext::parse_duration` API (#20816)
erenavsarogullari Mar 14, 2026
c74976f
minor: Move PreparedAccessPlan to same module as ParquetAccessPlan (#…
alamb Mar 14, 2026
5db04b8
chore(deps): bump pyjwt from 2.11.0 to 2.12.0 (#20938)
dependabot[bot] Mar 14, 2026
6d3a846
Rewrite `SUM(expr + scalar)` --> `SUM(expr) + scalar*COUNT(expr)` (#…
alamb Mar 14, 2026
9b7d092
Add AGENTS.md / CLAUDE.md (#20939)
Dandandan Mar 14, 2026
538a201
perf: Optimize array set ops on sliced arrays (#20693)
neilconway Mar 15, 2026
1f59d32
fix: dfbench respects DATAFUSION_RUNTIME_MEMORY_LIMIT env var (#20631)
adriangb Mar 15, 2026
ab28234
Support `columns_sorted` in row_filters (#20497)
sdf-jkl Mar 15, 2026
8609288
Add --simulate-latency / SIMULATE_LATENCY option to dfbench / ./benc…
Dandandan Mar 16, 2026
b61aee7
Minor: make signatures of `SessionContext::register_*` methods consis…
alexandreyc Mar 16, 2026
3ece9ec
test: add reproducer for Dictionary InList pushdown type mismatch (#2…
erratic-pattern Mar 16, 2026
c6f7145
Extract shared `ParquetReadPlan` for leaf column resolution (#20913)
friendlymatthew Mar 16, 2026
5d37bab
chore: Remove usage of `paste` crate (#20946)
coderfender Mar 16, 2026
4166a6d
perf: Optimize comparison on nested types (#20716)
neilconway Mar 16, 2026
bd071be
feat: add `custom_string_literal_override` to unparser Dialect trait …
goldmedal Mar 16, 2026
26251bb
Use exact distinct_count from statistics if exists for `COUNT(DISTINC…
buraksenn Mar 16, 2026
a7e0941
fix(spark): return input string for PATH/FILE on schemeless URLs in `…
davidlghellin Mar 16, 2026
972b890
thin-ci (#20972)
blaginin Mar 16, 2026
fa6706a
chore(deps): bump lz4_flex from 0.12.0 to 0.12.1 (#20973)
dependabot[bot] Mar 17, 2026
0dfcd97
Replace ahash with foldhash for faster hashing in datafusion-common (…
Dandandan Mar 17, 2026
4c96125
Fix decimal log precision for non-power values (#20433)
kumarUjjawal Mar 17, 2026
9756146
chore(deps): bump Swatinem/rust-cache from 2.8.2 to 2.9.1 (#20979)
dependabot[bot] Mar 17, 2026
1e99ed5
chore(deps): bump taiki-e/install-action from 2.68.25 to 2.68.34 (#20…
dependabot[bot] Mar 17, 2026
13a39d7
chore(deps): bump github/codeql-action from 4.32.6 to 4.33.0 (#20982)
dependabot[bot] Mar 17, 2026
8a95c4c
chore(deps): bump astral-sh/setup-uv from 7.3.1 to 7.6.0 (#20981)
dependabot[bot] Mar 17, 2026
11b9693
chore(deps): bump runs-on/action from 2.0.3 to 2.1.0 (#20980)
dependabot[bot] Mar 17, 2026
84a79e1
fix: InList Dictionary filter pushdown type mismatch (#20962)
erratic-pattern Mar 17, 2026
50b6bf8
fix: Run release verification with `--profile=ci` (#20987)
alamb Mar 17, 2026
fd145c4
[Minor] Update Cargo.lock, Fix Tokio minor breaking change (#20978)
Dandandan Mar 17, 2026
8142308
chore(deps): Revert "chore(deps): bump runs-on/action from 2.0.3 to 2…
mbutrovich Mar 17, 2026
e74e58f
fix: move overflow guard before dense ratio in hash join to prevent o…
buraksenn Mar 17, 2026
6ab16cc
bug: fix `array_remove_*` with NULLS (#21013)
comphead Mar 17, 2026
cf0a182
Simplify logic for memory pressure partial emit from ordered group by…
alamb Mar 18, 2026
b7a3f53
docs: in release email, be specific about changelog location (#20975)
kevinjqliu Mar 18, 2026
a6a4df9
Fix memory reservation starvation in sort-merge (#20642)
xudong963 Mar 18, 2026
b6b542e
perf: Optimize `array_positions()` for scalar needle (#20770)
neilconway Mar 18, 2026
7e4818d
fix: improve GroupOrdering docs (#20994)
alamb Mar 18, 2026
317052e
perf: Optimize `approx_distinct()` for string, binary inputs (#21037)
neilconway Mar 18, 2026
d138c36
infra: automatically delete branch on pr merge (#21033)
kevinjqliu Mar 18, 2026
7014a45
feat: Extract NDV (distinct_count) statistics from Parquet metadata (…
asolimando Mar 19, 2026
4ae19eb
fix: update clickbench expected plan for NDV-aware optimization (#21050)
asolimando Mar 19, 2026
4010a55
Add support for nested lists in substrait consumer (#20953)
alexanderbianchi Mar 19, 2026
c792700
build: update Rust toolchain version to 1.94.0 (#21045)
dariocurr Mar 19, 2026
897b5c1
feat: support repartitioning of FFI execution plans (#20449)
timsaucer Mar 19, 2026
6ef4cef
chore: Cleanup fully-qualified ScalarFunctionArgs (#20804)
neilconway Mar 19, 2026
9885f4b
fix: `arrays_zip/list_zip` allow single array argument (#21047)
hsiang-c Mar 20, 2026
556ea9c
Sketch out a Morselize API
alamb Mar 9, 2026
e676d0d
Add shared state
alamb Mar 20, 2026
1dbd393
Start adding in IO scheduling
alamb Mar 20, 2026
7703298
builder
alamb Mar 20, 2026
6472084
Extract queues
alamb Mar 20, 2026
025923c
Fix API
alamb Mar 20, 2026
de393d7
workstealing
alamb Mar 20, 2026
2f3b919
Update tests
alamb Mar 21, 2026
89d6819
fix hot loop
alamb Mar 21, 2026
b831f48
Add temp tracing
alamb Mar 21, 2026
f671daa
Fix shared state
alamb Mar 21, 2026
1b2e4b9
hard code morsel depth
alamb Mar 24, 2026
674b8d1
Use crossbeam channel and RWLocks
alamb Mar 24, 2026
15450bf
fmt
alamb Mar 24, 2026
fd976c1
Do not worksteal when preserving partitions
alamb Mar 24, 2026
37b6648
clarify comments
alamb Mar 24, 2026
5f4f844
Spawn I/O futures for morsel planners to enable true parallel prefetch
Dandandan Mar 24, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
19 changes: 19 additions & 0 deletions .asf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ github:
- sql
enabled_merge_buttons:
squash: true
squash_commit_message: PR_TITLE_AND_DESC
merge: false
rebase: false
features:
Expand All @@ -50,11 +51,29 @@ github:
main:
required_pull_request_reviews:
required_approving_review_count: 1
# needs to be updated as part of the release process
# .asf.yaml doesn't support wildcard branch protection rules, only exact branch names
# https://github.com/apache/infrastructure-asfyaml?tab=readme-ov-file#branch-protection
# these branches protection blocks autogenerated during release process which is described in
# https://github.com/apache/datafusion/tree/main/dev/release#2-add-a-protection-to-release-candidate-branch
branch-50:
required_pull_request_reviews:
required_approving_review_count: 1
branch-51:
required_pull_request_reviews:
required_approving_review_count: 1
branch-52:
required_pull_request_reviews:
required_approving_review_count: 1
pull_requests:
# enable updating head branches of pull requests
allow_update_branch: true
allow_auto_merge: true
# auto-delete head branches after being merged
del_branch_on_merge: true

# publishes the content of the `asf-site` branch to
# https://datafusion.apache.org/
publish:
whoami: asf-site

14 changes: 8 additions & 6 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,12 @@ RUN apt-get update && export DEBIAN_FRONTEND=noninteractive \
# Remove imagemagick due to https://security-tracker.debian.org/tracker/CVE-2019-10131
&& apt-get purge -y imagemagick imagemagick-6-common

# Add protoc
# https://datafusion.apache.org/contributor-guide/getting_started.html#protoc-installation
RUN curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v25.1/protoc-25.1-linux-x86_64.zip \
&& unzip protoc-25.1-linux-x86_64.zip -d $HOME/.local \
&& rm protoc-25.1-linux-x86_64.zip
# setup the containers WORKDIR so npm install works
# https://stackoverflow.com/questions/57534295/npm-err-tracker-idealtree-already-exists-while-creating-the-docker-image-for
WORKDIR /root

ENV PATH="$PATH:$HOME/.local/bin"
# Add protoc, npm, prettier
# https://datafusion.apache.org/contributor-guide/development_environment.html#protoc-installation
RUN apt-get update \
&& apt-get install -y --no-install-recommends protobuf-compiler libprotobuf-dev npm nodejs\
&& rm -rf /var/lib/apt/lists/*
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE/bug_report.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
name: Bug report
description: Create a report to help us improve
type: Bug
labels: bug
body:
- type: textarea
Expand Down
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE/feature_request.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
name: Feature request
description: Suggest an idea for this project
type: Feature
labels: enhancement
body:
- type: textarea
Expand Down
14 changes: 14 additions & 0 deletions .github/actions/setup-builder/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,17 @@ runs:
# https://github.com/actions/checkout/issues/766
shell: bash
run: git config --global --add safe.directory "$GITHUB_WORKSPACE"
- name: Remove unnecessary preinstalled software
shell: bash
run: |
echo "Disk space before cleanup:"
df -h
apt-get clean
# remove tool cache: about 8.5GB (github has host /opt/hostedtoolcache mounted as /__t)
rm -rf /__t/* || true
# remove Haskell runtime: about 6.3GB (host /usr/local/.ghcup)
rm -rf /host/usr/local/.ghcup || true
# remove Android library: about 7.8GB (host /usr/local/lib/android)
rm -rf /host/usr/local/lib/android || true
echo "Disk space after cleanup:"
df -h
4 changes: 3 additions & 1 deletion .github/actions/setup-macos-aarch64-builder/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ runs:
rustup default stable
rustup component add rustfmt
- name: Setup rust cache
uses: Swatinem/rust-cache@v2
uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
with:
save-if: ${{ github.ref_name == 'main' }}
- name: Configure rust runtime env
uses: ./.github/actions/setup-rust-runtime
47 changes: 0 additions & 47 deletions .github/actions/setup-macos-builder/action.yaml

This file was deleted.

9 changes: 0 additions & 9 deletions .github/actions/setup-rust-runtime/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@ description: 'Setup Rust Runtime Environment'
runs:
using: "composite"
steps:
# https://github.com/apache/datafusion/issues/15535
# disabled because neither version nor git hash works with apache github policy
#- name: Run sccache-cache
# uses: mozilla-actions/sccache-action@65101d47ea8028ed0c98a1cdea8dd9182e9b5133 # v0.0.8
- name: Configure runtime env
shell: bash
# do not produce debug symbols to keep memory usage down
Expand All @@ -32,11 +28,6 @@ runs:
#
# Set debuginfo=line-tables-only as debuginfo=0 causes immensely slow build
# See for more details: https://github.com/rust-lang/rust/issues/119560
#
# readd the following to the run below once sccache-cache is re-enabled
# echo "RUSTC_WRAPPER=sccache" >> $GITHUB_ENV
# echo "SCCACHE_GHA_ENABLED=true" >> $GITHUB_ENV
run: |
echo "RUST_BACKTRACE=1" >> $GITHUB_ENV
echo "RUSTFLAGS=-C debuginfo=line-tables-only -C incremental=false" >> $GITHUB_ENV

27 changes: 25 additions & 2 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,10 @@ updates:
- package-ecosystem: cargo
directory: "/"
schedule:
interval: daily
interval: weekly
target-branch: main
labels: [auto-dependencies]
open-pull-requests-limit: 15
ignore:
# major version bumps of arrow* and parquet are handled manually
- dependency-name: "arrow*"
Expand All @@ -44,9 +45,31 @@ updates:
patterns:
- "prost*"
- "pbjson*"

# Catch-all: group only minor/patch into a single PR,
# excluding deps we want always separate (and excluding arrow/parquet which have their own group)
all-other-cargo-deps:
applies-to: version-updates
patterns:
- "*"
exclude-patterns:
- "arrow*"
- "parquet"
- "object_store"
- "sqlparser"
- "prost*"
- "pbjson*"
update-types:
- "minor"
- "patch"
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "daily"
interval: "weekly"
open-pull-requests-limit: 10
labels: [auto-dependencies]
- package-ecosystem: "pip"
directory: "/docs"
schedule:
interval: "weekly"
labels: [auto-dependencies]
18 changes: 11 additions & 7 deletions .github/workflows/audit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,25 +23,29 @@ concurrency:

on:
push:
branches:
- main
paths:
- "**/Cargo.toml"
- "**/Cargo.lock"
branches:
- main

pull_request:
paths:
- "**/Cargo.toml"
- "**/Cargo.lock"

merge_group:

jobs:
security_audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Install cargo-audit
run: cargo install cargo-audit
uses: taiki-e/install-action@de6bbd1333b8f331563d54a051e542c7dfef81c3 # v2.68.34
with:
tool: cargo-audit
- name: Run audit check
# Ignored until https://github.com/apache/datafusion/issues/15571
# ignored py03 warning until arrow 55 upgrade
run: cargo audit --ignore RUSTSEC-2024-0370 --ignore RUSTSEC-2025-0020
# Note: you can ignore specific RUSTSEC issues using the `--ignore` flag ,for example:
# run: cargo audit --ignore RUSTSEC-2026-0001
run: cargo audit --ignore RUSTSEC-2024-0014
55 changes: 55 additions & 0 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

name: "CodeQL"

on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]
schedule:
- cron: '16 4 * * 1'

permissions:
contents: read

jobs:
analyze:
name: Analyze Actions
runs-on: ubuntu-latest
permissions:
contents: read
security-events: write
packages: read

steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false

- name: Initialize CodeQL
uses: github/codeql-action/init@b1bff81932f5cdfc8695c7752dcee935dcd061c8 # v4
with:
languages: actions

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@b1bff81932f5cdfc8695c7752dcee935dcd061c8 # v4
with:
category: "/language:actions"
16 changes: 15 additions & 1 deletion .github/workflows/dependencies.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,16 @@ concurrency:

on:
push:
branches-ignore:
- 'gh-readonly-queue/**'
paths:
- "**/Cargo.toml"
- "**/Cargo.lock"
pull_request:
paths:
- "**/Cargo.toml"
- "**/Cargo.lock"
merge_group:
# manual trigger
# https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow
workflow_dispatch:
Expand All @@ -41,7 +44,7 @@ jobs:
container:
image: amd64/rust
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
submodules: true
fetch-depth: 1
Expand All @@ -53,3 +56,14 @@ jobs:
run: |
cd dev/depcheck
cargo run

detect-unused-dependencies:
runs-on: ubuntu-latest
container:
image: amd64/rust
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Install cargo-machete
run: cargo install cargo-machete --version ^0.9 --locked
- name: Detect unused dependencies
run: cargo machete --with-metadata
Loading
Loading