Skip to content

Unparser drops ORDER BY alias when flattening Projection through SubqueryAlias#21491

Merged
Jefffrey merged 12 commits intoapache:mainfrom
yonatan-sevenai:bug/sort_push
Apr 23, 2026
Merged

Unparser drops ORDER BY alias when flattening Projection through SubqueryAlias#21491
Jefffrey merged 12 commits intoapache:mainfrom
yonatan-sevenai:bug/sort_push

Conversation

@yonatan-sevenai
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

When plan_to_sql encounters a Projection → Sort → Projection → SubqueryAlias plan where the outer Projection excludes a Sort
column defined as an alias in the inner Projection (e.g. Z AS c), the function rewrite_plan_for_sort_on_non_projected_fields
flattens the two Projections into one but only keeps the outer Projection's columns. This drops the alias definition, leaving
ORDER BY c referencing a column that no longer exists in the generated SQL.

What changes are included in this PR?

In rewrite_plan_for_sort_on_non_projected_fields (datafusion/sql/src/unparser/rewrite.rs), when the inner Projection is
trimmed to only the outer Projection's expressions, sort expressions that reference dropped aliases are now inlined to the
underlying physical expression. For example, ORDER BY c becomes ORDER BY t."Z" when c was defined as Z AS c in the
dropped inner Projection.

Are these changes tested?

Yes. A new regression test test_sort_on_aliased_column_dropped_by_outer_projection in
datafusion/sql/tests/cases/plan_to_sql.rs constructs the exact plan shape that triggers the bug and asserts the correct SQL
output:

SELECT t."X" AS a, t."Y" AS b FROM phys_table AS t ORDER BY t."Z" DESC NULLS FIRST LIMIT 1

All existing plan_to_sql tests (118 tests) continue to pass.

Are there any user-facing changes?

No API changes. Previously invalid SQL is now generated correctly.

yonatan-sevenai and others added 9 commits March 22, 2026 00:06
…gregate

When the SQL unparser encountered a SubqueryAlias node whose direct
child was an Aggregate (or other clause-building plan like Window, Sort,
Limit, Union), it would flatten the subquery into a simple table alias,
losing the aggregate entirely.

For example, a plan representing:
  SELECT j1.col FROM j1 JOIN (SELECT max(id) AS m FROM j2) AS b ON j1.id = b.m

would unparse to:
  SELECT j1.col FROM j1 INNER JOIN j2 AS b ON j1.id = b.m

dropping the MAX aggregate and the subquery.

Root cause: the SubqueryAlias handler in select_to_sql_recursively would
call subquery_alias_inner_query_and_columns (which only unwraps
Projection children) and unparse_table_scan_pushdown (which only handles
TableScan/SubqueryAlias/Projection). When both returned nothing useful
for an Aggregate child, the code recursed directly into the Aggregate,
merging its GROUP BY into the outer SELECT instead of wrapping it in a
derived subquery.

The fix adds an early check: if the SubqueryAlias's direct child is a
plan type that builds its own SELECT clauses (Aggregate, Window, Sort,
Limit, Union), emit it as a derived subquery via self.derive() with the
alias always attached, rather than falling through to the recursive
path that would flatten it.
…gregate

When the SQL unparser encountered a SubqueryAlias node whose direct
child was an Aggregate (or other clause-building plan like Window, Sort,
Limit, Union), it would flatten the subquery into a simple table alias,
losing the aggregate entirely.

For example, a plan representing:
  SELECT j1.col FROM j1 JOIN (SELECT max(id) AS m FROM j2) AS b ON j1.id = b.m

would unparse to:
  SELECT j1.col FROM j1 INNER JOIN j2 AS b ON j1.id = b.m

dropping the MAX aggregate and the subquery.

Root cause: the SubqueryAlias handler in select_to_sql_recursively would
call subquery_alias_inner_query_and_columns (which only unwraps
Projection children) and unparse_table_scan_pushdown (which only handles
TableScan/SubqueryAlias/Projection). When both returned nothing useful
for an Aggregate child, the code recursed directly into the Aggregate,
merging its GROUP BY into the outer SELECT instead of wrapping it in a
derived subquery.

The fix adds an early check: if the SubqueryAlias's direct child is a
plan type that builds its own SELECT clauses (Aggregate, Window, Sort,
Limit, Union), emit it as a derived subquery via self.derive() with the
alias always attached, rather than falling through to the recursive
path that would flatten it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rojection

In `rewrite_plan_for_sort_on_non_projected_fields`, when the outer
Projection excludes a Sort column that was defined as an alias in the
inner Projection (e.g. `Z AS c`), the rewrite replaced the inner
Projection's expressions with only the outer Projection's mapped
expressions, dropping the alias definition. This left ORDER BY
referencing a non-existent column.

The fix inlines the underlying physical expression into the Sort when
an alias is dropped (e.g. `ORDER BY c` → `ORDER BY t."Z"`).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Clarify what the algorithm does and why, remove example-specific
column names from the implementation comments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the sql SQL Planner label Apr 9, 2026
Copy link
Copy Markdown
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think once CI is green this should be good to go

@yonatan-sevenai
Copy link
Copy Markdown
Contributor Author

I think once CI is green this should be good to go

Thanks @Jefffrey .
CI is green. Merged from main and ensured no conflicts and all tests pass.

Would appreciate a merge.

@Jefffrey Jefffrey added this pull request to the merge queue Apr 23, 2026
@Jefffrey
Copy link
Copy Markdown
Contributor

Thanks @yonatan-sevenai

Merged via the queue into apache:main with commit 95157ef Apr 23, 2026
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Unparser drops column alias needed by ORDER BY when flattening Projection through SubqueryAlias

2 participants