Skip to content

[ty] Avoid expression_type calls for syntax error targets in unpacking assignment#24663

Open
charliermarsh wants to merge 3 commits intomainfrom
charlie/unknown
Open

[ty] Avoid expression_type calls for syntax error targets in unpacking assignment#24663
charliermarsh wants to merge 3 commits intomainfrom
charlie/unknown

Conversation

@charliermarsh
Copy link
Copy Markdown
Member

Summary

We parse something, not = (1, 2) as (on the LHS) a name target (something) and a unary target (not) followed by an empty name. As a result, we never visit the not or its name, which means we never infer or record a type for that malformed subtree in UnpackResult.

Later, an expression_type(...) lookup for any subexpression can miss and panic.

Closes astral-sh/ty#3283.

@charliermarsh charliermarsh added bug Something isn't working ty Multi-file analysis & type inference labels Apr 15, 2026
@astral-sh-bot
Copy link
Copy Markdown

astral-sh-bot Bot commented Apr 15, 2026

Typing conformance results

No changes detected ✅

Current numbers
The percentage of diagnostics emitted that were expected errors held steady at 87.94%. The percentage of expected errors that received a diagnostic held steady at 83.36%. The number of fully passing files held steady at 79/133.

@astral-sh-bot
Copy link
Copy Markdown

astral-sh-bot Bot commented Apr 15, 2026

Memory usage report

Memory usage unchanged ✅

@astral-sh-bot
Copy link
Copy Markdown

astral-sh-bot Bot commented Apr 15, 2026

ecosystem-analyzer results

No diagnostic changes detected ✅

Full report with detailed diff (timing results)

@AlexWaygood
Copy link
Copy Markdown
Member

Hmm, where's the dodgy expression_type(...) call? Should we maybe just be using a non-panicking variant like try_expression_type(...).unwrap_or_else(Type::unknown) there instead?

@charliermarsh
Copy link
Copy Markdown
Member Author

Is that preferred? I guess I would've thought recording an expression was preferred :)

@charliermarsh
Copy link
Copy Markdown
Member Author

Regardless I will track it down now.

@AlexWaygood
Copy link
Copy Markdown
Member

We tried for a while to maintain the invariant that every node in the AST should have a type stored for it. But we gave up on it a while back because it proved nearly impossible to maintain it.

@AlexWaygood
Copy link
Copy Markdown
Member

And yeah, preferring non-panicking APIs to fix server crashes is the approach we took in #18455, for example

@charliermarsh
Copy link
Copy Markdown
Member Author

So should nothing be using expression_type?

@charliermarsh charliermarsh changed the title [ty] Visit syntax error targets in unpacking assignment [ty] Avoid expression_type calls for syntax error targets in unpacking assignment Apr 16, 2026
@carljm
Copy link
Copy Markdown
Contributor

carljm commented Apr 16, 2026

Not sure without looking at all uses, but it seems unlikely we have many cases where we can be sure that an expression already has a type.

Glancing at this PR, it seems like it might be worth having a .expression_type_or_unknown() method.

@carljm carljm removed their request for review April 16, 2026 04:48
@charliermarsh
Copy link
Copy Markdown
Member Author

Added.

@AlexWaygood
Copy link
Copy Markdown
Member

I still feel like I'm missing a full understanding of the problem here. Why does this cause ty to crash in the server context but not on the CLI? What server API is it that's leading to us calling infer_expression(...) here even though that call isn't required when type-checking the file?

@charliermarsh
Copy link
Copy Markdown
Member Author

(It's related to completions -- I'll add a more complete explanation to the summary of the exact call stack.)

@charliermarsh charliermarsh marked this pull request as draft April 17, 2026 00:06
@charliermarsh
Copy link
Copy Markdown
Member Author

charliermarsh commented Apr 17, 2026

Okay, so...

Recall that for something, not = (1, 2), the left-hand side has one normal Name ("something"), and one UnaryOp with an empty Name ("") inside.

Semantic indexing visits all the sub-expressions on the left-hand side of that tuple, and creates an AssignmentDefinitionKind for the empty inner name with TargetKind::Sequence(unpack).

Then, when aggregating completions in the LSP, we call all_reachable_members, which calls binding_type (and then infer_definition_types) on every live binding, including that empty name. Specifically, unpacked.expression_type(target) ends up as unpacked.expression_type(Name("")).

But... the UnpackResult never visited the empty name. The Unpacker walks the tuple in unpack_inner, and for UnaryOp(Not, Name("")), hits the _ => {} arm, and never records a Type for the empty name.

So it's ultimately a discrepancy between the symbols populated during semantic indexing (which includes the empty name) and the inferences created when we unpack the tuple (which doesn't).

This doesn't affect type-checking because we never call infer_definition on the Name("") node.

@charliermarsh charliermarsh marked this pull request as ready for review April 17, 2026 01:22
@AlexWaygood
Copy link
Copy Markdown
Member

What do you think of #24697 as an alternative fix, that tackles the problem via improve our parser error recovery for invalid syntax cases? (All Claude generated -- basically just created by me repeatedly prompting it to look at the fundamental cause of why the semantic index and the type inferencer were walking the AST in different ways.)

@AlexWaygood
Copy link
Copy Markdown
Member

Here's a comparative review of the two PRs FWIW, also by Claude (but with a new context)

Details

Comparative Review: PR #24663 vs PR #24697

Both PRs fix the same LSP panic from astral-sh/ty#3283: typing a, not = (1, 2) (or a, not x = ...) crashes ty because the parser's error-recovery AST leaves a phantom Store-context Name inside a UnaryOp, which semantic indexing turns into a binding, but the unpacker never records a type for it — so later binding_type(...) → expression_type(...) panics.

The shared root cause

ruff_python_parser::helpers::set_expr_ctx recursively propagates the target ctx (Store/Del) down through every container it understands. Its Expr::UnaryOp arm propagates into the operand — but UnaryOp is never a valid assignment/deletion target in Python, so that propagation is a lie. Semantic indexing trusts the lie and records a binding; the unpacker ignores UnaryOp in its _ => arm and records no type. Asymmetry → panic.

What each PR actually changes (relative to its merge base)

#24663 (Charlie) — downstream, defensive fix

  • ty_python_semantic/src/types/unpacker.rs: removes UnpackResult::expression_type (the panicking variant); adds expression_type_or_unknown that returns Type::unknown() on miss. try_expression_type already existed.
  • ty_python_semantic/src/types/infer/builder.rs + class/static_literal.rs: updates ~8 call sites to use the new method.
  • cycle_normalized: keeps an .expect("both UnpackResults derive from the same Unpack") for the internal cycle-recovery invariant.
  • 2 LSP e2e tests (diagnostics only). Snapshots show the same parser-produced diagnostics as before.

#24697 (Alex) — root-cause, parser fix

  • ruff_python_parser/src/parser/helpers.rs: set_expr_ctx now marks UnaryOp's operand as ExprContext::Invalid instead of propagating new_ctx. 5-line change.
  • Updates 3 existing invalid_syntax@* parser snapshots (operand ctx flips from Store to Invalid).
  • 3 LSP e2e tests: the same two diagnostic tests, plus a completion request that forces all_reachable_members → binding_type, i.e., the exact path the original panic took.

How the approaches compare

Correctness. #24697 removes the lie — ExprContext::Invalid already has explicit arms in builder.rs (l.8087, l.8430), subscript.rs, type_expression.rs, and in the linter's checkers/ast/mod.rs:1729, all returning Type::unknown() / no-op. The fix slots cleanly into a pattern that already exists. #24663 leaves the ctx-lie in place and teaches one caller (the unpacker) to tolerate missing type entries, but the underlying semantic-index phantom binding is still created — x in a, not x = (1, 2) will still be recorded as a live Unknown-typed local, which is visible to completions and other consumers that don't go through expression_type.

Test strength. Both share two diagnostic tests. #24697's third test (completions_do_not_panic_on_invalid_not_target) directly exercises binding_type, which is the call path that actually panicked in the reported LSP issue. #24663's tests only cover publish_diagnostics and in principle could pass while a completion-triggered panic still occurred. (In practice its fix likely covers both, but the test surface is weaker.)

Blast radius. #24697 touches a shared parser helper. The three updated snapshots are all invalid_syntax@* fixtures, which is reassuring — no valid-syntax behavior changes. The linter's ast/mod.rs already handles ExprContext::Invalid. Still, anyone else calling set_expr_ctx and expecting Store on a UnaryOp's operand would see a behavior change; worth a once-over in the formatter. #24663 is strictly confined to ty's semantic layer.

API hygiene. #24663 removes the #[track_caller] expression_type(...) panic method, which is a nice side win. But it does replace "panic on missing" with "silently return Unknown," and the API now forces every caller to make that choice. #24697 lets the invariant (ctx matches what the AST really means) hold, so consumers don't have to defensively special-case.

Residual user-visible effect. With #24663, after typing a, not x = (1, 2) the name x appears as a binding of type Unknown in completions and hover — because semantic indexing still thinks x is stored. With #24697, no binding is created, so x doesn't pollute scope at all. Not a correctness bug either way, but #24697 matches user intent better.

Commit message / framing. #24697's commit message is explicit about the root cause and the alignment with the existing semantic-indexing (Invalid, _) arm — useful for future readers. #24663's description is brief but accurate.

Minor observations

Recommendation

#24697 is the better fix. It's smaller, addresses the root cause, aligns two walkers that had drifted apart, fits the existing ExprContext::Invalid pattern used elsewhere in ty, and has a more targeted regression test. #24663 is a reasonable defensive cleanup in isolation — and the expression_type_or_unknown rename is a genuine improvement — but as a fix it papers over a parser bug without removing the underlying inconsistency.

A reasonable combined path: land #24697 as the fix, and optionally keep #24663's API rename (drop the call-site changes and the "malformed targets" comment, since they'd no longer be the motivation). But if picking one, take #24697.

@charliermarsh
Copy link
Copy Markdown
Member Author

That feels closer to my first fix (make sure we infer types for the names we collect), but from the other direction (make sure we don't collect names that we won't infer types for).

With that change, given x, not y = ..., I think we wouldn't offer y as a completion suggestion downstream (unlike in this PR), which seems a bit worse IMO?

@charliermarsh
Copy link
Copy Markdown
Member Author

I guess my feeling on reflection is that my initial fix feels the most correct to me :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lsp error/crash when doing single-line multiple-variable assignment to a variable starting with not

4 participants