fix(debuginfo): correct language detection for LTO-compiled binaries by nsavoire · Pull Request #7 · DataDog/symbolic

nsavoire · 2026-02-24T12:17:23Z

Problem

LTO can produce compilation units (CUs) whose DW_AT_language does not reflect the true source language of the functions they contain. Two cases arise:

Artificial LTO CUs — e.g. a C++ language tag on a CU that contains C functions. A top-level subprogram in such a CU carries a cross-unit DW_AT_abstract_origin pointing to the real (partial) CU.
Cross-language LTO inlinees — e.g. a C function inlined into a Rust caller. The inlinee's DW_AT_abstract_origin references the C CU directly.

Implementation

`UnitRef::resolve_entry_language(entry, depth)`

New helper on UnitRef that follows DW_AT_abstract_origin chains recursively to find the authoritative CU language:

Recurses into the referenced entry first to handle multi-level chains
If no deeper reference yields a language, falls back to the referenced CU's own DW_AT_language for cross-unit references.
Limits recursion to MAX_ABSTRACT_ORIGIN_DEPTH = 16 levels (matching the limit used by elfutils dwarf_attr_integrate) to guard against cycles or malformed DWARF.

`DwarfUnit::resolve_function_language(entry, fallback_language)`

Thin wrapper around resolve_entry_language on the unit's UnitRef, falling back to fallback_language when no cross-unit language is found (e.g. when DW_AT_abstract_origin points to a partial unit with no DW_AT_language).

`parse_function` / `parse_inlinee`

parse_function calls resolve_function_language to resolve the authoritative language for the top-level subprogram and passes it down through parse_function_children and parse_inlinee. parse_inlinee calls resolve_function_language again on its own entry so that cross-language inlinees (e.g. C inlined into Rust) override the enclosing function's language correctly.

Tests

Two regression tests added with real binary fixtures:

test_lto_language_detection (libjemalloc.so.debug): verifies that je_tcache_arena_associate and malloc_mutex_trylock_final (both C functions in a library compiled with LTO) are detected as Language::C, not Language::Cpp.
test_cross_language_lto_inlinee_language (cross_language_lto.debug): verifies that my_add (a C function inlined into a Rust binary via cross-language LTO) is detected as Language::C, not Language::Rust.

nsavoire · 2026-02-26T08:33:20Z

@codex review

chatgpt-codex-connector · 2026-02-26T08:39:47Z

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

buranmert

i need to do quite a bit of research in order to properly review this, i'll go with approving-blindly 😅

do you expect more changes in symbolic in the future?
do you think we should automate symbolic upgrades in our related services? or manual upgrades are still okay?

nsavoire · 2026-03-03T10:58:42Z

i need to do quite a bit of research in order to properly review this, i'll go with approving-blindly 😅

do you expect more changes in symbolic in the future? do you think we should automate symbolic upgrades in our related services? or manual upgrades are still okay?

This is not an easy PR to review 😄
I don't expect more changes, so I think we can keep with manual update for now.
Currently dd_master is in bad shape and CI do not pass. I plan to first update dd_master to latest symbolic release (12.17.2) and merge my PR on top of it.

LTO can produce compilation units whose DW_AT_language does not reflect the true source language of the functions they contain. Two cases arise: 1. Artificial LTO CUs (e.g. artificial CU with C++ language tag that contains C functions): a top-level subprogram in such a CU carries a cross-unit DW_AT_abstract_origin pointing to the real CU. We now follow that reference in resolve_function_language to pick up the origin CU's language, which is then used for the symbol-table name, DWARF name, and fallback name of the function. 2. Cross-language LTO inlinees (e.g. a C function inlined into Rust): the inlinee's DW_AT_abstract_origin references the C CU directly. resolve_function_name now reads the referenced CU's language via UnitRef::language() whenever it follows an abstract_origin across a unit boundary, overriding the language supplied by the caller. To propagate the correctly-resolved language to all inlinees of a top-level subprogram, parse_function passes it down through parse_function_children and parse_inlinee. Same-unit abstract_origin references (LTO partial units without a further cross-unit link) keep the enclosing function's language as a fallback, which is correct for the common case where all code in an LTO CU shares the same language.

nsavoire force-pushed the nsavoire/lto_language branch 4 times, most recently from d46e13c to 4769ee7 Compare February 25, 2026 10:17

nsavoire changed the base branch from upstream_master to dd_master February 25, 2026 10:18

nsavoire force-pushed the nsavoire/lto_language branch 3 times, most recently from bbab72f to d443c7d Compare February 25, 2026 12:34

nsavoire changed the title ~~Attempt to get language from DW_AT_abstract_origin~~ fix(debuginfo): correct language detection for LTO-compiled binaries Feb 25, 2026

nsavoire marked this pull request as ready for review February 25, 2026 12:45

nsavoire requested review from a team and buranmert February 25, 2026 12:45

nsavoire force-pushed the nsavoire/lto_language branch from d443c7d to 52185c9 Compare February 25, 2026 12:48

Gandem approved these changes Feb 25, 2026

View reviewed changes

Comment thread symbolic-debuginfo/src/dwarf.rs Outdated

Comment thread symbolic-debuginfo/src/dwarf.rs

Comment thread symbolic-debuginfo/src/dwarf.rs Outdated

Comment thread symbolic-debuginfo/src/dwarf.rs

Gandem reviewed Feb 25, 2026

View reviewed changes

Comment thread symbolic-debuginfo/src/dwarf.rs

Gandem approved these changes Feb 26, 2026

View reviewed changes

Comment thread symbolic-debuginfo/src/dwarf.rs Outdated

buranmert approved these changes Mar 2, 2026

View reviewed changes

nsavoire force-pushed the nsavoire/lto_language branch from cc480ea to 173a4d2 Compare March 3, 2026 14:43

theomagellan approved these changes Mar 4, 2026

View reviewed changes

Comment thread symbolic-debuginfo/src/dwarf.rs Outdated

nsavoire added 2 commits March 18, 2026 18:27

Improve readability

3aabe8e

Improve test consistency

ba7c464

nsavoire merged commit 3cf2422 into dd_master Mar 18, 2026
10 of 11 checks passed

nsavoire deleted the nsavoire/lto_language branch March 18, 2026 17:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(debuginfo): correct language detection for LTO-compiled binaries#7

fix(debuginfo): correct language detection for LTO-compiled binaries#7
nsavoire merged 3 commits intodd_masterfrom
nsavoire/lto_language

nsavoire commented Feb 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nsavoire commented Feb 26, 2026

Uh oh!

chatgpt-codex-connector bot commented Feb 26, 2026

Uh oh!

buranmert left a comment

Uh oh!

nsavoire commented Mar 3, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

nsavoire commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Implementation

UnitRef::resolve_entry_language(entry, depth)

DwarfUnit::resolve_function_language(entry, fallback_language)

parse_function / parse_inlinee

Tests

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nsavoire commented Feb 26, 2026

Uh oh!

chatgpt-codex-connector bot commented Feb 26, 2026

Uh oh!

buranmert left a comment

Choose a reason for hiding this comment

Uh oh!

nsavoire commented Mar 3, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nsavoire commented Feb 24, 2026 •

edited

Loading

`UnitRef::resolve_entry_language(entry, depth)`

`DwarfUnit::resolve_function_language(entry, fallback_language)`

`parse_function` / `parse_inlinee`