fix scalefactor delta overflows to comply with AAC spec (ISO 14496-3)#99
Open
nschimme wants to merge 2 commits intoknik0:masterfrom
Open
fix scalefactor delta overflows to comply with AAC spec (ISO 14496-3)#99nschimme wants to merge 2 commits intoknik0:masterfrom
nschimme wants to merge 2 commits intoknik0:masterfrom
Conversation
This commit addresses two bugs in BlocQuant()'s scale factor range pass that allowed scalefactor differences to exceed the strict ±60 limit required by the AAC specification: 1. Fix PNS delta predictor: PNS scale factors share the HCB_PNS codebook and require a separate delta predictor. Without it, the first PNS band's delta was computed relative to the regular global_gain, producing out-of-bounds deltas. A dedicated `lastpns` predictor is now initialized 90 steps below `global_gain` so the first PNS entry fits comfortably within the ±60 constraint. 2. Enforce limits on all active bands: The quantizer previously only clamped deltas for HCB_ESC bands. The condition is now updated to `(book != HCB_ZERO) && (book != HCB_NONE)` to ensure the ±60 limit is enforced for every active scalefactor band, regardless of the Huffman codebook. Additionally, this refactors the clamping logic into a centralized `clamp_sf_diff()` inline function and replaces hardcoded scalefactor magic numbers with named constants in `huff2.h`.
Contributor
Author
|
Context on why The AAC Scalefactor Delta BugThe Bug: "The Broken Bridge"AAC uses Huffman Book 12 to encode the difference (delta) between band volumes.
The Scenario
The Fix: "Two Tracks & Guardrails"
Why Speech MOS DroppedIn speech, "noise" (consonants like S or F) is often louder than background hiss.
|
Contributor
Author
|
Okay, I discovered that this is partially responsible for #40. I'm working on the remaining fix. |
Contributor
Author
|
Okay, this fixes #40 and possibly other cases too. No performance drop: https://github.com/nschimme/faac/actions/runs/24285523096 |
Contributor
Author
|
Hopefully the above two fixes also let me finally got Pseudo SBR to work. I kept getting blocked 😅 |
High-energy transients were producing quantized indices > 8191, exceeding the AAC escape sequence limit and corrupting the bitstream. - Peak Detection: Tracks bandmaxe (max spectral magnitude) per band. - Gain Limiting: Proactively caps sfacfix in qlevel if the peak coefficient would exceed the representable range. - Sync-Lock: Floors the scalefactor and re-derives the final gain to ensure encoder-decoder reconstruction alignment.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit addresses two bugs in BlocQuant()'s scale factor range pass that allowed scalefactor differences to exceed the strict ±60 limit required by the AAC specification:
lastpnspredictor is now initialized 90 steps belowglobal_gainso the first PNS entry fits comfortably within the ±60 constraint.(book != HCB_ZERO) && (book != HCB_NONE)to ensure the ±60 limit is enforced for every active scalefactor band, regardless of the Huffman codebook.Additionally, this refactors the clamping logic into a centralized
clamp_sf_diff()inline function and replaces hardcoded scalefactor magic numbers with named constants inhuff2.h.This causes a slight regression to speech where they accidentally did better, but overall all other cases we see a slight MOS improvement at the cost of 1% CPU throughput that we need to pay anyway for correctness: https://github.com/nschimme/faac/actions/runs/24264165540