Skip to content

Fix VAD timing log to show per-call and cumulative time#3661

Open
danielbodart wants to merge 2 commits intoggml-org:masterfrom
danielbodart:fix-vad-timing-log
Open

Fix VAD timing log to show per-call and cumulative time#3661
danielbodart wants to merge 2 commits intoggml-org:masterfrom
danielbodart:fix-vad-timing-log

Conversation

@danielbodart
Copy link

Summary

  • Fixes misleading VAD timing log that showed cumulative time labeled as just "vad time"
  • Now shows both per-call time and cumulative total: vad time = 1.23 ms (cumulative 4.56 ms) processing 16000 samples

Motivation

When whisper_vad_detect_speech is called multiple times (e.g. in a streaming context), the logged time was the running total, making it look like each call was getting progressively slower. This change makes it clear which value is which.

danielbodart and others added 2 commits February 13, 2026 06:35
Add whisper_decode_with_state_and_aheads() which saves alignment head
cross-attention data during decode, and whisper_state_get_aheads_cross_qks()
to read the resulting tensor from state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The log previously showed cumulative time labeled as just "vad time",
which was misleading when called multiple times. Now shows both the
per-call time and the cumulative total.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant