Fix VAD timing log to show per-call and cumulative time by danielbodart · Pull Request #3661 · ggml-org/whisper.cpp

danielbodart · 2026-02-13T06:43:55Z

Summary

Fixes misleading VAD timing log that showed cumulative time labeled as just "vad time"
Now shows both per-call time and cumulative total: vad time = 1.23 ms (cumulative 4.56 ms) processing 16000 samples

Motivation

When whisper_vad_detect_speech is called multiple times (e.g. in a streaming context), the logged time was the running total, making it look like each call was getting progressively slower. This change makes it clear which value is which.

Add whisper_decode_with_state_and_aheads() which saves alignment head cross-attention data during decode, and whisper_state_get_aheads_cross_qks() to read the resulting tensor from state. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The log previously showed cumulative time labeled as just "vad time", which was misleading when called multiple times. Now shows both the per-call time and the cumulative total. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

danielbodart and others added 2 commits February 13, 2026 06:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix VAD timing log to show per-call and cumulative time#3661

Fix VAD timing log to show per-call and cumulative time#3661
danielbodart wants to merge 2 commits intoggml-org:masterfrom
danielbodart:fix-vad-timing-log

danielbodart commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

danielbodart commented Feb 13, 2026

Summary

Motivation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant