Audio Signal Analysis in the Frequency Domain

A Streamlit web application for frequency-domain analysis of audio signals, built from scratch using NumPy, SciPy, Matplotlib, and Plotly. The project covers the full pipeline from signal framing through spectral feature extraction, spectrogram generation, formant analysis, and fundamental frequency estimation via cepstrum.

Core Concepts

Signal Framing

The audio signal is divided into short overlapping frames (typically 20–40 ms), within which the signal can be treated as approximately stationary. Each frame is then processed independently.

Window Functions

Before applying FFT, a window function is applied to each frame to reduce spectral leakage — the artifact caused by treating a finite-length signal as periodic. Five windows are implemented: rectangular, triangular, Hamming, Hann, and Blackman.

FFT and Frequency Spectrum

The frequency spectrum of each frame is computed via the Fast Fourier Transform. The magnitude spectrum can be displayed on a linear or logarithmic (dB) scale.

Spectral Features

Four frame-level features are extracted from the magnitude spectrum:

Frequency Centroid — the "center of gravity" of the spectrum, related to perceived brightness:

$$FC(n) = \frac{\int_0^\infty \omega , S_n(\omega) , d\omega}{\int_0^\infty S_n(\omega) , d\omega}$$

Effective Bandwidth — the weighted standard deviation around the centroid:

$$BW(n) = \sqrt{\frac{\int_0^\infty (\omega - FC(n))^2 , S_n^2(\omega) , d\omega}{\int_0^\infty S_n^2(\omega) , d\omega}}$$

Spectral Flatness Measure (SFM) — ratio of geometric mean to arithmetic mean of the power spectrum. Values close to 1 indicate noise-like signals; values close to 0 indicate tonal signals:

$$SFM = \frac{\left(\prod_{k} |X(k)|^2\right)^{1/N}}{\frac{1}{N} \sum_{k} |X(k)|^2}$$

Spectral Crest Factor (SCF) — ratio of the peak to the mean of the power spectrum, measuring how "spiky" the spectrum is:

$$SCF = \frac{\max_k |X(k)|^2}{\frac{1}{N} \sum_{k} |X(k)|^2}$$

Energy Ratio in Subbands (ERSB) — the fraction of total spectral energy contained in each of four frequency bands (0–630 Hz, 630–1720 Hz, 1720–4400 Hz, 4400+ Hz):

$$ERSB_{[f_0, f_1]}(t) = \frac{\int_{f_0}^{f_1} S_t^2(f) , df}{\int S_t^2(f) , df}$$

Spectrogram

The spectrogram is computed by applying the FFT to each overlapping frame and stacking the resulting magnitude spectra into a 2D time-frequency representation. It is rendered interactively using Plotly, allowing zoom and hover inspection of exact frequency values.

Formant Analysis

Formants are the resonant frequencies of the vocal tract, visible as peaks in the smoothed magnitude spectrum of voiced speech. The application detects them by smoothing the spectrum with a Savitzky-Golay filter and finding peaks below 5000 Hz.

Cepstrum and Fundamental Frequency Estimation

The real cepstrum is defined as the inverse FFT of the log magnitude spectrum:

$$C(\tau) = \mathcal{F}^{-1}\left(\log |X(\omega)|\right)$$

The fundamental frequency F0 is estimated by locating the peak of the cepstrum within the quefrency range corresponding to a plausible F0 range (50–400 Hz):

$$f_0 = \frac{1}{\tau_{\max}}, \quad \tau_{\max} = \arg\max_\tau C(\tau)$$

The application also compares this estimate against autocorrelation and AMDF-based methods.

Tech Stack

NumPy — vectorized signal processing
SciPy — WAV file loading, peak detection, filtering
Matplotlib — static visualizations
Plotly — interactive spectrogram
Streamlit — web interface

Running the App

pip install -r requirements.txt
streamlit run app.py

Upload a WAV file and explore the analysis tabs: Frame Analysis, Frequency Analysis, Spectrogram, Cepstrum Analysis, and Statistics & Export.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
audio		audio
README.md		README.md
app.py		app.py
dokumentacja_2.pdf		dokumentacja_2.pdf
functions.py		functions.py
functions_spectral.py		functions_spectral.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Signal Analysis in the Frequency Domain

Core Concepts

Signal Framing

Window Functions

FFT and Frequency Spectrum

Spectral Features

Spectrogram

Formant Analysis

Cepstrum and Fundamental Frequency Estimation

Tech Stack

Running the App

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Audio Signal Analysis in the Frequency Domain

Core Concepts

Signal Framing

Window Functions

FFT and Frequency Spectrum

Spectral Features

Spectrogram

Formant Analysis

Cepstrum and Fundamental Frequency Estimation

Tech Stack

Running the App

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages