Skip to content

Michael-Pytel/Frequency-Domain-Audio-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio Signal Analysis in the Frequency Domain

A Streamlit web application for frequency-domain analysis of audio signals, built from scratch using NumPy, SciPy, Matplotlib, and Plotly. The project covers the full pipeline from signal framing through spectral feature extraction, spectrogram generation, formant analysis, and fundamental frequency estimation via cepstrum.

Core Concepts

Signal Framing

The audio signal is divided into short overlapping frames (typically 20–40 ms), within which the signal can be treated as approximately stationary. Each frame is then processed independently.

Window Functions

Before applying FFT, a window function is applied to each frame to reduce spectral leakage — the artifact caused by treating a finite-length signal as periodic. Five windows are implemented: rectangular, triangular, Hamming, Hann, and Blackman.

FFT and Frequency Spectrum

The frequency spectrum of each frame is computed via the Fast Fourier Transform. The magnitude spectrum can be displayed on a linear or logarithmic (dB) scale.

Spectral Features

Four frame-level features are extracted from the magnitude spectrum:

Frequency Centroid — the "center of gravity" of the spectrum, related to perceived brightness:

$$FC(n) = \frac{\int_0^\infty \omega , S_n(\omega) , d\omega}{\int_0^\infty S_n(\omega) , d\omega}$$

Effective Bandwidth — the weighted standard deviation around the centroid:

$$BW(n) = \sqrt{\frac{\int_0^\infty (\omega - FC(n))^2 , S_n^2(\omega) , d\omega}{\int_0^\infty S_n^2(\omega) , d\omega}}$$

Spectral Flatness Measure (SFM) — ratio of geometric mean to arithmetic mean of the power spectrum. Values close to 1 indicate noise-like signals; values close to 0 indicate tonal signals:

$$SFM = \frac{\left(\prod_{k} |X(k)|^2\right)^{1/N}}{\frac{1}{N} \sum_{k} |X(k)|^2}$$

Spectral Crest Factor (SCF) — ratio of the peak to the mean of the power spectrum, measuring how "spiky" the spectrum is:

$$SCF = \frac{\max_k |X(k)|^2}{\frac{1}{N} \sum_{k} |X(k)|^2}$$

Energy Ratio in Subbands (ERSB) — the fraction of total spectral energy contained in each of four frequency bands (0–630 Hz, 630–1720 Hz, 1720–4400 Hz, 4400+ Hz):

$$ERSB_{[f_0, f_1]}(t) = \frac{\int_{f_0}^{f_1} S_t^2(f) , df}{\int S_t^2(f) , df}$$

Spectrogram

The spectrogram is computed by applying the FFT to each overlapping frame and stacking the resulting magnitude spectra into a 2D time-frequency representation. It is rendered interactively using Plotly, allowing zoom and hover inspection of exact frequency values.

Formant Analysis

Formants are the resonant frequencies of the vocal tract, visible as peaks in the smoothed magnitude spectrum of voiced speech. The application detects them by smoothing the spectrum with a Savitzky-Golay filter and finding peaks below 5000 Hz.

Cepstrum and Fundamental Frequency Estimation

The real cepstrum is defined as the inverse FFT of the log magnitude spectrum:

$$C(\tau) = \mathcal{F}^{-1}\left(\log |X(\omega)|\right)$$

The fundamental frequency F0 is estimated by locating the peak of the cepstrum within the quefrency range corresponding to a plausible F0 range (50–400 Hz):

$$f_0 = \frac{1}{\tau_{\max}}, \quad \tau_{\max} = \arg\max_\tau C(\tau)$$

The application also compares this estimate against autocorrelation and AMDF-based methods.

Tech Stack

  • NumPy — vectorized signal processing
  • SciPy — WAV file loading, peak detection, filtering
  • Matplotlib — static visualizations
  • Plotly — interactive spectrogram
  • Streamlit — web interface

Running the App

pip install -r requirements.txt
streamlit run app.py

Upload a WAV file and explore the analysis tabs: Frame Analysis, Frequency Analysis, Spectrogram, Cepstrum Analysis, and Statistics & Export.

About

Streamlit app for frequency-domain audio signal analysis — FFT, spectrogram, spectral features (centroid, bandwidth, SFM, SCF), formant detection, and F0 estimation via cepstrum.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages