aus.analysis

Audio analysis tools developed from Eyben, “Real-Time Speech and Music Classification.” These tools expect audio as a 1D array of samples. The analyzer function runs all of the spectral analysis tools, so it is more convenient than trying to run all of those tools individually.

analyzer(audio: np.ndarray, sample_rate: int)

Runs a suite of analysis tools on a provided NumPy array of audio samples

Parameters:
  • audio (np.ndarray) – A 1D NumPy array of audio samples

  • sample_rate (int) – The sample rate of the audio

Returns:

A dictionary with the analysis results

Return type:

dict

energy(audio: np.ndarray)

Extracts the RMS energy of the signal. Reference: Eyben, pp. 21-22.

Parameters:

audio (np.ndarray) – A NumPy array of audio samples

Returns:

The RMS energy of the signal

spectral_centroid(magnitude_spectrum: np.ndarray, magnitude_freqs: np.ndarray, magnitude_spectrum_sum)

Calculates the spectral centroid from provided magnitude spectrum. Reference: Eyben, pp. 39-40.

Parameters:
  • magnitude_spectrum (np.ndarray) – The magnitude spectrum

  • magnitude_freqs (np.ndarray) – The magnitude frequencies

  • magnitude_spectrum_sum – The sum of the magnitude spectrum

Returns:

The spectral centroid

Return type:

float

spectral_entropy(spectrum_pmf: np.ndarray)

Calculates the spectral entropy from provided power spectrum. Reference: Eyben, pp. 23, 40, 41.

Parameters:

spectrum_pmf (np.ndarray) – The spectrum power mass function PMF

Returns:

The spectral entropy

Return type:

float

spectral_flatness(magnitude_spectrum: np.ndarray, magnitude_spectrum_sum)

Calculates the spectral flatness from provided magnitude spectrum. References: Eyben, p. 39, https://en.wikipedia.org/wiki/Spectral_flatness.

Parameters:
  • magnitude_spectrum (np.ndarray) – The magnitude spectrum

  • magnitude_spectrum_sum – The sum of the magnitude spectrum

Returns:

The spectral flatness, in dBFS

Return type:

float

spectral_kurtosis(spectrum_pmf: np.ndarray, magnitude_freqs: np.ndarray, spectral_centroid: float, spectral_variance: float)

Calculates the spectral kurtosis. Reference: Eyben, pp. 23, 39-40.

Parameters:
  • spectrum_pmf (np.ndarray) – The spectrum power mass function PMF

  • magnitude_freqs (np.ndarray) – The magnitude frequencies

  • spectral_centroid (float) – The spectral centroid

  • spectral_variance (float) – The spectral variance

Returns:

The spectral kurtosis

Return type:

float

spectral_roll_off_point(power_spectrum: np.ndarray, magnitude_freqs: np.ndarray, n: float, power_spectrum_sum)

Calculates the spectral roll off frequency from provided power spectrum. Reference: Eyben, p. 41.

Parameters:
  • power_spectrum (np.ndarray) – The power spectrum

  • magnitude_freqs (np.ndarray) – The magnitude frequencies

  • n (float) – The roll-off, as a fraction \((0 \leq n \leq 1.00)\)

  • power_spectrum_sum – The sum of the power spectrum

Returns:

The roll-off frequency

Return type:

float

spectral_skewness(spectrum_pmf: np.ndarray, magnitude_freqs: np.ndarray, spectral_centroid: float, spectral_variance: float)

Calculates the spectral skewness. Reference: Eyben, pp. 23, 39-40.

Parameters:
  • spectrum_pmf (np.ndarray) – The spectrum power mass function PMF

  • magnitude_freqs (np.ndarray) – The magnitude frequencies

  • spectral_centroid (float) – The spectral centroid

  • spectral_variance (float) – The spectral variance

Returns:

The spectral skewness

Return type:

float

spectral_slope(power_spectrum: np.ndarray)

Calculates the spectral slope from provided power spectrum. Reference: Eyben, pp. 35-38.

Parameters:

power_spectrum (np.ndarray) – The power spectrum

Returns:

The slope

Return type:

float

spectral_slope_region(power_spectrum: np.ndarray, rfftfreqs: np.ndarray, f_lower: float, f_upper: float, sample_rate: int)

Calculates the spectral slope from provided power spectrum, between the frequencies specified. The frequencies specified do not have to correspond to exact bin indices. Reference: Eyben, pp. 35-38.

Parameters:
  • power_spectrum (np.ndarray) – The power spectrum

  • rfftfreqs (np.ndarray) – The FFT freqs for the power spectrum bins

  • f_lower (float) – The lower frequency

  • f_upper (float) – The upper frequency

  • sample_rate (int) – The sample rate of the audio

Returns:

The slope

Return type:

float

spectral_variance(spectrum_pmf: np.ndarray, magnitude_freqs: np.ndarray, spectral_centroid: float)

Calculates the spectral variance. Reference: Eyben, pp. 23, 39-40.

Parameters:
  • spectrum_pmf (np.ndarray) – The spectrum power mass function PMF

  • magnitude_freqs (np.ndarray) – The magnitude frequencies

  • spectral_centroid (float) – The spectral centroid

Returns:

The spectral variance

Return type:

float

zero_crossing_rate(audio: np.ndarray, sample_rate: int)

Extracts the zero-crossing rate. Reference: Eyben, p. 20.

Parameters:
  • audio (np.ndarray) – A NumPy array of audio samples

  • sample_rate (float) – The sample rate of the audio

Returns:

The zero-crossing rate

Return type:

float