# Preprocessing¶

## Pre-emphasis¶

speechpy.processing.preemphasis(signal, shift=1, cof=0.98)[source]

preemphasising on the signal.

Parameters: signal (array) – The input signal. shift (int) – The shift step. cof (float) – The preemphasising coefficient. 0 equals to no filtering. the pre-emphasized signal.

## Stacking¶

speechpy.processing.stack_frames(sig, sampling_frequency, frame_length=0.02, frame_stride=0.02, filter=<function <lambda>>, zero_padding=True)[source]

Frame a signal into overlapping frames.

Parameters: sig (array) – The audio signal to frame of size (N,). sampling_frequency (int) – The sampling frequency of the signal. frame_length (float) – The length of the frame in second. frame_stride (float) – The stride between frames. filter (array) – The time-domain filter for applying to each frame. By default it is one so nothing will be changed. zero_padding (bool) – If the samples is not a multiple of frame_length(number of frames sample), zero padding will be done for generating last frame. stacked_frames-Array of frames of size (number_of_frames x frame_len). array

## FFT Spectrum¶

speechpy.processing.fft_spectrum(frames, fft_points=512)[source]

This function computes the one-dimensional n-point discrete Fourier Transform (DFT) of a real-valued array by means of an efficient algorithm called the Fast Fourier Transform (FFT). Please refer to https://docs.scipy.org/doc/numpy/reference/generated/numpy.fft.rfft.html for further details.

Parameters: frames (array) – The frame array in which each row is a frame. fft_points (int) – The length of FFT. If fft_length is greater than frame_len, the frames will be zero-padded. The fft spectrum - If frames is an num_frames x sample_per_frame matrix, output will be num_frames x FFT_LENGTH. array

## Power Spectrum¶

speechpy.processing.power_spectrum(frames, fft_points=512)[source]

Power spectrum of each frame.

Parameters: frames (array) – The frame array in which each row is a frame. fft_points (int) – The length of FFT. If fft_length is greater than frame_len, the frames will be zero-padded. The power spectrum - If frames is an num_frames x sample_per_frame matrix, output will be num_frames x fft_length. array

## Power Spectrum Log¶

speechpy.processing.log_power_spectrum(frames, fft_points=512, normalize=True)[source]

Log power spectrum of each frame in frames.

Parameters: frames (array) – The frame array in which each row is a frame. fft_points (int) – The length of FFT. If fft_length is greater than frame_len, the frames will be zero-padded. normalize (bool) – If normalize=True, the log power spectrum will be normalized. The power spectrum - If frames is an num_frames x sample_per_frame matrix, output will be num_frames x fft_length. array

## Derivative Extraction¶

speechpy.processing.derivative_extraction(feat, DeltaWindows)[source]

This function the derivative features.

Parameters: feat (array) – The main feature vector(For returning the second order derivative it can be first-order derivative). DeltaWindows (int) – The value of DeltaWindows is set using the configuration parameter DELTAWINDOW. Derivative feature vector - A NUMFRAMESxNUMFEATURES numpy array which is the derivative features along the features. array