Preprocess Audio

The audio preprocessing module allows you to categorize audio and analyze its quality before you use it in tasks. This module has options to perform clipping detection, Signal-to-Noise Ratio (SNR) calculation, and Dual-Tone Multi-Frequency (DTMF) dial tone identification.

Audio Preprocessor Modes

Audio Categorization

Audio Quality Assessment

Clipping Detection

Signal-to-Noise Ratio Calculation

DTMF Identification

Configure Audio Preprocessing Tasks

Run an Audio Preprocessing Task

Example Results

Audio Categorization Mode

Clipping Detection Mode

SNR Calculation Mode

DTMF Mode

Run Audio Analysis

Note: The 11.2 release of HPE IDOL Speech Server uses an implementation of audio preprocessing based on DNN technolgy, which means that you do not need to tailor thresholds to specific audio types. The new implementation uses normalized feature vector input rather than audio samples, which requires updates to the task schemas.

Note that for tasks that combine audio preprocessing with speech-to-text, you must you must include separate frontend and normalizer calls for both audio preprocessing and speech-to-text, because the form of the frontend feature vectors needed for the two tasks might be different. The standard HPE IDOL Speech Server tasks configuration file(speechserver-tasks.cfg) includes several examples.

For more information on working with the new algorithm, see the HPE IDOL Speech Server Reference. All tasks in the speechserver-tasks.cfg file use the new algorithm, but the old algorithm is still supported for backwards compatibility, and you can use it in exactly the same way as before.


_HP_HTML5_bannerTitle.htm