The following schema describes how to use the audiopreproc
module to identify regions of silence, speech, and non-speech in an audio file.
[AudioNet] 0 = a, ts ← audio(MONO, input) 1 = f1 <- frontend(_, a) 2 = nf1 <- normalizer(_, f1) 3 = w <- audiopreproc(A, a, nf1) 2 = output ← wout (_, w, ts)
0
|
The audio module processes the mono audio. |
1
|
The frontend module converts the audio data from 0 into front-end frame data. |
2
|
The normalizer module normalizes the frame data from 1 . |
3
|
The audiopreproc module in audio classification mode (A ) processes the audio (a ) and normalized frame data (nf ). |
4
|
The wout module writes the audio classification information (w ) to the output file. |
|