Speech-to-Text with Word Filtering

The following schema describes a speech-to-text operation that is followed by a postprocessing operation that filters the results to replace any inappropriate words with a specified term (by default, this term is “<BLEEP>”).

[WavToTextFilter]
0 = a ← wav(MONO, input)
1 = f ← frontend(_, a)
2 = nf ← normalizer(_, f)
3 = w1 ← stt (_, nf)
4 = w2 ← postproc(B, w1)
5 = output ← wout (_,w2)
DefaultResults = out
0 The wav module processes the mono audio.
1 The frontend module converts the audio data into speech front-end frame data.
2 The normalizer module normalizes the frame data.
3 The stt module converts the normalized frame data into text.
4 The postproc module replaces barred words in the text with a specified term.
5 The wout module writes the filtered words resulting from 4 to file.

_HP_HTML5_bannerTitle.htm