The postproc
module filters and modifies results produced by audio processing tasks such as speech-to-text, language identification, and so on.
The postproc
module has two modes of operation, which can also be combined to perform both processes in a single operation.
Mode | Input | Output | Description |
---|---|---|---|
B
|
w
|
w
|
Accepts a word data stream, and replaces all barred words either with a fixed term (such as <BLEEP> ), or a term specific to an individual word. The barred word list is provided as a text file containing one word or a pair of words (the barred word, and the replacement term for that word) on each line. |
R
|
w
|
w
|
Accepts a data stream, and recombines any word fragments to form complete words. Writes the resulting word sequence to an output word stream. |
P
|
w1
|
w2
|
Include simple sentence-forming punctuation (for example, full stops and initial capital letters) in the output. HPE recommends that you do not use the output as the input for other modules; this option is designed purely for display purposes, and to produce more human-readable output. This mode uses periods of silence in a sequence of words to break it up into sentences with added punctuation, but you must remove any periods of silence manually from the punctuated string. Note: You can use this mode in conjunction with word barring, but for use in conjunction with any other mode, you must call Caution: This mode is designed for use with Latin languages, and therefore is not recommended for use with non-Latin languages. |
Examples:
w2 ← postproc (B,w1)
w2 ← postproc (R,w1)
w2 ← postproc (BR,w1)
W2 <- postproc(P, w1)
Note: In B
mode, you must set at least one of the BarredList
and BarredTerm
parameters.
BarredList |
BarredTerm |
NonSentFinalWords |
RcmpAllowSuffix |
RcmpValidList |
|