The StreamToTextMusicFilter
task takes a stream and returns a file containing a text version of the wav file. Sections of the audio that are determined to be music are filtered out and not included in the transcript.
Parameter | Description | Required |
---|---|---|
Type | The task type. Set to . |
Yes |
AppDnnBase | The location of the appResources directory, which contains the DNN and .ian files to use. |
|
ClassWordFile | The path to a list of new words and weightings to add to the language model at load time. | |
Conf | Whether to generate word confidence scores. | |
CustomLm | The custom language model to use. | |
Diag | Whether to generate diagnostic information. | |
DiagFile | The alignment diagnostics file to generate. | |
DnnFile | The DNN file to use. | |
DnnScale | The DNN output acoustic score scaling factor. | |
ForceRecompoundOff | Whether to prevent recompounding. | |
ForceRecompoundOn | Whether to force recompounding. | |
FrameDupl | An integer value which allows for greater time efficiency with only a minimal loss of recognition accuracy. | |
FrameDupl | The balance between performance and speed for audio preprocessing DNN classification. | |
Lang | The language pack to use. | |
LatFile | The name of the lattice file that contains word hypotheses. | |
LatScale | The depth of the lattice. | |
LatWinSize | The size (in seconds) of the lattice output window. | |
LatWordFile | A list of words to find. | |
Out | The file to write language identification results to. | |
PronFile | A file to use to either replace or add alternative pronunciations of words at language load time. | |
Punctuation | Whether to add punctuation to the word data. | |
SilThresh | The threshold between what the task identifies as silence and non-silence. | |
SpeechBias | Whether to bias towards speech (rather than music, noise, or silence) in the identification of audio segments. | |
SpeedBiasLevel | The balance between speed and accuracy in the decoder. | |
SpeechThresh | The threshold between speech and non-speech (music or noise). | |
WordBar | Switches on word barring. | |
WordBarList | The location of a list of words to be barred. |
http://localhost:13000/action=AddTask&Type=StreamToTextMusicFilter&SpeechThresh=-30&Out=Transcript1.ctm
This action uses port 13000
to instruct HPE IDOL Speech Server, which is located on the local machine, to transcribe the audio stream (using the specified threshold values to determine sections that contain music), and to write the results to the Transcript1.ctm
file, with any sections determined to be music filtered out.
|