StreamToTextMusicFilter

DEPRECATED:

The StreamToTextMusicFilter task is deprecated in IDOL Speech Server version 11.5 and later. Use the SpeechToTextFilter task instead.

This task is still available for existing implementations, but it might be incompatible with new functionality. The task might be deleted in future.

The StreamToTextMusicFilter task takes a stream and returns a file containing a text version of the wav file. Sections of the audio that are determined to be music are filtered out and not included in the transcript.

Parameters

Parameter Description Required
Type The task type. Set to StreamToTextMusicFilter. Yes
AppDnnBase The location of the appResources directory, which contains the DNN and .ian files to use.  
AppFrameDupl The balance between performance and speed for audio preprocessing DNN classification.  
ClassWordFile The path to a list of new words and weightings to add to the language model at load time.  
Conf Whether to generate word confidence scores.  
CustomLm The custom language model to use.  
Diag Whether to generate diagnostic information.  
DiagFile The alignment diagnostics file to generate.  
DnnFile The DNN file to use.  
DnnScale The DNN output acoustic score scaling factor.  
ForceRecompoundOff Whether to prevent recompounding.  
ForceRecompoundOn Whether to force recompounding.  
FrameDupl An integer value which allows for greater time efficiency with only a minimal loss of recognition accuracy.  
Lang The language pack to use.  
LatFile The name of the lattice file that contains word hypotheses.  
LatScale The depth of the lattice.  
LatWinSize The size (in seconds) of the lattice output window.  
LatWordFile A list of words to find.  
Out The file to write language identification results to.  
PronFile A file to use to either replace or add alternative pronunciations of words at language load time.  
Punctuation Whether to add punctuation to the word data.  
SilThresh The threshold between what the task identifies as silence and non-silence.  
SpeechBias Whether to bias towards speech (rather than music, noise, or silence) in the identification of audio segments.  
SpeedBiasLevel The balance between speed and accuracy in the decoder.  
SpeechThresh The threshold between speech and non-speech (music or noise).  
WordBar Switches on word barring.  
WordBarList The location of a list of words to be barred.  

Example

http://localhost:15000/action=AddTask&Type=StreamToTextMusicFilter&SpeechThresh=-30&Out=Transcript1.ctm

This action transcribes the audio stream (using the specified threshold values to determine sections that contain music), and writes the results to the Transcript1.ctm file, with any sections determined to be music filtered out.


_FT_HTML5_bannerTitle.htm