LangIdBndWav

The LangIdBndWav task reads in data from an audio file, converts it into language identification features, and determines boundaries where the language changes. The task returns the language identification results between boundaries.

Parameters

Parameter Description Required
Type The task name. Set to LangIdBndWav. Yes
AppDnnBase The location of the appResources directory, which contains the DNN and .ian files to use.  
Beam The beam width of the search process.  
ClassList A list of language classifiers to use.  
ClassPath The path to the directory containing the language classifiers.  
DiagFile The file to write the diagnostic information to.  
DiagLevel The level of detail to include in the diagnostic information.  
DnnFile The Deep Neural Network acoustic modeling file to use.  
File The audio file to process. Yes
FrameDupl The balance between performance and speed for audio preprocessing DNN classification.  
Lang The name of the language pack to use. Yes
LangList A subset of languages to use from the classifier list file.  
MinPhoneRate The minimum phone rate (phones per second).  
NBest The maximum number of language candidates to include in the output file.  
Out The file to write language identification results to. Yes
OutB The file to write boundary point information to. Yes
SegSize The maximum results segment size.  
SegSmoothWin The size of the smoothing window.  
SegStep The step size in phones of the analysis window.  
SilThresh The threshold between what the task identifies as silence and non-silence.  
SpeechThresh The threshold between speech and non-speech (music or noise).  
SugdInputChannels The channel layout of the input media file.  
SugdInputFrequency The sampling rate of the input media file.  
NOTE:

The ClassList parameter is required only if you want to change the audio sample rate, or if you want to use your own custom classifiers. You might also need to specify the ClassPath parameter, depending on the location of the classifier files.

Example

http://localhost:13000/action=AddTask&Type=LangIdBndWav&File=C:\Data\Speech.wav&ClassList=ListManager/OptClassSet&ClassPath=C:\LangID\&Out=SpeechLang1.ctm&OutB=SpeechBnd.ctm

This action uses port 13000 to instruct HPE IDOL Speech Server, which is located on the local machine, to identify languages and language boundaries in the Speech.wav file using the language classifiers specified in the OptClassSet list. The action instructs HPE IDOL Speech Server to write the identification results to the SpeechLang1.ctm file and the boundary information to the SpeechBnd.ctm file.


_HP_HTML5_bannerTitle.htm