The LangId
task receives audio data from a file or binary stream and converts it into language identification features to identify the languages.
You can run this task in three modes:
Parameter | Description | Required |
---|---|---|
Type | The task name. Set to LangId . |
Yes |
AppFrameDupl | The balance between performance and speed for audio preprocessing DNN classification. | |
Beam | The beam width of the search process. | |
ClassList | A list of language classifiers to use. | |
ClassPath | The path to the directory containing the language classifiers. | |
ClosedSet | Whether to use closed set or open set language identification. | |
DiagFile | The file to write the diagnostic information to. | |
DiagLevel | The level of detail to include in the diagnostic information. | |
DiscardShort | The minimum segment length to allow (when LidMode is Boundary ). |
|
EndTime | The end of an audio section to process. | |
File | The audio file to process. | Yes, if InputType is File . |
FrameDupl | An integer value which allows for greater time efficiency with only a minimal loss of recognition accuracy. | |
InputType | The type of audio to process (file, binary data, or stream). | |
Lang | The name of the language pack to use. | |
LangList | A subset of languages to use from the classifier list file. | |
LidMode | The mode to use for language identification. | |
MaxPhoneLen | The maximum phoneme length to allow (in seconds). | |
MinPhoneRate | The minimum phoneme rate (phonemes per second). | |
MinPhones | The minimum number of phonemes per segment. | |
NBest | The maximum number of language candidates to include in the output file. | |
Out | The file to write language identification results to. | |
ScoreMode | The scoring method to use for speaker identification. | |
SegSize | The maximum results segment size. | |
StartTime | The beginning of an audio section to process. | |
SugdInputChannels | The channel layout of the input media file. This parameter does not apply when InputType is Stream . |
|
SugdInputFrequency | The sampling rate of the input media file. This parameter does not apply when InputType is Stream . |
The ClassList
parameter is required only if you want to change the audio sample rate, or if you want to use your own custom classifiers. You might also need to specify the ClassPath
parameter, depending on the location of the classifier files.
Segmented Mode
http://localhost:13000/action=AddTask&Type=LangId&LidMode=segmented&InputType=Stream&Out=SpeechLang6.ctm
This action uses port 13000
to instruct IDOL Speech Server, which is located on the local machine, to identify languages in the audio stream using the default language classifiers, and to write the identification results to the SpeechLang6.ctm
file.
http://localhost:13000/action=AddTask&Type=LangId&LidMode=segmented&File=C:\Data\Speech.wav
This action uses port 13000
to instruct IDOL Speech Server, which is located on the local machine, to identify the language in the Speech.wav
file using the default language classifiers.
Cumulative Mode
http://localhost:13000/action=AddTask&Type=LangId&LidMode=cumulative&InputType=Stream&Out=SpeechLang3.ctm
This action uses port 13000
to instruct IDOL Speech Server, which is located on the local machine, to identify the language in the audio stream using the default language classifiers. The action instructs IDOL Speech Server to write the identification results to the SpeechLang3.ctm
file.
http://localhost:13000/action=AddTask&Type=LangId&LidMode=cumulative&File=C:\Data\Speech.wav&Out=SpeechLang4.ctm
This action uses port 13000
to instruct IDOL Speech Server, which is located on the local machine, to identify the language in the Speech.wav
file using the default language classifiers. The action instructs IDOL Speech Server to write the identification results to the SpeechLang4.ctm
file.
Boundary Mode
http://localhost:13000/action=AddTask&Type=LangId&LidMode=boundary&InputType=Stream&ClassList=ListManager/OptClassSet&ClassPath=C:\LangID\&Out=SpeechLang1.ctm
This action uses port 13000
to instruct IDOL Speech Server, which is located on the local machine, to identify languages and language boundaries in the audio stream using the language classifiers specified in the OptClassSet
list. The action instructs IDOL Speech Server to write the identification results to the SpeechLang1.ctm
file.
http://localhost:13000/action=AddTask&Type=LangId&Lidmode=boundary&File=C:\Data\Speech.wav
This action uses port 13000
to instruct IDOL Speech Server, which is located on the local machine, to identify languages and language boundaries in the Speech.wav
file using the default language classifiers.
|