Follow the steps to convert audio files into language identification feature files.
To convert an audio file into a language identification feature file
Send an AddTask
action to IDOL Speech Server, and set the following parameters:
Type
|
The task name. Set to LangIdFeature . |
File
|
The audio file to process. |
Out
|
The name of the feature file to create. |
For example:
http://localhost:15000/action=AddTask&Type=LangIdFeature&File=C:\Data\FrenchSpeech.wav&Out=frenchSpeech.lif
This action creates the frenchSpeech.lif
file from the FrenchSpeech.wav
file.
This action returns a token. You can use the token to:
In the 10.11 release of IDOL Speech Server , language identification feature generation is based on a DNN decode, similar to that used in speech-to-text. This feature provides greater accuracy than decodes that use the previous acoustic model technology, but at the expense of slower processing speed.
Depending on your hardware specification, you might want to switch to using acoustic models instead of DNN decodes.
This process also applies if you are running language identification from a wav file or an audio stream, which generates a language ID feature stream as part of the process.
To run language identification without DNNs
DNNFile=None
in the action to disable DNN usage.Specify the language classifiers. If you are using the base language classifiers in the 10.11 release, you must specify the non-DNN classifier list. To do this, change entries in the configuration file such as:
ClList = $params.ClassList = Classifiers-3.0-16k/classifiers.txt
to:
ClList = $params.ClassList = Classifiers-3.0-16k/classifiers_noDNN.txt
If you disable the DNN when you process the test data, but you use the standard classifiers (or those trained based on DNN-generated features), the features do not match and the accuracy suffers. The same applies in the reverse situation (that is, if you use the non-DNN classifiers with DNN decoding enabled). Because of this, you must be consistent when you generate language identification features for the training, development, and evaluation stages.
|