Overview

HPE IDOL Speech Server can identify the languages that are being spoken in a section of audio.

You can use the language ID base classifier packs (available for both 8 kHz and 16 kHz audio) to perform language identification tasks straight out of the box. You can also create your own language classifiers, either to identify languages that are not covered by the pack, or to produce classifiers that are more closely matched to the properties of your target audio than the default classifiers.

You can train language classifiers for any spoken language, regardless of whether that language is supported for speech-to-text. You can train transcriptions with audio data only; transcriptions are not required.

Note: To use the installed language classifier set for telephony, you must change the language ID base pack to SYLS-tel. You must also set the class list to be the classifier list file that is stored in the 8k classifiers directory, for example: Classifiers-3.1-8k/classifiers.txt.

Note: If you know the languages that are likely to occur in your audio files, HPE recommends that you restrict the language classifier set to include only the classifiers for those languages. To do this, set the LangList parameter when you submit a language identification action.

For best performance, HPE recommends that you train a set of classifiers based on example data that is representative of the data that you want to analyze. If you use the base classifier pack, HPE recommends that you optimize the classifiers on your own data. For instructions, see Optimize the Language Identification Set.


_HP_HTML5_bannerTitle.htm