Open Set Language Identification

By default, IDOL Speech Server language identification uses closed set language classification.

In closed set language identification, IDOL Speech Server expects all the speech to belong to one of the trained languages. The server returns a result for all valid speech segments. If there is an unknown language in the audio, IDOL Speech Server identifies it as the known language that returns the highest language score.

You can also use open set language identification. In this case, the audio can contain speech from languages other than the trained set. Each trained language has a score threshold associated with it, trained during the optimization stage. If the score for the highest ranked classifier for a given segment of audio is lower than the associated threshold, IDOL Speech Server labels the section as unknown (or UNK).

The use of thresholds means that open set language identification might lead to genuine sections of known language speech being labelled as unknown. Micro Focus recommends that you use open set identification only where unknown language data is likely to be an issue. You might also want to use this method if you want to identify whether the speech belongs to a particular language, and do not need to identify the other possible languages.


_FT_HTML5_bannerTitle.htm