ClosedSet

Set ClosedSet to False to use open set language identification.

By default, IDOL Speech Server uses closed set language identification. In this case, it identifies any speech in the audio as one of the configured language classifiers, even if the confidence that the speech matches the language is very low. This option is most useful when you know that the speech must belong to a particular set of languages, or when you want to maximize the amount of speech that you classify. This option results in false positives if the audio contains a language that is not one of the selected language models.

In open set language identification, IDOL Speech Server identifies the best match from the configured language classifiers, and then calculates a language score value. If the score is below a configured threshold, IDOL Speech Server labels the language as unknown. This option might be useful if you are not sure what languages to expect in your data. You might also want to use this method if you want to identify whether the speech belongs to a particular language, and do not need to identify the other possible languages.

Type: Boolean
Default: True
Required: No
Configuration Section: langid module
Example: ClosedSet=False
See Also: ClosedSet (action parameter)

_FT_HTML5_bannerTitle.htm