You must combine the individual language classifiers into a single language identification set. To identify the language being spoken, IDOL Speech Server compares audio against this set.
To combine the language classifiers into a set
Create a list of classifiers. Use the following format for each list entry:
LanguageName;BaseLanguage;ClassifierName;LanguageWeight;Threshold
where:
LanguageName
|
is the name that is reported in the results. |
BaseLanguage
|
is the base language that this language classifier belongs to. You can use this option if you have multiple classifiers for dialects of the same language (for example, ENUK and ENUS). This value is used only during optimization. This column is optional. If you do not want to use base languages, add the following line to the file, above the first language row: $baselangs;NO |
ClassifierName
|
is the name of the trained classifier file for this language. NOTE: It is not necessary to specify the full path to the classifier in this list, because you can use an action parameter to provide this information when you use the classifier set.
|
LanguageWeight
|
is the weight to apply to the language. IDOL Speech Server uses weights to scale the scores for each language before it compares them. You can leave this field empty before you run the LangIdOptimize task, which generates optimized weights (see Optimize the Language Identification Set). |
Threshold
|
is the language score threshold that IDOL Speech Server uses for this language. In open set language identification, IDOL Speech Server uses this threshold to determine whether the audio matches the language. In this case, if the language score is below the specified threshold, IDOL Speech Server returns the language as unknown. You can leave this field empty before you run the LangIdOptimize task, which generates optimized thresholds (see Optimize the Language Identification Set). |
Separate the three fields with semi-colons (;).
For more information about IDOL Speech Server's list manager, see Create and Manage Lists.
The following example list contains four language classifiers: ENUK, ENUS, ESES, and FRFR.
$baselangs;YES ENUK;EN;Classifiers-3.3-16k/ENUK.lcf ENUS;EN;Classifiers-3.3-16k/ENUS.lcf ESES;ES;Classifiers-3.3-16k/ESES.lcf FRFR;FR;Classifiers-3.3-16k/FRFR.lcf
You can use the optional $sfreq
parameter at the start of the classifier list to identify the sample frequency for the classifier set.
For example:
$sfreq;16000 ENUK;EN;ENUK.lcf;1.003 ESES;ES;ESES.lcf;0.985
This example sets the sample frequency for this set to 16 kHz, and then lists the two classifiers in the set. In addition, usage is restricted to audio of the specified sample frequency, to prevent accidental mismatches.
|