Review the Optimization Log File

When you run the classifier set optimization task, IDOL Speech Server writes information about the optimization performance to a log file. To specify the log file, use the OutputLog parameter in the [lidoptimizer] section of the tasks configuration file (speechserver-tasks.cfg). To retrieve this log file, use the GetResults action, and set the Token action parameter to the token for the optimization task.

The log file shows how the language classifiers are performing on each iteration of the optimization. The <lidoptimize_iteration_record> summarizes the system performance across all languages for a specific iteration. For example:

<lidoptimize_iteration_record>
   <iteration>1</iteration>
   <learning_rate>1.5</learning_rate>
   <avg_score>0.968</avg_score>
   <min_score>0.848</min_score>
   <std_dev>0.00917</std_dev>
   <wt_nincreased>12</wt_nincreased>
   <wt_increased>ARMSA,DEDE,ELGR,ENUK,ESES,ESLA,FRFR,HBIL,ITIT,RORO,RURU,SKSK</wt_increased>
   <wt_ndecreased>10</wt_ndecreased>
   <wt_decreased>DADK,ENUS,FAIR,JAJP,KOKR,NLNL,PLPL,PTBR,SVSE,ZHCN</wt_decreased>
   <wt_avgchange>-0.0131</wt_avgchange>
   <wt_avgchange_abs>0.0327</wt_avgchange_abs>
   <result_delta>0</result_delta>
   <best_so_far>Yes</best_so_far>
</lidoptimize_iteration_record>

The record contains the following elements.

iteration The current iteration of the optimization process.
learning_rate The current learning rate of the algorithm. The learning rate decreases throughout the process.
avg_score The average correct score across all languages, with the current weighting.
min_score The score of the worst performing language classifier.
std_dev The standard deviation across the classifier scores.
wt_nincreased The number of languages whose weights were increased.
wt_increased The language pack codes for the languages whose weights were increased.
wt_ndecreased The number of languages whose weights were decreased.
wt_decreased The language pack codes for the languages whose weights were decreased.
wt_avgchange The average weight change in real terms.
wt_avgchange_abs The average weight change in absolute terms.
result_delta The delta in the result when compared to the previous iteration.
best_so_far Whether this iteration was the best iteration.

The aim of the optimization process is to optimize the values for avg_score, min_score, and std_dev.

The log file also contains <lidoptimize_lang_record> items. This item shows the performance of a specific language at a specified optimization iteration. For example:

<lidoptimize_lang_record>
   <iteration>3</iteration>
   <lang>ENUK</lang>
   <weight>0.991</weight>
   <score>0.933</score>
   <threshold>-1.472</threshold>
   <threshold_missRate>0.289</threshold_missRate>
   <threshold_falseRate>0.020</threshold_falseRate>
   <threshold_recall>0.711</threshold_recall>
   <threshold_precision>0.590</threshold_precision>
   <threshold_fmeasure>0.645</threshold_fmeasure>
   <data_avg_false>0.648</data_avg_false>
   <data_max_false>0.940</data_max_false>
   <class_avg_false>0.576</class_avg_false>
   <class_max_false>0.936</class_max_false>
   <closest_imposter>ENUS (58.3%)</closest_imposter>
   <rel_change>0.019</rel_change>
   <rel_change_str>INCREASED</rel_change_str>
   <next_weight>1.009</next_weight>
   <conf_analysis>
      <true_conf>
         <n_confs>168</n_confs>
         <min_conf>1.008</min_conf>
         <vlow_conf>1.067</vlow_conf>
         <low_conf>1.128</low_conf>
         <med_conf>1.217</med_conf>
         <high_conf>1.335</high_conf>
         <vhigh_conf>1.394</vhigh_conf>
         <max_conf>1.542</max_conf>
      </true_conf>
      <false_conf>
         <n_confs>22</n_confs>
         <min_conf>1.001</min_conf>
         <vlow_conf>1.010</vlow_conf>
         <low_conf>1.031</low_conf>
         <med_conf>1.056</med_conf>
         <high_conf>1.096</high_conf>
         <vhigh_conf>1.154</vhigh_conf>
         <max_conf>1.198</max_conf>
      </false_conf>
   </conf_analysis>
   <score_analysis>
      <true_score>
         <n_scores>180</n_scores>
         <min_score>-2.485</min_score>
         <vlow_score>-1.589</vlow_score>
         <low_score>-1.500</low_score>
         <med_score>-1.417</med_score>
         <high_score>-1.337</high_score>
         <vhigh_score>-1.277</vhigh_score>
         <max_score>-1.200</max_score>
      </true_score>
      <false_score>
         <n_scores>4487</n_scores>
         <min_score>-4.741</min_score>
         <vlow_score>-2.211</vlow_score>
         <low_score>-2.067</low_score>
         <med_score>-1.928</med_score>
         <high_score>-1.816</high_score>
         <vhigh_score>-1.726</vhigh_score>
         <max_score>-1.215</max_score>
      </false_score>
   </score_analysis>
</lidoptimize_lang_record>

The record item shows the existing weight and the new weight for the language, as well as some information on intermediate values used to calculate it. The record also gives information on how the language performed on the last iteration, information about confusability and the closest imposter for the language, and details about the spread of confidence (from the best result to the worst).

iteration The current iteration of the optimization process.
lang The language that this record represents.
weight The weight for this language classifier at this iteration.
score The correct rate for this classifier at this iteration (that is, is the proportion of audio segments for this language that were correctly identified).
threshold The score threshold for this classifier at this iteration (that is, the language score that represents a good probability that the audio matches the language).
threshold_missRate
threshold_falseRate
threshold_recall
threshold_precision
threshold_fmeasure
The statistical information for the threshold calculation:
  • missrate. The proportion of true language segments that were not identified.
  • falserate. The proportion of false language segments that were incorrectly identified as the specified language.
  • recall. The fraction of true positives identified out of all instances of the true language.
  • precision. The fraction of true positives identified out of all the instances identified as the language.
  • fmeasure. A statistical measure based on the precision and recall.
data_avg_false
The average score for an incorrect language classifier for the data for the current language, relative to the current classifier score. The maximum value is 1.0, which suggests that the incorrect language scores best on the data). IDOL Speech Server uses this value during weight estimation.
data_max_false
The highest score for an incorrect language classifier for the data for the current language, relative to the current classifier score. The maximum value is 1.0, which suggests that the incorrect language scores best on the data). IDOL Speech Server uses this value during weight estimation.
class_avg_false
The average score for the current language classifier computed across all other language data examples. The score is relative to the score of the true classifier for each data example. IDOL Speech Server uses this value during weight estimation.
class_max_false
The highest score for the current language classifier computed across all other language data examples. The score is relative to the score of the true classifier for each data example. IDOL Speech Server uses this value during weight estimation.
closest_imposter
The incorrect language that most closely matched the language.
rel_change The amount that the weight changed by.
rel_change_str The strength of the change in weight.
next_weight The new weight for this classifier, for the next iteration.
conf_analysis The conf_analysis section gives details of the confidence values seen for different hits for this language. It contains a section for true hits (true_conf) and a section for false positives (false_conf).
  • n_confs. The number of confidence values recorded in this section.
  • min_conf. The minimum confidence seen for a hit in this section.
  • vlow_conf. A very low confidence value seen for a hit in this section.
  • low_conf. A low confidence value seen for a hit in this section.
  • medium_conf. A medium confidence value seen for a hit. in this section
  • high_conf. A high confidence value seen for a hit in this section.
  • vhigh_conf. A very high confidence value seen for a hit in this section.
  • max_conf. The maximum confidence value seen for a hit in this section.
score_analysis The score_analysis section gives details of the language score values seen for different hits for this language. It contains a section for true hits (true_score) and a section for false positives (false_score).
  • n_scores. The number of language score values recorded in this section.
  • min_score. The minimum language score seen for a hit in this section.
  • vlow_score. A very low language score value seen for a hit in this section.
  • low_score. A low language score value seen for a hit in this section.
  • medium_score. A medium language score value seen for a hit. in this section
  • high_score. A high language score value seen for a hit in this section.
  • vhigh_score. A very high language score value seen for a hit in this section.
  • max_score. The maximum language score value seen for a hit in this section.

After all optimization iterations are complete, the <lidoptimize_result> log item reports the best iteration. For example:

<lidoptimize_result>
	<best_iteration>4</best_iteration>
</lidoptimize_result>

IDOL Speech Server writes the language classifier weights for the best iteration to the classifier list file. When you use this file to run language identification tasks, IDOL Speech Server balances the classifier scores based on the new, trained weights.


_FT_HTML5_bannerTitle.htm