Measure Speech-to-Text Success Rates

To measure the success rate of speech-to-text for typical application data, you need:

NOTE:

To ensure the accuracy of the resulting success rates, the text file must be a completely verbatim record of the audio file.

IDOL Speech Server provides a scoring tool that reports the general word precision and recall rates measured across the test files. The procedures for measuring success rates for speech-to-text are well established and standardized. For more information, see http://www.itl.nist.gov/iad/mig/tools/.

The general steps for measuring success rates are:

  1. Run the speech-to-text function on the test data.
  2. Normalize the text in the truth files. Normalization is important because it avoids representation mismatch between recognized words and truth text, such as ‘one’ and ‘1’.
  3. Align the recognition output text against the normalized truth text and calculate precision and recall rates based on aligned entities. For details, see Score Speech Recognition.

_FT_HTML5_bannerTitle.htm