Use Live Mode for Streaming

When you perform speech-to-text conversion on a live audio stream, you can specify a mode that defines the rate to perform analysis. Versions of IDOL Speech Server from 10.8 upwards and the 6.0+ versions of the language packs use DNN acoustic models to improve speech-to-text accuracy. Each language pack contains at least two DNN acoustic models of different sizes. By default, in fixed mode the larger, most accurate model is used.

To override the default option, specify a different DNN file as the value of the DNNFile parameter in the task configuration file or at the command line.

CAUTION:

You can use DNN acoustic modelling in live or relative mode only if your DNN files are smaller than a certain size. In addition, you must be using Intel (or compatible) Processors that support SIMD extensions SSSE3 and SSE4.1. If this is not possible, you can set the DNNFile parameter to none to allow non-DNN speech-to-text without hardware limitations.

To use live mode in live stream speech-to-text tasks, you must add the Mode configuration parameter to the configuration sections for the stt module, and add the StreamMode configuration parameter to the configuration sections for the audio module, if they are not already present. For example:

[stt]
Mode=$params.Mode
[audio]
StreamMode=$params.Mode

This configuration creates a Mode action parameter. To use live mode, set the Mode action parameter to live in a task action that uses the stt and audio modules, such as SpeechToText. For example:

http://localhost:15000/action=AddTask&Type=SpeechToText&Lang=ENUK&Out=Transcript1.ctm&Mode=Live&InputType=Stream

_FT_HTML5_bannerTitle.htm