IDOL Speech Server extracts information from audio, returning the results as XML documents, which you can index into IDOL Server.
Speech Server can process audio and video files that are compressed in various formats. You can also stream uncompressed audio directly into Speech Server.
Speech Server can translate spoken words into text.
Speech Server can search audio for words or phrases and return their location in the audio.
If trained on sample data from speakers, Speech Server can identify the speakers in audio.
Speech Server can identify the language that is being spoken.
Speech Server can add time locations to an audio transcript.
Speech Server can detect particular audio sections (for example, melodies, adverts, or jingles) and return their location in the audio.
Speech Server can detect security-related sounds (including alarms, screams, gunshots, and breaking glass) and return their location in the audio. In addition, Speech Server can identify the type of alarm, if trained on relevant alarms.
|