Audio Analysis

IDOL Speech Server extracts information from audio, returning the results as XML documents, which you can index into IDOL Server.

Speech Server can process audio and video files that are compressed in various formats. You can also stream uncompressed audio directly into Speech Server.

Speech-to-Text

Speech Server can translate spoken words into text.

Phonetic Phrase Search

Speech Server can search audio for words or phrases and return their location in the audio.

Speaker Identification

If trained on sample data from speakers, Speech Server can identify the speakers in audio.

Spoken Language Identification

Speech Server can identify the language that is being spoken.

Transcript Alignment

Speech Server can add time locations to an audio transcript.

Audio Fingerprint Identification

Speech Server can detect particular audio sections (for example, melodies, adverts, or jingles) and return their location in the audio.

Audio Security

Speech Server can detect security-related sounds (including alarms, screams, gunshots, and breaking glass) and return their location in the audio. In addition, Speech Server can identify the type of alarm, if trained on relevant alarms.


_FT_HTML5_bannerTitle.htm