The tasks in the following table are deprecated. These tasks are available in the current Speech Server configuration file, but might be incompatible with new functionality. These tasks are likely to be deleted in future.
Task | Description |
---|---|
AfpAddTrackStream | Adds a new audio track to an Audio Fingerprint database, receiving the audio data as a stream, and converting it into AFP features before indexing. |
AfpAddTrackWav | Adds a new audio track to an Audio Fingerprint database, reading the data from an audio file, and converting it into AFP features before indexing. |
AfptAddTrackStream | Performs the same task as AfpAddTrackStream, but uses a template database (fptdb ), which improves
robustness to audio mismatches at the cost of scalability. |
AfptAddTrackWav | Performs the same task as AfpAddTrackWav, but uses a template database (fptdb ), which improves
robustness to audio mismatches at the cost of scalability. |
AfpMatchStream | Receives audio data as a binary stream and searches it for any sections that match audio indexed in an AFP database. |
AfpMatchWav | Reads in data from an audio file, and searches it for any sections that match audio indexed in an AFP database. |
AfptDatabaseInfo | Performs the same task as AfpDatabaseInfo, but uses a template database (fptdb ), which improves
robustness to audio mismatches at the cost of scalability. |
AfptMatchStream | Performs the same task as AfpMatchStream, but uses template-based matching as opposed to landmarks, which improves robustness to audio mismatches at the cost of scalability. |
AfptMatchWav | Performs the same task as AfpMatchWav, but uses template-based matching as opposed to landmarks, which improves robustness to audio mismatches at the cost of scalability. |
AfptRemoveTrack | Performs the same task as AfpRemoveTrack, but uses a template database (fptdb ), which improves
robustness to audio mismatches at the cost of scalability. |
AmTrain | Presents training audio and transcription data to the acoustic model training process, creating accumulator files that are used by the AmTrainFinal task to produce a final adapted acoustic model. |
AmTrainFinal | Produces the adapted acoustic model, given a set of accumulator files created by the AmTrain task. |
DataObfuscation | Prepares training data with any sensitive or classified information concealed. |
IvSpkIdDevelStream | Takes a single audio stream, along with the name of the speaker the stream is associated with, and generates scores for tuning iVector thresholds. |
IvSpkIdDevelWav | Processes a single audio file to generate scores for tuning iVector thresholds. |
IvSpkIdEvalStream |
Runs iVector-based identification of any sections of an audio stream where the trained speakers are present. |
IvSpkIdEvalWav | Performs iVector-based speaker identification on a single audio file. |
IvSpkIdSetEditThresh | Modifies the threshold of a single template stored in an iVector template set file. |
IvSpkIdSetInfo | Produces a log file that lists the contents of the specified iVector template set file. |
IvSpkIdTmpEditThresh |
Modifies the threshold of a single iVector template file. |
IvSpkIdTmpInfo | Produces a log file that lists the contents of the specified iVector template file. |
IvSpkIdTrainStream |
Takes a single audio stream containing speech data from the speaker to be trained, and creates a new iVector speaker template file. |
IvSpkIdTrainWav |
Takes a single audio file containing speech data from the speaker to be trained, and creates a new iVector speaker template file. |
LangIdBndLif | Reads in language identification features from file and determines boundaries in the feature sequence where the language changes. Returns the language identification results between boundaries. |
LangIdBndStream | Receives audio data as a binary stream, converts it into language identification features, and determines boundaries where the language changes. Returns the language identification results between boundaries. |
LangIdBndWav | Reads in data from an audio file, converts it into language identification features, and determines boundaries where the language changes. Returns the language identification results between boundaries. |
LangIdCumLif | Reads in language identification features from file. Returns the running language identification score at periodic intervals. This is the score for all the input data from the start to the current point. |
LangIdCumStream | Receives audio data as a binary stream and converts it into language identification features. Returns the running language identification score at periodic intervals. This is the score for all the input data from the start to the current point. |
LangIdCumWav | Reads in data from an audio file and converts it into language identification features. Returns the running language identification score at periodic intervals. This is the score for all the input data from the start to the current point. |
LangIdSegLif | Reads in language identification features from file, processes the data in fixed-sized chunks, and returns the language identification results for each chunk. |
LangIdSegStream | Receives audio data as a binary stream and converts it into language identification features. IDOL Speech Server processes the data in fixed-sized chunks, and returns the language identification results for each chunk. |
LangIdSegWav | Reads in data from an audio file and converts it into language identification features. IDOL Speech Server processes the data in fixed-sized chunks, and returns the language identification results for each chunk. |
SegmentWav | Attempts to segment audio into sections by speaker even if no trained speakers exist in the system. |
SpkIdDevel | Processes speaker ID feature files to generate scores for tuning model thresholds. |
SpkIdDevelFinal |
Estimates the thresholds for a set of speaker templates. |
SpkIdDevelStream | Creates or updates a development (.atd ) file for an audio stream. |
SpkIdDevelWav | Creates or updates a development (.atd ) file for an audio file. |
SpkIdEvalStream | Analyzes an audio stream to identify any sections where the trained speakers are present. |
SpkIdEvalWav | Analyzes an audio file to identify any sections where the trained speakers are present. |
SpkIdFeature | Creates a speaker ID feature file. |
SpkIdSetAdd | Takes one or more audio template files, and adds them to an audio template set file. |
SpkIdSetDelete | Removes a template from an audio template set file. |
SpkIdSetEditThresh | Modifies the threshold of a single template in an audio template set file. |
SpkIdSetInfo | Retrieves information on an audio template set file. |
SpkIdTmpEditThresh | Modifies the threshold of a single template. |
SpkIdTmpInfo | Retrieves information on an audio template file. |
SpkIdTrain | Uses one or more feature files to train a speaker template. |
SpkIdTrainStream | Takes an audio stream containing speech data from the speaker to be trained, and creates a new speaker template file. |
SpkIdTrainWav | Takes a single audio file containing speech data from the speaker to be trained, and creates a new speaker template file. |
StreamToText | Converts live audio into a text transcript. |
StreamToTextMusicFilter | Converts live audio into a text transcript and categorizes the audio so that you can remove any sections consisting of music or noise. |
TelWavToText | Transcribes a telephony audio file, including dial tones and DTMF dial tones. |
WavPhraseSearch | Searches for a specified phrase or phrases in an audio file. |
WavToFMD | Creates a phoneme time track file from a single audio file. |
WavToPlh | Reads data from an audio file and produces an audio feature file, which is used in tasks such as Amtrain (adapts acoustic models). |
WavToText | Converts an audio file into a text transcript. |
|