IvSpkId

The IvSpkId task performs iVector-based speaker identification on an audio file, or on any sections of an audio stream where the trained speakers are present.

Parameters

Parameter Description Required
Type The task name. Set to IvSpkId. Yes
AllowEmpty Whether to produce gender labels as output if no speakers are specified.  
AudioUpsampling Whether to allow audio upsampling if the input audio has a sample rate too low for the task.  
DiagFile The file to write the diagnostic information to.  
DiagLevel The level of detail to include in the diagnostic information.  
DiscardShort Exclude segments shorter than a specific duration from further analysis.  
EndTime The end of an audio section to process.  
File The audio file to process. Yes, if InputType is File.
FrameDupl The balance between performance and speed for audio preprocessing DNN classification.  
InputType The type of audio to process (file, binary data, or stream).  
MaxNonSpeech The maximum length of non-speech segments.  
MaxSpeech The maximum length of speech segments.  
MinNonSpeech The minimum length of non-speech segments.  
MinSpeech The minimum length of speech segments.  
Out The file to write the results to.  
ScoreMode The scoring method to use for speaker identification.  
Sfreq The sample frequency of the audio stream to process.  
StartTime The beginning of an audio section to process.  
SugdInputChannels The channel layout of the input media file. This parameter does not apply when InputType is Stream.  
SugdInputFrequency The sampling rate of the input media file. This parameter does not apply when InputType is Stream.  

TemplateExt

The file extension to use for template files.  
TemplateList A list file that lists multiple speaker template files to use.  
TemplatePath The path to the directory containing the speaker templates.  
TemplateSet An audio template set file.  
ThreshScale The rate at which to scale the thresholds.  

Examples

http://localhost:15000/action=AddTask&Type=IvSpkId&InputType=File&File=C:\Data\Speech.wav&TemplateSet=speakers.ivs&ClosedSet=False&Out=results.ctm

This action searches the Speech.wav file for speakers based on the template set file speakers.ivs, and writes the identification results to the results.ctm file.

http://localhost:15000/action=AddTask&Type=IvSpkId&InputType=Stream&TemplateSet=speakers.ivs&Out=results.ctm

This action searches the audio stream for speakers based on the iVector-based template set file speakers.ivs, and writes the identification results to the results.ctm file.


_FT_HTML5_bannerTitle.htm