The IvSpkIdTrainAudio
task takes an audio file or stream containing speech data from the speaker to be trained, and creates a new iVector speaker template file.
Parameter | Description | Required |
---|---|---|
Type | The task name. Set to IvSpkIdTrainAudio .
|
Yes |
DiagFile | The file to write the diagnostic information to. | |
DiagLevel | The level of detail to include in the diagnostic information. | |
EndTime | The end of an audio section to process. | |
File | The audio file that contains sample speech from one person. | Yes, if InputType is File . |
FrameDupl | An integer value which allows for greater time efficiency without significant change in recognition accuracy. | |
InputType | The type of audio to process (file, binary data, or stream). | |
LabFile | A single label file to use. | |
LabType | The type of labels to use. | |
Out | The name of the speaker template file to create. You must include the
audio template file extension (.iv )
. |
Yes |
Sfreq | The sample frequency of the audio file to process. | |
StartTime | The beginning of an audio section to process. | |
SugdInputChannels | The channel layout of the input media file. This parameter does not apply when InputType is Stream . |
|
SugdInputFrequency | The sampling rate of the input media file. This parameter does not apply when InputType is Stream . |
http://localhost:15000/action=AddTask&Type=IvSpkIdTrainAudio&InputType=File&File=C:/Data/BrownSpeech.wav&Out=Brown.iv
This action uses port 15000
to instruct IDOL Speech Server, which is located on the local machine, to create the Brown.iv
template file by using the BrownSpeech.wav
file.
http://localhost:15000/action=AddTask&Type=IvSpkIdTrainAudio&InputType=Stream&Out=Brown.iv
This action uses port 15000
to instruct IDOL Speech Server, which is located on the local machine, to create the Brown.iv
template file for the audio stream.
|