StreamSidOptimize

DEPRECATED:

The StreamSidOptimize task is deprecated for HPE IDOL Server version 11.3. Use the SpkIdDevelStream and SpkIdDevelFinal tasks instead.

This task is still available for existing implementations, but it might be incompatible with new functionality. The task might be deleted in future.

The StreamSidOptimize task generates statistics that are used for determining speaker template match thresholds. It is a version of the WavSidOptimize task that reads in audio from a binary stream.

The statistics are based on analyzing the speaker match scores observed for each template against both matching speaker data (leading to true positives), and non-matching speaker data (leading to false positives).

You build up the statistics by presenting HPE IDOL Speech Server with audio labeled as being from one of the known speakers or an unknown speaker. The StreamSidOptimize task generates these statistics and stores them in a Speaker ID Optimization (.spo) file.

You must run the StreamSidOptimize task once for each audio stream. You can choose to append the scores for each audio stream to a single .spo file (the default method), or to create a separate .spo file for each audio stream and combine these at the packaging stage.

Parameters

Parameter Description Required
Type The task name. Set to StreamSidOptimize. Yes
Ast The speaker classifier file. See Comments.
CompSelect The components to use for scoring.  
Diag Whether to generate diagnostic information.  
DiagFile The file to write the diagnostic information to.  
DiscardShort Exclude segments shorter than a specific duration from further analysis.  
MinNonSpeech The minimum size in seconds of non-speech segments.  
MinSpeech The minimum size in seconds of speech segments.  
Sfreq The sample frequency of the audio file to process.  
SidBase The sid base pack resource to use to determine the base files to use.  
Sig The .sig file to use for speaker identification.  
SpeakerName The speaker label for the speaker in the audio. For unknown speakers, set to Unknown_. Yes
SpkList A list of speaker templates. Yes
SpkPath The path to the directory containing the speaker templates.  
SpkSegCoef Applies a weight to bias the decision about where speaker boundaries occur.  
Spo The .spo file to create or update. Yes
SpoAppend Whether to append match data scores to a common .spo file.  
USM The USM to use.  
USMEnabled Whether to use the USM for optimization.  

Example

http://localhost:13000/action=AddTask&Type=StreamSidOptimize&SpkList=ListManager/speakers&SpkPath=C:\training&Spo=speakers.spo&SpeakerName=/ENUK/Bob

This action uses port 13000 to instruct HPE IDOL Speech Server, which is located on the local machine, to generate match statistics for the speaker /ENUK/Bob by checking the sample speech in the audio stream against the speaker templates specified in the speakers list and writing the results to the speakers.spo file.

Comments

If you do not specify the Ast parameter, the action uses the base ast file, determined by the SidBase resource. This base file does not contain any speaker information, and cannot identify speakers, but it performs gender detection and speaker segmentation.


_HP_HTML5_bannerTitle.htm