Speaker Identification

The following diagram shows the modules in IDOL Speech Server that enable speaker identification in a single step.

The audio module reads the audio file and prepares windowed data.

a is the audio window series.

The speakerid module takes each window of samples, analyzes it and identifies samples that match the speakers in the speaker database.

w is the output time-marked word series.

The wout module prepares the output phrase labels and time positions for storage and result reporting.

The schema that implements this feature is:

[MySpeakerId]
a, ts ← audio (MONO, input)
w ← speakerid (_, a)
output ← wout (_, w, ts)

_FT_HTML5_bannerTitle.htm