Perform Speaker Identification Using Templates

The following schema describes how to run speaker identification given a set of templates.

[ivSpkId]
0 = a,ts <- audio(MONO, input)
1 = w1 <- speakerid(GENDER_DETECT, a)
2 = f1 <- frontend1(_,a)
3 = nf1 <- normalizer1(_, f1)
4 = w2 <- audiopreproc(A, a, nf1)
5 = f2 <- frontend2(_,a)
6 = nf2 <- normalizer2(SEGMENT_BLOCK, f2, w1)
7 = nf3 <- filter(FEAT_INCLUSIVE, nf2, w2)
8 = sid <- ivScore(SEGMENT, nf3, w1)
9 = output <- sidout(_, sid, ts)
0 The audio module processes the mono audio data.
1 The speakerid module takes the audio data (a) and outputs speaker turn segments.
2 The frontend1 module converts the audio data from 1 into front-end frame data.
3 The normalizer1 module normalizes the frame data from 2.
4 The audiopreproc1 module in audio classification mode processes the audio (a) and normalized frame data (nf1).
5 The frontend2 module converts the audio data from 0 into front-end frame data.
6 The normalizer2 module normalizes the frame data from 5.
7 The filter module filters the output from 6 (nf2) to include only frames that occur in segments that contain speech, by using the audio classification data from 4 (w2).
4 The ivscore module takes audio features from 7 (nf3) and speaker segment information (w1), and produces a set of iVector speaker scores for each segment.
9 The sidout module takes the speaker ID score information (sid) and writes this information into a results file.

_HP_HTML5_bannerTitle.htm