The following schema describes language identification in CUMULATIVE
mode.
[langId] 0 = a,ts <- audio(MONO, input) 1 = f1 <- frontend1(_, a) 2 = nf1 <- normalizer1(_, f1) 3 = w <- audiopreproc(A, a, nf1) 4 = f2 <- frontend2(_, a) 5 = nf2 <- normalizer2(_, f2, w) 6 = lf1 <- lidfeature(_, nf2) 7 = lf2 <- filter(LF_INCLUSIVE, lf1, w) 8 = lid <- langid(_, lf2) 9 = output <- lidout(_, lid, ts) DefaultResults=out
0
|
The audio module processes the mono audio data. |
1
|
The frontend1 module converts audio data into speech front-end frame data. |
2
|
The normalizer1 module normalizes frame data from 1 (f ). |
3
|
The audiopreproc module in audio classification mode processes the audio (a ) and normalized frame data (nf1 ). |
4
|
The frontend2 module converts the audio data from 0 into speech front-end frame data. |
5
|
The normalizer2 module normalizes the frame data from 4 , adapted according to the audio classification from 3 (w ). |
6
|
The lidfeature module converts normalized frame data from 5 (nf2 ) into language identification feature data. |
7
|
The filter module filters the output from 6 (lf1 ) to include only language features that occur in segments that contain speech. |
8
|
The langid module processes the language identification feature data from 7 (lf2 ) to identify the language. |
9
|
The language identification information (lid ) is written to the output file. |
|