Language Identification Feature Extraction

The following schema describes the creation of language identification feature files that you can use to create and optimize language identification classifiers.

[langIdFeature]
0 = a <- audio(MONO, input)
1 = f1   <- frontend1(_, a)
2 = nf1  <- normalizer1(_, f1)
3 = w <- audiopreproc(A, a, nf1)
4 = f2 <- frontend2(_, a)
5 = nf2 <- normalizer2(_, f2, w)
6 = lf1 <- lidfeature(_, nf2)
7 = lf2 <- filter(LF_INCLUSIVE, lf1, w)
8 = output <- lfout(_, lf2)
DefaultResults=out
0 The audio module processes the mono audio data.
1 The frontend1 module converts the audio data from 0 into front-end frame data.
2 The normalizer1 module normalizes the frame data from 1.
3 The audiopreproc1 module runs audio classification on the audio (a) and the normalized frame data from 2 (nf1).
4 The frontend2 module converts the audio data from 0 into front-end frame data.
5 The normalizer2 module normalizes the frame data from 4.
6 The lidfeature module converts the normalized frame data from 5 (nf2) into language identification feature data.
7 The filter module filters the output from 6 (lf1), using the audio classification data (w), to include only language features that occur in segments that contain speech.
8 The lfout module writes the language identification feature data from 7 (lf2) to the output file.

_HP_HTML5_bannerTitle.htm