Cluster Speech

The following schema describes how to segment an audio file into speaker clusters labeled Cluster_0, Cluster_1, and so on.

[ClusterSpeech]
0 = a <- audio(MONO, input)
1 = f1 <- frontend1(_, a)
2 = nf1 <- normalizer1(_, f1)
3 = w1 <- audiopreproc(A, a, nf1)
4 = f2 <- frontend2(_, a)
5 = nf2 <- normalizer2(_, f2, w1)
6 = w2 <- segment(_, nf2)
7 = w3 <- splitspeech(_, ws:w2, wc:w1, nf2)
8 = output <- wout(_, w3)
0 The audio module processes audio data.
1 The frontend1 module converts audio data (a) into speech front-end frame data.
2 The normalizer1 module normalizes frame data from 1 (f1).
3 The audiopreproc module processes the audio (a) and normalized frame data (nf) into Music, Speech, or Silence.
4 The frontend2 module converts the audio data from 0 (a) into speech front-end frame data.
5 The normalizer2 module normalizes frame data from 4 (f2).
6 The segment module finds short homogeneous acoustic segments.
7 The splitspeech modules forms the acoustic segments into speaker clusters.
8 The wout module writes the audio speaker clusters to a file.

_HP_HTML5_bannerTitle.htm