Cluster Speech

The following schema describes how to segment an audio file into speaker clusters labelled Cluster_0, Cluster_1, and so on.

[clusterSpeech]
0 = a <- wav3(MONO, input)
1 = w1 <- audiopreproc3(A, a)
2 = f <- frontend3(_, a)
3 = nf <- normalizer3(_, f, w1)
4 = w2 <- segment3(_, nf)
5 = w3 <- splitspeech3(_, ws:w2, wc:w1, nf)
6 = output <- wout3(_, w3)
0 The wav module processes wav audio data.
1 The audiopreproc module processes the audio (a) into Music, Speech, or Silence
2 The frontend module converts audio data (a) into speech front-end frame data.
3 The normalizer module normalizes frame data from 2 (f).
4 The segment module finds short homogeneous acoustic segments.
5 The splitspeech modules forms these into speaker clusters.
6 The wout module writes the audio speaker clusters.

_HP_HTML5_bannerTitle.htm