Align the Transcript

The transcript aligner compares the speech-to-text transcript and the original transcript text to produce an aligned transcript. The aligner either uses words as whole units or breaks them down into phonemes or letters. You can therefore select one of three modes:

In addition, the alignment algorithm can also work in one of two polarity modes:

To run the transcript alignment task

For example:

http://localhost:13000/action=AddTask&Type=TranscriptAlign&TxtFile=C:\data\transcript.txt&CtmFile=C:\misc\speechtext.ctm&Out=AlignedTranscript.ctm&MatchType=words

This action uses port 13000 to instruct IDOL Speech Server, which is located on the local machine, to compare the original transcript transcript.txt with the speech-to-text transcript speechtext.ctm to produce an aligned transcript, AlignedTranscript.ctm. The action instructs IDOL Speech Server to use the words alignment mode.

The output file is in the following format:

1 A 0.000 0.420 behind 1.000
1 A 0.420 7.790 it 1.000
1 A 8.210 2.870 all 1.000
1 A 11.080 0.000 <s> 1.000
1 A 11.080 0.000 Teaism 1.000
1 A 11.080 0.000 was 1.000
1 A 11.080 0.000 Taoism 1.000
1 A 11.080 0.000 in 1.000
1 A 11.080 0.000 disguise 1.000
1 A 11.080 0.000 <s> 1.000

From left to right, the columns in the output data file contain:

This action returns a token. You can use the token to:


_HP_HTML5_bannerTitle.htm