Japanese, Korean, Mandarin, and Taiwanese Mandarin languages do not separate words with whitespace. Text in these languages must be segmented into words before IDOL Speech Server can process them. You can segment text using the SegmentText
task.
Parameter | Description | Required |
---|---|---|
Type | The task name. Set to SegmentText . |
Yes |
IgnoreHashLines | Whether to exempt sections bounded by hash symbols from segmentation. | |
Lang | The language pack to use. | Yes |
Pgf | The pronunciation information file to use. | Yes |
TxtFileIn | The text file to segment. | Yes |
TxtFileOut | The text file to write the segmented text to. | Yes |
http://localhost:15000/action=AddTask&Type=SegmentText&Lang=JAJP&TxtFileIn=C:/Data/Japanese.txt&TxtFileOut=JA_seg.txt&Pgf=T:\LP\ENUK\ver-ENUK-5.0.pgf
This action segments text in the Japanese.txt
file and writes the results to the JA_seg.txt
file in the Temp directory.
|