Japanese, Korean, Mandarin, and Taiwanese Mandarin languages do not separate words with whitespace. Text in these languages must be segmented into words before HPE IDOL Speech Server can process them. You can segment text using the SegmentText
task.
Parameter | Description | Required |
---|---|---|
Type | The task name. Set to SegmentText . |
Yes |
IgnoreHashLines | Whether to exempt sections bounded by hash symbols from segmentation. | |
Lang | The language pack to use. | Yes |
Pgf | The pronunciation information file to use. | Yes |
TxtFileIn | The text file to segment. | Yes |
TxtFileOut | The text file to write the segmented text to. | Yes |
http://localhost:13000/action=AddTask&Type=SegmentText&Lang=JAJP&TxtFileIn=C:/Data/Japanese.txt&TxtFileOut=JA_seg.txt&Pgf=T:\LP\ENUK\ver-ENUK-5.0.pgf
This action uses port 13000
to instruct HPE IDOL Speech Server, which is located on the local machine, to segment text in the Japanese.txt
file and write the results to the JA_seg.txt
file in the Temp directory.
|