SegmentText

Japanese, Korean, Mandarin, and Taiwanese Mandarin languages do not separate words with whitespace. Text in these languages must be segmented into words before IDOL Speech Server can process them. You can segment text using the SegmentText task.

Parameters

Parameter Description Required
Type The task name. Set to SegmentText. Yes
IgnoreHashLines Whether to exempt sections bounded by hash symbols from segmentation.  
Lang The language pack to use. Yes
Pgf The pronunciation information file to use. Yes
TxtFileIn The text file to segment. Yes
TxtFileOut The text file to write the segmented text to. Yes

Example

http://localhost:13000/action=AddTask&Type=SegmentText&Lang=JAJP&TxtFileIn=C:/Data/Japanese.txt&TxtFileOut=JA_seg.txt&Pgf=T:\LP\ENUK\ver-ENUK-5.0.pgf

This action uses port 13000 to instruct IDOL Speech Server, which is located on the local machine, to segment text in the Japanese.txt file and write the results to the JA_seg.txt file in the Temp directory.


_HP_HTML5_bannerTitle.htm