SegmentText

Japanese, Korean, Mandarin, and Taiwanese Mandarin languages do not separate words with whitespace. Text in these languages must be segmented into words before IDOL Speech Server can process them. You can segment text using the SegmentText task.

Parameters

Parameter Description Required
Type The task name. Set to SegmentText. Yes
IgnoreHashLines Whether to exempt sections bounded by hash symbols from segmentation.  
Lang The language pack to use. Yes
Pgf The pronunciation information file to use. Yes
TxtFileIn The text file to segment. Yes
TxtFileOut The text file to write the segmented text to. Yes

Example

http://localhost:15000/action=AddTask&Type=SegmentText&Lang=JAJP&TxtFileIn=C:/Data/Japanese.txt&TxtFileOut=JA_seg.txt&Pgf=T:\LP\ENUK\ver-ENUK-5.0.pgf

This action segments text in the Japanese.txt file and writes the results to the JA_seg.txt file in the Temp directory.


_FT_HTML5_bannerTitle.htm