Per-Language Sentence-Breaking Files

For languages in which words are not delimited by spaces (Japanese, Chinese, Thai, and Korean), the IDOL Content component uses sentence-breaking libraries. In a default IDOL Content component installation, these files are stored in the IDOL/langfiles directory.

If you run Content on a UNIX platform, specify the LD_LIBRARY_PATH to ensure that Content can find the sentence-breaking files that it requires.

The following tables list the files that the individual languages require.

  • Japanese

    NT UNIX

    japanesebreaking.dll

    \jpn-cha\cforms.cha

    \jpn-cha\chadic.da

    \jpn-cha\chadic.lex

    \jpn-cha\chasenrc

    \jpn-cha\connect.cha

    \jpn-cha\ctypes.cha

    \jpn-cha\grammar.cha

    \jpn-cha\matrix.cha

    \jpn-cha\table.cha

    libchasen.dll

    japanesebreaking.so

    /jpn-cha/cforms.cha

    /jpn-cha/chadic.da

    /jpn-cha/chadic.lex

    /jpn-cha/chasenrc

    /jpn-cha/connect.cha

    /jpn-cha/ctypes.cha

    /jpn-cha/grammar.cha

    /jpn-cha/matrix.cha

    /jpn-cha/table.cha

  • Traditional Chinese

    NT UNIX

    chinesebreaking.dll

    big5togb.txt

    wordlist.txt

    chineseconvlist.txt

    chinesebreaking.so

    big5togb.txt

    wordlist.txt

    chineseconvlist.txt

  • Simplified Chinese

    NT UNIX

    chinesebreaking.dll

    big5togb.txt

    wordlist.txt

    chineseconvlist.txt

    chinesebreaking.so

    big5togb.txt

    wordlist.txt

    chineseconvlist.txt

  • Thai

    NT UNIX

    thaibreaking.dll

    thaidict.txt

    thaiconvlist.txt

    thaibreaking.so

    thaidict.txt

    thaiconvlist.txt

  • Korean

    NT UNIX

    koreanbreaking.dll

    main.dat

    prob.dat

    main.fst

    prob.fst

    pos.nam

    tag.nam

    tagout.nam

    connection.txt

    StopPosNam.txt

    TagName.txt

    koreanconvlist.txt

    koreanbreaking.so

    main.dat

    prob.dat

    main.fst

    prob.fst

    pos.nam

    tag.nam

    tagout.nam

    connection.txt

    StopPosNam.txt

    TagName.txt

    koreanconvlist.txt