Whether to tokenize all strings into N-grams, or only multi-byte strings. Use this parameter in combination with the NGram parameter, which determines the size of character N-grams.
For example, if you set NGramMultiByteOnly
to True
, if a document that contains both English and Asian text, HPE Category Component tokenizes the Asian text into N-grams according to the NGram setting. It does not tokenize the English text.
Type: | Boolean |
Default: | False |
Required: | No |
Configuration Section: | LanguageTypes or MyLanguage |
Example: | Ngram=2
|
See Also: | NGram |
|