LanguageModelBuild

The LanguageModelBuild task builds a new language model from a set of text files.

Parameters

Parameter Description Required
Type The task name. Set to LanguageModelBuild. Yes
BaseDictionary The base dictionary for the language model.  
BuildLabel The build label to use for the language model.  
ContentDatabase The IDOL Content component database to use to retrieve training text.  
ContentHost The host name or IP address of the IDOL Content component to retrieve training text from.  
ContentPort The ACI port of the IDOL Content component to retrieve training text from.  
ContentTextTag The IDOL fields to retrieve text data from.  
DataList The list of training text files. Yes
DataPath The path to the directory containing the training text files listed in DataList. Yes
DiagFile The file to write the diagnostic information to.  
DiagLevel The level of detail to include in the diagnostic information.  
DoDctGen Whether to generate a dictionary.  
DoNorm Whether to perform text normalization.  
DoSmoothing Whether to enable smoothing.  
DoSegment Whether to segment text.  
DropList A list of words to exclude from the vocabulary of the custom language model.  
KeepList A list of words that must appear in the vocabulary of the custom language model.  
KeepTemp Whether to keep the temporary text files for diagnostics  
Lang The language pack to use as a foundation. Yes
Log The name of the log file to write.  
NewDictionary The dictionary to generate. Yes
NewLanguageModel The custom language model to generate. Yes
NewLMInfoFile The Language Model Information file to generate.  
VocabSize The maximum size of the vocabulary to include in the custom language model.  

Example

http://localhost:13000/action=AddTask&Type=LanguageModelBuild&DataList=ListManager/Langmodel&DataPath=C:\LanguageModelFiles&Lang=ENUK-tel&NewLanguageModel=mymodel&NewDictionary=mymodel&DoSmoothing=False

This action uses port 13000 to instruct HPE IDOL Speech Server, which is located on the local machine, to use the training text specified in the Langmodel list and the ENUK-tel language pack to build a new language model and dictionary file, both named mymodel. This action also calculates a recommended interpolation weight at the end of the language model building process.

NOTE:

The interpolation weight is only a suggested weight–you can choose to set other weights.

The new language models are placed in the custom language models folder, as specified in the HPE IDOL Speech Server configuration file.


_HP_HTML5_bannerTitle.htm