The LanguageModelBuild
task builds a new language model from a set of text files.
Parameter | Description | Required |
---|---|---|
Type | The task name. Set to LanguageModelBuild . |
Yes |
BaseDictionary | The base dictionary for the language model. | |
BuildLabel | The build label to use for the language model. | |
ContentDatabase | The IDOL Content component database to use to retrieve training text. | |
ContentHost | The host name or IP address of the IDOL Content component to retrieve training text from. | |
ContentPort | The ACI port of the IDOL Content component to retrieve training text from. | |
ContentTextTag | The IDOL fields to retrieve text data from. | |
DataList | The list of training text files. | Yes |
DataPath | The path to the directory containing the training text files listed in DataList. | Yes |
DiagFile | The file to write the diagnostic information to. | |
DiagLevel | The level of detail to include in the diagnostic information. | |
DoDctGen | Whether to generate a dictionary. | |
DoNorm | Whether to perform text normalization. | |
DoSmoothing | Whether to enable smoothing. | |
DoSegment | Whether to segment text. | |
DropList | A list of words to exclude from the vocabulary of the custom language model. | |
KeepList | A list of words that must appear in the vocabulary of the custom language model. | |
KeepTemp | Whether to keep the temporary text files for diagnostics | |
Lang | The language pack to use as a foundation. | Yes |
Log | The name of the log file to write. | |
NewDictionary | The dictionary to generate. | Yes |
NewLanguageModel | The custom language model to generate. | Yes |
NewLMInfoFile | The Language Model Information file to generate. | |
VocabSize | The maximum size of the vocabulary to include in the custom language model. |
http://localhost:15000/action=AddTask&Type=LanguageModelBuild&DataList=ListManager/Langmodel&DataPath=C:\LanguageModelFiles&Lang=ENUK-tel&NewLanguageModel=mymodel&NewDictionary=mymodel&DoSmoothing=False
This action uses the training text specified in the Langmodel
list and the ENUK-tel
language pack to build a new language model and dictionary file, both named mymodel
. This action also calculates a recommended interpolation weight at the end of the language model building process.
The interpolation weight is only a suggested weight–you can choose to set other weights.
The new language models are placed in the custom language models folder, as specified in the IDOL Speech Server configuration file.
|