The [Taxonomy]
section of the configuration file offers a number of parameters to control the generation of taxonomies.
Parameter | Description |
---|---|
MaxQNum
|
The maximum number of documents used to generate a taxonomy. This overrides the value set (if any) in the DREQuery parameter of the TaxonomyGenerate command. The default is 5000. |
MaxConcepts
|
The maximum number of terms that the trained category used for taxonomy generation can have. The default is 100. |
ConceptThreshold
|
The minimum weight that a term must have to appear in the trained category, and therefore the taxonomy. The default is 0. |
MinConceptOccs
|
The minimum number of times a term must appear in the training documents to be included in the trained category. The default is 10. |
OnlyMatchSubset
|
Whether to use the entire index for the purposes of statistics calculation when determining term distribution. A value of 1 means use only the documents in the defined subset, 0 the entire index. The default is 1. |
MinChildren
|
The minimum number of children a term must have for it to appear in the taxonomy. This value translates to the minimum number of child categories a category must have. Any categories that do not meet this constraint are redistributed across the taxonomy, and added as a child categories to other categories, if suitable parents exist. The default is 0. |
CompoundRelevance
|
The minimum percentage relevance two terms must have for them to be combined into one. The default is 40. |
SiblingStrength
|
The minimum percentage relevance that the parent terms of two terms must have for them to be combined into one. The default is 20. |
RelevanceThreshold
|
The minimum percentage relevance term X must have to term Y for X to be the parent of Y. The default is 20. |
DistributionThreshold
|
The minimum percentage relevance term Y must have to term X for X to be the parent of Y. It must have a smaller value than RelevanceThreshold . The default is 5. |
|