TrainCategory
The TrainCategory processor makes requests to a Category component, to train one or more categories. The processor uses the text content of incoming documents to train categories (document metadata is ignored).
The TrainCategory processor has an advanced configuration interface that you can use to:
-
view the categories that already exist
TIP: To get this information, the processor uses
action=CategoryGetHierDetails. - create a new category (click
) -
view and manually edit the training of existing categories (select a category and click
). NOTE: Using the advanced configuration interface, you can train a category with text or by specifying a Boolean expression. When you train a category with a Boolean expression, attempting to add further training to that category from incoming documents results in an error and the documents are transferred to the "failed" relationship.
- delete a category (select a category and click
).
On the "TESTER" tab you can enter text and see which categories it matches, if any.
After training categories, you can categorize documents by using the CategorizeDocument processor.
For more information about categorization, refer to the Knowledge Discovery Administration Guide.
Properties
| Name | Default Value | Description |
|---|---|---|
| IDOL License Service |
An IdolLicenseServiceImpl that provides a way to communicate with a Knowledge Discovery License Server. |
|
| Category Host | The host name or IP address of the Category component. | |
| Category Port | The Category component ACI port. | |
| Request Timeout | 60 | The maximum amount of time to wait, in seconds, for a response from the Category component. |
| SSL Config Service | An optional IdolSSLConfigServiceImpl that specifies the settings to use to communicate with the Category component over SSL/TLS. Set this property if your Category component has been configured to accept connections over SSL. | |
| Category Name |
The name of the category that you want to train. You can specify either:
The processor automatically creates any categories that do not exist. |
|
| Batch Size | 100 | The maximum number of documents to batch into a single training request. |
Relationships
| Name | Description |
|---|---|
| success | FlowFiles that were processed successfully. |
| failure | FlowFiles that were not processed successfully. |