TrainCategory

The TrainCategory processor makes requests to a Category component, to train one or more categories. The processor uses the text content of incoming documents to train categories (document metadata is ignored).

The TrainCategory processor has an advanced configuration interface that you can use to:

  • view the categories that already exist

    TIP: To get this information, the processor uses action=CategoryGetHierDetails.

  • create a new category (click )
  • view and manually edit the training of existing categories (select a category and click ).

    NOTE: Using the advanced configuration interface, you can train a category with text or by specifying a Boolean expression. When you train a category with a Boolean expression, attempting to add further training to that category from incoming documents results in an error and the documents are transferred to the "failed" relationship.

  • delete a category (select a category and click ).

On the "TESTER" tab you can enter text and see which categories it matches, if any.

After training categories, you can categorize documents by using the CategorizeDocument processor.

For more information about categorization, refer to the Knowledge Discovery Administration Guide.

Properties

Name Default Value Description
IDOL License Service  

An IdolLicenseServiceImpl that provides a way to communicate with a Knowledge Discovery License Server.

Category Host   The host name or IP address of the Category component.
Category Port   The Category component ACI port.
Request Timeout 60 The maximum amount of time to wait, in seconds, for a response from the Category component.
SSL Config Service   An optional IdolSSLConfigServiceImpl that specifies the settings to use to communicate with the Category component over SSL/TLS. Set this property if your Category component has been configured to accept connections over SSL.
Category Name  

The name of the category that you want to train. You can specify either:

  • a single, fixed, category name.
  • a dynamic value, by using NiFi expression language to produce a value based on the incoming FlowFile. For example, you could specify ${"idol.category"} to select the value of the idol.category FlowFile attribute.

The processor automatically creates any categories that do not exist.

Batch Size 100 The maximum number of documents to batch into a single training request.

Relationships

Name Description
success FlowFiles that were processed successfully.
failure FlowFiles that were not processed successfully.