Eduction

The Eduction processor uses IDOL Eduction to extract entities from text. An entity is a word, phrase, or block of information. For example, you can use Eduction to extract names, addresses, telephone numbers, and dates from document content or metadata.

For more information about Eduction and how to configure Eduction, refer to the Eduction User Guide.

Properties

Name Default Value Description
IDOL License Service   An IdolLicenseService that provides a way to communicate with an IDOL License Server.
Entity  

A comma-separated list of entities to extract.

To specify several entities, you can use wildcard expressions. For example: place/city1/*,place/city2/*. The * wildcard matches any number of characters, and the ? wildcard matches a single character.

You must also set the Resource Files property to the location of the resource files that contain your chosen entities.

Entity Field  

A comma-separated list of document fields in which to write the matches from Eduction. The value of this property must have the same number of values as the Entity property.

Resource Files   A comma-separated list of compiled ECR files containing Eduction grammar entries. At least one resource file is required. You can match multiple resource files with wildcard expressions. You can use the * wildcard to match any number of characters, or the ? wildcard to match a single character.
Search Fields DRECONTENT A comma-separated list of document fields to search for entities, for example DRECONTENT or DRETITLE.
Simple Output False A Boolean value that specifies whether to add only the matched text to the document fields specified by the Entity Field property. To add only the matched text, set this parameter to true. With the default value, false, the fields will have subfields that contain the matched text, the offset, and the score.
eduction_configuration_parameter_name  

You can add additional properties that match the name of an Eduction configuration parameter, and set an appropriate value. For more information about the configuration parameters that you can use to configure Eduction, refer to the Eduction User Guide.

Some properties, for example PostProcessingTaskN, accept the name of a configuration section (so that you can configure more than one post processing task). When you configure associated properties, prefix the property names with the section name in square brackets. For example, if you set a property named PostProcessingTask0 to MyTask, specify the associated script by setting a property named [MyTask]Script.

Relationships

Name Description
success Successfully processed FlowFiles are routed to this relationship.
failure FlowFiles that had an invalid or unknown format.

_FT_HTML5_bannerTitle.htm