Introduction

Connectors monitor your organization's data repositories so that your IDOL index is kept up to date. When new items are added to a repository, the connector sends documents for ingestion. When items are deleted the connector ingests a document to represent the deletion so that the document in the IDOL index is removed.

If you view the attributes of a NiFi FlowFile you can see that there is an attribute named idol.reference.action. This attribute specifies what an indexer, such as the PutIDOL processor, must do with the document. For example, if the FlowFile represents a new item that was created in the repository, the connector sets this attribute to Add so that the document is added to the index.

When a connector synchronizes with a repository it might find an existing item has changed. If the change affects only the document metadata, the connector ingests a metadata-only document with the updated field values. In this case the idol.reference.action is set to Update.

If an item in a repository is modified and the change affects the document content, the connector ingests two documents. The first document has an idol.reference.action of Delete to delete the existing document from the index, and the second has the action Add, to add a document containing the new content. These documents must be indexed in the correct order. The delete has to be indexed first, to remove the existing document. If the documents are indexed the other way around then the new content is added and immediately deleted.

NiFi Ingest provides a way to ensure that your documents are indexed in the correct order, but this is something that you must configure.

To ensure that documents are indexed in the correct order:

IMPORTANT: Do not delete FlowFiles that have been registered with a DocumentRegistryService. If you stop processing a FlowFile that has been registered by a connector or through the RegisterDocument processor, any subsequent processor that uses the DocumentRegistryServiceImpl might wait for that document indefinitely. The only way to remove dependencies on a deleted FlowFile is to stop processing and delete the document registry database.

If you want to stop processing a FlowFile (for example, because you do not want to index it), Micro Focus recommends that you route the FlowFile to an UnregisterDocument processor.


_FT_HTML5_bannerTitle.htm