Start Processing

After you have configured the ingestion pipeline and a connector, you can send some documents to NiFi.

To start processing

  1. In the NiFi canvas, right-click your input port and click Start. This allows the input port to receive data.
  2. Send the synchronize action to your connector. For example:

    http://connector:7002/action=fetch&fetchaction=synchronize
  3. The connector performs the synchronize fetch action and starts sending documents to NiFi. The input port has been started so the documents are received. The KeyViewExtractFiles processor has not been started, so the documents wait in the preceding queue:

  4. Start the KeyViewExtractFiles processor (right click the processor and click Start).

    The processor begins processing documents from the queue. If there is a problem, a bulletin icon () appears on the processor. Hover the cursor over this icon to view the error message. In the following example, the processor is unable to locate the KeyView libraries because the KeyView controller service is using the KEYVIEW_DIRECTORY environment variable but that variable has not been set:

    If you need to fix a problem like this, remember to stop the processor and disable the KeyView controller service before attempting to modify their configurations. After you have made your changes, enable the KeyView controller service and start the processor. When the processor extracts the documents successfully, they move into the next queue ready for processing by the next processor. You might notice that the KeyViewExtractFiles processor outputs more FlowFiles than it received. This is because it has extracted subfiles from their containers and each subfile is represented by a new FlowFile.

  5. Start the KeyViewFilterDocument processor.

    The FlowFiles are processed and move to the next queue.

  6. The next processor is the RemoveDocumentPart processor. Before starting that processor, you can look at the FlowFiles in its queue:

    1. Right-click the queue and click List Queue.

      A dialog box opens that shows lists the FlowFiles in the queue.

    2. For one of the FlowFiles, click the view details () icon.

      The FlowFile dialog box opens and displays information about the flow file.

    3. Click View.

      The FlowFile is displayed.

    4. In the View as box, click formatted.

      The FlowFile should have a section named ContentFile or ContentFilename. ContentFile is present when the FlowFile has associated binary data and ContentFilename is present when the FlowFile contains the path to an associated file. When you configure your connectors you can choose whether to send the binary data or just the file path. If the FlowFile represents a subfile that was extracted by the KeyViewExtractFiles processor, it has a ContentFilename section that references the extracted file in the KeyView temporary directory.

      If you look at FlowFiles after processing by the RemoveDocumentPart processor, you should notice that the ContentFile and ContentFilename parts have been removed.

    5. Close the FlowFile window and the FlowFile dialog box.
  7. Start the remaining processors.

    NiFi processes the FlowFiles. You can then query IDOL and find the documents that have been indexed.


_FT_HTML5_bannerTitle.htm