Ingest XML

Many systems export data in XML format. This section describes how to ingest XML into IDOL using NiFi Ingest.

The steps in this section assume that:

To ingest XML

  1. Add a GetFileSystem processor to your data flow to retrieve the XML file(s).

  2. Add an ExecuteDocumentLua processor to the data flow.
  3. Connect the "success" relationship of the GetFileSystem processor to the ExecuteDocumentLua processor.
  4. Configure the ExecuteDocumentLua processor.

    1. Right-click the processor and click Configure.

      The Configure Processor dialog box opens.

    2. Click the Properties tab.
    3. Set the property Lua script function arguments to LuaFlowFileDocument, LuaProcessorSession.
    4. Click ADVANCED.

      The advanced configuration page opens.

    5. In the Lua Samples area, click Reading and writing a FlowFile document > Parse XML from the content file(name), and return new documents.
    6. Copy the example script into the Lua code area.

      The script uses the parse_document_xml function to parse the input file. If the incoming FlowFile contains a filename, this is passed directly to the function. If the incoming FlowFile contains an embedded file, the data is read and passed to the parse_document_xml function as a string.

    7. At the beginning of the Lua script, modify the values in the xmlParams table so that they are suitable for your XML. For example, the document_root_paths option is a list of paths to elements that represent the root of a document in the input XML. For more information about these options, refer to the documentation for the parse_document_xml function.
    8. Click SAVE and then close the advanced configuration page.
  5. Connect the "returned" relationship of the ExecuteDocumentLua processor to your ingestion pipeline. The resulting documents are output to the "returned" relationship because they are explicitly returned from the handler function in the Lua script.

    The original FlowFiles that were routed to the ExecuteDocumentLua processor are routed to the "success" relationship. You can auto-terminate this relationship to avoid indexing a document containing all of the original XML.

  6. Start the GetFileSystem and ExecuteDocumentLua processors.


_FT_HTML5_bannerTitle.htm