Import Tasks

Import Tasks are processing tasks that are performed on documents by CFS, before the documents are indexed into IDOL Server. Import Tasks enable you to manipulate and enrich the documents that are created by CFS.

CFS includes Import Tasks that meet common processing requirements. For example, there are Import Tasks to filter advertisements out of HTML files, or divide document content into shorter sections.

Write documents to disk

You can use the IdxWriter and XmlWriter tasks to write documents to disk in IDOL IDX or XML format. This allows you to view the information that is being indexed into IDOL Server, so that you can check the information is being indexed as you expected. If necessary, you can then use other import tasks or custom Lua scripts to manipulate and enrich the information.

The CsvWriter and JsonWriter tasks write documents to disk in CSV or JSON format. You can also use the SqlWriter task to write document metadata and content to disk in the form of SQL "insert" statements, so that you can insert the information from the documents into a database.

Manipulate and enrich documents

You can use import tasks to enrich documents, without needing to write custom scripts. For example, you can:

Validate and reject documents

You can use import tasks to reject documents that you do not want to index into IDOL server. For example, the BadFilesFilter task rejects documents that do not contain valid content. When a document is rejected, it is not processed further and is not indexed into IDOL. However, you can index the document into an IDOL Server that has been configured to handle failed documents.

Run a Lua Script

The Lua task runs a Lua Script. Lua is an embedded scripting language that you can use to manipulate documents and define custom processing rules. CFS includes Lua functions for manipulating documents and running other tasks. For example, you can add, modify, or remove fields and their values.

Configure Import Tasks

Import tasks are configured in the [ImportTasks] section of the CFS configuration file.

You can run Import Tasks before or after documents are processed by KeyView. Pre import tasks run before KeyView processing. Post Import tasks run after KeyView processing.
