Process external metadata

When processing items in file system datasets, you can choose to have Fusion capture external metadata that is not normally indexed. You can process external metadata for items regardless of whether you store the item's content.

Fusion uses a PowerShell script to instruct the agent how to process external metadata. The script, the External metadata file processor, is defined in the primary capture rules of the file system dataset. You can capture the external metadata dynamically and scan every item in the dataset, or use extracted metadata files stored alongside the original items in your file system.

When creating a file system dataset,

  • The external metadata file processor must be a PowerShell script (.ps1 file extension) located in the path defined by the ScriptBasePath in the Agent Admin UI under Advanced Settings > Run Script. By default, this is the \Agent\Scripts directory of the Fusion installation path (for example, C:\Program Files\Fusion\Agent\Scripts). When creating the dataset, you will specify the file name of the metadata file processor.

    A sample file, ExternalMetadata.ps1, is provided with Fusion and is located in the \Agent\Scripts directory of the Fusion installation path (default is C:\Program Files\Fusion\Agent\Scripts).

  • Select Dynamic as the external metadata processing type to scan each item in the dataset and is captures properties of the item (such as last access time or author), inferred from its context (such as the folder it is in), or a third- party system. This option does not require extraction ahead of time, but may require additional processing time.

  • Select From file as the processing type and define the extension of the files that contain the extracted metadata. The external metadata files are not processed as part of the dataset primary capture rules.

    For example, you use an application to extract content from images, scanned documents, audio files, and video files. This content is saved in metadata files that are directly associated to the original items, such as image001.tiff (original item) and image001.tiff.idx (extracted metadata file). When viewing a document with processed external metadata in Analyze or Manage, the metadata displays in the document view panel in a section labeled "Metadata Text Content".

    • The external metadata files that contain the metadata information must be located in the same directory as the original document from which the metadata was extracted. When creating the dataset, you will define the file extensions for the files that contain the external metadata.

    • The external metadata files should provide well formatted, human readable text. The external metadata text is used as is for identifying and extracting grammar values, tagging, keyword searching, and document preview. Well formatted text such as the use of key value pairs results in better output of processing and searches.

Sample external metadata file processor

Fusion includes a sample external metadata file processor, ExternalMetadata.ps1. The sample file includes configurations to read the metadata information from the external metadata files and provides information for how to capture external metadata dynamically. Modify the sample file processor script to suit your needs or create a new processor file using the sample as a base. The script adds the metadata to the index for the associated original documents.

When "External metadata capture" is enabled for a file system dataset, the Fusion processing agent looks for and reads the external metadata file processor script and takes action accordingly.