Manage repositories

In File Analysis Suite, a repository defines the subpath on a data source, the rules and schedule for processing, and the entities to identify within the data during processing.

Once you have created at least one source, you can create as many repositories as necessary for that source. For example, you want to process data from only specific directories on a given file system. You would create a source for the file system and then create a repository for each of the directories on that file system that contain data you want to process. This lets you focus the processing on only the desired data, omitting known irrelevant data.

On the Manage Repositories page, you can filter the list of repositories by repository TYPE and choose whether to VIEW the list sorted by sources. The analysis type is the processing conducted for the repository. The document count for each configured repository reflects the number of parent documents processed (extracted attachments are not included in the count). The document size for each repository represents the size on disk.

From the Manage Repositories page, you can view additional information about each repository.

  • Hover over or click in the row for a repository to display action icons to edit (edit icon), update (update icon), activate/deactivate (activate icon/deactivate icon), and delete (open detail panel icon) the repository. For repositories with documents, you can go to the data volume chart (), focused on the selected repository. For file system and SharePoint repositories containing data, you can go to the sensitive data heat map () in Manage, focused on the repository.

  • Click anywhere in the row for the desired repository and then click the open detail panel icon (open detail panel icon) to display repository details. The detail panel includes the options defined for the repository as well as key information about the documents in the repository.

    From the detail panel, you can edit, update, activate/deactivate, and delete the repository. For repositories with documents, you can go to the data volume chart (), focused on the selected repository. For file system and SharePoint repositories containing data, you can go to the sensitive data heat map () in Manage, focused on the repository. Click the Change link next to the schedule information to open the Edit repository dialog to the Schedule information.

    • On the METRICS tab, view the number of documents that have metadata only processed, that have been analyzed, collected, and are on hold.

    • On the GRAMMARS tab, view the grammars and entities defined for the repository.

    • On the ACTIVITY tab, view the details of the last 10 activities performed. If more than 10 activities have been performed, click the MORE link to see the full list for the repository on the Agent Activity page.

If you need to process a repository outside of the scheduled run time, you can update the repository. If you request to update a repository and the repository is currently processing, the update request is not acted upon. The update action cannot be taken while the repository is initializing following repository creation.

Once created, a repository can be deactivated and then activated as needed. A deactivated repository cannot be processed. If the repository was already processing data, no additional data is processed once the repository is deactivated. Deactivated repositories cannot be edited either. Deactivated repositories display a gray icon next to the repository name.

You can remove the connection to a repository ("delete" the repository) if there are no active data sources associated with the repository (through workspaces) and no documents associated with the repository are on hold. If the repository has associated documents, you can deactivate the repository but you cannot delete it.