Manage sources

In File Analysis Suite, a source defines the initial connection to a specific data platform through a selected agent cluster. A repository defines the subpath on the source, the rules and schedule for processing, and the entities to identify within the data during processing.

You must create at least one agent cluster before you can create sources and associated repositories. Because a repository defines a subpath on a source, you must create a source before you can create an associated repository.

On the Manage Sources page, you can filter the list of sources by source TYPE and AGENT CLUSTER, or search for a source by name.

From the Manage Sources page, you can view additional information about each source.

  • Hover over or click in the row for a source to display action icons to edit (edit icon) and delete (delete icon) the source. For sources with documents, you can go to the data volume chart (data volume chart inlne icon), focused on the source. For file system and SharePoint sources containing data, you can go to the sensitive data heat map (sensitive data heat map inline icon) in Manage, focused on the source.

  • Click anywhere in the row for the desired source and then click the open detail panel icon (open detail panel icon) to display the source details.

    From the detail panel, you can edit and delete the source. For sources with documents, you can go to the data volume chart (data volume chart detail panel icon), focused on the source. For file system and SharePoint sources containing data, you can also go to the sensitive data heat map (sensitive data heat map detail panel icon) in Manage, focused on the source.

The following types of source are supported.

Source type Version or platform supported
File System CIFS/SMB2.0 shares
Exchange 2016, 2019, Office 365
SharePoint 2016, Office 365
Content Manager

9.4

IMPORTANT: Only Microsoft SQL Server RDB datasets are supported at this time.

Google Drive

not applicable

TIP: A source is associated with a Google Workspace for the domain that includes the desired users' drives. Repositories on the source are associated with a single user account Google Drive.

Review the following tasks and considerations for each source type:

  • For all source types, at least one agent cluster must exist prior to creating source. Selection of an agent cluster is required when you create a source.

  • For all source types, keep in mind that processing of data does not occur at the source level, only at the repository level.

  • For all source types, you have the option to limit access in File Analysis Suite by granting only specific users or groups access to the source.

    CAUTION: If limiting access to a source and an underlying repository, users without access will not be able to view workspaces with a data source that includes the repository or view individual items that originated in the repository.

  • For Exchange sources:

    • See Exchange connection to complete additional tasks required for processing data from Exchange.

    • The Exchange source uses an agent to connect to the mail server to process new items, as well as items that already exists on the mail server. This method processes items based on user mailboxes and therefore includes folder information and is subject to user action (such as delete).

    • Exchange processing is based on Active Directory (AD) groups or data subject association with a workspace. Before creating Exchange sources, review your current AD groups. You may need to create more encompassing groups comprised of existing groups in order to apply a default action (at the repository level), such as indexing only metadata or sampling a percentage of data, across a larger portion of your employees.

      Avoid having users in multiple groups defined in repositories.

  • For File System sources,

    • Ensure that your CIFS shares can be accessed by the "System" account.

    • If your file system allows long paths, the machine hosting the processing agent must also be enabled for long paths.

  • For SharePoint sources, only the latest revision of a document is processed. See SharePoint connection to complete the tasks necessary for connection.

  • For Content Manager sources, see Content Manager integration to complete the tasks necessary for connection.

  • For Google Drive sources,

    • A repository connects to a single user's Google Drive. To process items for multiple users, create a Google Drive repository for each desired user. See Google Drive connection to complete the tasks necessary for connection.

      NOTE: File Analysis Suite supports processing of data from Google Workspace's Drive; data from personal Google Drives is not supported.

    • Shortcut files that exist on Google drives are not processed.

You can remove the connection to a source ("delete" the source) if there are no repositories associated with the source.