SharePoint connection

If you will be creating repositories that process SharePoint data, you must complete additional tasks to enable processing by the processing agent.

Create SharePoint user

File Analysis Suite performs processes in SharePoint through a SharePoint user. The SharePoint user is used by the processing agent to connect to and perform actions in your SharePoint environment related to processing data. A SharePoint user is also necessary to send data managed by File Analysis Suite to a defined location in SharePoint. The exact level of permissions required depends on the actions you will be performing through File Analysis Suite.

  • To process and collect items from SharePoint source, the SharePoint user must have a minimum permission of Reading.

  • To delete items from a SharePoint source, the SharePoint user must have a minimum permission of Collaboration.

  • To process, collect, and delete items from OneDrive, the SharePoint user must be added to the Site Collection Administrators for the OneDrive site to be processed.

  • To send items to a SharePoint target, the SharePoint user must have a minimum permission of Collaboration.

When creating a source, you will define the credentials for the SharePoint user that will be used to process data (read, collect, delete). When creating a target, you will define the credentials for the SharePoint user that will be used to send data to SharePoint. For auditing in SharePoint, you can choose to create two separate users—one for processing and one for sending to targets.

For the SharePoint version supported, see the supported data resources under "Agent installation and configuration".

Update the logon account for FAS Services

On each processing agent host machine in a cluster assigned to process SharePoint documents, the log on account for the FAS services must be in the host's Administration group and can logon as a service.

Complete this task after the agent is installed on the host machine and the necessary SharePoint requirements are in place.

SharePoint processing

When processing Microsoft Office items from SharePoint, File Analysis Suite uses the last modified date in the de-duplication calculation. Due to the way modified dates are handled in Office items and in SharePoint (including OneDrive), File Analysis Suite will not identify documents with different dates as duplicates.

  • When an Office item is uploaded to a SharePoint or OneDrive site web interface, the item's modified date is changed.

  • When an Office item is added to a local system and is then synchronized to the SharePoint site, the item's modified date is not changed.

SharePoint Lists are comprised of form records, called items in SharePoint, that contain various text fields and can have attachments. When deleting SharePoint files, File Analysis Suite does not delete attachments to items from SharePoint Lists.

SharePoint item counts

The document and item counts in File Analysis Suite may differ from the "item" count as seen in the SharePoint site interface. This difference relates to the following.

  • In File Analysis Suite, a document is an original file processed by File Analysis Suite and an item is an attachment to an original file. In SharePoint, an item is a row in a table, or a record in a database and a document is a type of item.

  • In SharePoint, item counts are derived from the total number folders, documents, and items (each entry in a SharePoint Item List). In File Analysis Suite, document counts are derived from the total number of documents from SharePoint Document Libraries and attachments in a SharePoint Item List. File Analysis Suite does not process the field, or entry, in an Item List, only the attachments from the Item List.

    For example, if SharePoint Item has zero attachments, SharePoint records this as one item. If a SharePoint Item has 10 attachments, SharePoint also records this as one item.

  • The item count listed on the SharePoint Site Contents page for libraries includes all items in the library, including folders. Folders in which files exist in SharePoint are not included in counts in File Analysis Suite.

  • When processing SharePoint content, File Analysis Suite does not process SharePoint library items that include the UIVersion field. These SharePoint items are SharePoint UI elements and are skipped. For example, the Form Templates and List Template Gallery library items are UI elements and therefore not processed by File Analysis Suite. However, when viewing the SharePoint Site Contents page, these SharePoint items are included in the item count.

SharePoint deletion tracking

File Analysis Suite tracks the deletion of managed SharePoint items using the SharePoint change logs. Each time processing is run on a repository—on a schedule, or on demand—File Analysis Suite checks the SharePoint logs for deleted items. For each managed item that is deleted in SharePoint, File Analysis Suite deletes that item from the File Analysis Suite index. If an item within a container file (such as ZIP) is deleted in SharePoint, the item is removed from the index as part of updating the container file when the File Analysis Suite job run occurs.

To ensure accurate tracking of items deleted from SharePoint, ensure that the SharePoint repositories in File Analysis Suite are updated more often than the maximum number of days SharePoint logs are kept. For example, if your SharePoint logs are configured to be stored for 60 days, verify that your SharePoint repositories are updated at least every 59 days.