SharePoint connection

If you will be creating sources and datasets that process SharePoint or SharePoint Online (O365) data, you must complete additional tasks to enable processing by the processing agent.

Complete the tasks for SharePoint or SharePoint Online as appropriate for the data to be managed by OpenText Core Data Discovery & Risk Insights. If you will process data by both SharePoint and SharePoint Online, complete the tasks for both implementations.

For the SharePoint version supported, see the supported data resources under Agent installation and configuration.

SharePoint connection tasks

Complete the following to process data from SharePoint.

  1. Update the logon account for OpenText Core Data Discovery & Risk Insights services

SharePoint Online connection tasks

OpenText Core Data Discovery & Risk Insights requires non-user based access to SharePoint Online using either the Microsoft Entra ID access method (recommended) or the SharePoint app-only access method.

CAUTION: Microsoft is ending support for the SharePoint app-only authentication method. This authentication method will stop working for new SharePoint tenants as of November 1st, 2024; it will stop working for existing tenants and be fully retired as of April 2nd, 2026.

For more information, see Azure ACS retirement in Microsoft 365.

Configure web proxy settings (optional)

The SharePoint processor service controlled by the processing agent requires connectivity to the OpenText Core Data Discovery & Risk Insights cloud components, often located away from the local network where the agent host servers are located. Although direct connectivity is ideal, use of a web proxy may be required in some environments for the agent systems to reach the OpenText Core Data Discovery & Risk Insights cloud.

NOTE: Authenticated proxies are supported for SharePoint Online (O365) only.

SharePoint processing

When processing Microsoft Office items from SharePoint, OpenText Core Data Discovery & Risk Insights uses the last modified date in the de-duplication calculation. Due to the way modified dates are handled in Office items and in SharePoint (including OneDrive), OpenText Core Data Discovery & Risk Insights will not identify documents with different dates as duplicates.

  • When an Office item is uploaded to a SharePoint or OneDrive site web interface, the item's modified date is changed.

  • When an Office item is added to a local system and is then synchronized to the SharePoint site, the item's modified date is not changed.

SharePoint Lists are comprised of form records, called items in SharePoint, that contain various text fields and can have attachments. When deleting SharePoint files, OpenText Core Data Discovery & Risk Insights does not delete attachments to items from SharePoint Lists.

SharePoint item counts

The document and item counts in OpenText Core Data Discovery & Risk Insights may differ from the "item" count as seen in the SharePoint site interface. This difference relates to the following.

  • In OpenText Core Data Discovery & Risk Insights, a document is an original file processed by OpenText Core Data Discovery & Risk Insights and an item is an attachment to an original file. In SharePoint, an item is a row in a table, or a record in a database and a document is a type of item.

  • In SharePoint, item counts are derived from the total number folders, documents, and items (each entry in a SharePoint Item List). In OpenText Core Data Discovery & Risk Insights, document counts are derived from the total number of documents from SharePoint Document Libraries and attachments in a SharePoint Item List. OpenText Core Data Discovery & Risk Insights does not process the field, or entry, in an Item List, only the attachments from the Item List.

    For example, if SharePoint Item has zero attachments, SharePoint records this as one item. If a SharePoint Item has 10 attachments, SharePoint also records this as one item.

  • The item count listed on the SharePoint Site Contents page for libraries includes all items in the library, including folders. Folders in which files exist in SharePoint are not included in counts in OpenText Core Data Discovery & Risk Insights.

  • When processing SharePoint content, OpenText Core Data Discovery & Risk Insights does not process SharePoint library items that include the UIVersion field. These SharePoint items are SharePoint UI elements and are skipped. For example, the Form Templates and List Template Gallery library items are UI elements and therefore not processed by OpenText Core Data Discovery & Risk Insights. However, when viewing the SharePoint Site Contents page, these SharePoint items are included in the item count.

SharePoint deletion tracking

OpenText Core Data Discovery & Risk Insights tracks the deletion of managed SharePoint items made at the original source location using the SharePoint change logs. Each time processing is run on a dataset—on a schedule, or on demand—OpenText Core Data Discovery & Risk Insights checks the SharePoint change logs for deleted items. For each managed item that is deleted in SharePoint, that item is deleted from OpenText Core Data Discovery & Risk Insights. If an item within a container file (such as ZIP) is deleted in SharePoint, the item is removed from the application as part of updating the container file when the job run occurs.

To ensure accurate tracking of items deleted from SharePoint, ensure that the SharePoint datasets in OpenText Core Data Discovery & Risk Insights are updated more often than the maximum number of days SharePoint change logs are kept. OpenText Core Data Discovery & Risk Insights uses information from the SharePoint change logs to identify deleted SharePoint items to be removed from the application. Without this information, items deleted from your SharePoint environment cannot be removed from the application. SharePoint items that have been added or modified are appropriately updated in OpenText Core Data Discovery & Risk Insights.

For example, if your SharePoint change logs are configured to be stored for 60 days, verify that your SharePoint datasets are updated at least every 59 days.

CAUTION: Failure to rescan SharePoint datasets before SharePoint change logs are purged will result in items being tracked incorrectly in OpenText Core Data Discovery & Risk Insights. Using the same example, if your SharePoint change logs are configured to be stored for 60 days and your SharePoint datasets are updated every 90 days, you will lose 30 days of important information about deleted items—items deleted during this 30 day time frame will not be removed from the application.

The loss of information cannot be reconciled in OpenText Core Data Discovery & Risk Insights; you would have to create a new dataset and start over.