Workbooks
Workbooks are the basic tools for organizing data items for review. Workbooks contain data items that are either manually added or are added based on a set of criteria.
You can create the following types of workbooks.
-
A static workbook (
) represents a set of items that you have manually gathered. You can create an empty workbook and then add items to it, or create the workbook and add items to it at the same time from a document list. You can continue to add items to this workbook as needed.
-
A query workbook (
) represents a set of items based on defined search criteria, at a point in time. The search criteria you used to gather the items is saved as part of the workbook, but cannot be edited. The search query is performed once, at the time the workbook is created. Similar to a static workbook, you can continue to manually add items to this workbook as needed.
TIP: Since the query is performed at the time the workbook is created, ensure that at lease one dataset has been selected for the workspace. If there are no datasets, the query cannot return items and the workbook will be empty.
-
A dynamic query workbook (
) represents a set of items based on defined search criteria gathered on an ongoing basis. The search is performed at the time the workbook is created and as new items are processed or existing items are reprocessed; matching items are added to the workbook. You can edit the search criteria as desired and, as a result, the actual items within a dynamic workbook can fluctuate.
-
(Unstructured data only) A task workbook (
) represents a set of items gathered as a result of a specific action, such as de-duplication or random sampling. The defined action is taken on items associated with selected datasets and performed once at the time the workbook is created. View the task history on the Activity tab of the workbook detail panel. Similar to a static workbook, you can continue to manually add items to a task workbook as needed. For more information, see Task workbook.
Workbook considerations:
-
You can create a workbook based on a workbook template. When you do so, the Category, Type, and Criteria Summary are pre-populated based on the template. You can make changes to these default selections when you create a workbook based on the template. See Workbook templates.
-
A workbook cannot be associated with multiple categories at the same time. It may either have no category associated with it, or a single category. See Categories.
-
When you delete a workbook in Fusion, the documents associated with the workbook are not actually deleted. The specific grouping of the documents within the workbook is deleted. The documents can be reassembled using a search query, or by pulling them from other workbooks within the workspace.
-
You cannot delete a workbook while actions or processes are in progress.
Once created, you can open the workbook detail panel () to view additional information and to initiate actions on the workbook.
NOTE: From time to time, the format of a workbook may change. If the workbook format has changed, a warning icon () displays next to the workbook name in the detail panel. Hover over the warning icon to learn more and, if necessary, how to update the workbook.
-
On the GENERAL tab, view the details about the assigned category, the workbook type, and when the workbook was created, last modified and by whom, and the total size of documents in the workbook. You can also view the total number of metadata-only documents, analyzed documents, collected documents, items with grammar values extracted, documents on hold, items exported, documents protected, documents sent to a target, shared content, and documents deleted. Hover over the document count bar charts to view the total file size—item counts do not include the hover-over file size information because a data set may include both the parent and the child items and therefor skew the actual file size.
Click the bar chart to go to the Content tab to view the items or documents related to the selected data set. The shared content metric identifies the number of documents in the workbook (number on the right) and the total number of shared items in the workbook that exist in other workbooks within the workspace.
Click the buttons to view the workbook contents, edit the workbook, export the documents to a defined export location, or delete the workbook.
-
(Unstructured data only) On the ACTIVITY tab, initiate file analysis tasks such as store content, OCR, and extract grammar values, as well as file actions such as data collection, holds, send data to a target, data protection, and deletion of documents. For more information about the actions you can initiate on workbooks, see Manage workbook activity.
TIP: You have access to features and functions based on your assigned permissions. You will only see features and functions that have been enabled for the workspace you are viewing and that you have permission to see.
You can also view the history and status of any actions that have been initiated for the workbook.
-
On the DATA PRIVACY tab, view the weighted risk score and the tags applied to documents within the workbook.
On a scale of 0-100, the risk score represents the level of risk and is calculated based on the sensitive tags associated with the documents in the workbook relative to the respective weight (associated weighted labels). The higher the risk score the higher the risk. A risk score of 0 does not mean no risk, it means that no tags associated with weighted labels were identified.
-
On the workspace Overview page, click the WORKBOOKS tab.
-
Click NEW WORKBOOK.
TIP: If no workbooks exist in this workspace, click Create a New Workbook.
The New Workbook dialog opens.
-
Complete the general options for the new workbook.
Option Description Name Type a unique, meaningful name for the workbook.
NOTE: Duplicate workbook names are not allowed within the same workspace.
Limits: Maximum 50 characters.
Template Select whether to base this workbook on a defined template.
-
To define the workbook from scratch, select None.
-
To define the workbook from a template, select the desired template from the list. The values defined for options in the template are pre-populated in this dialog. You can edit these values if desired.
NOTE: This option defaults to None. Make sure the desired option is selected.
Category Select the desired category to apply to this workbook.
Default: None, or the category defined as part of the selected workbook template.
Type Select whether this is a Static, Query, or Dynamic Query workbook. Description Type a meaningful description for the workbook.
Limits: Maximum 250 characters.
Click NEXT.
-
-
(Query Workbook, Dynamic Query Workbook) Complete the criteria for the new workbook.
-
Click Select to select the criteria for the dynamic workbook. The criteria dialog opens.
-
Click the add icon (
) to add the desired criteria and complete the details as applicable.
Query and dynamic query workbook criteria syntaxField Description Search examples Text Keywords Type the desired keywords.
TIP: Click the add icon to add an additional keyword field.
Matches all indexed text fields and content.
The keyword search supports Boolean syntax, wildcards, phrases and proximity searches.
NOTE: The text keyword search is ANDed with any additional options selected in the search builder.
(“company declared bankruptcy” AND "Smith and Jones") OR "ric?ard smith*" ANY of the following Add one or more criterion and define the specific values to be matched.
Each entry is combined with the OR operator.
To add grammar rules-
Click the GRAMMARS tab and expand the desired grammar class or type.
-
Click the desired grammar rules.
-
Continue to expand grammar classes and rules and then click grammar rules as needed.
-
Click ADD.
The selected grammar rules are added to the Search Builder.
To add additional grammar rules for a grammar class or type already selected-
Click the list icon (
) for the selected grammar class or type.
-
Click the additional desired grammar rules or click Select All to add all grammar rules for the grammar class or type.
-
Click ADD.
The selected grammar rules are added to the Search Builder.
-
Create Date > By date range From <date> to <date> OR File types TXT, XLS.
-
(“company declared bankruptcy” AND "Smith and Jones") AND Create Date > By date range From <date> to <date>
ALL of the following Add one or more criterion and define the specific values to be matched.
Each entry is combined with the AND operator.
To add grammar rules-
Click the GRAMMARS tab and expand the desired grammar class or type.
-
Click the desired grammar rules.
-
Continue to expand grammar classes and rules and then click grammar rules as needed.
-
Click ADD.
The selected grammar rules are added to the Search Builder.
To add additional grammar rules for a grammar class or type already selected-
Click the list icon (
) for the selected grammar class or type.
-
Click the additional desired grammar rules or click Select All to add all grammar rules for the grammar class or type.
-
Click ADD.
The selected grammar rules are added to the Search Builder.
-
Create Date > By date range From <date> to <date> AND File types TXT, XLS.
-
(“company declared bankruptcy” AND "Smith and Jones") AND Create Date > By date range From <date> to <date> AND File types TXT, XLS.
For multiple-value fields, items of each metadata type in the list are combined with the OR operator, then combined with any other metadata fields with the AND operator.
-
(Create Date > By date range From <date> to <date>) AND (File types TXT OR XLS) AND (Holds "myHold1" OR "Hold2")
-
(Create Date > By date range From <date> to <date>) AND (Tags "tag01" OR "tag02")
In this example, "tag01" and "tag02" are defined in a single Tags field.
-
(Create Date > By date range From <date> to <date>) AND (Tags "tag01") AND (Tags "tag02")
In this example, "tag01" and "tag02" are defined in separate Tags fields.
NONE of the following Add one or more criterion and define the specific values that, if matched, are excluded from the results.
Each entry is combined with the OR operator.
To add grammar rules-
Click the GRAMMARS tab and expand the desired grammar class or type.
-
Click the desired grammar rules.
-
Continue to expand grammar classes and rules and then click grammar rules as needed.
-
Click ADD.
The selected grammar rules are added to the Search Builder.
To add additional grammar rules for a grammar class or type already selected-
Click the list icon (
) for the selected grammar class or type.
-
Click the additional desired grammar rules or click Select All to add all grammar rules for the grammar class or type.
-
Click ADD.
The selected grammar rules are added to the Search Builder.
Selecting NONE of the following excludes the criteria from the search results.
NOTE: When Group Attachments is selected in the filter panel, the results list may include parent items normally excluded by this selection. If an attachment meets the criteria for this selection and all other selected criteria, these parent items may be included in the results.
(“company declared bankruptcy” AND "Smith and Jones") NOT File types TXT, XLS. -
-
Click SAVE. The criteria dialog closes and the selected criteria displays in the Summary field.
Click NEXT.
-
-
Review the summary of the new workbook options.
Click FINISH.
The new workbook is created.
-
From the contents tab of a workspace or a dynamic workbook, conduct a search and review the search results document list.
-
At the top of the data list, click
> Save As.
The Save As a dynamic workbook dialog opens.
-
Complete the details of the new dynamic workbook.
Option Description Name Type a meaningful, unique name for the new dynamic workbook.
Duplicate workbook names are not allowed within the same workspace.
Limits: Maximum 50 characters.
Description Type a meaningful description for the new dynamic workbook.
Limits: Maximum 250 characters.
Category Select a category to assign to the dynamic workbook or select No Category.
Criteria Review the summary of the search criteria. This option is based on the search conducted and is not editable. -
Click SAVE.
The dynamic workbook is created and displays in the list of workbooks for the workspace.
-
On the Workbooks page within a workspace, click or hover over the row for the desired workbook and then click the edit icon (
).
TIP: You can also click in the row for the desired workbook, open the detail panel (
) and then click EDIT.
The Edit Workbook dialog opens.
-
Edit the workbook information as desired.
You can not change the workbook type. If you are editing a dynamic query workbook, you can edit the criteria.
-
Click OK.
The workbook is updated.
-
On the Workbooks page within a workspace, click or hover over the row for the desired workbook and then click the delete icon (
).
TIP: You can also click in the row for the desired workbook, open the detail panel (
) and then click DELETE. If Delete is dimmed, another action is in progress. You can cancel the action in-progress action or wait until the action has completed and then try again.
-
In the confirmation dialog, click YES to confirm the action.
TIP: If another action is in progress, you will receive a message that the workbook cannot be deleted. Click NO to close the dialog. You can cancel the action in-progress action or wait until the action has completed and then try again.
The workbook is deleted. Any documents that were in the workbook remain in the workspace and in any other workbooks to which they were associated.
From the list of workbooks, do one of the following from the list of workbooks on the Workbook page within a workspace.
-
To view a list of all data associated with the workbook, click the workbook name. The Content tab opens and displays the list of data associated with the selected workbook.
TIP: You can also click in the row for the desired workbook, open the detail panel (
) and then click VIEW CONTENTS.
Use the View by selection in the action menu on the Content tab to view by items or by conversations.
-
To view a list of data items within the workbook associated with data items analyzed, collected, with grammar values extracted, on hold, exported, protected, sent to a target, deleted, or that are shared, click in the row for the desired workbook and then open the detail panel (
). On the GENERAL tab, click the bar chart for the document or item set you want to view. The Content tab opens and displays the list of data items within the workbook related to the selected data set.
Task workbook
NOTE: Applies to unstructured data workspaces only.
A task workbook lets you either identify duplicate data within or across datasets or randomly sample a defined percentage of data from one or more datasets.
Once the task workbook is created, the action is taken for the selected datasets and performed once at the time the task workbook is created. You can continue to manually add items to a task workbook as needed. To view the history for the task, open the workbook detail panel and review the information on the Activity tab. You can edit the name/prefix and description for a task workbook, but not the criteria.
TIP: Both deduplication and random sampling require that at least one dataset exists within the workspace and the dataset includes documents. If no datasets exist for the workbook, create at least one dataset prior to attempting to create a task workbook—you may need to allow time for the dataset to populate with documents.
Deduplication
By identifying duplicate data, you can then determine where you have possible redundant data and take any additional actions as appropriate, such as deleting duplicates.
To identify duplicate items, you define a dataset to represent official records to compare against, or you define rules to identify master items to compare against. The identified duplicate items and all family members of those items (such as attachments or parent item) are added to the workbook.
When an item is processed by Fusion, a hash value is created. The hash acts as a fingerprint that can identify a file, excluding the file name. An item is identified as a duplicate if it has the same hash as an item identified as an official record or master item.
-
On the workspace Overview page, click the WORKBOOKS tab.
-
Click NEW TASK.
TIP: If no workbooks exist in this workspace, click Create a New Task.
The New Task dialog opens.
-
Select Deduplication as the task type.
-
Select one or both of the deduplication methods. Click the toggle to select (
) or deselect (
) the option.
-
Select Compare against official records to identify items already present in a set of official records. You will define one or more datasets and items within those datasets will be considered official records. Items in remaining datasets that are identical to the official record items are then considered copies.
-
Select Deduplicate within and across datasets to identify master items and duplicate items across and within a set of datasets based on set of master rules. Items from datasets that are identical to items in other datasets—based on dataset priority, or create or modify date—are considered duplicates.
-
-
Select Only compare root documents to compare root documents for duplicates. Click the toggle to select (
) or deselect (
) the option.
-
When enabled (selected), only root documents are compared. Email attachments and files that are inside ZIPs will not be compared.
-
When not enabled (deselected) all items, including email attachments and files inside ZIPs, are compared.
-
-
Select how the content is compared.
-
Select Full binary data to compare based on complete binary fingerprint including content and any embedded metadata.
-
Select Essential data to compare based on essential content (such as text, images and formatting). Excludes any embedded metadata not intrinsic to the file's content, such as SharePoint properties.
NOTE: This option pertains only to Office documents.
TIP: To fully deduplicate a set of documents that comprises Office and non-Office documents, create two task workspaces with the same comparison options except that one compares full binary data and one compares essential data.
-
Click NEXT.
-
-
Select datasets to compare.
NOTE: This page does not display if you did not select to compare against official records.
Select datasets to be considered as the official records. Identical items in non-selected datasets will be considered duplicates.
Click NEXT.
-
Set the master rules.
NOTE: This page does not display if you did not select to deduplicate across and within datasets.
Select how the master item is set, based on one of the following:
-
Select Dataset to determine the master item based on dataset priority.
Set the order for which datasets are compared. If an identical item is encountered in multiple datsets, the item in the lowest ranked dataset is considered a duplicate of the item in highest ranked dataset.
If you also selected to compare against official records, datasets selected as the official records are not shown. These selected datasets are considered the highest priority.
To set the order of the datasetsDo one or more of the following:
-
Click the up arrow on the move button to move the dataset up one level.
-
Click the down arrow on the move button to move the dataset down one level.
-
Click and hold the center portion of the move button and then drag the dataset card up or down to the desired location.
-
-
Select Oldest Create Date to identify the master item as the item with the oldest create date. Identical items across datasets with newer creation dates are considered duplicates.
-
Select Newest Create Date to identify the master item as the item with the newest creation date. Identical items across datasets with older creation dates are considered duplicates.
-
Select Oldest Modify Date to identify the master item as the item with the oldest modify date. Identical items across datasets with newer modify dates are considered duplicates.
-
Select Newest Modify Date to identify the master item as the item with the newest modify date. Identical items across datasets with older modify dates are considered duplicates.
Click NEXT.
-
-
Complete the task workbook information.
Option Description Name Type a unique, meaningful name for the workbook.
NOTE: Duplicate workbook names are not allowed within the same workspace.
Limits: Maximum 50 characters.
Description Type a meaningful description for the workbook.
Limits: Maximum 250 characters.
Category Select the Redundant category to apply to this workbook.
Default: None, or the category defined as part of the selected workbook template.
Click NEXT.
-
Review the summary information and then click START.
The new task workbook is created and the deduplication process starts.
Random Sampling
With random sampling, you can gather a percentage of your data into a workbook for review and then, based on that review, determine where you may have sensitive data.
Random sampling example
You sample 5% of each of your SharePoint and File System datasets in North America. Based on a review of the data gathered from that sampling, you see that you have a significant amount of personal information in your SharePoint datasets, but nothing in your File System datasets. You can now focus your time and resources on the SharePoint datasets to ensure this personal information is properly secured.
-
On the workspace Overview page, click the WORKBOOKS tab.
-
Click NEW TASK.
TIP: If no workbooks exist in this workspace, click Create a New Task.
The New Task dialog opens.
-
Select Random Sampling.
Click NEXT.
-
Set the sampling rate for each dataset within the workspace. Each dataset defaults to 10%.
TIP: For datasets that contain few documents, the number of documents sampled may not be the exact percentage. The more documents in the dataset, the more accurately the number of sampled documents represents the percentage defined.
-
To set the sampling percentage rate, click and drag the slider to the desired percentage.
As you move the slider, the estimated number of documents, out of the total number of documents in the dataset, displays.
-
If you have more than one dataset and do not want to sample all of them, set the sampling rate percentage to 0% (all the way to the left) for datasets you do not want to sample.
Click NEXT.
-
-
Complete the task workbook information.
Option Description Workbook Prefix Type a unique, meaningful prefix for the task workbook name. The defined prefix is followed by the dataset name to create the workbook name, <prefix><dataSourceName>.
Limits: Maximum 10 characters.
NOTE: Duplicate workbook names are not allowed within the same workspace.
If the original dataset name plus the prefix exceeds 50 characters, the dataset name as part of the workbook name is truncated to bring the workbook name down to 50 characters.
Description Type a meaningful description for the task workbook.
Limits: Maximum 250 characters.
Category Select the desired category to apply to this workbook.
Default: None, or the category defined as part of the selected workbook template.
Click NEXT.
-
Review the summary information and then click START.
A new workbook is created for each dataset with a sampling rate set at 1% or greater.