3.1 Windows File System

3.1.1 Table Relationships

Windows File System Metadata

The collected scan data is generally broken down into three major areas: Identity System info, File System data, and Permissions data.

For general file system metadata collection, only file system data is collected, along with minimal identity system data pertaining to file and folder owners.

Windows File System Permissions

NTFS Permissions data is limited to folder structure as well as assigned and inherited NTFS access control entries (ACEs).

It should be noted that permissions scans do not include metadata specific information such as directory quota, nor do they include any file-entry data that is not a folder. Only permissions for folder, share, and DFS entries are currently collected.

3.1.2 Scoping and Filtering

Scoping is the process by which selected data is limited to areas of interest. Areas of interest may include all file system data related to a specific identity system, or only data within one or more subdirectories. Additionally, data could be scoped as it relates to a given owner or trustee.

Scope by Identity System

Scoping by identity system is as simple as limiting a query to a specific srs.identity_system.id value, or using one of the supported srs.current_* views, a specific identity system name.

The following example selects file system data from a given identity system, limited to 100 entries.

Example (SQL Server)

Example (PostgreSQL)

Scope by Server

Scoping by server is as simple as filtering by the server column in the srs.scan_targets table or in one of the supported srs.current_* views.

Also note that the server name may be case sensitive depending on the database collation.

The following example selects all file system data from a specific server, limited to 100 entries.

Example (SQL Server)

Example (PostgreSQL)

Scope by Scan Target

Scoping by scan target is useful where a specific CIFS share name or DFS target is known.

Note that the scan target name may be case sensitive depending on the database collation.

Example: Select file system data from a particular scan target (share or volume) limited to 100 entries.

Example (SQL Server)

Example (PostgreSQL)

Scope by Directory

Scoping by a particular directory or folder requires the use of the hierarchical markers in the srs.scan_data table.

These markers assist with determining parent and child folders as well as all subordinate file system entries for a given directory or set of directories.

Field

Description

Notes

idx

Entry index.

Unique per scan.

parent_idx

Index of parent directory, share or DFS name space entry.

All sibling file system entries will have the same parent index.

path_depth

Current path depth relative to root path.

The root path is always depth zero (0).

Other paths such as shares may have the same depth as the root path, but can be distinguished by path_type.Entries occurring above the root path (such as DFS name spaces) will have a negative value.

ns_left ,

ns_right

Nested set indexes for current entry.

Nested set markers provide a quick way to determine all subordinates for a given directory. See examples below for detail.

The following example selects all NTFS file system entries subordinate to and including the specified target path.

Example: Scope by Directory

In this example, we are using two SELECT statements: one to get the information for the desired root path, and one to pull all subordinate entries along with the root path. Notice how the JOIN filter in the second SELECT statement uses not only the scan_id to limit the particular scan(s) of interest, but also uses the ns_left and ns_right fields to keep the data set limited to file entries in the folder hierarchy.

In the following diagram, an example of the nested set model calculations are shown with an example structure under \\Server\Share. In this example, exactly 1,000 file system entries exist, including files, folders, and the share itself.

Figure 3-1 Nested Set Calculations Example

For each node in the scanned file structure, a left (ns_left) and right (ns_right) value are assigned. The values are assigned by traversing the imaginary path from the root down the left side of the structure, incrementing the ns_left values by one. Once a leaf node is encountered, the incrementing value continues, but is now assigned to ns_right.

This process continues until the entire graph of the file structure has been traversed, and the root path is finally assigned the last number for its ns_right value.

The nested set model has the following characteristics, some of which are vital to hierarchical processing, such as determining subordinate objects:

  • The root path will always have an ns_left value of 1 and an ns_right value of 2n, where n = the total number of entries.

  • For any given container object (folder, share, etc.), all subordinate entries can be found by searching for all objects in the scan having an ns_left value greater than the container path’s ns_left value, and an ns_right value less than the container path’s ns_right value.

  • Nested set is generally the fastest method available in relational data models for retrieving all subordinate objects when representing hierarchical data.

For more information on the nested set model, see http://en.wikipedia.org/wiki/Nested_set_model.

3.1.3 File System Target Paths

Starting with File Reporter 4.1, users may now define and manage a Custom Query report's selected target paths via the report definition itself, separate from any associated SQL queries.

This process is accomplished via a temporary table that is injected into the SQL query session at runtime when using any of the File Reporter tools such as Report Designer or the SQL query editor in the File Reporter web application for Custom Query reports.

Newer report templates available on the File Query Cookbook site (https://filequerycookbook.com) make use of this construct which provides a more hands-off approach for users not as comfortable with modifying SQL queries directly but who need the flexibility to define and change a report's file system target paths.

Example Query

To understand this process, the following example illustrates a custom query that reports on NTFS file system permissions for one or more target paths selected with the File System Target Paths dialog in Report Designer.

IMPORTANT:SQL Server requires a hash '#' prefix when referencing temporary tables.

When using SQL Server as the backend database, be sure that any references to tmp_cq_fs_paths in your SQL queries are changed to #tmp_cq_fs_paths instead.

Conversely, PostgreSQL cannot use hash marks '#' as part of the table name, so be sure that this prefix does not exist in your SQL queries when using PostgreSQL as the backend database.

  1. Launch the File Reporter Report Designer application and create a new empty report.

    See Creating a Report in the File Reporter 4.1 Client Tools Guide for details.

  2. Depending on the database in use, enter one of the following SQL queries into the SQL query editor dialog.

    Example (SQL Server)

    Example (PostgreSQL)

  3. Click Save to save the SQL query.

  4. Click File System Paths to open the File System Target Paths dialog.

  5. Select one or more paths to report on, then save the selection.

    Be sure to select paths that are marked as having Permissions scan data available as seen in the File System Target Paths dialog.

  6. Click Execute Query to run the SQL query and see the results.

Using Alternate SQL Query Editors

When developing a SQL query for a Custom Query report, you may wish to develop the query itself in a SQL query editor of your choice, such as SQL Server Management Studio (SSMS) or PgAdmin for PostgreSQL.

In these development environments, the injected temporary table is not available by default. To stage the temporary table, use the following approach.

IMPORTANT:Although any existing report definition may be used as a reference, we strongly advise creating a new Report Definition and using its associated ID.

This process allows flexibility for changing the selected target paths during the query design phase without impacting other report definitions.

  1. Create a new Custom Query Report.

    See Creating a Report in the File Reporter 4.1 Client Tools Guide.

  2. Assign one or more File System target paths to the report definition.

    See File System Paths Selector in the File Reporter 4.1 Client Tools Guide.

  3. Find the report ID for the newly created report.

    1. In the Main form of the Report Designer, find the name of the newly created report definition.

    2. The column at the far right of the grid indicates the ID for each report. Make note of the new report definition's ID number.

  4. Insert the following SQL code at the start of the query.

    Example (SQL Server)

    Example (PostgreSQL)

  5. Be sure to change the example's report ID of "17" to the report ID identified from the previous step.

  6. Add SQL statements as needed to complete the query.

  7. When the SQL query development is complete, copy all of the SQL statements into the Custom Query report definition except for the initial lines used to stage the temporary table.

    Using the example query from earlier, a complete query using a staged temporary table with an alternate SQL query editor looks as follows:

    Example (SQL Server)

    Example (PostgreSQL)