Uses for Clusters from Snapshots

This section describes some of the common uses of clusters created using snapshots and the IDOL Cluster actions.

What’s New

The recommended workflow produces the clusters that are present in the data set at a particular point in time. If you want to see new trends and clusters that have recently appeared, you can add the WhatsNew parameter to the ClusterCluster action. In this case, IDOL identifies clusters that are present in one snapshot (the most recent snapshot with a particular name), but not in another (the oldest snapshot with that name).

You can also specify a time frame in the ClusterCluster action, and it then uses the newest and oldest snapshots with the specified name in that time frame.

To create What’s New data, you must have at least two snapshots with the same name. To make best use of the What’s New function, Micro Focus recommends that you take regular snapshots of your index.

Spectrographs

You can use snapshots to create spectrographs. A spectrograph is a visual representation of the clusters present in your documents. It displays how clusters change and grow, by joining clusters together from adjacent time periods, using colored lines of varying widths. The width of a line indicates the number of documents in the cluster, and the brightness of the color of the line shows how important the cluster is.

You can overlay cluster data on the spectrograph, to create a user-friendly representation. It quickly shows the clusters that are present, and which of them are or were most important, and shows the development of the clusters in a particular time frame.

To create a spectrograph you must have at least two snapshots, because it performs comparisons between cluster results sets, to show clusters that have changed or disappeared, as well as new clusters. These snapshots must contain identifiable clusters, according to the configured BindLevel.

For the best spectrograph results, Micro Focus recommends that you use a large collection of snapshots. For example, you can set up a schedule to create a snapshot every day as your data is updated.

Maps

You can use clusters to generate maps. A map is another visual representation of clusters, but it displays only a single point of time. On a map, a cluster is represented by a block of color. The distance between clusters on the map represents the similarity of those clusters; a smaller distance implies greater similarity.

To create a map, there must be at least one cluster in a results set. If there are no clusters, IDOL produces a blank map. In this case, apply the troubleshooting steps to produce more clusters.

You can generate maps on demand, as part of a query, for example by using XML data from a query result. Additionally, you can create three-dimensional maps as well as two-dimensional ones.

Automatic Category Discovery

When you find a particularly interesting trend, you can import a representative cluster as a category, to enable future monitoring. IDOL uses the documents in the cluster as the category training, and it uses the cluster title as the category name. The CategoryImportFromCluster allows you to import the specified cluster.

In this way, users can automatically discover and create categories, without needing to review the documents in the index. The cluster documents have a high level of conceptual similarity, so the category training is usually of good quality, particularly if you have set a high BindLevel.


_FT_HTML5_bannerTitle.htm