Dynamic Clustering

Dynamic clustering is a type of Automatic Query Guidance (AQG), in which IDOL Server clusters the best terms and phrases in query results. With dynamic clustering, you return the document IDs that the best terms occur in, so that the documents can be considered as clusters. This method does not address all terms and phrases in the result documents.

The following section lists some considerations for using dynamic clustering. For more information about how to configure and run dynamic clustering, see Automatic Query Guidance.

Algorithmic Outline

In dynamic clustering, IDOL Server finds the best phrases and terms from the result documents that the query returns. The best of these form the cluster titles. The phrases are then clustered according to conceptual similarity, and the number of documents that contain some or all of the clustered phrases.

Things to Consider

Consider the following points when you are deciding whether to use dynamic clustering:

Results return immediately on completion. Some other approaches require a separate action to retrieve the cluster results.

Cluster results reflect the state of the index when the query runs. The results reflect any recent content. Other approaches might produce results that are already out of date because of newly indexed content, because they run clustering operations on the state of the index at a fixed point in the past. Dynamic clustering is ideal if your index is constantly being updated.

This method is ideal for providing guidance to users who are using free text search terms. Other approaches take longer to provide results for this use.

A few other configuration parameters allow you to adjust the results, if required.

This method returns only the top few main clusters. It indicates some subclusters by using the terms and phrases that return. If you want to use more main clusters, you might need to use a different approach.

This process slows down as you increase the MaxResults parameter. However, the quality of clusters increases as the number of results increases. You must experiment with results to find a compromise between the quality and performance. HPE recommends that you start with a value of 100 for MaxResults, and increase it as required.

This functionality uses an action thread, which may reduce server performance for other operations.

The clustering operation considers only the documents in the query result set. If you want to cluster larger volumes of data in your index, you must use a different approach.

This method returns only result data. If you want to use a data visualization, you must create one yourself. Other approaches can create visualizations.

The only indication of the significance of a cluster is the number of documents that it contains. If you require other information, you might want to use a different approach.

The results of dynamic clustering are not saved to disk. You must generate them again if you need to re-use them.

The clusters returned relate only to the result set, not the whole index. Consider using a different approach if you want to find index-wide trends.

Send documentation feedback to HPE

_HP_HTML5_bannerTitle.htm