Use Your Content > Investigate > Automatic Query Guidance > Improve Automatic Query Guidance

Improve Automatic Query Guidance

This section describes some ways that you can improve the results from Automatic Query Guidance.

QuerySummaryAdvanced and QuerySummaryPlus

The QuerySummaryAdvanced parameter turns on the advanced algorithm required for Dynamic Clustering and Automatic Query Guidance. You can alternatively set the QuerySummaryPlus parameter to True. In this mode, IDOL Server uses an improved phrase selection algorithm, which is marginally slower.

HPE recommends that you turn on QuerySummaryPlus, unless the very small performance increase is critical.

Number of Results

The key to good AQG results is that they are generated from good results sets. For example, running QuerySummary on a set of only 10 documents rarely gives strong results. The more documents in the results set (that is, the higher your value for MaxResults) the better the results are. The only limit is to fit to your performance requirements.

For this reason, you usually perform AQG with two IDOL queries. The first gets the results that you actually want to display on the screen (for example, the top ten list in a user interface pane). The second query is the QuerySummary query, with a much higher MaxResults value, and setting Print to NoResults to avoid returning the unnecessary document matches.

This method improves the response time by not returning results that are not useful.

Number of Elements

You can control the number of elements that the QuerySummary returns by using the QuerySummaryLength configuration parameter. You should set this parameter to a value that returns sufficient elements for your purposes. You rarely need a value of more than 100, because it is unlikely that there are more than 100 strong elements in a particular results set.

The number of useful elements varies with the query. To determine whether to use the element, you should look at information such as the occurrence counts or cluster ID (see Query Summary Response Format).

You might use only elements that appear in more than five documents in the results set, or those that have a positive cluster ID.

Number of Terms

IDOL Server generates elements by analyzing the most important terms in the results set. By default, it uses the top 50 terms in the set, which is not enough for QuerySummaryAdvanced.

You can increase the value of the QuerySummaryTerms configuration parameter to increase the quality of the elements. In general, use a value of 1000 or lower. Higher values do not improve the quality any further, and might eventually reduce performance. HPE recommends that you test with different values to determine the value that gives you the best balance of performance and quality for your environment.

Content to Use

By default, IDOL uses the SourceType fields to generate the AQG results, as well as the TitleType fields. If you have not configured any SourceType fields, then it uses all Index fields, which might include fields that are not suitable for this type of analysis. HPE recommends that you set SourceType fields to only those containing clean natural language text.

To prevent extremely long documents causing slow-performing queries, AQG uses only the first 6000 characters from each document. You can modify this value by using the QuerySummaryMaxDocLength configuration parameter. However, HPE recommends that you change this setting only in advanced cases.

By default, AQG generation does not use numeric terms. You can change this behavior by using the QuerySummaryNumbers configuration parameter.

Filter Unwanted Elements

You might get unwanted AQG elements if the documents that you use to generate AQG results contain boiler-plate text, such as a disclaimer or e-mail signature. You can prevent these elements by automatically detecting very common phrases in the index when you start IDOL, which you can manually edit if required. Use the QuerySummaryStopPhraseMode configuration parameter.

Similarly, you can provide a list of phrases to favor when choosing AQG elements, by using the QuerySummaryWhiteListMode configuration parameter.

When creating AQG results from financial documents, you might provide a list of financial terms and phrases that you want to give higher weighting.

You can modify the stop phrase list and the white list by using the DREQUERYSUMMARYMANAGEMENT index action. You can also view the phrases on these lists by using the QuerySummaryManagement action. For more information, refer to the IDOL Server Reference.

Other Parameters

Other parameters can affect the quality of your terms, because they can affect the results set from which IDOL Server generates the elements.

Setting Combine to Simple might improve quality if many sections of the same few documents dominate your results set.

Furthermore, if you expand a term (for example, Apollo), the top 100 documents of a query might all be about the moon landings, with the Greek Mythology results too far down the list to be reflected in the elements. In this case, setting Sort to Random might improve results, by giving a more representative sample.

NOTE:

If you set Sort to Random for a multiple term query, you might need to also set the MinLinks, MinScore, or MatchAllTerms parameters to prevent poor matches from dominating the results. For more information, refer to the IDOL Server Reference.


_HP_HTML5_bannerTitle.htm