Query Performance Considerations

When you find out the steps that take most of the query time, you can tune your configuration and queries to decrease the response times.

The following sections suggest some of the options to consider to optimize your query and action performance.

Network Latency

When you measure query performance, look in the request log for the times. If the time taken to receive a response is much greater than the times shown in the IDOL logs, then it might be network latency that is slowing down the query.

To view the request log, you can either use the GRL action, or go to the Logs page in the Monitor section of IDOL Admin.

Verify network latency by running a sample list of about 100 queries on both the local IDOL machine, and on your testing machine. If there is a big difference between the testing machine and the IDOL machine, it is likely to be a network latency issue. You can also consider using the LogRequestTiming configuration parameter to help diagnose query latency, or use the GRL action.

You can reduce the size of the results coming back by using the PrintFields parameter to reduce the amount of data that comes back. You can also use the TCPReceiveWindowSize configuration parameter to tune network performance. However, typically, your system administrator must address the network latency issue.

FieldText Restrictions

If you have unoptimized FieldText restrictions that run against a high percentage of your index, you might be able to alter the query to improve performance.

TIP:

You can use the IDOL Admin user interface to query the server and analyze query speed.

If you have a query of the type Text=*&FieldText=MATCH{MyTerm}:FIELD1, you might benefit from adding Text=MyTerm:FIELD1.

For this kind of field restriction, FIELD1 must be an Index field. If you choose to use this method, you must also consider the other processing in index fields, such as stop words and the IndexNumbers configuration.

When you use this type of query a lot, it is usually more efficient to use an optimized field type.

TIP:

You can regenerate and optimize some field types even after indexing.

See Also: Worked Example: Query for Numeric Values.

See Also: FieldText Optimizations

Check Committed Documents

When you run a GetStatus action, check the number of documents and committed_documents. If the number of committed_documents is much higher than documents, it might mean that you have deleted a lot of documents without freeing up the index slots.

TIP:

You can also use the Index Summary tab on the Status page in IDOL Admin to view the number of documents and committed documents.

You can run a DRECOMPACT index action to remove the unnecessary committed documents, and this can greatly improve your query performance, depending on the ratio of committed_documents to documents. The query speed improvements depend on a number of factors, so there is no one ideal ratio at which to perform a compaction. A general guideline is to compact when the ratio of committed documents to documents is 1.2-1.5, but in some systems, higher ratios might be acceptable.

If the ratio is higher than 2, you might find it faster to export your data, run a DREINITIAL, and then reindex your content, than to run a DRECOMPACT. However, during a DRECOMPACT, the documents are still available for querying, whereas if you initialize and reindex, the data is unavailable until the reindex is complete (unless you reindex to a new server and keep the original one available).

Performance correlates closely with the number of documents in IDOL Server, so freeing up index slots is generally quite helpful. Deleted documents have the largest affect on the L and A stages of the query (that is, the stages where IDOL is looking at the term information).

TotalResults and Prediction

When the TotalResults action parameter is set to True, IDOL Server gives a count of all documents for a given query, which adds query overhead. If you do not need the exact count, leave this parameter out, or set Predict to True to approximate the total number of results, rather than giving an exact count.

Additionally, consider the TotalResultsPredictionThreshold configuration parameter. Using a low value for this parameter increases the query performance when you have TotalResults set to True, and Predict set to True.

Similarly, for the GetQueryTagValues action, use the DocumentCount parameter sparingly, because this option can increase the running time of the action.

Find Terms that Do Not Add Value

If you have a large loop count, you might find that you have some terms that occur in a very large number of documents, but which do not add much value. You can use the TermGetAll action to find terms that are very common. Consider adding these terms to the relevant stop word list to reduce the index size and improve query performance. Also consider using the default english_large.dat stop word list for English content.

NOTE:

To get the best performance, add your very common terms to the stop list before indexing. If you add the stop word list after indexing, you can still improve query performance, but it does not reduce index size for existing documents. You can also use the StopWordIndex parameter to configure whether and how you want to index stop words.

TIP:

To see your most common terms, use the IDOL Admin performance monitoring Terms tab, or run the following action:
action=TermGetAll&MaxTerms=100&Type=DocOccs

Numeric and Alphanumeric Searches

In some cases you might not need to search for numeric or alphanumeric values in indexed fields. In this case, you can optimize the index by using the IndexNumbers configuration parameters. You can still use FieldText for numeric and alphanumeric values if you do not index them.

Alternatively, if you need numeric and alphanumeric search, but you do not need wildcard values on these terms, consider using the SplitNumbers configuration parameter, or the IndexNumbersType field property and IndexNumbersMaxValue configuration parameter. These terms often make up a large part of a search index.

You can check whether you have a lot of numeric and alphanumeric values by using the TermGetAll action with TermAnalysis set to True. Alternatively, you can use the Terms tab in the Performance page of IDOL Admin to monitor your terms.

DatabaseMatch

You can return fewer and more appropriate documents by separating your indexed data logically in databases, according to how queries use it (you can use the Databases page in the Control section of IDOL Admin to administer your databases). You can then use the DatabaseMatch action parameter for queries, which is one of the first steps in query evaluation, and helps IDOL to narrow down the result list very quickly.

MinDate and MaxDate

If your queries look for data that is date specific, use the MinDate and MaxDate action parameters to return fewer documents that are most relevant. These parameters use the values in DateType fields, which IDOL Server stores for fast retrieval.

MaxResults

Choose the value of the MaxResults parameter carefully. Performance is better for a smaller result set, and unless you are performing legal or e-discovery searches, you do not normally need a large initial result set.

PrintFields

Performance is better for a smaller number of fields in the PrintFields parameter. The performance is worst when you set the Print parameter to all, because IDOL Server must load a lot more data from disk. Micro Focus recommends that you print only the required fields, and avoid printing internal fields that generally return by default, such as the reference, databases, languages and so on.

Tag Clouds, AQG or Dynamic Clusters

The QuerySummary action parameter provides the dynamic thesaurus, Tag Cloud, clustering, and AQG functionality. See Automatic Query Guidance

You can configure the number of concepts that IDOL Server returns for the thesaurus by using the QuerySummaryLength parameter. You can improve performance by using the lowest number that provides adequate results for your purposes. Similarly, use the lowest appropriate value for the QuerySummaryTerms configuration parameter.

Summaries

Generating dynamic summaries during queries can add query overhead, and it relies on SourceType fields. If you have multiple SourceType fields defined, you can improve the summary performance by lowering the value of the MaxSourceCharacters configuration parameter.

You might also want to consider using a lower value for the Sentences and Characters action parameters. You can easily test the results of changing these values by removing or changing the parameter and rerunning slow queries.

Wildcards

Wildcard expansion can be performance intensive, because IDOL Server must look up a lot of terms, and load a large amount of data. You might want to introduce a requirement in your front-end application for a minimum number of prefix characters before a wildcard value (for example, to allow sta* but disallow st*). Leading wildcard values (for example *ing) are especially slow, so use these values sparingly.

You can improve performance for wildcard searches by increasing the UnstemmedMinDocOccs configuration parameter to the default value or higher. You can also disable wildcard searches by setting the DisallowWildcards configuration parameter to True.

If your queries generally target wildcards against only a known set of document fields, consider using the UnstemmedTrackFields configuration parameter. This parameter allows you optimize a small set of fields for wildcard queries, and works best if the specified fields contain a limited pool of data (for example, people names).

In a system where you have a large amount of memory, you might want to consider increasing the UnstemmedMemoryMaxSize configuration parameter. This parameter specifies the size of the unstemmed index search structure that IDOL Server can hold in memory. Unstemmed searches are faster when there is a higher memory limit. However, this option does increase the memory requirement of IDOL.

If most of the overly expanded wildcards return unintended results, you can also consider using the WildcardMaxTerms configuration parameter.

Combine

The Combine options help to reduce the number of results that return, especially if you have duplicate documents identify by a ReferenceType field. The Simple option is the fastest, and Micro Focus recommends that you use this option if you have used document sectioning during the index process.

FieldCheck

The FieldCheck option is a very fast way to restrict the initial result set, much like DatabaseMatch. Use this filter against a field whose value you match in its entirety. This method works best where the FieldCheck field values have a uniform distribution in the document set.

FieldCheckType fields improve the performance for MATCH queries, but you can only define one field per document as FieldCheckType in IDOL Server. You can also use the FieldCheck value in the Combine operation.


_FT_HTML5_bannerTitle.htm