Improve Query Performance

See AlsoTune IDOL Server

This section provides some information about how to tune your IDOL Server queries, and optimize performance.

This page describes how to find and analyze information about your query performance, and find out the stages that slow your queries down. You can then see Query Performance Considerations to find out what changes you can make to improve performance.

Query IDOL

IDOL Server has many query action and configuration parameters that affect response times. In general, the server is optimized to quickly narrow the list of possible hits when it evaluates a query, but there are several areas to consider when optimizing response times for a particular situation. IDOL Server internally processes a query in various steps. You must consider each of these steps when you attempt to diagnose slow performance.

When a user sends a query, IDOL Server performs the following actions:

  1. It loads the required term information from the dyntermindex, including information to satisfy any wildcard terms.

  2. It finds the documents containing the minimal set of required terms.

    For example, if a query is A AND B and B occurs in fewer documents, the loop needs only consider B initially.

  3. It checks the remaining terms for the document to verify that it satisfies the Boolean expressions in the query.

  4. It performs additional metadata filtering, for databases, dates, and languages.

  5. It applies any FieldText and security restrictions.

  6. It returns the results to the user, applying any Sort and Combine rules from the query.

  7. It performs any additional result enrichment processing (highlight, summary, AQG).

When there are a large number of terms in the query, or when the query terms are very common, the query has a larger footprint because it matches more documents. This results in slower queries.

To analyze the query performance in more details, you can use the Content query log, with LogLevel set to Full. The following section describes the log output in more detail. You can also use query performance analysis in IDOL Admin to view timing information for queries, or set PerformanceAnalysis to True for your Query to return the details in an ACI response. See PerformanceAnalysis Information for more information.

IDOL Query Log Information

The following example shows a query log with LogLevel set to Full.

01/07/2007 17:18:54 [7] /action=query&outputencoding=UTF8&xmlmeta=True&text=web administration&spellcheck=True&fieldtext=MATCH{.doc,.xls,.ppt,.mpp,.pdf,.html,.ndx}:FILETYPE&minscore=70&databasematch=InternalWiki&securityinfo=<string>&combine=SIMPLE&synonym=True&printfields=FileType&maxresults=10&totalresults=True&summary=CONTEXT&characters=300&highlight=SUMMARYTERMS&starttag=<myStartTag>
01/07/2007 17:18:55 [7] L 51523; A 3612; F 1345; S 322; DL 0; SL 0; DT 1340738
01/07/2007 17:18:58 [7] Returning 10 matches
01/07/2007 17:18:58 [7] Query complete
L

The total number of results that this query can include. In this part of the query, IDOL Server identifies possible matches based on the terms in the Text parameter.

A smaller number here generally leads to faster query times, because there are fewer documents to evaluate further. Overly expensive wildcard values (such as text=*), or very common terms increase this value. The fastest possible case is a single query term that appears in only one document.

This is the main way that IDOL quickly searches through millions of documents. In the earlier log example, 51,523 documents matched the query terms.

A

The number of documents that still match after matching Boolean expressions in the Text parameter, and using weighting to determine whether the document has a high enough relevance to return as a result.

In the earlier log example, 3,612 documents passed the basic tests such as the MinScore relevance value of 70%. From this example, you can see that using a relevance threshold reduces the potential results set to 7% of the initial set.

F

The number of documents that still match the query after metadata checks. This step includes database matching, language matching, FieldCheck, and so on. Using these options in a query can quickly help IDOL Server processes to narrow the results that it has to further filter or return.

From the example, 1,345 documents match here. These documents must all go through the slowest part of the query loop, which includes AgentBoolean restrictions, security, and FieldText. FieldText restrictions can be slow if they include non-optimized fields, so you must consider why the server must evaluate a lot of results here.

NOTE:

A high value here does not necessarily indicate a problem, as long as your query runs fast enough. Even optimized queries increment this counter.

S

The number of documents added to the final result structure, after the slow checks. If a document is counted here, it has passed all checks.

In the log example, 322 documents passed through FieldText operations for final filtering. The actual number of results that IDOL Server returns might be smaller, after taking into account the Combine and MaxResults parameters.

DL

The number of times that IDOL Server accessed a DLL, for example, when using mapped security.

In the log example, 0 means that IDOL Server did not call any external libraries (like security) to help evaluate the query.

NOTE:

Not all security types require calls to the DLL for security evaluation.

SL

The number of times IDOL Server had to load a field from disk to sort a particular document properly, even though it was sorting on a SortType field.

To improve this number, consider how many fields you want to sort on, and whether you can use an optimized field type.

In the log example, 0 means that IDOL Server did not have to go to disk to satisfy any sorting requirements for the query.

DT

The number of kilobytes of dynterm data loaded. This number has an impact on performance if the server has to look up and load lots of data to return the result set. Consider the number of fields you print, the number of results, and the underlying disk I/O and bandwidth.

The DT value is an important metric. In the log example, 1.28 GB is read from disk into memory. If the HDD reading was 100-200 MBps, this query might take several seconds to finish, regardless of later steps.

NOTE:

This amount of memory is expected to be free over the idle memory usage of IDOL. If it is not free, there might be memory swapping, which further reduces performance.

PerformanceAnalysis Information

When you send a Query action with the PerformanceAnalysis parameter set to True, the ACI response includes the same count information that the query log includes.

Counter Name in PerformanceAnalysis Output Corresponding Count in Query Log
Examined L
AddMatch A
SlowCheck F
AddToStructure S
DllLoads DL
SortLoads SL
Term KB DT

The output for the PerformanceAnalysis parameter also includes counts timers for each of the sub-stages of the Query action. In each case, the Count attribute shows the number of times IDOL Server accessed data in that step. The main value in the <autn:timer> tag shows the amount of time that the stage took.

Timer Name Description
Nodetable

Time spent loading nodetable data.

Ideally, the Count value should be at most the number of documents that the query returns, and then only if you print fields. If your Count value is higher than that, it typically suggests that the query includes an unoptimized operation that requires IDOL Server to load the document from disk.

Reference Nodetable

Time spent loading the reference fields.

This value depends on MatchReference, StoredState, printing of reference fields, and Combine.

Wildcard

Time spent evaluating wildcards.

This stage includes a mixture of time spent in the unstemmed structure and the diskindex.

Numeric Time spent evaluating numeric fields. The count shows the number of numeric fields evaluated in the query.
Spellcheck Term Time spent on spelling check evaluation. There is one Spellcheck Term timer tag for each incorrect term.
Diskindex Time spent loading term information.
Security Time spent evaluating document security.
Refindex Time spent processing MatchReference, and StoredState matching.
Sort Time spent evaluating SortType fields

 


_FT_HTML5_bannerTitle.htm