When you find out the steps that take most of the query time, you can tune your configuration and queries to decrease the response times.
The following sections suggest some of the options to consider to optimize your query and action performance.
When you measure query performance, look in the request log for the times. If the time taken to receive a response is much greater than the times shown in the IDOL logs, then it might be network latency that is slowing down the query.
To view the request log, you can either use the GRL
action, or go to the Logs page in the Monitor section of IDOL Admin.
Verify network latency by running a sample list of about 100 queries on both the local IDOL machine, and on your testing machine. If there is a big difference between the testing machine and the IDOL machine, it is likely to be a network latency issue. You can also consider using the LogRequestTiming
configuration parameter to help diagnose query latency, or use the GRL
action.
You can reduce the size of the results coming back by using the PrintFields
parameter to reduce the amount of data that comes back. You can also use the TCPReceiveWindowSize
configuration parameter to tune network performance. However, typically, your system administrator must address the network latency issue.
If you have unoptimized FieldText
restrictions that run against a high percentage of your index, you might be able to alter the query to improve performance.
You can use the IDOL Admin user interface to query the server and analyze query speed.
If you have a query of the type Text=*&FieldText=MATCH{MyTerm}:FIELD1
, you might benefit from adding Text=MyTerm:FIELD1
.
For this kind of field restriction, FIELD1
must be an Index
field. If you choose to use this method, you must also consider the other processing in index fields, such as stop words and the IndexNumbers
configuration.
When you use this type of query a lot, it is usually more efficient to use an optimized field type.
You can regenerate and optimize some field types even after indexing.
See Also: Worked Example: Query for Numeric Values.
See Also: FieldText Optimizations
When you run a GetStatus
action, check the number of documents
and committed_documents
. If the number of committed_documents
is much higher than documents
, it might mean that you have deleted a lot of documents without freeing up the index slots.
You can also use the Index Summary tab on the Status page in IDOL Admin to view the number of documents and committed documents.
You can run a DRECOMPACT
index action to remove the unnecessary committed documents, and this can greatly improve your query performance, depending on the ratio of committed_documents
to documents
. The query speed improvements depend on a number of factors, so there is no one ideal ratio at which to perform a compaction. A general guideline is to compact when the ratio of committed documents to documents is 1.2-1.5, but in some systems, higher ratios might be acceptable.
If the ratio is higher than 2, you might find it faster to export your data, run a DREINITIAL
, and then reindex your content, than to run a DRECOMPACT
. However, during a DRECOMPACT
, the documents are still available for querying, whereas if you initialize and reindex, the data is unavailable until the reindex is complete (unless you reindex to a new server and keep the original one available).
Performance correlates closely with the number of documents in IDOL Server, so freeing up index slots is generally quite helpful. Deleted documents have the largest affect on the L
and A
stages of the query (that is, the stages where IDOL is looking at the term information).
When the TotalResults
action parameter is set to True
, IDOL Server gives a count of all documents for a given query, which adds query overhead. If you do not need the exact count, leave this parameter out, or set Predict
to True
to approximate the total number of results, rather than giving an exact count.
Additionally, consider the TotalResultsPredictionThreshold
configuration parameter. Using a low value for this parameter increases the query performance when you have TotalResults
set to True
, and Predict
set to True
.
Similarly, for the GetQueryTagValues
action, use the DocumentCount
parameter sparingly, because this option can increase the running time of the action.
If you have a large loop count, you might find that you have some terms that occur in a very large number of documents, but which do not add much value. You can use the TermGetAll
action to find terms that are very common. Consider adding these terms to the relevant stop word list to reduce the index size and improve query performance. Also consider using the default english_large.dat
stop word list for English content.
To get the best performance, add your very common terms to the stop list before indexing. If you add the stop word list after indexing, you can still improve query performance, but it does not reduce index size for existing documents. You can also use the StopWordIndex
parameter to configure whether and how you want to index stop words.
To see your most common terms, use the IDOL Admin performance monitoring Terms tab, or run the following action:action=TermGetAll&MaxTerms=100&Type=DocOccs
In some cases you might not need to search for numeric or alphanumeric values in indexed fields. In this case, you can optimize the index by using the IndexNumbers
configuration parameters. You can still use FieldText for numeric and alphanumeric values if you do not index them.
Alternatively, if you need numeric and alphanumeric search, but you do not need wildcard values on these terms, consider using the SplitNumbers
configuration parameter, or the IndexNumbersType
field property and IndexNumbersMaxValue
configuration parameter. These terms often make up a large part of a search index.
You can check whether you have a lot of numeric and alphanumeric values by using the TermGetAll
action with TermAnalysis
set to True
. Alternatively, you can use the Terms tab in the Performance page of IDOL Admin to monitor your terms.
You can return fewer and more appropriate documents by separating your indexed data logically in databases, according to how queries use it (you can use the Databases page in the Control section of IDOL Admin to administer your databases). You can then use the DatabaseMatch
action parameter for queries, which is one of the first steps in query evaluation, and helps IDOL to narrow down the result list very quickly.
If your queries look for data that is date specific, use the MinDate
and MaxDate
action parameters to return fewer documents that are most relevant. These parameters use the values in DateType
fields, which IDOL Server stores for fast retrieval.
Choose the value of the MaxResults
parameter carefully. Performance is better for a smaller result set, and unless you are performing legal or e-discovery searches, you do not normally need a large initial result set.
Performance is better for a smaller number of fields in the PrintFields
parameter. The performance is worst when you set the Print
parameter to all
, because IDOL Server must load a lot more data from disk. Micro Focus recommends that you print only the required fields, and avoid printing internal fields that generally return by default, such as the reference, databases, languages and so on.
The QuerySummary
action parameter provides the dynamic thesaurus, Tag Cloud, clustering, and AQG functionality. See Automatic Query Guidance
You can configure the number of concepts that IDOL Server returns for the thesaurus by using the QuerySummaryLength
parameter. You can improve performance by using the lowest number that provides adequate results for your purposes. Similarly, use the lowest appropriate value for the QuerySummaryTerms
configuration parameter.
Generating dynamic summaries during queries can add query overhead, and it relies on SourceType
fields. If you have multiple SourceType
fields defined, you can improve the summary performance by lowering the value of the MaxSourceCharacters
configuration parameter.
You might also want to consider using a lower value for the Sentences
and Characters
action parameters. You can easily test the results of changing these values by removing or changing the parameter and rerunning slow queries.
Wildcard expansion can be performance intensive, because IDOL Server must look up a lot of terms, and load a large amount of data. You might want to introduce a requirement in your front-end application for a minimum number of prefix characters before a wildcard value (for example, to allow sta*
but disallow st*
). Leading wildcard values (for example *ing
) are especially slow, so use these values sparingly.
You can improve performance for wildcard searches by increasing the UnstemmedMinDocOccs
configuration parameter to the default value or higher. You can also disable wildcard searches by setting the DisallowWildcards
configuration parameter to True
.
If your queries generally target wildcards against only a known set of document fields, consider using the UnstemmedTrackFields
configuration parameter. This parameter allows you optimize a small set of fields for wildcard queries, and works best if the specified fields contain a limited pool of data (for example, people names).
In a system where you have a large amount of memory, you might want to consider increasing the UnstemmedMemoryMaxSize
configuration parameter. This parameter specifies the size of the unstemmed index search structure that IDOL Server can hold in memory. Unstemmed searches are faster when there is a higher memory limit. However, this option does increase the memory requirement of IDOL.
If most of the overly expanded wildcards return unintended results, you can also consider using the WildcardMaxTerms
configuration parameter.
The Combine
options help to reduce the number of results that return, especially if you have duplicate documents identify by a ReferenceType
field. The Simple
option is the fastest, and Micro Focus recommends that you use this option if you have used document sectioning during the index process.
The FieldCheck
option is a very fast way to restrict the initial result set, much like DatabaseMatch
. Use this filter against a field whose value you match in its entirety. This method works best where the FieldCheck
field values have a uniform distribution in the document set.
FieldCheckType
fields improve the performance for MATCH
queries, but you can only define one field per document as FieldCheckType
in IDOL Server. You can also use the FieldCheck
value in the Combine
operation.
|