In many situations, the memory usage of an IDOL instance can become the limiting factor in its sizing. In this case, reducing the memory usage by editing the configuration often allows you to index more documents per instance.
The MemoryReport action returns the usage of each memory index, and is the key to determining how to alter your configuration. However, reducing the largest index does not always give the best return, because some changes affect performance more than others. This section lists the primary indexes to help you to determine a strategy for reducing your memory usage.
You can view information about system and process memory usage on the Memory and Memory Map tabs on the Performance page in the Monitor section of IDOL Admin. You can also view overall real-time performance data on a single graph on the Real Time tab.
In general, limiting the memory usage for a particular index forces the server to swap the information in and out of memory when needed, and the resulting disk reads will affect your performance. This is not the case for the index and term caches.
The index cache holds information on terms in index fields, to minimize the time it takes to flush the cache to disk, which in turn improves the indexing performance.
For many IDOL Servers, you index data in two phases: the first phase indexes a large amount of historic documents, usually before the server reaches production. The second phase slowly increases the index by adding new documents over time. In such a case, Micro Focus recommends that you set the index cache to a high value (perhaps 1 GB or more) during the first phase, and then lower it (to perhaps 100 MB) for the second phase to free up memory.
You can dynamically reduce the size of the index cache by using the DRERESIZEINDEXCACHE
index action. Alternatively, you can resize the cache on the Caches tab of the Status page in IDOL Admin.
The DRERESIZEINDEXCACHE
index action does not update the value in your configuration file. If you restart IDOL Server, it reverts to using the value in the configuration file.
The nodetable
index stores information to optimize the loading of document fields. Its size is proportional to the number of documents in the server. Micro Focus recommends that you only limit this index if you only rarely load fields during queries, or if query performance is not your primary concern.
The numeric
, match
, sort
and parametric
structures store information about fields that you configure as NumericType
, MatchType
, SortType
, or ParametricType
. The sizes of these structures are proportional to the number of documents in the server, and the average number of values of each type in a document.
The first consideration in limiting any of these structures is to ensure that you keep the number of fields of each type to a minimum. To view the fields of a given type use the GetTagNames
action (or use the Field Types page in the Monitor section of IDOL Admin). You can use the output to remove any unnecessary fields from the configuration.
The following action returns a list of all the NumericType
fields:
action=GetTagNames&FieldType=numeric
You can limit the usage of each structure further by using the NumericMemoryMaxSize
, SortFieldMemoryMaxSize
, and ParametricMemoryMaxSize
configuration parameters. It is often possible to make these changes without greatly affecting query performance. However, Micro Focus recommends that you test performance for a number of values to determine the impact.
Setting NumericMemoryMaxSize
also reduces memory usage of MatchType
fields, and also that different numeric fields can have memory reduced by different amounts by using the NumericNormalMaxMem
configuration parameter.
For parametric fields, Micro Focus recommends that you always set the ParametricMaxPairsPerDocument
configuration parameter to zero to reduce memory. This parameter affects performance only for the rarely used GetTagValues
action.
The unstemmed index is used for wildcard expansion and spelling correction, and its size is proportional to the number of terms stored in it. The recommended method for reducing its size is to limit the number of unstemmed terms that IDOL Server stores.
You can remove extremely rare terms from the unstemmed index by increasing the UnstemmedMinDocOccs
configuration parameter.
Often, the majority of terms in the unstemmed index are numeric or mixed alphanumeric terms. You can control which terms to match with wildcard terms by using the UnstemmedIndexNumbers
configuration parameters, or IndexNumbersMaxValue
and IndexNumbersNTruncateLength
. These parameters prevent long numeric terms from taking up space.
The UnstemmedMemoryMaxSize
parameter configures the amount of memory used to memory map portions of the dbub.dat
file (part of the unstemmed index). By default there is no memory limit and the entire file can be mapped into memory over time.
The dbun.dat
file is always wholly memory mapped and cannot be memory limited.
The refindex
controls the mapping of a reference field to its internal docid
, and its size is proportional to the number of documents in the server and the number of reference fields in each document. Its size is usually not large compared to other indexes, and so Micro Focus recommends that you limit it only if you rarely use reference fields in queries.
The term cache stores information on terms in index fields to optimize a query containing that term.
The only situation in which you might want to use the term cache is when you have a large amount of memory available, and your queries often use frequently occurring terms. In this case, you can set the amount of memory to use with the TermCachePersistentKB
configuration parameter.
|