See Also: IDOL Internal Storage and Indexes
You can see the amount of disk space that IDOL uses by using the DiskReport
action. Alternatively, you can view information about the disk usage of directories and files in the working directory for each component on the Disk Report tab of the Performance page in the Monitor section of IDOL Admin.
The following sections describe how to reduce disk usage for particular indexes.
You can dramatically reduce the amount of disk space required by the nodetable if there are any fields in the document that you do not actually need to store for retrieval.
If you retrieve the documents from a different system by using the reference, you can configure IDOL to index the body of the document (usually extracted as the DRECONTENT
field) but not store it. In this case, operations that require the original document body (for example, highlighting, and the GetContent
action) require additional processing by the application that uses IDOL. For example, rather than sending a GetContent
action to IDOL Server, it must request the document from the original repository, or through a Connector.
The KeyView extraction process might pull out fields or information that your particular IDOL system does not require. In this case you can also discard these fields and not store them.
There are two mechanisms for discarding fields:
Configure CantHaveFieldCSVs
and MustHaveFieldCSVs
. This option strips the data out of the document before IDOL Server indexes the document.
Set NodetableStoreContent
to False
, and then use the StoredType
property to specify fields that you want to store.
This option results in IDOL Server indexing the data, but then discarding those fields rather than writing them to disk.
Some functionality is unavailable for non-stored content, but in some cases there are alternative methods that you can use. For more information, see Storage of Document Content.
Storing fields allows IDOL to retrieve the data without needing to go back to the document source, and can improve overall query performance. You can also regenerate stored fields if index validation fails, or if you need to change the field configuration after indexing.
You can use the NodeTableCompression
configuration parameter to compress the documents in the nodetable on disk. In this case, IDOL Server compresses data in the nodetable directory before storing it, reducing the IDOL Server disk footprint.
The main methods for reducing disk usage of these indexes attempt to reduce the amount of indexed data.These methods include:
IndexNumbers
, IndexNumbersTruncateLength
, and IndexNumbersMaxValue
configuration parameters
number of index fields
Reducing the TermSize
also reduces the disk usage for the diskindex
and index cache, but normally this gain is minimal.
Reducing TermSize
by 1 for an index with 1 million unique terms only reduces disk space usage by 1 MB.
You can further reduce the unstemmed disk space usage by using the UnstemmedIndexNumbers
and UnstemmedMinDocOccs
configuration options.
The AdvancedPlus option can contribute to the disk usage of the diskindex
. To minimize the diskindex
size, HPE recommends that you turn this option off unless you particularly need the SENTENCE
and PARAGRAPH
matching.
For the Match, Numeric, and Parametric indexes, you can reduce disk usage only by limiting the amount of data indexed for those types.
For the Sort index, you can reduce disk usage by limiting the SortFieldStorageLength
parameter.
|