Date-Based Indexing
You can use date-based indexing to distribute the indexing load between IDOL servers. Date-based indexing uses the date of the documents being indexed or in which you replace fields.
When you enable date-based indexing, DIH indexes each document in a DREADD
and DREADDDATA
action by its #DREDATE
field, or another DateType
field configured in the [FieldProcessing]
section. It indexes each replace in a DREREPLACE
action based on its #DREDATE
line, if it exists. Otherwise it sends the action to all child servers. It sends all other actions to all child servers.
When you use date-based indexing, you cannot alter the number of child servers.
To enable date-based indexing, set DistributeByDate
to True
in the [Server]
section of the DIH configuration file.
Configure the date ranges for child servers by using a [DateRangeN]
subsection in the [DistributionIDOLServers]
section of the DIH configuration file.
For both DIH stand-alone and unified configuration, you must configure DateFormatCSVs
in the [Server]
section for date-based indexing to work. For example:
[Server] Port=9070 DIHPort=9071 MirrorMode=False DistributeByDate=True DateFormatCSVs=DD/MM/YYYY,YYYY/MM/DD,YYYY-MM-DD [DistributionIDOLServers] Number=2 [IDOLServer0] Host=localhost Port=9100 [IDOLServer1] Host=localhost Port=9500 [DateRange0] FromDate=1980/01/01 UpToDate=1990/01/01 Engines=0 [DateRange1] FromRelative=-3 UpToRelative=5 Engines=1
In this example, server 0 indexes documents dated from 1 January 1980 to 31 December 1989. If it is Tuesday (relative 0), server 1 gets documents dated from the previous Saturday (relative -3) and from the following Saturday (relative 4). The upper limit is exclusive.
NOTE: You can use DistributeByDate
only when MirrorMode
is set to False
. DIH will not start if DistributeByDate
and MirrorMode
are both set to True
.
In DistributeByDate
mode, you can also use the UnknownFieldValueAction
configuration parameter to determine how to treat documents where the date field is missing, contains a value that DIH cannot parse as a date, or contains a date value that does not match the configured date ranges. For more information, refer to the DIH Reference.