You can use field-based indexing to distribute the indexing load between IDOL servers. This mode is similar to reference-based indexing, except that you configure the document fields that determine which child server to send the document to. Data indexing uses the value of the specified field in the documents being indexed or in which you replace fields.
When you enable field-based indexing, it applies to the DREADD
, DREADDDATA
, and DREREPLACE
index actions. DIH sends DREREPLACE
index actions to all child servers, because the DREREPLACE
action does not contain the information required to determine which child server contains the original document.
When you use field-based indexing, you cannot alter the number of child servers.
Field-based indexing might prevent deduplication of documents with different field values. You can use field-based indexing only with a KillDuplicates=NONE
setting in the [Server]
section of the IDOL Server configuration file.
[Server]
section, set the DistributeByFields
parameter to True
. In the [Server]
section, Set DistributeByFieldsCSVs
to a comma-separated list of fields that DIH uses to distribute index data between child servers. For example:
DistributeByFieldsCSVs=*/DeDupeHash,*/SecondDistributeField
Save and close the configuration file.
Restart the DIH for your changes to take effect.
For example:
[Server] Port=9070 DIHPort=9071 MirrorMode=False DistributeByFields=True DistributeByFieldsCSVs=*/DeDupeHash,*/SecondDistributeField
You can use DistributeByFields
only when MirrorMode
is set to False
. DIH will not start if DistributeByFields
and MirrorMode
are both set to True
.
You can also set the BalanceDistributeByFields
configuration parameter to balance the distribution of documents that do not contain the specified distribution fields. In this option, DIH sends documents to a random child server if they lack the specified distribution fields.
By default, DIH internally determines how to distribute documents between child servers. This process ensures that DIH always sends duplicate documents to the same child server.
Instead, you can configure DIH to distribute documents to specific child servers when the field contains a specific value.
[Server]
section, set the DistributeByFields
parameter to True
. [Server]
section, set the DistributeByFieldsCSVs
parameter to a comma-separated list of fields that DIH uses to distribute index data between child servers. In the [IDOLServerN]
or [DIHEngineN]
section for each group of child servers, set DistributeByFieldsValues
to a comma-separated list of field values. If a document contains a field listed in the DistributeByFieldsCSVs
parameter with this value, indexes it to this child server. For example:
[DIHEngine2] DistributeByFieldsValues=backup,France
You can configure each field value in the list for only one child server. If a field value occurs in the list for multiple child servers, indexes matching documents into the child server with the lowest ID.
In the [Server]
section, set the UnknownFieldValueAction
parameter to the action that DIH takes if the fields listed in the DistributeByFieldsCSVs
parameter contain unknown values. The following actions are available:
Distribute
|
DIH uses a hash of the field values to distribute the document, as with conventional field-based indexing. |
Ignore
|
DIH ignores the document and logs a warning. |
Default
|
DIH sends the document to the server specified by the UnknownFieldValueDefaultEngine configuration parameter. |
For example:
UnknownFieldValueAction=Distribute
[Server]
section, set UnknownFieldValueDefaultEngine
to the number of the child server that acts as the default server. Set this parameter only if you have set UnknownFieldValueAction
to Default
.[Server]
section, set DistributeOnMultipleFieldValues
to True
if you want to index documents into each server group that matches the particular field values. Set this parameter to False
if you want the document to index only into the server with the lowest number.For example:
[Server] DistributeByFields=True DistributeByFieldsCSVs=*/database,*/country UnknownFieldValueAction=Default UnknownFieldValueDefaultEngine=0 DistributeOnMultipleFieldValues=True [DIHEngines] Number=3 [DIHEngine0] DistributeByFieldsValues=main [DIHEngine1] DistributeByFieldsValues=uk [DIHEngine2] DistributeByFieldsValues=backup,france
|