Field Names and Field Identifiers

The field name is the name of a field in your index. The field name might represent the name of an IDX field (#DREFIELD), or an XML tag or attribute. A field identifier is the full name and path of the field in your index, for example the full path to the XML tag or attribute. You can use the GetTagNames action to retrieve the identifiers for existing fields (for example, for use in Query FieldText); alternatively, you can monitor field types in the data index on the Field Types page in the Monitor section of IDOL Admin.

IDOL query syntax typically allows you to use either the field name or the field identifier. You might want to use the identifier to remove any ambiguity.

NOTE:

Field names are often referred to with the prefix */. This prefix automatically matches the parent nodes of the document, which are typically not of interest.
For example, */DRECONTENT matches all fields named DRECONTENT that are not at the root level. In the field processing section of the IDOL Server configuration file, a field listed as FieldName is processed as */FieldName.

Unlike structured information management systems, you do not need to know all the IDOL fields during the configuration of your system. If IDOL Server encounters a previously unknown field during the index process, it creates a field with that name. For example, if you index the following document:

#DREREFERENCE testdoc
#DREFIELD MyNewField="value"
#DRECONTENT
Hello, world
#DREENDDOC

In this example, IDOL Server creates a field called MyNewField. You can immediately use the MyNewField field in a FieldText query, such as action=Query&FieldText=MATCH{value}:MyNewField.

Depending on the configuration of the Content server that indexes the document, this field might have properties associated with it, which allow for enhanced search functionality, or optimization.

Valid Characters in Field Identifiers

Field names can contain alphanumeric characters (a-z, 0-9), period (.), underscore (_) and hyphen (-). IDOL Server replaces all other characters with an underscore during indexing, and processes the new field name this creates as normal.

IDOL indexes and queries the following two documents in exactly the same way, because it replaces the # character in the second document with an underscore (_):

#DREREFERENCE testdoc
#DREENDDOC
#DREREFERENCE testdoc
#DREFIELD my#field="value"
#DREENDDOC

It is also best practice to use field names that conform to XML specifications.

Connector Field Standardization

When you index documents into IDOL Server by using connectors, many connectors standardize field names to a common string. This process ensures that common metadata values have the same field name in your IDOL index, regardless of the field name in the original repository.

Your connector installation includes a dictionary.xml file, which lists the standardized names for various fields in different connectors.

Assign Properties to Field Identifiers

Main TopicField Properties

By default, a field has no special properties associated with it, which can result in sub-optimal query performance, depending on how you use the field in queries. You can improve query performance by applying properties to certain fields before you index them.

To assign properties to a field, you configure rules in the Field Processing section of the IDOL Server configuration file. These rules list the field names that you want to assign properties to. You can use wildcards to match several fields.

The following example configuration defines certain fields as index fields when IDOL creates them in the index. In this case, any field called DRECONTENT or DRETITLE, and fields whose name start with PAGE are index fields).

[FieldProcessing]
0=SetIndexFields

[SetIndexFields]
Property=IndexFields
PropertyFieldCSVs=*/DRECONTENT,*/DRETITLE,*/PAGE*
TrimSpaces=False

[IndexFields]
Index=True

Retrieve Field Identifiers or Field Names and Associated Properties

The GetTagNames action returns the list of currently known field names. You can also set the TypeDetails parameter to True to also return the associated properties for each field name. For example:

<autn:name code="4" types="index,highlight,title,sourcefield,textparseindex">DOCUMENT/DRETITLE</autn:name>
<autn:name code="5" types="numeric">DOCUMENT/PRICE</autn:name>
<autn:name code="6" types="numericdate">DOCUMENT/MYDATE</autn:name>
<autn:name code="7" types="index,highlight,sourcefield,textparseindex">DOCUMENT/DRECONTENT</autn:name>

You can use the output of the GetTagNames action to help you to interpret the response of the MemoryReport action, which uses the numeric field codes. For example, the DOCUMENT/PRICE field in the GetTagNames response above corresponds to Numeric Index 5 in the MemoryReport response below.

<name>Numeric Indexes</name>
    <memoryusage>3333393</memoryusage>
    <noncomponentusage>0</noncomponentusage>
    <approx>false</approx>
    <components>5</components>
    <memory0>
        <name>Numeric Index 5</name>
        <memoryusage>627357</memoryusage>
        <noncomponentusage>96</noncomponentusage>
        <approx>false</approx>
        <components>3</components>

XML Field Names

See Also: Case-Sensitive Search

By default, IDOL matches field names in a case-insensitive manner. If you index XML directly, you might need case sensitive matching. You can configure this behavior by using the CaseSensitiveFieldNames configuration parameter.

When you index XML directly, you might also need to configure existing XML namespaces in your data in the AdditionalNameSpaces configuration parameter.

IDOL Server treats XML attributes as fields. When it creates the field name, it uses format _ATTR_AttributeName, and it uses the name of the parent tag as part of the field identifier.

For the field <PRODUCT PRICE=value> , the PRICE attribute becomes PRODUCT/_ATTR_PRICE in the IDOL Server index.


_FT_HTML5_bannerTitle.htm