Field Properties

Before you index data into IDOL, you configure IDOL to store different fields with different properties. There are many available field properties. In general, they do one of the following:

The fields that you use for each property depends on the content in your documents, and the property that you apply depends on how you want to use the fields.

NOTE:

Many field properties must be set before you index content, to allow IDOL to process them correctly. If you change a field property after indexing data, you must reindex the content or regenerate the index.

TIP:

You can find out the names of all fields with a particular type or field code by using the GetTagNames action (for more information, refer to the IDOL Server Reference), or by using the Field Types page in the Monitor section of IDOL Admin.

Store Fields

By default, IDOL Server stores all fields in a document. However, you can prevent it from processing and storing some fields if you do not want to use them for search or retrieval. For this, you use the CantHaveFieldCSVs configuration parameter.

When you Retrieve Content with connectors, the connectors add many standard metadata fields to your documents, which include details of the import results. These fields might be useful in the IDX file that the connector creates, and help you to diagnose problems when you are setting up your indexing pipeline. However, you might not want to store them in your final IDOL Server index.

If you do not need any of the features that require stored content (such as AQG), you can optionally choose to not store the document content. You can set NodeTableStoreContent to False, which reduces the storage requirements of your IDOL, by saving the disk space that the nodetable part of the index normally uses. When you are not storing content, basic query functionality works as normal, but you cannot print results. This might be suitable in a setup where you can easily view the original document instead.

NOTE:

You can regenerate an index only if IDOL Server stores the fields for the index that you want to regenerate.

If you do not need to store all the content, but you want to store particular fields, you can use the StoredType property for those fields.

You can use the StoredType property for a few fields that you want to be able to print in results, or for a particular type of field that you want to be able to regenerate.

Metadata Fields

During indexing, IDOL Server uses the following types of fields to retrieve important information about the documents.

Field Property Field Contains Uses for this Property
ACLType The security ACL for the document. This option is essential for mapped security. Apply it to the field that your Connectors use to store the ACL.
DatabaseType The database. IDOL Server indexes documents to the database specified in this field. If you do not have a database field, IDOL Server uses the database specified in the index action. If neither option specifies a database, IDOL Server does not index the document.
DateType The document date.

This field defines the age of the document. It is used as the autn:date metadata field in query responses, and for the MinDate and MaxDate restrictions for queries. Each document can have only one DateType field.

NOTE:

Compare with NumericDateType, which optimizes FieldText date searches.

LanguageType The language type of the document. This field defines the language and encoding of a document, which affects how IDOL Server performs various language processing steps. If a document does not have a LanguageType field, IDOL Server either attempts to detect the language (if Automatic Language Detection is turned on), or uses the DefaultLanguageType.
ReferenceType The document reference. All documents must have a reference value. This field is often also used for deduplication.
SecurityType The security type to use for the document. If you have multiple security types configured, this field specifies which one a particular document uses. A document can have only one SecurityType field.
TitleType The document title. The title of the document. This value is used in the autn:title metadata field in query responses. A document can have only one TitleType field.

Fields for Processing

The following types of fields determine how IDOL Server performs processing during indexing.

Field Property Field Contains Uses for this Property
Index General document content.

IDOL Server processes content in these fields linguistically (with stemming and so on). This enables conceptual search for these fields. This field type is generally useful for the bulk of a document, such as the body of an e-mail, or the text of a Web page.

IDOL Server stores the information about the terms and positions from Index fields in the dynterm and unstemmed indexes. You can improve performance by minimizing the amount of data in these indexes. Configure a field as an Index field only if you want to search it conceptually. For a highly structured field, such as a document date, it is typically better to use an optimized field type for search, and use a FieldText or metadata search parameter.

IDOL Server stores the content of documents in the nodetable index. It mostly uses this content for printing the document in results.

Making all fields Index fields can result in a large index, and correspondingly slow performance. Equally, do not use optimized field types for all fields if you do not intend to use them for search.

LangDetectType Content that IDOL Server uses to detect the document language. If you are using Automatic Language Detection, IDOL Server uses the contents of these fields to detect the language. Choose fields that have a large amount of content in, to maximize the chances of successfully identifying the language. Often these fields are the same as the Index fields.

Fields Optimized for Search

For natural language search, IDOL Server uses Index fields. These fields contain document content, which is processed linguistically.

If a field contains data that you frequently want to use to filter values in field searches, you can optimize the searches by applying the appropriate field property to these fields.

For more information about the performance implications of these properties, see FieldText Optimizations.

Field Property Field Contains Uses for this Property
MatchType Text or alphanumeric content that you want to match in its entirety.

This property optimizes the field for MATCH, STRING, and WILD FieldText operators. IDOL Server stores the values for these fields in the match index, which allows fast lookup for searches using these values.

Use this property for values that you want to search in their entirety, such as a name or a country. When using this property and the MATCH operator, you can greatly improve the performance of queries for those values.

Do not use the MatchType property for large, unstructured fields that you want to search with a basic (conceptual or keyword) search. The match index size is proportional to the size of the unique values that it contains. If you use it for large, unstructured values, the match index will be large, reducing the performance benefit. Similarly, do not use this property for fields that you do not want to use in FieldText searches, for example if you only use the field as metadata or when printing the document content.

NumericDateType Date content that you want to use to search for date ranges.

This property optimizes the field for the GTNOW, LTNOW, and RANGE FieldText operators. Use it for dates that you want to search for in these ranges. IDOL Server converts the value of the date field to autndate format, stores values for this field in the numeric index, which allows fast lookup for searches using these values.

NOTE:

The MinDate and MaxDate query parameters use the DateType field. The DateType field is also used for the document date metadata.

NumericType Numeric content that you want to match in its entirety, or as part of a range.

This property optimizes the field for a number of FieldText operators, including EQUAL, GREATER, and LESS. IDOL Server stores values for this field in the numeric index, which allows fast lookup for searches using these values.

Use this property for values that you want to search using numeric FieldText operations. Do not use the NumericType property for fields that you use only when printing document content, because it increases the size of the numeric index, which can reduce the performance for other queries.

In addition, you can use the following field properties to enable Parametric Search .

Field Property Field Contains Uses for this Property
ParametricType Content that you want to use for Parametric Search .

Parametric search (also known as faceted search) allows you to find documents that have a particular value or range of values in a particular field. This type of search is useful when you want to be able to find a set of documents with particular properties. You must use ParametricType fields to turn on parametric search actions, such as GetQueryTagValues.

Parametric field processing increases the indexing time, while IDOL Server stores the field information required for parametric search. It stores the information in the parametric index, which maps all the values in a field to the document IDs in which it occurs.

Use the ParametricType property only for fields that you want to use in parametric search filters. Using it for other values unnecessarily increases the index time and index size.

ParametricRangeType Numeric content that you want to use to form ranges of data in Parametric Search .

For text parametric fields, the parametric search returns every possible value of the field. For numeric values in ParametricRangeType fields, IDOL Server can instead return a set of value ranges.

Use this option for fields that you want to use in parametric searches that contain a wide range of numeric values.

A price field might contain values from $1 to $1000, in very small increments. Rather than displaying 1,000 possible values, the parametric range would display a smaller number of ranges, such as $1-200, £201-400 and so on.

If the field contains only a very small number of values, you might not want to use the parametric range.

TIP:

For ParametricRangeType fields, you can also set the Ranges property to the number of ranges that you want to use. See Numeric Ranges.

Fields for Retrieval

The following fields are used in the query process.

Field Property Description Uses for this Property
HiddenType Fields that you do not want to display in results. The fields with this property are stored in the IDOL Server index, but they are not displayed in printed results. Apply this field to metadata values that you do not want the end user to see, such as the ACLType field, or the field that stores virtual node details in DIH consistent hashing mode.
HighlightType Fields in which IDOL Server must highlight terms when highlighting is enabled.

When you use highlighting during a query, or send a Highlight action, IDOL Server only highlights the link terms that occur in the HighlightType fields. You can make any field HighlightType.

IDOL Server does not perform any additional processing on HighlightType fields at index time. You can change these fields at any time (although you need to restart the server for the changes to take effect).

If you set a field as HighlightType, IDOL Server checks the field for values to highlight even if you do not print the field in the results. You might want to use HighlightType fields that you commonly display in results.

PrintType Fields that IDOL Server returns by default in results.

The fields with this property are printed by default in query results, if you do not use the Print parameter to choose alternative print options. You can choose any fields, according to your use case. You might want to present a sparse set of results, with just a document reference and title, or you might want to print the contents of a summary field to provide more information about the result document.

If you have a large number of print fields, or very large print fields, it can reduce the query performance, and increase the amount of data that must be sent.

SortType Fields that IDOL Server uses to sort results.

These fields are optimized for sorting query results. IDOL Server stores the SortType field information in the sort index.

The SortType field property is most useful when:

  • Most of your documents have a value in the field.

  • Most of the values have a common prefix that you can remove by using the SortFieldPrefixCSVs parameter.

  • The values (after removing any common prefix) all fit in the configured SortFieldStorageLength.

In other cases, you might want to use NumericType, NumericDateType, and MatchType fields, which are also optimized for sorting.

SourceType Fields that IDOL Server uses to generate summaries and suggest similar documents. These fields contain document content. Often these fields are the same as the Index fields, and by default if you do not configure SourceType fields, IDOL Server uses the contents of the Index fields for the functions that uses SourceType fields.

 

 


_FT_HTML5_bannerTitle.htm