Before you index data into IDOL, you configure IDOL to store different fields with different properties. There are many available field properties. In general, they do one of the following:
specify that the field contains a particular piece of metadata.
optimize the field for retrieval in a particular type of search, or enable a particular type of search for that field.
specify how to treat the field during retrieval.
The fields that you use for each property depends on the content in your documents, and the property that you apply depends on how you want to use the fields.
Many field properties must be set before you index content, to allow IDOL to process them correctly. If you change a field property after indexing data, you must reindex the content or regenerate the index.
You can find out the names of all fields with a particular type or field code by using the GetTagNames
action (for more information, refer to the IDOL Server Reference), or by using the Field Types page in the Monitor section of IDOL Admin.
By default, IDOL Server stores all fields in a document. However, you can prevent it from processing and storing some fields if you do not want to use them for search or retrieval. For this, you use the CantHaveFieldCSVs configuration parameter.
When you Retrieve Content with connectors, the connectors add many standard metadata fields to your documents, which include details of the import results. These fields might be useful in the IDX file that the connector creates, and help you to diagnose problems when you are setting up your indexing pipeline. However, you might not want to store them in your final IDOL Server index.
If you do not need any of the features that require stored content (such as AQG), you can optionally choose to not store the document content. You can set NodeTableStoreContent
to False
, which reduces the storage requirements of your IDOL, by saving the disk space that the nodetable
part of the index normally uses. When you are not storing content, basic query functionality works as normal, but you cannot print results. This might be suitable in a setup where you can easily view the original document instead.
You can regenerate an index only if IDOL Server stores the fields for the index that you want to regenerate.
If you do not need to store all the content, but you want to store particular fields, you can use the StoredType
property for those fields.
You can use the StoredType
property for a few fields that you want to be able to print in results, or for a particular type of field that you want to be able to regenerate.
During indexing, IDOL Server uses the following types of fields to retrieve important information about the documents.
Field Property | Field Contains | Uses for this Property |
---|---|---|
ACLType
|
The security ACL for the document. | This option is essential for mapped security. Apply it to the field that your Connectors use to store the ACL. |
DatabaseType
|
The database. | IDOL Server indexes documents to the database specified in this field. If you do not have a database field, IDOL Server uses the database specified in the index action. If neither option specifies a database, IDOL Server does not index the document. |
DateType
|
The document date. |
This field defines the age of the document. It is used as the NOTE:
Compare with NumericDateType, which optimizes FieldText date searches. |
LanguageType
|
The language type of the document. | This field defines the language and encoding of a document, which affects how IDOL Server performs various language processing steps. If a document does not have a LanguageType field, IDOL Server either attempts to detect the language (if Automatic Language Detection is turned on), or uses the DefaultLanguageType . |
ReferenceType
|
The document reference. | All documents must have a reference value. This field is often also used for deduplication. |
SecurityType
|
The security type to use for the document. | If you have multiple security types configured, this field specifies which one a particular document uses. A document can have only one SecurityType field. |
TitleType
|
The document title. | The title of the document. This value is used in the autn:title metadata field in query responses. A document can have only one TitleType field. |
The following types of fields determine how IDOL Server performs processing during indexing.
Field Property | Field Contains | Uses for this Property |
---|---|---|
Index
|
General document content. |
IDOL Server processes content in these fields linguistically (with stemming and so on). This enables conceptual search for these fields. This field type is generally useful for the bulk of a document, such as the body of an e-mail, or the text of a Web page. IDOL Server stores the information about the terms and positions from IDOL Server stores the content of documents in the nodetable index. It mostly uses this content for printing the document in results. Making all fields |
LangDetectType
|
Content that IDOL Server uses to detect the document language. | If you are using Automatic Language Detection, IDOL Server uses the contents of these fields to detect the language. Choose fields that have a large amount of content in, to maximize the chances of successfully identifying the language. Often these fields are the same as the Index fields. |
For natural language search, IDOL Server uses Index fields. These fields contain document content, which is processed linguistically.
If a field contains data that you frequently want to use to filter values in field searches, you can optimize the searches by applying the appropriate field property to these fields.
For more information about the performance implications of these properties, see FieldText Optimizations.
Field Property | Field Contains | Uses for this Property |
---|---|---|
MatchType
|
Text or alphanumeric content that you want to match in its entirety. |
This property optimizes the field for Use this property for values that you want to search in their entirety, such as a name or a country. When using this property and the Do not use the |
NumericDateType
|
Date content that you want to use to search for date ranges. |
This property optimizes the field for the NOTE:
The |
NumericType
|
Numeric content that you want to match in its entirety, or as part of a range. |
This property optimizes the field for a number of FieldText operators, including Use this property for values that you want to search using numeric FieldText operations. Do not use the |
In addition, you can use the following field properties to enable Parametric Search .
Field Property | Field Contains | Uses for this Property |
---|---|---|
ParametricType
|
Content that you want to use for Parametric Search . |
Parametric search (also known as faceted search) allows you to find documents that have a particular value or range of values in a particular field. This type of search is useful when you want to be able to find a set of documents with particular properties. You must use Parametric field processing increases the indexing time, while IDOL Server stores the field information required for parametric search. It stores the information in the parametric index, which maps all the values in a field to the document IDs in which it occurs. Use the |
ParametricRangeType
|
Numeric content that you want to use to form ranges of data in Parametric Search . |
For text parametric fields, the parametric search returns every possible value of the field. For numeric values in Use this option for fields that you want to use in parametric searches that contain a wide range of numeric values. A price field might contain values from $1 to $1000, in very small increments. Rather than displaying 1,000 possible values, the parametric range would display a smaller number of ranges, such as $1-200, £201-400 and so on. If the field contains only a very small number of values, you might not want to use the parametric range. |
For ParametricRangeType
fields, you can also set the Ranges
property to the number of ranges that you want to use. See Numeric Ranges.
The following fields are used in the query process.
Field Property | Description | Uses for this Property |
---|---|---|
HiddenType
|
Fields that you do not want to display in results. | The fields with this property are stored in the IDOL Server index, but they are not displayed in printed results. Apply this field to metadata values that you do not want the end user to see, such as the ACLType field, or the field that stores virtual node details in DIH consistent hashing mode. |
HighlightType
|
Fields in which IDOL Server must highlight terms when highlighting is enabled. |
When you use highlighting during a query, or send a IDOL Server does not perform any additional processing on If you set a field as |
PrintType
|
Fields that IDOL Server returns by default in results. |
The fields with this property are printed by default in query results, if you do not use the If you have a large number of print fields, or very large print fields, it can reduce the query performance, and increase the amount of data that must be sent. |
SortType
|
Fields that IDOL Server uses to sort results. |
These fields are optimized for sorting query results. IDOL Server stores the The
In other cases, you might want to use |
SourceType
|
Fields that IDOL Server uses to generate summaries and suggest similar documents. | These fields contain document content. Often these fields are the same as the Index fields, and by default if you do not configure SourceType fields, IDOL Server uses the contents of the Index fields for the functions that uses SourceType fields. |
|