Types of Field

You might initially think that you want to treat all fields as document content, which receives detailed linguistic processing for conceptual search. However, this method often results in slow indexing speeds, and a large index, which then has an impact on query speeds.

In your review of your content, you probably found a few key types of data. The following sections discuss some of the most common types.

Customizing your configuration can help you get the most value from your content, at a lower cost in terms of disk space, and speed. For more information about the types of fields available and their configurations, see Field Properties.

References

Every document must have a reference field. The reference can be any field to identify the contents of the document. You can also configure more than one reference field for each document.

During indexing, the reference field is commonly used to remove duplicate documents. In this case, if an incoming document has the same value in the reference field as an existing document, IDOL Server deletes one. By default it deletes the existing document and indexes the new version (updating the database), but you can optionally configure it to keep the existing version.

In general, your connector adds a DREREFERENCE field that you can use as a reference. This value might be a file path, URL, or database primary key, depending on your repository. The connectors also all produce a unique identification (UID) field, which you can also use as a reference.

Document Content

Document content is generally the largest field of the document; the body of an e-mail, or the text in a file. For most uses, you want to be able to search for keywords, sentences, and concepts in this content. In IDOL Server, you generally configure these fields as Index fields.

The Languages section contains more information about the processing for index fields.

You might also want to be able to highlight search terms, or use the content to create document summaries. You can configure IDOL Server to use the same field in multiple ways. For more information, refer to the IDOL Server Administration Guide.

Metadata

Metadata fields contain information about the document. In some cases, you might not need to search for this content at all.

In other cases, these fields might contain information that you can use for filtering, such as a date, or the user that owns the document. In most such cases, you only ever want to use the information in its entirety, so the linguistic processing that applies to document content is unnecessary.

You can instead use optimized field types that are specific to the field content (text, number, date, and so on).

Other Document Details

As well as metadata, your documents might have other details or properties that you want to use in IDOL. For example, a product description might have a list of product features, as well as the main product description.

If you extract this kind of detail into fields, you can make it much easier to search for this kind of content by using optimized field types.

Enriched Data

When you extract data from your repositories, you can use various forms of data enrichment to make it easier to retrieve in IDOL Server. In many cases, you add this data to a field (Document Tagging).

 


_FT_HTML5_bannerTitle.htm