Before You Index > Enrich Your Content > Summarization


Summarization attempts to reduce a section of text to a smaller amount. It aims to either remove redundant or irrelevant information, or to draw attention immediately to the most relevant part of a large document.

Summaries are useful for displaying in results lists. You can use it to show just the start of an article, or to show the most relevant extract from a matching document.

You can perform summarization at index time to store, or you can dynamically create summaries at any time by:

IDOL uses the content of your SourceTypefields to generate document summaries (or Index fields if you have not configured source fields). HPE recommends that you configure your SourceType fields to be only those containing clean natural language text.

To prevent extremely long documents causing slow-performing queries, summarization uses only the first 10,000 characters from each document. You can change this value by using the MaxSourceCharacters configuration parameter.

Automatic Summarization at Index Time

You can create summaries of documents before you index them into IDOL Server, and store this data in a field. When users search for content, you can return the document summary field, which allows a user to quickly see what the document is about.

Creating and storing these summaries in an automated process means that the information is readily available, and can be quickly retrieved. If you want to create a Conceptual summary or Quick summary for all your query results, this method is faster than producing a summary for every query. However, the summary does take up more space in the index.

If you want to create a Contextual summary that displays content that is most relevant to the query, you can create summaries at query time.

You can create summaries for documents at any time. However, producing a summary for every result in every query would be an unnecessary drain on your system resources and result in very slow queries. With automatic summarization, you create a summary once, and use it every time you retrieve the document.

Dynamic Summaries

You can automatically create summaries for every query that you send to IDOL Server. You can add summary options to the query action, which specify the type of summary you want to create. You can also use the Summarize action to summarize a piece of text on request.

Dynamic summaries are useful if you want to produce a summary that displays the content that is most relevant to the query (a Contextual summary). If you want to produce a Conceptual summary or Quick summary to display in query results, you should consider Automatic Summarization at Index Time, which can save time in your queries.

Types of Summary

You can produce different types of summary, according to your purpose:

Summary Description
Conceptual summary

Sentences that are typical of the document content, which can be from different parts of the document.

Use this type of summary to give a general idea of what the document is about.

Contextual summary

A conceptual summary, biased to include sentences that are particularly relevant to the query terms.

Use this type of summary to show the sections of the document that are most relevant to the query.

Quick summary

The first few sentences of the document.

Use this type of summary to give a brief introduction to the document.

Create conceptual or quick summaries for all documents during your index processing, and store it in a field. When a user queries IDOL, return the title and summary for each result, to show what the document is about.

Add a contextual summary option to queries. When a user runs a search, IDOL generates a contextual summary for each result. The summary provides a few sentences that show how the document relates to the query.

Ensure Query Terms Return in the Summary

To ensure that the summary always contains terms from the query, use a character context summary.

action=query&text=Steve Reich&summary=context&characters=250