Open topic with navigation
By default, search in IDOL Content is not case sensitive. This means that a search for a term matches other occurrences of that term, regardless of their case in the original document, and regardless of the case in which the search term is written. For example, a search for
Dog matches documents that originally contained terms such as dog, DOG, Dog, or DoG. To achieve this behavior, IDOL stores all terms in the index in uppercase, and also converts search terms to uppercase during a query to match these terms.
In some situations, you might want to match documents containing a term with a particular case. For example, you might want to find documents containing the language
Polish, but not match any talking about
polish. You can enable case-sensitive search by turning on one of the AdvancedCaseSearch or AdvancedPlus configuration parameters. before you index any documents. In these modes, IDOL stores all terms with additional information about the original case of each occurrence.
In a query, you enable a case-sensitive search for a term by adding a tilde (~) to the start of a term. For example:
This query matches only documents that contain the term LAX in uppercase.
To ensure efficient storage, IDOL stores each occurrence of a term as one of only four case types:
This behavior means that a search for a term that is in the 'other' category matches any of the potential forms in this category. For example, a search for
~dOG might match a document containing doG or DoG. However, in most cases the number of occurrences of terms in this category is extremely small, and when there are occurrences, particular words are usually in the same form, so this ambiguity does not usually affect matching.
By default, a search term that has an initial capital letter (followed by a lowercase letter) gives a small weight boost to documents that contain that term with an initial capital letter. This feature applies regardless of whether you configure case-sensitive search. For example, when all other factors are equal, a search for
Polish matches documents that contain Polish with a slightly higher weight than documents that contain polish or POLISH.
You can disable this behavior by turning off the
KeywordMode configuration parameter.
Case-insensitive search works in all major scripts. For example, a search for
étude matches a document containing
αθήνα, and so on. AdvancedCaseSearch and AdvancedPlus work equally well in these languages.
Not all languages have the concept of case; Chinese, Japanese, Korean, Arabic, Hindi, and Hebrew, among others, do not have case and so they are not affected by case-sensitive settings.
By default, FieldText matching is also case-insensitive. For example:
matches documents that contain the value paris in the
CITY field, regardless of case. IDOL converts the values given in the query and the field values that it matches to uppercase for comparison. If you require case-sensitive matching, you can set the
CaseSensitive parameter to
True for the query. For example:
matches only documents containing exactly
Paris in the given field. As for text searches, this functionality works for all major scripts.
For improved performance, HPE always recommends that you perform these queries against MatchType fields (for example, for the previous query, HPE recommends that
*/CITY is configured as
MatchType). Case-sensitive matching is performed differently with this setting, so we consider this separately below.
By default, FieldText matching against MatchType fields is not case-sensitive. For example:
matches documents that contain the value paris in the
CITY field, regardless of case. IDOL converts the values in MatchType fields to uppercase at index time, and at query time it converts the field value in the query.
If you always require case-sensitive matching, you can set the
CaseSensitiveMatchValues configuration parameter to
True. You must either set this parameter before you index data, or use the RegenerateMatchIndex parameter after indexing. With this configuration, IDOL stores the values of MatchType fields in their original case, and it matches the value that you specify in the query is matched exactly.
You cannot use the
CaseSensitive action parameter in combination with
By default, all field names are matched in a case-insensitive fashion. For example:
matches occurrences of london in the field
plAce, and so on. IDOL converts field names to uppercase at index time, as well as any field names given in configuration or action parameters.
If you require case-sensitive matching, you can set the
CaseSensitiveFieldNames configuration parameter to
True, before you index content. In this configuration mode, IDOL stores field names in their original case, and it does not alter values given in configuration or action parameters.
By default, IDOL converts all values for parametric refinement to uppercase at index time. For example:
action=GetQueryTagValues&Text=some query text&FieldName=City
returns a list of values containing city names, for example BERLIN, regardless of the case that the value had in the original document. As a result, the action returns a single merged value for occurrences and counts of documents that contain Berlin, berlin and so on, rather than returning many values that differ only by case.
If you require case-sensitive parametric values, you can set the
CaseSensitiveParametricValues configuration parameter to
True. You must either set this parameter before you index data, or use the RegenerateParametricIndex parameter after indexing. With this configuration, IDOL stores the values of ParametricType fields in their original case, and it returns the values in their original form.