This topic discusses a number of different ways that you can query for numeric values , and the performance and configuration advantages and disadvantages of these configurations.
The following examples consider a query for a value in the DocumentType
field. This field is generated by IDOL KeyView, and contains a numeric value. The following table describes the different types of query you can use to search for a value of 230 in the DocumentType
field.
Query Name | Query Example | IDOL Configuration |
---|---|---|
Unoptimized EQUAL |
action=Query&FieldText=EQUAL{230}:DocumentType
|
None |
Optimized EQUAL |
action=Query&FieldText=EQUAL{230}:DocumentType
|
DocumentType field set as NumericType . |
Unoptimized MATCH {} |
action=Query&FieldText=MATCH{230}:DocumentType
|
None |
Optimized MATCH {} |
action=Query&FieldText=MATCH{230}:DocumentType
|
DocumentType field set as MatchType . |
Numeric Text Query |
action=Query&Text=230:DocumentType
|
DocumentType field set as Index field type. |
FieldCheck
|
action=Query&Text=*&FieldCheck=230
|
DocumentType field set as FieldCheckType . |
In an unoptimized EQUAL
query, you use the FieldText
EQUAL
operator on a field that has no special configuration. This option might make sense when you very rarely query against the field, and when saving RAM and disk space is more important than query performance.
EQUAL
queries work only against numeric values (such as 12554
, 43.27
, -3712.39
). Do not use these queries for arbitrary strings (such as dog
, cat
, apples
). Additionally, EQUAL
matches all numeric variations of a value, so that 1
is the same as 01
, 001
, and 1.0
.
In optimized EQUAL
queries, you use the FieldText
EQUAL
operator on a field that you have configured with the NumericType
property. NumericType
fields have the additional advantage that they are optimized for the RANGE
, GREATER
, and LESS
operators, which allow you to search for a range of numeric values.
Optimized EQUAL
queries are much faster than the unoptimized equivalent for exact-value matches, at the cost of RAM and disk space. Memory usage is proportional to the number of NumericType
fields, and the number of instances of those fields. You can limit the memory usage for a field by using the NumericNormalMaxMem
configuration parameter. Disk usage is the same as the memory usage.
EQUAL
queries work only against numeric values (such as 12554
, 43.27
, -3712.39
). Do not use these queries for arbitrary strings (such as dog
, cat
, apples
). Additionally, EQUAL
matches all numeric variations of a value, so that 1
is the same as 01
, 001
, and 1.0
.
In an unoptimized MATCH
query, you use the FieldText
MATCH
operator on a field that has no special configuration. In general, Micro Focus does not recommend using MATCH
for numeric value matching, because the EQUAL
operator generally uses less memory than MATCH
for the same exact value match. However, you might need to use MATCH
in the case where you need to match numeric values as strings, for example when you treat 001
as different to 1
. In this case, an unoptimized MATCH
query might be appropriate when you very rarely query against the field, and when saving RAM and disk space is more important than query performance.
In optimized MATCH
queries, you use the FieldText
MATCH
operator on a field that you have configured with the MatchType
property. In general, Micro Focus does not recommend using MATCH
for numeric value matching, because the EQUAL
operator generally uses less memory than MATCH
for the same exact value match. However, you might need to use MATCH
in the case where you need to match numeric values as strings, for example when you treat 001
as different to 1
. In this case, an optimized MATCH
query are much faster than the unoptimized equivalent for exact-value matches, at the cost of RAM and disk space. In particular:
Memory usage is proportional to the number of fields and the number of unique values in each field. In the same way as for NumericType
fields, you can limit the memory for a field by using the NumericNormalMaxMem
configuration parameter.
Disk usage is proportional to the number of fields and the number of unique values.
In a numeric text query, you use the Text parameter with a field restriction, for example Text=230:DocumentType
. For this query, the field must be an Index
type field. There is no equivalent to optimized or unoptimized for this type of query.
For numeric queries of this type on an Index
field, you must have IndexNumbers
set to 1
. If you do not want to use this setting in general, you can use the IndexNumbersType
field property for individual fields to use a more restrictive value (for example, you might set IndexNumbers
to 1
for your system, and then configure some fields to have IndexNumber
set to 2
to index only non-numeric and alphanumeric terms).
When using this type of query, bear in mind that if the numeric values occur in other Index
fields, it might affect the term weight and query relevance . You must also take care with character settings such as TangibleCharacters
when negative and decimal values are possible.
For this type of query, consider the following notes on performance and system usage:
There is no appreciable memory usage for index terms.
Disk usage is proportional to the total number of terms, and the number of occurrences.
Indexing numeric terms often expands the number of unique terms in the index overall, which affects both query and index performance. Querying for a large number of terms (independent of their individual occurrences) generally has the most negative effect on performance.
In some cases where you only need to use one field name, you can use FieldCheck
in a similar way to EQUAL
or MATCH
. You can only query for one value at a time, which means that you cannot use Boolean expressions to find multiple values. Also, you can only configure one field as FieldCheck
in a server, so that if you have multiple applications operating on the same index, they must use the same FieldCheck
field. Memory and disk usage are not affected by the FieldCheck
field.
|