Case Sensitive Matches

You can configure Eduction to match characters case sensitively or case insensitively. By default, it is case sensitive, which has better performance.

The simplest way to match case insensitively is to disable the MatchCase configuration parameter (set the parameter to False in the configuration file). Alternatively, if you are creating your own custom XML grammar files, you can configure individual grammars, entities, and entries individually to be case sensitive or insensitive. If you configure case sensitivity at a lower level, it overrides the higher level settings. Additionally, if you reference the entity in another entity, it maintains its own case sensitivity setting.

Most entities in the standard grammars do not have case sensitivity set explicitly, giving you the flexibility to use case sensitivity as required in your grammars.

NOTE:

If you design an entity for case-insensitive matching, it is important that entries in the entity have a consistent case style to ensure that all matches are extracted correctly. You should use all lower case, all upper case, or all initial capitals, but not a mixture.Eduction uses an optimization technique for case insensitive matching that might not extract every possible match if the entity is not defined consistently.

Case Insensitive Match Performance

Case sensitive matching generally has better performance than case insensitive matching. If you require case insensitive matching, you can use case normalization to give the same performance as case-sensitive matching.

When you want to use case normalization:

Eduction normalizes the input data accordingly before the (case sensitive) matching. This process means that both your input and grammars are all in the same case, so the matching is effectively case insensitive, with the performance benefits of case sensitive matching.

When to Configure Case Sensitivity

HPE recommends that you always create and use Eduction grammars that allow you to do case sensitive matching, because it has better performance. Most of the standard grammars come with entities using common and appropriate case styles. Some also have different entities for different case styles. If your data uses a consistent case, it is unlikely that you need to use case insensitive matching.


_HP_HTML5_bannerTitle.htm