An Eduction grammar defines patterns for matching text in a document. A pattern is a combination of characters and operators. An operator is a sequence of special characters that match text by following the rules associated with the operator.
Pattern |
Description |
Matches |
---|---|---|
Smith|John |
Match either Smith or John |
Smith John |
[0-9]{3} |
Match a sequence of three characters in the range 0 through 9 |
123 456 |
In this example, the square bracket operators [] are used to match on any of the characters 0 through 9 and the curly braces {} are used to repeat the previous pattern three times.
Grammars are described using XML. The template that defines the XML that Eduction understands is contained in the file edk.dtd
. When writing grammars for Eduction, Micro Focus recommends that you reference edk.dtd
at the start of the XML grammar file using the include statement, and that you use a DTD-compatible XML authoring tool to eliminate syntax errors and save time.
Here is an example of a simple Eduction grammar:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE grammars SYSTEM "edk.dtd"> <grammars> <grammar name="mygrammar"> <entity name="name" type="public"> <pattern>Smith|John</pattern> </entity> <entity name="digits" type="public"> <pattern>[0-9]{3}</pattern> </entity> </grammar> </grammars>
This grammar defines two entities: mygrammar/name
and mygrammar/digits
.
For full details of the Eduction grammar XML syntax, and the edk.dtd
, see Grammar Format Reference.
For a more extensive set of example Eduction grammar files, see Example Grammar Files.
|