Structure of Regular Expressions

Silk Performer uses an equivalent regular expression implementation as found in 4.x BSD Unix system.

The table below describes the structure of regular expressions and shows which characters may be used:

Character Description
[1] char Matches itself, unless it is a special character (metachar): . \ [ ] * + ^ $
[2] . Matches any character
[3] \

Matches the subsequent character unless it is a left or right round bracket, a digit 1 to 9 or a left or right angle bracket (see [7], [8] and [9]).

It is used as an escape character for all other meta-characters, and itself. When used in a set (see [4]), it is treated as an ordinary character.

[4] [set]

Matches one of the characters in the set. If the first character in the set is "^", it matches a character NOT in the set. The shorthand “S-E” is used to specify a set of characters from S to E, inclusively. The special characters "]" and "-" have no special meaning if they appear as the first chars in the set.

Examples:

[a-z] any lowercase letter

[^]-] any character except ] and -

[^A-Z] any character except uppercase letters

[a-zA-Z] any letter

[5] * Any regular expression of [1] to [4] followed by the closure character (*) matches a string of zero or more characters that have that form.
[6] + Same as [5], except it matches one or more.
[7] \ A regular expression of form [1] to [10] enclosed as \(form\) matches whatever the specified form matches. The enclosure creates a set of tags, used for [8] and for pattern substitution. The tagged forms are numbered starting from 1.
[8] \ A \ character followed by a digit 1 to 9 matches whatever a previously tagged regular expression ([7]) matched. For example, “\5” represents the fifth pattern specified in the “\(form\)” format.
[9]

\<

\>

A regular expression starting with a \< construct and/or ending with a \> construct, restricts the pattern matching to the beginning of a word, and/or the end of a word. A word is defined to be a character string beginning and/or ending with the characters A-Z a-z 0-9 and _. It must also be preceded and/or followed by any character other than those mentioned.
[10] A composite regular expression xy, where x and y are of form [1] to [10] matches the longest match of x followed by a match for y.
[11] ^ $ A regular expression starting with a ^ character and/or ending with a $ character restricts the pattern matching to the beginning of the line or to the end of line. Elsewhere in the expression, ^ and $ are treated as ordinary characters.

Example

Pattern Matches Comment
\\\.\* \.* backslash dot asterisk
\[\]\+\^\$ []+^$ other special characters
foo*.* fo foo fooo foobar fobar foxx ...
fo[ob]a[rz] fobar fooar fobaz fooaz
foo\\+ foo\ foo\\ foo\\\ ...
\(foo\)[1-3]\1 foo1foo foo2foo foo3foo the same as: foo[1-3]foo
\(fo.*\)-\1 foo-foo fo-fo fob-fob foobar-foobar ...