![]() | Installing SCP and Samba | Compatibility With Previous Versions of UNIX Option | ![]() |
The UNIX Option uses regular expressions to perform the search/replace operations that are possible during both the import and publish operations.
During search/replace operations, each text file is read as a series of lines. Each line is processed by applying all of the applicable search/replace patterns to it. The order in which these patterns are applied is controlled by the order in which they were specified in the UNIX Option Setup
Note: Because each line is processed individually, it is not possible to write a pattern that can search across multiple lines.
The syntax for the regular expressions is very similar to the syntax
used by the UNIX grep
command. Regular expressions include
both normal characters and metacharacters. Metacharacters have special
meaning, or change the meaning of other regular characters. For example,
if you have used the DOS prompt on a PC, you will be familiar with the
dir *.* command; in this case, the asterixes (wildcards) are
metacharacters which are equivalent to zero or more normal characters.
The following metacharacters are supported for defining search patterns:
Metacharacter
|
Meaning
|
---|---|
^ |
Matches the start of the line. Inside a character class, it negates the class |
$ |
End of line |
. |
Matches any character |
[ |
Start of character class |
] |
End of character class |
* |
Matches 0 or more occurrences of the preceding regular expression |
+ |
Matches 1 or more occurrences of the preceding regular expression |
? |
Matches exactly 0 or 1 occurrence of the preceding regular expression |
| |
Matches expression on either the left side or right side of it |
( |
Start of substring |
) |
End of substring |
" |
Delimit character for a literal string |
\ |
Escape character |
The following metacharacters are supported for defining replace patterns:
Metacharacter |
Meaning
|
---|---|
& |
The string that the search pattern matched. If it os followed by a number (n) between 1 and 9, it is the string that matched substring number n. |
\ |
Escape character |
The escape character is used to escape the special meaning of metacharacters.
For example, a search pattern of $HOME
would fail because
the $ character has a special meaning. To make this work correctly, you
would specify the pattern as \$HOME
; the backslash indicates
that the special meaning of the character that follows it should be
ignored.
In addition, the escape character is used to define some special characters that are difficult or impossible to represent otherwise. These are termed escape sequences and the following are recognized:
Escape Sequence
|
Meaning
|
---|---|
\b |
Backspace |
\e |
ASCII escape character |
\f |
Form feed |
\n |
New line |
\r |
Carriage return |
\s |
Space |
\t |
Tab |
\\ |
Backslash character |
\ddd
|
Character specified by 1 - 3 octal digits (d) |
\xdd |
Character specified by 1 - 2 hexadecimal digits (d) |
\x^c |
Control character specified by letter (c) |
The filename patterns on the search/replace dialogs use a standard UNIX-style wildcard matching syntax instead of full regular expressions. The following metacharacters are recognized:
Metacharacter
|
Meaning
|
---|---|
* |
Any string of 0 or more characters |
? |
Any single character |
[] |
Define a character class for a single character |
\ |
Escape any of the previous special
characters. Use \\ to match a backslash |
The following examples introduce the various metacharacters.
Search Pattern |
Meaning
|
---|---|
^Start |
Matches the word Start if
it is the first thing on the line of text. |
End$ |
Matches the word End if it
is the last thing on a line of text Note: The UNIX Option does not pass the line termination characters e.g. CRLF or LF, to the search pattern. |
file\.dat |
Matches the exact word file.dat
anywhere on the line of text. Note: The escape character is used before the . since the period is a metacharacter. |
file.\.dat |
This is an example of a metacharacter.
The . matches any one valid character. This pattern matches
strings such as filea.dat , fileX.dat ,
file9.dat , and so on. |
file..\.dat |
Metacharacters can be used multiple
times. This example matches any strings that contain file ,
followed by exactly two characters, followed by .dat .
|
file..?\.dat |
This is an example of a repeating
metacharacter. The ? character matches exactly 0 or 1
occurrences of the previous regular expression, which in this case is a
. metacharacter. This example therefore matches any strings that
contains file , followed by 1 or 2 other characters,
followed by .dat . |
file.*\.dat |
This example contains another repeating
metacharacter. The * matches 0 or more of the preceding
regular expression, which again is a . metacharacter. This
example matches file , followed by any number of valid
characters followed, by .dat . |
file[ABC]\.dat |
This is an example of a character class.
A character class contains a list of valid characters, in this case the
letters A, B and C. This pattern matches fileA.dat ,
fileB.dat or fileC.dat . |
file[0-9]\.dat |
A character class can contain a range of
characters; this is specified using a hyphen. This example defines a
character class that matches any number from 0 to 9. This pattern
matches file , followed by a numeric digit, followed by
.dat . |
file[0-9A-F]+\.dat |
This is example is the most complex so
far. The character class contains two ranges, 0 through 9, and A through
F; that is, a hexadecimal digit. The + metacharacter
matches 1 or more of the preceding regular expression, which is the
character class. This pattern therefore matches file ,
followed by 1 or more hexadecimal digits, followed by .dat .
|
file(\.dat)? |
Substrings can be used to group multiple
character together into one logical regular expression. In this example,
the \.dat pattern is within a substring a followed by a
? metacharacter. The ? matches exactly 0 or
1 occurrences of the preceding regular expression, which, in this case
is the entire substring.This pattern therefore matches file
or file.dat .Note: Without the substring, the pattern file\.dat? would match file.da or
file.dat
|
file\.(dat)|(idx) |
This example contains substrings and the
option metacharacter | . The option metacharacter matches
either the regular expression on the left or the regular expression on
the right. This pattern matches file. followed by dat
or idx |
"file.dat" |
When a search string is encased in double
quotes, it ignores all other metacharacters within the quotes (except
the escape character). This example matches file.dat |
The true power of regular expressions becomes apparent when you can replace whatever it is that you matched as part of the search. The substring operator is essential for you to be able to set the focus on whatever it is that you want to replace.
Search Pattern
|
Replace Pattern
|
Comment
|
---|---|---|
"file.dat"
|
newfile |
Searches for a literal string and a direct replacement with a different literal string. |
(.*)\.htm |
&1.html
|
Searches for any string
ending in .htm and replaces it with the string that the
search pattern matched, followed by .html . |
\"file([0-9A-F]+)\.dat\"
|
"newname&1.data"
|
This search statement is an
extension of one of the previous examples. It searches for a hexadecimal
based filename within quotes. The quotes are escapeed and the substring
delimiter around the hexadecimal digits sets the focus we want. The
replacement string is newname followed by the hexadecimal
digits from the search string, then the new extension. So, "file9F.dat"
would become "newname9F.data" |
Copyright © 1998 Micro Focus Limited. All rights reserved.
This document and the proprietary marks and names
used herein are protected by international law.
![]() | Installing SCP and Samba | Compatibility With Previous Versions of UNIX Option | ![]() |