Extracts (or capture) a value based on the specified regular expression or extract and substitute a value based on the specified “sed” expression. The value can be from a previously specified field in the query or a raw event message.
Synopsis
... | rex <regular_expression containing a field name>
Or
... | rex field = <field> mode=sed “s/<string to be substituted>/<substitution value>”
Understanding How Extraction Works
When the value is extracted based on a regular expression, the extracted value is assigned to a field name, which is specified as part of the regular expression. The syntax for defining the field name is ?<fieldname>, where fieldname is a string of alphanumeric characters. Using an underscore (“_”) is not recommended.
For example, use the following event to illustrate the power of rex.
[Thu Jul 30 01:20:06 2009] [error] [client 69.63.180.245] PHP Warning: Can't connect to 10.4.31.4:11211
If you want to extract any IP address from the above event and assign it to a field called IP_Address, specify the following rex expression:
| rex “(?<IPAddress>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})”
However, if you wanted to extract the IP address after the word “client” from the following event and assign it to a field called SourceIP, you will need to specify a start and end point for IP address extraction, so that the second IP address in the event is not captured. The starting point in this event can be [client and the end point can be ]. Thus, the rex expression will be:
| rex “\[client (?<SourceIP>[^\]]*)”
In this rex expression ?<SourceIP> is the field name defined to capture IP address and client specifies the text or point in the event AFTER which data will be extracted. The [^\]]* expression will match every character that is not a closing right bracket, therefore, for our example event, the expression will match until the end of the first IP address and not the second IP address that appears after the word “to” in the event message.
Understanding How Substitution Works
When the rex operator is used in sed mode, you can substitute the values of extracted fields with the values you specify. For example, if you are generating a report of events that contain credit card numbers, you might want to substitute the credit card numbers to obfuscate the real numbers.
The substitution only occurs in the search results. The actual event is not changed.
In the following example, the credit card numbers in the CCN field are substituted with “xxxx”, thus obfuscating sensitive data:
| rex field=CCN mode=sed “s/*/xxxx/g”
The “/g” at the end of the command indicates a global replace, that is, all occurrences of the specified pattern will be replaced in all matching events. If “/g” is omitted, only the first occurrence of the specified pattern in each event is replaced.
Multiple substitutions can be performed in a single command, as shown in the following example. In this example, the word “Authentication” is substituted with “xxxx” globally (for all matching events), the first byte of the agent address that start with “192” is substituted with “xxxx” and an IP address that starts with “10” is substituted with “xxxx”.
| rex field=msg mode=sed “s/Authentication/xxxx/g” | rex field=agentAddress mode=sed “s/192/xxxx/g” | rex field=dst mode=sed “s/10.*/xxxx/g”
Usage Notes
A detailed tutorial on the rex operator is available at Using the Rex Operator.
A Regex Helper tool is available for formulating regular expressions of fields in which you are interested. The Regex Helper parses an event into fields. Then, you select the fields that you want to include in the rex expression. The regular expression for those fields is automatically inserted in the Search box. For detailed information on the Regex Helper tool, see Regex Helper Tool.
The extracted values are displayed as additional columns in the All Fields view (of the System Fieldsets). To view only the extracted columns, select User Defined Fieldsets from the System Fieldsets list. In the above example, an additional column with heading “SourceIP” is added to the All Fields view; IP address values extracted from events are listed in this column.
If you want to use other search operators such as fields, sort, chart, and so on to refine your search results, you must first use this operator to extract those fields.
The following example extracts name and social security number from an event that contains data in name:John ssn:123-45-6789 format and assigns them to Name and SSN fields:
... | rex “name: (?<Name>.*) ssn: (?<SSN>.*)”
The following example extracts URLs from events and displays the top 10 of the extracted URLs:
... | rex “http://(?<URL>[^ ]*)” | top URL
The following example substitutes the last four digits of social security numbers extracted in the first event with xxxx:
... | rex field=SSN mode=sed “s/-\d{4}/-xxxx/g”