Description
This step type allows you to validate an input field against regular expression. A regular expression (regex or regexp for short) is a special text string for describing a search pattern. For example, the equivalent regex for wildcard notations such as *.txt to find all text files in a file manager is:
.*\.txt
See also:
- Wikipedia on regular expressions
- A regular expressions tutorial
- External sample parsing Tomcat log records
IMPORTANT: Don't panic! For people new to regular expressions, the cryptic nature of the language can be a bit daunting. However, "RegExps" pack a lot of punch and are very much worth the time you spend on it.
Settings Tab
Option |
Description |
---|---|
Step name |
Name of the step.
|
Field to evaluate |
Name of the field to evaluate |
Result Fieldname |
The name of the return field (boolean) |
Create fields for capture groups |
Enable this if you want to create new fields based on capture groups in the regular expression. |
Regular expression |
Put here the regular expression to match. |
Use variable substitution |
If you use variable, return it's content by selecting this option. |
Capture group fields |
Here you can specify the new fields you would like to capture. |
Content
Option |
Description |
---|---|
Ignore differences in Unicode encodings |
Check to ignore differences.
|
Enables case-insensitive matching |
By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched. Unicode-aware case-insensitive matching can be enabled by specifying the 'Unicode-aware case...' flag in conjunction with this flag.
|
Permit whitespace and and comments in pattern |
When enabled, the step will ignore whitespace and embedded comments starting with # through the end of the line.
|
Enable dotall mode |
When enabled, the expression '.' matches any character including the line terminator. By default, this expression does not match the line terminators.
|
Enable multiline mode |
When enabled, the expressions '^' and '$' match just after or just before, respectively, a line terminator or the end of the input sequence. By default, these expressions only match at the beginning and the end of the entire input sequence.
|
Enable Unicode-aware case folding |
When enabled, in conjunction with the Case-insensitive flag, case-insensitive matching is done in a manner consistent with the Unicode standard. By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched.
|
Enables Unix lines mode |
When enabled, only the line terminator is recognized in the behavior of '.', '^', and '$'.
|
Example
samples/transformations/Regex Eval - parse NCSA access log records.ktr