...
This step type allows you to validate an input field against regular expression. A regular expression (regex or regexp for short) is a special text string for describing a search pattern. For example, the equivalent regex for wildcard notations such as *.txt to find all text files in a file manager is:
Code Block |
---|
.*\.txt |
See also:
- Wikipedia on regular expressions
- A regular expressions tutorial
- External sample parsing Tomcat log records
IMPORTANT: Don't panic! For people new to regular expressions, the cryptic nature of the language can be a bit daunting. However, "RegExps" pack a lot of punch and are very much worth your time.
Settings Tab
Option | Description |
---|---|
Step name | Name of the step.
|
Field to evaluate | Name of the field to evaluate |
Result Fieldname | The name of the return field (boolean) |
Create fields for capture groups | Enable this if you want to create new fields based on capture groups in the regular expression. |
Regular expression | Put here the regular expression to match. |
Use variable substitution | If you use variable, return it's content by selecting this option. |
Capture group fields | Here you can specify the new fields you would like to capture. |
Content
Option | Description |
---|---|
Ignore differences in Unicode encodings | Check to ignore differences.
|
Enables case-insensitive matching | By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched. Unicode-aware case-insensitive matching can be enabled by specifying the 'Unicode-aware case...' flag in conjunction with this flag.
|
Permit whitespace and and comments in pattern | When enabled, the step will ignore whitespace and embedded comments starting with # through the end of the line.
|
Enable dotall mode | When enabled, the expression '.' matches any character including the line terminator. By default, this expression does not match the line terminators.
|
Enable multiline mode | When enabled, the expressions '^' and '$' match just after or just before, respectively, a line terminator or the end of the input sequence. By default, these expressions only match at the beginning and the end of the entire input sequence.
|
Enable Unicode-aware case folding | When enabled, in conjunction with the Case-insensitive flag, case-insensitive matching is done in a manner consistent with the Unicode standard. By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched.
|
Enables Unix lines mode | When enabled, only the line terminator is recognized in the behavior of '.', '^', and '$'.
|
Example
Code Block |
---|
samples/transformations/Regex Eval - parse NCSA access log records.ktr |