Regular Expression Post-Processor
Regular expression post-processors are the most flexible type of post-processor. They allow you to easily apply custom post-processing to artifacts (including command output) in your build. By supplying patterns in Java regular expression syntax, you can search for errors, warnings and/or information in build artifacts. In addition, regular expression post-processors support other powerful features:
- capture leading and trailing lines, so you can easily report relevent context
- join overlapping features, for simplified reporting
- exclude "false" positive hits (without complicating the original regular expressions)
- choose if errors and/or warnings found should trigger a build failure
Regular expresssion post-processors have been made powerful enough to form the basis of various other post-processors.
Regular expression post-processors are intended to be applied to plain text artifacts (including command output). When applied, a regular expression processor will use its configured patterns to search the artifact line-by-line for matching features. When a match is found, a corresponding feature (with context if requested) is added to the artifact. A line will match if the configured patterns are found anywhere within the line (i.e. the whole line need not match, just some substring of the line). To restrict matches to the whole line, start (^) and end ($) of line anchors may be used within individual expressions.
 | Escaping
Note that as the dollar sign character ($) and backslash character (\) have special meanings both in Pulse files and in regular expressions, you must take care when writing expressions that include these characters. See the section Escaping in Regular Expressions below. |
Escaping in Regular Expressions
Backslashes
The backslash character (\) is used as an escape character in both Pulse files and regular expressions. Thus, you must pay close attention to how backslashes are used in your regular expressions. Often, you will want to escape special characters (such as '.') in your expressions. To do so, you must preceed them with a backslash in the expression. For example, to match file names ending in ".cpp", you need the regular expression:
Now, to enter this expression in a Pulse file, you must add another backslash, as the Pulse file itself treats backslashes specially. So you would need something like:
<pattern category="info" expression="*\\.cpp"/>
If you need to actually match the backslash character itself within a regular expression, your expression must contain two backslashes. For example:
To express this pattern in a Pulse file, you must escape each of the backslashes, giving:
<pattern category="info" expression="sample\\\\file\\\\path"/>
Dollar Signs
The dollar sign ($) character also has a special meaning in both Pulse files (a property reference) and regular expressions (end of line). Thus to use the dollar sign to match the end of line in an expression, you must escape it in your Pulse file:
<pattern category="info" expression="line ends here\$"/>
To match a literal dollar sign in your expression, it must be escaped with a backslash in the pattern. To express this pattern in a Pulse file, the backslash itself needs escaping, hence:
<pattern category="info" expression="\\$100"/>
Attributes
| Attribute |
Description |
Required? |
Default |
| fail-on-error |
If true, any error features detected will cause the command (and hence build) to fail. |
No |
true |
| fail-on-warning |
If true, any warning features detected will cause the command (and hence build) to fail. |
No |
false |
| join-overlapping |
If true, any features found that overlap one another (i.e. share at least one common line) will be joined into a single feature |
No |
true |
| leading-context |
Number of lines preceding a matching line that should be captured as part of the feature summary. |
No |
0 |
| name |
The name of this post-processor. |
Yes |
|
| trailing-context |
Number of lines following a matching line that should be captured as part of the feature summary. |
No |
0 |
Child Elements
Each regular expression post-processor contains zero or more nested patterns. These patterns defined the expressions used to match lines. When the processor is applied to a line, each pattern is applied in turn to that line:
| Element |
Description |
Number |
| pattern |
Defines a single regular expression to search for. |
0 or more |
Examples
A simple regular expression post-processor to find compiler errors and warnings in C source code:
<regex.pp name="compile.pp">
<pattern category="error" expression="\\.[ch]:[0-9]+: error"/>
<pattern category="warning" expression="\\.[ch]:[0-9]+: warning"/>
</regex.pp>
A more complicated processor that does a broad search for errors with surrounding lines of context, excluding known false positives:
<regex.pp name="errors.pp" fail-on-error="false" leading-context="3" trailing-context="5">
<pattern category="error" expression="[Ee]rror">
<exclude expression="MyError.java"/>
<exclude expression="terror.txt"/>
</pattern>
</regex.pp>