Description
Filters the variants based on a regular expression defined over the activities.
MATCH_PROCESS_REGEX matches the variants of a process based on a regular expression. The regular expression defines a pattern over the activities of the variant. It returns an INT value which is 1 if the variant matches the pattern or 0 if it does not match.
Tips
- You can define an alias for an activity and use this alias as an abbreviation for the activity name inside the regular expression. This improves the readability of the pattern if you have long activity names.
- You can also define an alias for a subexpression. This enables you to give a logical name to a certain subexpression and use this subexpression at different places inside the regular expression. This allows you to structure your regular expressions and improve its understandability
- MATCH_PROCESS_REGEX is very useful if you want to filter on variants by a very specific and/or complex pattern, e.g. containing sequences or loops. If you want to filter for the occurrence of a single activity or very simple patterns, it is far more efficient to use normal conditional expressions instead of regular expressions.
Syntax
MATCH_PROCESS_REGEX ( activity_table.activity_column, regular_expression )
Patterns
This is an overview over all patterns that can be used within the regular expression. Full example queries as well as more detailed descriptions can be found below.
Syntax | Meaning | Example |
---|---|---|
' ' | Case contains the activity | 'A' |
^ | Case starts with the activity | ^ 'Scan Invoice' |
$ | Case ends with the activity | 'B' $ |
>> | Activities directly follow | 'A' >> 'B' |
| | Logical OR | 'A' | 'B' |
( ) | Group of activities | ('A' | 'B') >> ('C' >> 'D') |
* | 0 or more occurences | ('A' >> 'B')* |
+ | 1 or more occurences | ('A' >> 'B')+ |
? | 0 or 1 occurences | ('A' >> 'B')? |
'*' | Any activity matches | 'A' >> ('*')+ >> 'B' |
ANY | Any activity matches | 'A' >> (ANY)+ >> 'B' |
. | Any activity matches | 'A' >> . >> 'C' |
LIKE '%...%' | Activities that contain string | 'A' >> LIKE '% Invoice%' |
[' ',' '] | Set of activities of which one needs to match | 'A' >> ['B','D','E'] >> 'C' |
[! ' '] | Set of activities of which none matches | 'A' >> [! 'B'] >> 'C' |
AS | Gives an alias to a regex | ('A' >> (ANY)*) AS sequence, sequence >> 'B' |
Example
'A'
followed by activity 'B', followed by one or more
activities 'A'
or 'B'
in arbitrary order. After that, an activity 'C'
has
to occur. However, before and after that activity any number (zero or more) of any possible activity may occur.
Finally, the variant has to end with an activity 'H'
.
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", ^ 'A' >> 'B' >> ('B' | 'A')+ >> (ANY)* >> 'C' >> ('*')* >> 'H' $ ) = 1
Column1
"Table1"."Case"
Column2
"Table1"."Activity"
Table1
Case : INT | Activity : STRING | Timestamp : DATE |
---|---|---|
1 | 'A' | Fri Jan 01 2016 01:00:00.000 |
1 | 'B' | Fri Jan 01 2016 02:00:00.000 |
1 | 'C' | Fri Jan 01 2016 03:00:00.000 |
1 | 'D' | Fri Jan 01 2016 04:00:00.000 |
1 | 'E' | Fri Jan 01 2016 05:00:00.000 |
1 | 'F' | Fri Jan 01 2016 06:00:00.000 |
1 | 'G' | Fri Jan 01 2016 07:00:00.000 |
2 | 'A' | Fri Jan 01 2016 08:00:00.000 |
2 | 'B' | Fri Jan 01 2016 09:00:00.000 |
2 | 'D' | Fri Jan 01 2016 10:00:00.000 |
2 | 'E' | Fri Jan 01 2016 11:00:00.000 |
2 | 'H' | Fri Jan 01 2016 12:00:00.000 |
2 | 'I' | Fri Jan 01 2016 13:00:00.000 |
3 | 'A' | Fri Jan 01 2016 14:00:00.000 |
3 | 'B' | Fri Jan 01 2016 15:00:00.000 |
3 | 'B' | Fri Jan 01 2016 16:00:00.000 |
3 | 'C' | Fri Jan 01 2016 17:00:00.000 |
3 | 'G' | Fri Jan 01 2016 18:00:00.000 |
3 | 'H' | Fri Jan 01 2016 19:00:00.000 |
4 | 'A' | Fri Jan 01 2016 20:00:00.000 |
4 | 'B' | Fri Jan 01 2016 21:00:00.000 |
4 | 'A' | Fri Jan 01 2016 22:00:00.000 |
4 | 'D' | Fri Jan 01 2016 23:00:00.000 |
4 | 'C' | Sat Jan 02 2016 00:00:00.000 |
4 | 'H' | Sat Jan 02 2016 01:00:00.000 |
5 | 'D' | Sat Jan 02 2016 02:00:00.000 |
5 | 'A' | Sat Jan 02 2016 03:00:00.000 |
5 | 'A' | Sat Jan 02 2016 04:00:00.000 |
5 | 'B' | Sat Jan 02 2016 05:00:00.000 |
5 | 'B' | Sat Jan 02 2016 06:00:00.000 |
5 | 'H' | Sat Jan 02 2016 07:00:00.000 |
6 | 'X' | Sat Jan 02 2016 08:00:00.000 |
6 | 'A' | Sat Jan 02 2016 09:00:00.000 |
6 | 'B' | Sat Jan 02 2016 10:00:00.000 |
6 | 'B' | Sat Jan 02 2016 11:00:00.000 |
6 | 'B' | Sat Jan 02 2016 12:00:00.000 |
6 | 'C' | Sat Jan 02 2016 13:00:00.000 |
6 | 'H' | Sat Jan 02 2016 14:00:00.000 |
7 | 'A' | Sat Jan 02 2016 15:00:00.000 |
7 | 'B' | Sat Jan 02 2016 16:00:00.000 |
7 | 'A' | Sat Jan 02 2016 17:00:00.000 |
7 | 'D' | Sat Jan 02 2016 18:00:00.000 |
7 | 'C' | Sat Jan 02 2016 19:00:00.000 |
7 | 'H' | Sat Jan 02 2016 20:00:00.000 |
7 | 'Y' | Sat Jan 02 2016 21:00:00.000 |
8 | 'X' | Sat Jan 02 2016 22:00:00.000 |
8 | 'Y' | Sat Jan 02 2016 23:00:00.000 |
8 | 'Z' | Sun Jan 03 2016 00:00:00.000 |
8 | 'A' | Sun Jan 03 2016 01:00:00.000 |
8 | 'B' | Sun Jan 03 2016 02:00:00.000 |
8 | 'C' | Sun Jan 03 2016 03:00:00.000 |
Regular expression constructs
The regular expression can be constructed from several special constructs. In the following sections, these construts will be explained in more detail.
Activities
Activities are given by their respective name (including all spaces, special characters etc.) enclosed by single quotes:
Syntax
"'" <string> "'"
Example
'B'
). Every variant having at least one instance
of this activity will be contained in the result, regardless of the number of occurrences or its position in the
variant.
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", 'B') = 1
Column1
"Table1"."CASE_ID"
Column2
"Table1"."ACTIVITY"
Table1
CASE_ID : STRING | ACTIVITY : STRING | startTimestamp : DATE |
---|---|---|
'1' | 'A' | Fri Jan 01 2016 01:00:00.000 |
'1' | 'B' | Fri Jan 01 2016 02:00:00.000 |
'1' | 'C' | Fri Jan 01 2016 03:00:00.000 |
'1' | 'D' | Fri Jan 01 2016 04:00:00.000 |
'2' | 'E' | Fri Jan 01 2016 05:00:00.000 |
'2' | 'F' | Fri Jan 01 2016 06:00:00.000 |
'2' | 'G' | Fri Jan 01 2016 07:00:00.000 |
'2' | 'I' | Fri Jan 01 2016 08:00:00.000 |
If multiple activities with similar names should be mapped at the same time, it is possible to use the keyword
LIKE
with a SQL-style wildcard match (read here for
more details about the LIKE
operator).
Syntax
LIKE "'" <identifier_with_wildcards> "'"
Example
ask
in its name.
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", LIKE '%ask%' ) = 1
Column1
"Table1"."CASE_ID"
Column2
"Table1"."ACTIVITY"
Table1
CASE_ID : STRING | ACTIVITY : STRING | startTimestamp : DATE |
---|---|---|
'1' | 'Activity A' | Fri Jan 01 2016 01:00:00.000 |
'1' | 'Task B' | Fri Jan 01 2016 02:00:00.000 |
'1' | 'Activity C' | Fri Jan 01 2016 03:00:00.000 |
'1' | 'Activity D' | Fri Jan 01 2016 04:00:00.000 |
'2' | 'Activity A' | Fri Jan 01 2016 05:00:00.000 |
'2' | 'Activity A' | Fri Jan 01 2016 06:00:00.000 |
'2' | 'Activity B' | Fri Jan 01 2016 07:00:00.000 |
'2' | 'Activity C' | Fri Jan 01 2016 08:00:00.000 |
'3' | 'Activity A' | Fri Jan 01 2016 09:00:00.000 |
'3' | 'Activity B' | Fri Jan 01 2016 10:00:00.000 |
'3' | 'Activity B' | Fri Jan 01 2016 11:00:00.000 |
'3' | 'Activity C' | Fri Jan 01 2016 12:00:00.000 |
Concatenation
The concatenation allows to match for sequences of certain regular expressions (e.g. activity names). Sequences of two or more expressions are
given in the order of occurance, separated by two greater than symbols (>>
).
Syntax
<regular_expression> >> <regular_expression>
Example
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", 'B' >> 'C') = 1
Column1
"Table1"."CASE_ID"
Column2
"Table1"."ACTIVITY"
Table1
CASE_ID : STRING | ACTIVITY : STRING | startTimestamp : DATE |
---|---|---|
'1' | 'A' | Fri Jan 01 2016 01:00:00.000 |
'1' | 'B' | Fri Jan 01 2016 02:00:00.000 |
'1' | 'C' | Fri Jan 01 2016 03:00:00.000 |
'1' | 'D' | Fri Jan 01 2016 04:00:00.000 |
'2' | 'E' | Fri Jan 01 2016 05:00:00.000 |
'2' | 'F' | Fri Jan 01 2016 06:00:00.000 |
'2' | 'G' | Fri Jan 01 2016 07:00:00.000 |
'2' | 'I' | Fri Jan 01 2016 08:00:00.000 |
'3' | 'B' | Fri Jan 01 2016 09:00:00.000 |
'3' | 'A' | Fri Jan 01 2016 10:00:00.000 |
'3' | 'C' | Fri Jan 01 2016 11:00:00.000 |
'3' | 'D' | Fri Jan 01 2016 12:00:00.000 |
Matching any activity
Furthermore, it is possible to match any activity, regardless of its name. For convenience, there are three different ways to do this:
- by a an activity name that only contains the star symbol (
'*'
), - by the keyword
ANY
, - or by the dot symbol (
.
) which is well-known from the widely-used Perl regular expression syntax.
Example
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", 'A' >> . >> 'B' >> 'C') = 1
Column1
"Table1"."CASE_ID"
Column2
"Table1"."ACTIVITY"
Table1
CASE_ID : STRING | ACTIVITY : STRING | startTimestamp : DATE |
---|---|---|
'1' | 'A' | Fri Jan 01 2016 01:00:00.000 |
'1' | 'B' | Fri Jan 01 2016 02:00:00.000 |
'1' | 'C' | Fri Jan 01 2016 03:00:00.000 |
'1' | 'D' | Fri Jan 01 2016 04:00:00.000 |
'2' | 'A' | Fri Jan 01 2016 05:00:00.000 |
'2' | 'A' | Fri Jan 01 2016 06:00:00.000 |
'2' | 'B' | Fri Jan 01 2016 07:00:00.000 |
'2' | 'C' | Fri Jan 01 2016 08:00:00.000 |
'3' | 'A' | Fri Jan 01 2016 09:00:00.000 |
'3' | 'B' | Fri Jan 01 2016 10:00:00.000 |
'3' | 'B' | Fri Jan 01 2016 11:00:00.000 |
'3' | 'C' | Fri Jan 01 2016 12:00:00.000 |
Choice
The choice construct allows to define two or more different alternatives for the matching regular expression. The
overall regular expression matches if one of the given alternatives is matched. The alternative regular expressions
are separated by the pipe symbol (|
). In contrast to the concatenation, the order of the alternatives is
irrelevant.
Syntax
<regular_expression> "|" <regular_expression>
Example
'A'
, 'B'
, or
'H'
.
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", 'A' | 'B' | 'H' ) = 1
Column1
"Table1"."CASE_ID"
Column2
"Table1"."ACTIVITY"
Table1
CASE_ID : STRING | ACTIVITY : STRING | startTimestamp : DATE |
---|---|---|
'1' | 'A' | Fri Jan 01 2016 01:00:00.000 |
'1' | 'B' | Fri Jan 01 2016 02:00:00.000 |
'1' | 'C' | Fri Jan 01 2016 03:00:00.000 |
'2' | 'E' | Fri Jan 01 2016 04:00:00.000 |
'2' | 'F' | Fri Jan 01 2016 05:00:00.000 |
'2' | 'G' | Fri Jan 01 2016 06:00:00.000 |
'3' | 'H' | Fri Jan 01 2016 07:00:00.000 |
'3' | 'I' | Fri Jan 01 2016 08:00:00.000 |
'3' | 'K' | Fri Jan 01 2016 09:00:00.000 |
Match start/end
In general, the regular expressions may match at any position in a variant. However, sometimes it is desirable to
define a pattern each matching variants should start and / or end with. To define an exact match at the beginning of
each variant, the ^
symbol must be added before the definition of the regular expression. To define an
exact match at the end of each variant, the $
symbol must be added after the definnition of the regular
expression.
Syntax
[ ^ ] <regular_expression> [ $ ]
Examples
'A'
followed by an activity 'B'
.
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", ^ 'A' >> 'B') = 1
Column1
"Table1"."CASE_ID"
Column2
"Table1"."ACTIVITY"
Table1
CASE_ID : STRING | ACTIVITY : STRING | startTimestamp : DATE |
---|---|---|
'1' | 'A' | Fri Jan 01 2016 01:00:00.000 |
'1' | 'B' | Fri Jan 01 2016 02:00:00.000 |
'1' | 'C' | Fri Jan 01 2016 03:00:00.000 |
'1' | 'D' | Fri Jan 01 2016 04:00:00.000 |
'2' | 'A' | Fri Jan 01 2016 05:00:00.000 |
'2' | 'A' | Fri Jan 01 2016 06:00:00.000 |
'2' | 'B' | Fri Jan 01 2016 07:00:00.000 |
'2' | 'C' | Fri Jan 01 2016 08:00:00.000 |
'C'
followed by an activity 'D'
.
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", 'C' >> 'D' $) = 1
Column1
"Table1"."CASE_ID"
Column2
"Table1"."ACTIVITY"
Table1
CASE_ID : STRING | ACTIVITY : STRING | startTimestamp : DATE |
---|---|---|
'1' | 'A' | Fri Jan 01 2016 01:00:00.000 |
'1' | 'B' | Fri Jan 01 2016 02:00:00.000 |
'1' | 'C' | Fri Jan 01 2016 03:00:00.000 |
'1' | 'D' | Fri Jan 01 2016 04:00:00.000 |
'2' | 'A' | Fri Jan 01 2016 05:00:00.000 |
'2' | 'A' | Fri Jan 01 2016 06:00:00.000 |
'2' | 'B' | Fri Jan 01 2016 07:00:00.000 |
'2' | 'C' | Fri Jan 01 2016 08:00:00.000 |
'3' | 'A' | Fri Jan 01 2016 09:00:00.000 |
'3' | 'C' | Fri Jan 01 2016 10:00:00.000 |
'3' | 'D' | Fri Jan 01 2016 11:00:00.000 |
'3' | 'A' | Fri Jan 01 2016 12:00:00.000 |
'3' | 'C' | Fri Jan 01 2016 13:00:00.000 |
'3' | 'D' | Fri Jan 01 2016 14:00:00.000 |
Activity sets
Activity sets are an alternative way to express a choice between different activities. An activity set matches any
activity that is given in its defintion. Activity sets are defined as a (optionally comma-separated) list of activity
names enclosed by square brackets. In contrast to choices, activity sets can only be defined for activities,
but not for arbitrary regular expressions. However, activity sets can be inverted by adding a !
between
the opening bracket and the first activity name, so the activity set matches the complement of the defined
activities. This way, it is easy to express that all activities except one ore more certain activities should match.
Syntax
"[" [ ! ] <activity_name> ( , <activity_name> )* "]"
Examples
'A'
, 'B'
, or 'F'
.
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", ['A', 'B', 'F'] ) = 1
Column1
"Table1"."CASE_ID"
Column2
"Table1"."ACTIVITY"
Table1
CASE_ID : STRING | ACTIVITY : STRING | startTimestamp : DATE |
---|---|---|
'1' | 'A' | Fri Jan 01 2016 01:00:00.000 |
'1' | 'B' | Fri Jan 01 2016 02:00:00.000 |
'1' | 'C' | Fri Jan 01 2016 03:00:00.000 |
'2' | 'E' | Fri Jan 01 2016 04:00:00.000 |
'2' | 'F' | Fri Jan 01 2016 05:00:00.000 |
'2' | 'G' | Fri Jan 01 2016 06:00:00.000 |
'3' | 'X' | Fri Jan 01 2016 07:00:00.000 |
'3' | 'Y' | Fri Jan 01 2016 08:00:00.000 |
'3' | 'Z' | Fri Jan 01 2016 09:00:00.000 |
'A'
, 'B'
, or 'X'
.
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", [ ! 'A', 'B', 'X'] ) = 1
Column1
"Table1"."CASE_ID"
Column2
"Table1"."ACTIVITY"
Table1
CASE_ID : STRING | ACTIVITY : STRING | startTimestamp : DATE |
---|---|---|
'1' | 'A' | Fri Jan 01 2016 01:00:00.000 |
'1' | 'B' | Fri Jan 01 2016 02:00:00.000 |
'1' | 'C' | Fri Jan 01 2016 03:00:00.000 |
'2' | 'A' | Fri Jan 01 2016 04:00:00.000 |
'2' | 'B' | Fri Jan 01 2016 05:00:00.000 |
'2' | 'A' | Fri Jan 01 2016 06:00:00.000 |
'3' | 'X' | Fri Jan 01 2016 07:00:00.000 |
'3' | 'Y' | Fri Jan 01 2016 08:00:00.000 |
'3' | 'Z' | Fri Jan 01 2016 09:00:00.000 |
Grouping
Regular expressions which contain multiple sub-expressions (e.g., choices or concatenations) can be bracketed to group them in order to to clearify the structure. Grouping may be also necessary to nest regular expressions.
Syntax
"(" <regular_expression> ")"
Example
'A'
, 'B'
, and 'C'
.
The regular expression is grouped by a pair of brackets (without quantifier).
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", ('A' >> 'B' >> 'C' ) ) = 1
Column1
"Table1"."CASE_ID"
Column2
"Table1"."ACTIVITY"
Table1
CASE_ID : STRING | ACTIVITY : STRING | startTimestamp : DATE |
---|---|---|
'1' | 'A' | Fri Jan 01 2016 01:00:00.000 |
'1' | 'B' | Fri Jan 01 2016 02:00:00.000 |
'1' | 'C' | Fri Jan 01 2016 03:00:00.000 |
'1' | 'D' | Fri Jan 01 2016 04:00:00.000 |
'2' | 'A' | Fri Jan 01 2016 05:00:00.000 |
'2' | 'A' | Fri Jan 01 2016 06:00:00.000 |
'2' | 'B' | Fri Jan 01 2016 07:00:00.000 |
'2' | 'C' | Fri Jan 01 2016 08:00:00.000 |
'3' | 'A' | Fri Jan 01 2016 09:00:00.000 |
'3' | 'B' | Fri Jan 01 2016 10:00:00.000 |
'3' | 'B' | Fri Jan 01 2016 11:00:00.000 |
'3' | 'C' | Fri Jan 01 2016 12:00:00.000 |
Quantifiers
A quantifier defines that a regular expression may occur a variable number of times. Quantifiers must always be applied to a group expression. There are three different quantifiers available:
- * The regular expression may occur an arbitrary number of times, i.e. it occurs zero or more times.
- + The regular expression occurs at least once, i.e. it occurs one ore more times.
- ? The regular expression is optional, i.e. it may occur exactly once or not.
Syntax
"(" <regular_expression> ")" ( "+" | "?" | "*")
Examples
'B'
between activities 'A'
and 'C'
.
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", 'A' >> ('B')* >> 'C' ) = 1
Column1
"Table1"."Case"
Column2
"Table1"."Activity"
Table1
Case : INT | Activity : STRING | Timestamp : DATE |
---|---|---|
1 | 'A' | Fri Jan 01 2016 01:00:00.000 |
1 | 'B' | Fri Jan 01 2016 02:00:00.000 |
1 | 'C' | Fri Jan 01 2016 03:00:00.000 |
2 | 'A' | Fri Jan 01 2016 04:00:00.000 |
2 | 'B' | Fri Jan 01 2016 05:00:00.000 |
2 | 'B' | Fri Jan 01 2016 06:00:00.000 |
2 | 'C' | Fri Jan 01 2016 07:00:00.000 |
3 | 'A' | Fri Jan 01 2016 08:00:00.000 |
3 | 'C' | Fri Jan 01 2016 09:00:00.000 |
'B'
between activities 'A'
and 'C'
.
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", 'A' >> ('B')+ >> 'C' ) = 1
Column1
"Table1"."Case"
Column2
"Table1"."Activity"
Table1
Case : INT | Activity : STRING | Timestamp : DATE |
---|---|---|
1 | 'A' | Fri Jan 01 2016 01:00:00.000 |
1 | 'B' | Fri Jan 01 2016 02:00:00.000 |
1 | 'C' | Fri Jan 01 2016 03:00:00.000 |
2 | 'A' | Fri Jan 01 2016 04:00:00.000 |
2 | 'B' | Fri Jan 01 2016 05:00:00.000 |
2 | 'B' | Fri Jan 01 2016 06:00:00.000 |
2 | 'C' | Fri Jan 01 2016 07:00:00.000 |
3 | 'A' | Fri Jan 01 2016 08:00:00.000 |
3 | 'C' | Fri Jan 01 2016 09:00:00.000 |
'B'
between activities 'A'
and 'C'
.
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", 'A' >> ('B')? >> 'C' ) = 1
Column1
"Table1"."Case"
Column2
"Table1"."Activity"
Table1
Case : INT | Activity : STRING | Timestamp : DATE |
---|---|---|
1 | 'A' | Fri Jan 01 2016 01:00:00.000 |
1 | 'B' | Fri Jan 01 2016 02:00:00.000 |
1 | 'C' | Fri Jan 01 2016 03:00:00.000 |
2 | 'A' | Fri Jan 01 2016 04:00:00.000 |
2 | 'B' | Fri Jan 01 2016 05:00:00.000 |
2 | 'B' | Fri Jan 01 2016 06:00:00.000 |
2 | 'C' | Fri Jan 01 2016 07:00:00.000 |
3 | 'A' | Fri Jan 01 2016 08:00:00.000 |
3 | 'C' | Fri Jan 01 2016 09:00:00.000 |
Pitfall
- If not specified to match exactly at the start and/or at the end of the variant, a regular expression may match at any position in the variant as an arbitrary number of activities at beginning / end may be skipped. Consequently, a ? quantifier for example may also match any number of occurance of its subexpression, if it is the only subexpression or the first/last subexpression of a concatenation.
Example
'B'
. Because there is no exact match at start or end of
the variant defined, all given variants are matched as the additional activity 'B'
for variant 2 is
implicitly skipped.
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", ('B')? ) = 1
Column1
"Table1"."Case"
Column2
"Table1"."Activity"
Table1
Case : INT | Activity : STRING | Timestamp : DATE |
---|---|---|
1 | 'A' | Fri Jan 01 2016 01:00:00.000 |
1 | 'B' | Fri Jan 01 2016 02:00:00.000 |
1 | 'C' | Fri Jan 01 2016 03:00:00.000 |
2 | 'A' | Fri Jan 01 2016 04:00:00.000 |
2 | 'B' | Fri Jan 01 2016 05:00:00.000 |
2 | 'B' | Fri Jan 01 2016 06:00:00.000 |
2 | 'C' | Fri Jan 01 2016 07:00:00.000 |
3 | 'A' | Fri Jan 01 2016 08:00:00.000 |
3 | 'C' | Fri Jan 01 2016 09:00:00.000 |
Defining and referencing an alias
Sometimes, similar subexpressions may occur at different positions in the regular expression. This may be simple activities as well as complex subexpressions. An alias provides an elegant way to avoid repeating these subexpressions at different positions in the regular expression. It is a simple name that is assigned to a specific regular expression. By this name, the assigned regular expression can be referneced in order to re-use it in another regular expresssion. The regular expressions and their related aliases are given as a comma-separated list. The last regular expression of this list is the root expression which is the entry point for the pattern matching. Therefore, it cannot be referenced by another regular expression. Consequently, the root expression is never assigned to an alias.
Tips/Pitfalls
- Avoid to create cyclic definitions of regular expressions when using aliases as cycles will result in an error.
- Aliases can be used as a shortcut for long activity names. This may save a lot of typing and increase the readability of complex regular expressions.
- Be aware that the match at start symbol must be placed before the first aliased expressions, the match at end symbol must be placed after the root expression.
- It is not possible to reference aliases defined in another PROCESS_MATCH_REGEX operator. Only aliases within the same operator can be referenced.
Syntax
<regular_expression> ( AS <alias_name> "," <regular_expression> )*
Example
'A'
, 'B'
, 'C'
, and 'D'
.
Activities 'A'
and 'B'
are renamed to 'AliasA'
and 'AliasB'
, respectively.
Filter
FILTER MATCH_PROCESS_REGEX ("Table1"."ACTIVITY", 'B' AS AliasB, 'A' AS AliasA, AliasA >> AliasB >> 'C' >> 'D') = 1
Column1
"Table1"."CASE_ID"
Column2
"Table1"."ACTIVITY"
Table1
CASE_ID : STRING | ACTIVITY : STRING | startTimestamp : DATE |
---|---|---|
'1' | 'A' | Fri Jan 01 2016 01:00:00.000 |
'1' | 'B' | Fri Jan 01 2016 02:00:00.000 |
'1' | 'C' | Fri Jan 01 2016 03:00:00.000 |
'1' | 'D' | Fri Jan 01 2016 04:00:00.000 |
'2' | 'E' | Fri Jan 01 2016 05:00:00.000 |
'2' | 'F' | Fri Jan 01 2016 06:00:00.000 |
'2' | 'G' | Fri Jan 01 2016 07:00:00.000 |
'2' | 'I' | Fri Jan 01 2016 08:00:00.000 |