Applies to: CELONIS 4.0 CELONIS 4.2 CELONIS 4.3 CELONIS 4.4
Description
MATCH_PROCESS matches the variants of a process against a given pattern.
MATCH_PROCESS uses Nodes and Edges to match the cases. Nodes consist either of a single activity or a list of activities. Edges describe how the nodes are linked together.
Syntax
MATCH_PROCESS ( [ activity_table.activity_column ,] node (, node)* CONNECTED BY edge (, edge)* )
- activity_column: string column joinable to the activity table
- node: NODE | OPTIONAL | LOOP | OPTIONAL_LOOP | STARTING | ENDING | single_activity (, single_activity )* AS node_name
- single_activity: [LIKE] activity (Activity name. LIKE allows you to use wildcards in your activity name)
- edge: DIRECT | EVENTUALLY [ edge_start_node, edge_end_node ]
- edge_start_node: node_name
- edge_end_node: node_name
Node
A node consists of one or more activities. If multiple activities are given, it means one of those activities.
Node Types
- NODE: Node which has to be part once in the case, without any restrictions on where the node has to be.
- STARTING: Node which has to happen at the beginning of a case.
- ENDING: Node which has to happen at the end of a case.
- LOOP: Node which occurs at least once but can also be repeated.
Edge Types
- DIRECT: edge_end_node has to follow directly after the edge_start_node
- EVENTUALLY: between edge_start_node and edge_end_node other activities can be placed
Result
MATCH_PROCESS returns an integer column, which flags all matching cases with 1 and all non matching columns with a 0. The result is often used in combination with a filter.
Examples
Here MATCH_PROCESS flags all cases in which one activity A is followed directly by activity B with a 1.
Column1
"Table1_CASES"."CASE_ID"
Column2
MATCH_PROCESS("Table1"."ACTIVITY", NODE['A'] as src, NODE['B'] as trg CONNECTED BY DIRECT [ src, trg ])
Here is MATCH_PROCESS combined with a filter. The result are only cases in which one activity A is followed by activity B.
Filter
FILTER MATCH_PROCESS("Table1"."ACTIVITY", NODE['A'] as src, NODE['B'] as trg CONNECTED BY DIRECT [ src, trg ]) = 1
Column1
"Table1"."CASE_ID"
Column2
"Table1"."ACTIVITY"
If an activity has not only to be directly followed by another activity but can come any time later the keyword EVENTUALLY can be used. In this example MATCH_PROCESS flags all cases in which one activity A is followed eventually by activity B with a 1.
Column1
"Table1_CASES"."CASE_ID"
Column2
MATCH_PROCESS("Table1"."ACTIVITY", NODE['A'] as src, NODE['B'] as trg CONNECTED BY EVENTUALLY [ src, trg ])
Here is an example in which node 'node_ab' has two activities. This means that a matching case needs an activity C which comes either after A or B.
Column1
"Table1_CASES"."CASE_ID"
Column2
MATCH_PROCESS("Table1"."ACTIVITY", NODE['A', 'B'] as node_ab, NODE['C'] as node_c CONNECTED BY DIRECT [ node_ab, node_c ])
Nodes can represent loops. Here matching cases can have between activity A and C at least one or more activities of type B.
Column1
"Table1_CASES"."CASE_ID"
Column2
MATCH_PROCESS("Table1"."ACTIVITY", NODE['A'] AS node_a, LOOP['B'] AS loop_b, NODE['C'] AS node_c CONNECTED BY DIRECT [ node_a, loop_b ], DIRECT [ loop_b, node_c ])
Table1
CASE_ID : STRING | ACTIVITY : STRING | TIMESTAMP : DATE |
---|---|---|
'1' | 'A' | Sun Jan 01 2017 13:00:00.000 |
'1' | 'B' | Sun Jan 01 2017 13:01:00.000 |
'1' | 'B' | Sun Jan 01 2017 13:02:00.000 |
'1' | 'C' | Sun Jan 01 2017 13:04:00.000 |
'2' | 'A' | Sun Jan 01 2017 13:00:00.000 |
'2' | 'B' | Sun Jan 01 2017 13:01:00.000 |
'2' | 'C' | Sun Jan 01 2017 13:02:00.000 |
'3' | 'A' | Sun Jan 01 2017 13:00:00.000 |
'3' | 'C' | Sun Jan 01 2017 13:02:00.000 |
A loop node can also consist of multiple activities. PROCESS_MATCH accepts than all given activities, without regarding order or number of occurrences till another activity is found.
Column1
"Table1_CASES"."CASE_ID"
Column2
MATCH_PROCESS("Table1"."ACTIVITY", NODE['A'] AS node_a, LOOP['B', 'C'] AS loop_bc, NODE['D'] AS node_d CONNECTED BY DIRECT [ node_a, loop_bc ], DIRECT [ loop_bc, node_d ])
Nodes can be forced to be at the start or the end of a case.
Column1
"Table1_CASES"."CASE_ID"
Column2
MATCH_PROCESS("Table1"."ACTIVITY", STARTING['A'] AS node_a, ENDING['B'] AS node_b CONNECTED BY DIRECT [ node_a, node_b ])
Table1
CASE_ID : STRING | ACTIVITY : STRING | TIMESTAMP : DATE |
---|---|---|
'1' | 'A' | Sun Jan 01 2017 13:00:00.000 |
'1' | 'B' | Sun Jan 01 2017 13:01:00.000 |
'2' | 'A' | Sun Jan 01 2017 13:02:00.000 |
'2' | 'B' | Sun Jan 01 2017 13:03:00.000 |
'2' | 'C' | Sun Jan 01 2017 13:05:00.000 |
'3' | 'A' | Sun Jan 01 2017 13:06:00.000 |
'3' | 'A' | Sun Jan 01 2017 13:07:00.000 |
'3' | 'B' | Sun Jan 01 2017 13:08:00.000 |