Applies to: CELONIS 4.0 CELONIS 4.2 CELONIS 4.3 CELONIS 4.4 CELONIS 4.5
Description
MATCH_PROCESS matches the variants of a process against a given pattern.
Similar functionality is provided by MATCH_PROCESS_REGEX. MATCH_PROCESS uses Nodes and Edges to match the cases. Nodes consist either of a single activity or a list of activities. Edges describe how the nodes are linked together.
Syntax
MATCH_PROCESS ( [ activity_table.activity_column ,] node (, node)* CONNECTED BY edge (, edge)* )
- activity_column: string column joinable to the activity table
- node: NODE | OPTIONAL | LOOP | OPTIONAL_LOOP | STARTING | ENDING | single_activity (, single_activity )* AS node_name
- single_activity: [LIKE] activity (Activity name. LIKE allows you to use wildcards in your activity name)
- edge: DIRECT | EVENTUALLY [ edge_start_node, edge_end_node ]
- edge_start_node: node_name
- edge_end_node: node_name
Node
A node consists of one or more activities. If multiple activities are given, it means one of those activities.
Node Types
- NODE: Node which has to be part once in the case, without any restrictions on where the node has to be.
- STARTING: Node which has to happen at the beginning of a case.
- ENDING: Node which has to happen at the end of a case.
- LOOP: Node which occurs at least once but can also be repeated.
Edge Types
- DIRECT: edge_end_node has to follow directly after the edge_start_node
- EVENTUALLY: between edge_start_node and edge_end_node other activities can be placed
Result
MATCH_PROCESS returns an integer column, which flags all matching cases with 1 and all non matching columns with a 0. The result is often used in combination with a filter.
Examples
Here MATCH_PROCESS flags all cases in which one activity A is followed directly by activity B with a 1.
Query
Column1
"Activities_CASES"."CASE_ID"
Column2
MATCH_PROCESS ( "Activities"."ACTIVITY", NODE['A'] as src, NODE['B'] as trg CONNECTED BY DIRECT [ src, trg ] )
Input
Activities
Activities_CASES
CASE_ID : STRING |
---|
'1' |
'2' |
Foreign Keys
Activities_CASES.CASE_ID | Activities.CASE_ID |
Here is MATCH_PROCESS combined with a filter. The result are only cases in which one activity A is followed by activity B.
Query
Filter
FILTER MATCH_PROCESS ( "Activities"."ACTIVITY", NODE['A'] as src, NODE['B'] as trg CONNECTED BY DIRECT [ src, trg ] ) = 1
Column1
"Activities"."CASE_ID"
Column2
"Activities"."ACTIVITY"
Input
Activities
Activities_CASES
CASE_ID : STRING |
---|
'1' |
'2' |
Foreign Keys
Activities_CASES.CASE_ID | Activities.CASE_ID |
If an activity has not only to be directly followed by another activity but can come any time later the keyword EVENTUALLY can be used.
In this example MATCH_PROCESS flags all cases in which one activity A is followed eventually by activity B with a 1.
Query
Column1
"Activities_CASES"."CASE_ID"
Column2
MATCH_PROCESS ( "Activities"."ACTIVITY", NODE['A'] as src, NODE['B'] as trg CONNECTED BY EVENTUALLY [ src, trg ] )
Input
Activities
Activities_CASES
CASE_ID : STRING |
---|
'1' |
'2' |
Foreign Keys
Activities_CASES.CASE_ID | Activities.CASE_ID |
Here is an example in which node 'node_ab' has two activities. This means that a matching case needs an activity C
which comes either after A or B.
Query
Column1
"Activities_CASES"."CASE_ID"
Column2
MATCH_PROCESS ( "Activities"."ACTIVITY", NODE['A', 'B'] as node_ab, NODE['C'] as node_c CONNECTED BY DIRECT [ node_ab, node_c ] )
Input
Activities
Activities_CASES
CASE_ID : STRING |
---|
'1' |
'2' |
Foreign Keys
Activities_CASES.CASE_ID | Activities.CASE_ID |
Nodes can represent loops. Here matching cases can have between activity A and C at least one or more activities of type B.
Query
Column1
"Activities_CASES"."CASE_ID"
Column2
MATCH_PROCESS ( "Activities"."ACTIVITY", NODE['A'] AS node_a, LOOP['B'] AS loop_b, NODE['C'] AS node_c CONNECTED BY DIRECT [ node_a, loop_b ], DIRECT [ loop_b, node_c ] )
Input
Activities
CASE_ID : STRING | ACTIVITY : STRING | TIMESTAMP : DATE |
---|---|---|
'1' | 'A' | Sun Jan 01 2017 13:00:00.000 |
'1' | 'B' | Sun Jan 01 2017 13:01:00.000 |
'1' | 'B' | Sun Jan 01 2017 13:02:00.000 |
'1' | 'C' | Sun Jan 01 2017 13:04:00.000 |
'2' | 'A' | Sun Jan 01 2017 13:00:00.000 |
'2' | 'B' | Sun Jan 01 2017 13:01:00.000 |
'2' | 'C' | Sun Jan 01 2017 13:02:00.000 |
'3' | 'A' | Sun Jan 01 2017 13:00:00.000 |
'3' | 'C' | Sun Jan 01 2017 13:02:00.000 |
Activities_CASES
CASE_ID : STRING |
---|
'1' |
'2' |
'3' |
Foreign Keys
Activities_CASES.CASE_ID | Activities.CASE_ID |
A loop node can also consist of multiple activities. PROCESS_MATCH accepts than all given activities, without
regarding order or number of occurrences till another activity is found.
Query
Column1
"Activities_CASES"."CASE_ID"
Column2
MATCH_PROCESS ( "Activities"."ACTIVITY", NODE['A'] AS node_a, LOOP['B', 'C'] AS loop_bc, NODE['D'] AS node_d CONNECTED BY DIRECT [ node_a, loop_bc ], DIRECT [ loop_bc, node_d ] )
Input
Nodes can be forced to be at the start or the end of a case.
Query
Column1
"Activities_CASES"."CASE_ID"
Column2
MATCH_PROCESS ( "Activities"."ACTIVITY", STARTING['A'] AS node_a, ENDING['B'] AS node_b CONNECTED BY DIRECT [ node_a, node_b ] )
Input
Activities
CASE_ID : STRING | ACTIVITY : STRING | TIMESTAMP : DATE |
---|---|---|
'1' | 'A' | Sun Jan 01 2017 13:00:00.000 |
'1' | 'B' | Sun Jan 01 2017 13:01:00.000 |
'2' | 'A' | Sun Jan 01 2017 13:02:00.000 |
'2' | 'B' | Sun Jan 01 2017 13:03:00.000 |
'2' | 'C' | Sun Jan 01 2017 13:05:00.000 |
'3' | 'A' | Sun Jan 01 2017 13:06:00.000 |
'3' | 'A' | Sun Jan 01 2017 13:07:00.000 |
'3' | 'B' | Sun Jan 01 2017 13:08:00.000 |
Activities_CASES
CASE_ID : STRING |
---|
'1' |
'2' |
'3' |
Foreign Keys
Activities_CASES.CASE_ID | Activities.CASE_ID |