Applies to:  CELONIS 4.2 CELONIS 4.3 CELONIS 4.4 CELONIS 4.5 CELONIS 4.6 CELONIS 4.7 

Description

VARIANT aggregates all activities of a case into a string, which represents a variant of the case. SHORTENED can be used to shorten self-loops to a maximum number of occurrences.

Syntax

VARIANT ( activity_table.column ) 
  • activity_table.column: String column of the activity table.
SHORTENED ( VARIANT ( activity_table.column ) [, max_cycle_length ] )
  • activity_table.column: String column of the activity table.
  • max_cycle_length: More than max_cycle_length consecutive occurrences of the same activity are reduced to max_cycle_length occurrences (Since:  CELONIS 4.6). Default Value is 2.

SHORTENED

Sometimes, different cases may have self-loops of the same activity but with a different number of activities. Consequently, these cases are related to different variants. However, in some applications it is not of interest how often an activity is repeated but only if there is a self-loop or not. For such cases, the VARIANT operator can be wrapped by the SHORTENED command which shortens self-loops to a maximum number of occurrences. In this way, it is possible to abstract from repeated activities and reduce the number of distinct variants. The limit for the length of the self-loops can be specified by an optional parameter. The default value for the maximum cycle length is 2.

Input column

The input column can be any (modified) string column of the activity table. Usually, the activity column is used, but it is also possible to create variants based on a resource column, like user name or country information. Activities can also be grouped together by mapping activity names, for example using REMAP_VALUES. VARIANT ignores NULL values.

NULL handling

If an input value is NULL, VARIANT ignores the value.

Examples


[1] This example shows how VARIANT takes all activities per case and creates variants out of it.

Query
Column1
"Cases"."CASE_ID"
Column2
VARIANT ( "Activities"."ACTIVITY" )
Input
Activities
CASE_ID : INTACTIVITY : STRINGTIMESTAMP : DATE
1
'A'
Tue Jan 01 2019 13:00:00.000
1
'B'
Tue Jan 01 2019 13:01:00.000
1
'C'
Tue Jan 01 2019 13:02:00.000
2
'A'
Tue Jan 01 2019 13:03:00.000
2
'B'
Tue Jan 01 2019 13:04:00.000
2
'B'
Tue Jan 01 2019 13:05:00.000
2
'C'
Tue Jan 01 2019 13:06:00.000
3
'A'
Tue Jan 01 2019 13:07:00.000
3
'B'
Tue Jan 01 2019 13:08:00.000
3
'B'
Tue Jan 01 2019 13:09:00.000
3
'B'
Tue Jan 01 2019 13:10:00.000
3
'C'
Tue Jan 01 2019 13:11:00.000
Cases
CASE_ID : INT
1
2
3

Foreign Keys
Activities.CASE_IDCases.CASE_ID
Output
Result
Column1 : INTColumn2 : STRING
1
'A, B, C'
2
'A, B, B, C'
3
'A, B, B, B, C'



[2] If SHORTENED is applied to VARIANT, the loop in case '3' is reduced to a simple loop (with a default length of two):

Query
Column1
"Cases"."CASE_ID"
Column2
SHORTENED ( VARIANT ( "Activities"."ACTIVITY" ) )
Input
Activities
CASE_ID : INTACTIVITY : STRINGTIMESTAMP : DATE
1
'A'
Tue Jan 01 2019 13:00:00.000
1
'B'
Tue Jan 01 2019 13:01:00.000
1
'C'
Tue Jan 01 2019 13:02:00.000
2
'A'
Tue Jan 01 2019 13:03:00.000
2
'B'
Tue Jan 01 2019 13:04:00.000
2
'B'
Tue Jan 01 2019 13:05:00.000
2
'C'
Tue Jan 01 2019 13:06:00.000
3
'A'
Tue Jan 01 2019 13:07:00.000
3
'B'
Tue Jan 01 2019 13:08:00.000
3
'B'
Tue Jan 01 2019 13:09:00.000
3
'B'
Tue Jan 01 2019 13:10:00.000
3
'C'
Tue Jan 01 2019 13:11:00.000
Cases
CASE_ID : INT
1
2
3

Foreign Keys
Activities.CASE_IDCases.CASE_ID
Output
Result
Column1 : INTColumn2 : STRING
1
'A, B, C'
2
'A, B, B, C'
3
'A, B, B, C'



[3] Here is an example in which, instead of the ACTIVITY column, another column is used as input for VARIANT:

Query
Column1
"Cases"."CASE_ID"
Column2
VARIANT ( "Activities"."COUNTRY" )
Input
Activities
CASE_ID : INTACTIVITY : STRINGTIMESTAMP : DATECOUNTRY : STRING
1
'A'
Tue Jan 01 2019 13:00:00.000
'US'
1
'B'
Tue Jan 01 2019 13:01:00.000
'GB'
1
'C'
Tue Jan 01 2019 13:02:00.000
'GB'
2
'A'
Tue Jan 01 2019 13:03:00.000
'FR'
2
'B'
Tue Jan 01 2019 13:04:00.000
'AT'
2
'B'
Tue Jan 01 2019 13:05:00.000
'DE'
Cases
CASE_ID : INT
1
2

Foreign Keys
Activities.CASE_IDCases.CASE_ID
Output
Result
Column1 : INTColumn2 : STRING
1
'US, GB, GB'
2
'FR, AT, DE'



[4] In this example, activity 'B' is ignored in VARIANT by setting it to NULL using REMAP_VALUES:

Query
Column1
"Cases"."CASE_ID"
Column2
VARIANT ( REMAP_VALUES ( "Activities"."ACTIVITY" , [ 'B' , NULL ] ) )
Input
Activities
CASE_ID : INTACTIVITY : STRINGTIMESTAMP : DATE
1
'A'
Tue Jan 01 2019 13:00:00.000
1
'B'
Tue Jan 01 2019 13:01:00.000
1
'C'
Tue Jan 01 2019 13:02:00.000
2
'A'
Tue Jan 01 2019 13:03:00.000
2
'C'
Tue Jan 01 2019 13:04:00.000
2
'B'
Tue Jan 01 2019 13:05:00.000
Cases
CASE_ID : INT
1
2

Foreign Keys
Activities.CASE_IDCases.CASE_ID
Output
Result
Column1 : INTColumn2 : STRING
1
'A, C'
2
'A, C'



[5] Here all occurrences of the activities 'B' and 'C' are remapped to 'BC' using the REMAP_VALUES operator:

Query
Column1
"Cases"."CASE_ID"
Column2
VARIANT ( REMAP_VALUES ( "Activities"."ACTIVITY" , [ 'B' , 'BC' ] , [ 'C' , 'BC' ] ) )
Input
Activities
CASE_ID : INTACTIVITY : STRINGTIMESTAMP : DATE
1
'A'
Tue Jan 01 2019 13:00:00.000
1
'B'
Tue Jan 01 2019 13:01:00.000
1
'C'
Tue Jan 01 2019 13:02:00.000
2
'A'
Tue Jan 01 2019 13:03:00.000
2
'C'
Tue Jan 01 2019 13:04:00.000
2
'B'
Tue Jan 01 2019 13:05:00.000
Cases
CASE_ID : INT
1
2

Foreign Keys
Activities.CASE_IDCases.CASE_ID
Output
Result
Column1 : INTColumn2 : STRING
1
'A, BC, BC'
2
'A, BC, BC'


  • No labels