Skip to main content

PQL47 (PQL Function Library - CPM 4.7)

VARIANT

Applies to: CELONIS 4.2 CELONIS 4.3 CELONIS 4.4 CELONIS 4.5 CELONIS 4.6 CELONIS 4.7

Description

VARIANT aggregates all activities of a case into a string, which represents a variant of the case. SHORTENED can be used to shorten self-loops to a maximum number of occurrences.

Syntax
VARIANT ( activity_table.column ) 
  • activity_table.column: String column of the activity table.

SHORTENED ( VARIANT ( activity_table.column ) [, max_cycle_length ] )
  • activity_table.column: String column of the activity table.

  • max_cycle_length: More than max_cycle_length consecutive occurrences of the same activity are reduced to max_cycle_length occurrences (Since: CELONIS 4.6). Default Value is 2.

SHORTENED

Sometimes, different cases may have self-loops of the same activity but with a different number of activities. Consequently, these cases are related to different variants. However, in some applications it is not of interest how often an activity is repeated but only if there is a self-loop or not. For such cases, the VARIANT operator can be wrapped by the SHORTENED command which shortens self-loops to a maximum number of occurrences. In this way, it is possible to abstract from repeated activities and reduce the number of distinct variants. The limit for the length of the self-loops can be specified by an optional parameter. The default value for the maximum cycle length is 2.

Input column

The input column can be any (modified) string column of the activity table. Usually, the activity column is used, but it is also possible to create variants based on a resource column, like user name or country information. Activities can also be grouped together by mapping activity names, for example using REMAP_VALUES. VARIANT ignores NULL values.

NULL handling

If an input value is NULL, VARIANT ignores the value.

Examples

[1] This example shows how VARIANT takes all activities per case and creates variants out of it.

Query

Column1

"Cases"."CASE_ID"

Column2

VARIANT ( "Activities"."ACTIVITY" )

Input

Output

Activities

CASE_ID : INT

ACTIVITY : STRING

TIMESTAMP : DATE

1

'A'

Tue Jan 01 2019 13:00:00.000

1

'B'

Tue Jan 01 2019 13:01:00.000

1

'C'

Tue Jan 01 2019 13:02:00.000

2

'A'

Tue Jan 01 2019 13:03:00.000

2

'B'

Tue Jan 01 2019 13:04:00.000

2

'B'

Tue Jan 01 2019 13:05:00.000

2

'C'

Tue Jan 01 2019 13:06:00.000

3

'A'

Tue Jan 01 2019 13:07:00.000

3

'B'

Tue Jan 01 2019 13:08:00.000

3

'B'

Tue Jan 01 2019 13:09:00.000

3

'B'

Tue Jan 01 2019 13:09:00.000

3

'C'

Tue Jan 01 2019 13:11:00.000

Cases

CASE_ID : INT

1

2

3

Foreign Keys

Activities.CASE_ID

Cases.CASE_ID

Result

Column1 : INT

Column2 : STRING

1

'A, B, C'

2

'A, B, B, C'

3

'A, B, B, B, C'

[2] If SHORTENED is applied to VARIANT, the loop in case '3' is reduced to a simple loop (with a default length of two).

Query

Column1

"Cases"."CASE_ID"

Column2

SHORTENED ( VARIANT ( "Activities"."ACTIVITY" ) )

Input

Output

Activities

CASE_ID : INT

ACTIVITY : STRING

TIMESTAMP : DATE

1

'A'

Tue Jan 01 2019 13:00:00.000

1

'B'

Tue Jan 01 2019 13:01:00.000

1

'C'

Tue Jan 01 2019 13:02:00.000

2

'A'

Tue Jan 01 2019 13:03:00.000

2

'B'

Tue Jan 01 2019 13:04:00.000

2

'B'

Tue Jan 01 2019 13:05:00.000

2

'C'

Tue Jan 01 2019 13:06:00.000

3

'A'

Tue Jan 01 2019 13:07:00.000

3

'B'

Tue Jan 01 2019 13:08:00.000

3

'B'

Tue Jan 01 2019 13:09:00.000

3

'B'

Tue Jan 01 2019 13:10:00.000

3

'C'

Tue Jan 01 2019 13:11:00.000

Cases

CASE_ID : INT

1

2

3

Foreign Keys

Cases.CASE_ID

Result

Column1 : INT

Column2 : STRING

1

'A, B, C'

2

'A, B, B, C'

3

'A, B, B, C'

[3] Here is an example in which, instead of the ACTIVITY column, another column is used as input for VARIANT.

Query

Column1

"Cases"."CASE_ID"

Column2

VARIANT ( "Activities"."COUNTRY" )

Input

Output

Activities

CASE_ID : INT

ACTIVITY : STRING

TIMESTAMP : DATE

COUNTRY : STRING

1

'A'

Tue Jan 01 2019 13:00:00.000

'US'

1

'B'

Tue Jan 01 2019 13:01:00.000

'GB'

1

'C'

Tue Jan 01 2019 13:02:00.000

'GB'

2

'A'

Tue Jan 01 2019 13:03:00.000

'FR'

2

'B'

Tue Jan 01 2019 13:04:00.000

'AT'

2

'B'

Tue Jan 01 2019 13:05:00.000

'DE'

Cases

CASE_ID : INT

1

2

Foreign Keys

Activities.CASE_ID

Cases.CASE_ID

Result

Column1 : INT

Column1 : INT

1

'US, GB, GB'

2

'FR, AT, DE'

[4] In this example, activity 'B' is ignored in VARIANT by setting it to NULL using REMAP_VALUES.

Query

Column1

"Cases"."CASE_ID"

Column2

VARIANT ( REMAP_VALUES ( "Activities"."ACTIVITY" , [ 'B' , NULL ] ) )

Input

Output

Activities

CASE_ID : INT

ACTIVITY : STRING

TIMESTAMP : DATE

1

'A'

Tue Jan 01 2019 13:00:00.000

1

'B'

Tue Jan 01 2019 13:01:00.000

1

'C'

Tue Jan 01 2019 13:02:00.000

2

'A'

Tue Jan 01 2019 13:03:00.000

2

'C'

Tue Jan 01 2019 13:04:00.000

2

'B'

Tue Jan 01 2019 13:05:00.000

Cases

CASE_ID : INT

1

2

Foreign Keys

Activities.CASE_ID

Cases.CASE_ID

Result

Column1 : INT

Column2 : STRING

1

'A, C'

2

'A, C'

[5] Here all occurrences of the activities 'B' and 'C' are remapped to 'BC' using the REMAP_VALUES operator.

Query

Column1

"Cases"."CASE_ID"

Column2

VARIANT ( REMAP_VALUES ( "Activities"."ACTIVITY" , [ 'B' , 'BC' ] , [ 'C' , 'BC' ] ) )

Input

Output

Activities

CASE_ID : INT

ACTIVITY : STRING

TIMESTAMP : DATE

1

'A'

Tue Jan 01 2019 13:00:00.000

1

'B'

Tue Jan 01 2019 13:01:00.000

1

'C'

Tue Jan 01 2019 13:02:00.000

2

'A'

Tue Jan 01 2019 13:03:00.000

2

'C'

Tue Jan 01 2019 13:04:00.000

2

'B'

Tue Jan 01 2019 13:05:00.000

Cases

CASE_ID : INT

1

2

Foreign Key

Activities.CASE_ID

Cases.CASE_ID

Result

Column1 : INT

Column2 : STRING

1

'A, BC, BC'

2

'A, BC, BC'

See also: