Skip to main content

PQL46 (PQL Function Library - CPM 4.6)

Calculations on activity sequences

Applies to: CELONIS 4.6

Description

This example shows how to identify and perform calculations on activity sequences containing more than two activities with PQL.

Identifying recurrent sequences containing e.g. unnecessary rework is an important requirement to ensure a smooth process handling. For example, the smooth processing of the purchase approval workflow is crucial to ensure a prompt delivery of ordered materials. As purchase orders often have to pass extensive approval procedures, the avoidance of unnecessary rework steps associated with the release process are therefore essential.

Challenge

For each case, show the occurring activity sequences that contain the activities 'Approval requested', 'Change PO for approval requested' and 'PO changed'.

The number of occurrences per case should be shown in a separate column.

The average throughput time for the activity sequence per case should be shown in a separate column.

Solution

[1] The activity sequences can be identified using the following query. In this example, case 1 and case 3 contain the specific activity sequence ('Approval requested' followed by 'Change PO for approval request' followed by 'PO changed'), and case 2 does not.

Query

Filter

FILTER
    "Activities"."Activity" = 'Approval requested'
    AND
    ACTIVITY_LEAD ( "Activities"."Activity" ) = 'Change of PO for approval requested'
    AND
    ACTIVITY_LEAD ( "Activities"."Activity" , 2 ) = 'PO changed';

Column1

"Activities"."Case"

Column2

"Activities"."Activity"
||
' - '
||
ACTIVITY_LEAD ( "Activities"."Activity" )
||
' - '
||
ACTIVITY_LEAD ( "Activities"."Activity" , 2 )

Column3

COUNT ( "Activities"."Activity" )

Column4

AVG ( DATEDIFF ( hh , "Activities"."Timestamp" , ACTIVITY_LEAD ( "Activities"."Timestamp" , 2 ) ) )

Input

Output

Activities

Case : INT

Activity : STRING

Timestamp : DATE

1

'PO created'

Mon Feb 01 2016 01:00:00.000

1

'Approval requested'

Mon Feb 01 2016 02:00:00.000

1

'Change of PO for approval requested'

Mon Feb 01 2016 03:00:00.000

1

'PO changed'

Mon Feb 01 2016 04:00:00.000

1

'Approval requested'

Mon Feb 01 2016 05:00:00.000

1

'Change of PO for approval requested'

Mon Feb 01 2016 06:00:00.000

1

'PO changed'

Mon Feb 01 2016 07:00:00.000

1

'Approval requested'

Mon Feb 01 2016 08:00:00.000

1

'Approval granted'

Mon Feb 01 2016 09:00:00.000

2

'PO created'

Mon Feb 01 2016 01:00:00.000

2

'Approval requested'

Mon Feb 01 2016 02:00:00.000

2

'Approval granted'

Mon Feb 01 2016 03:00:00.000

3

'PO created'

Mon Feb 01 2016 02:00:00.000

3

'Change of PO for approval requested'

Mon Feb 01 2016 03:00:00.000

3

'PO changed'

Mon Feb 01 2016 04:00:00.000

3

Approval granted'

Mon Feb 01 2016 05:00:00.000

Result

Column1 : INT

Column2 : STRING

Column3 : INT

Column4 :  FLOAT

1

'Approval requested - Change of PO for approval requested - PO changed'

2

2.0

3

'Approval requested - Change of PO for approval requested - PO changed'

1

2.0

Step-by-step solution

In order to find (specific) activity sequences, we need to compare each activity with the following activities. This can be done with the ACTIVITY_LEAD operator. Now let's take a closer look on this operator.

[2] To be able to compare the current activity with the next and the second next activity, we need to have them in one line. We can do so by using ACTIVITY_LEAD.

Query

Column1

"Activities"."Case"

Column2

"Activities"."Activity"

Column3

ACTIVITY_LEAD ( "Activities"."Activity" )

Column4

ACTIVITY_LEAD ( "Activities"."Activity" , 2 )

Input

Output

Activities

Case : INT

Activity : STRING

Timestamp : DATE

1

'PO created'

Mon Feb 01 2016 01:00:00.000

1

'Approval requested'

Mon Feb 01 2016 02:00:00.000

1

'Change of PO for approval requested'

Mon Feb 01 2016 03:00:00.000

1

'PO changed'

Mon Feb 01 2016 04:00:00.000

1

'Approval requested'

Mon Feb 01 2016 05:00:00.000

1

'Change of PO for approval requested'

Mon Feb 01 2016 06:00:00.000

1

'PO changed'

Mon Feb 01 2016 07:00:00.000

1

'Approval requested'

Mon Feb 01 2016 08:00:00.000

1

'Approval granted'

Mon Feb 01 2016 09:00:00.000

2

'PO created'

Mon Feb 01 2016 01:00:00.000

2

'Approval requested'

Mon Feb 01 2016 02:00:00.000

2

'Approval granted'

Mon Feb 01 2016 03:00:00.000

3

'PO created'

Mon Feb 01 2016 01:00:00.000

3

'Approval requested'

Mon Feb 01 2016 02:00:00.000

3

'Change of PO for approval requested'

Mon Feb 01 2016 03:00:00.000

3

'PO changed'

Mon Feb 01 2016 04:00:00.000

3

'Approval requested'

Mon Feb 01 2016 05:00:00.000

3

'Approval granted'

Mon Feb 01 2016 06:00:00.000

Result

Column1 : INT

Column2 : STRING

Column3 : STRING

Column4 : STRING

1

'PO created'

'Approval requested'

'Change of PO for approval requested'

1

'Approval requested'

'Change of PO for approval requested'

'PO changed'

1

'Change of PO for approval requested'

'PO changed'

'Approval requested'

1

'PO changed'

'Approval requested'

'Change of PO for approval requested'

1

'Approval requested'

'Change of PO for approval requested'

'PO changed'

1

'Change of PO for approval requested'

'PO changed'

'Approval requested'

1

'PO changed'

'Approval requested'

'Approval granted'

1

'Approval requested'

'Approval granted'

null

1

'Approval granted'

null

null

2

'PO created'

'Approval requested'

'Approval granted'

2

'Approval requested'

'Approval granted'

null

2

'Approval granted'

null

null

3

'PO created'

'Approval requested'

'Change of PO for approval requested'

3

'Approval requested'

'Change of PO for approval requested'

'PO changed'

3

'Change of PO for approval requested'

'PO changed'

'Approval requested'

3

'PO changed'

'Approval requested'

'Approval granted'

3

Approval requested'

'Approval granted'

null

3

'Approval granted'

null

null

[3] In order to get the specific activity sequences starting with activity 'Approval requested', we are applying a filter to the table.

Query

Filter

FILTER "Activities"."Activity" = 'Approval requested'

Column1

"Activities"."Case"

Column2

"Activities"."Activity"

Column3

ACTIVITY_LEAD ( "Activities"."Activity" )

Column4

ACTIVITY_LEAD ( "Activities"."Activity" , 2 )

Input

Output

Activities

Case : INT

Activity : STRING

Timestamp : DATE

1

'PO created'

Mon Feb 01 2016 01:00:00.000

1

'Approval requested'

Mon Feb 01 2016 02:00:00.000

1

'Change of PO for approval requested'

Mon Feb 01 2016 03:00:00.000

1

'PO changed'

Mon Feb 01 2016 04:00:00.000

1

'Approval requested'

Mon Feb 01 2016 05:00:00.000

1

'Change of PO for approval requested'

'Change of PO for approval requested'

1

'PO changed'

Mon Feb 01 2016 07:00:00.000

1

'Approval requested'

Mon Feb 01 2016 08:00:00.000

1

'Approval granted'

Mon Feb 01 2016 09:00:00.000

2

'PO created'

Mon Feb 01 2016 01:00:00.000

2

'Approval requested'

Mon Feb 01 2016 02:00:00.000

2

'Approval granted'

Mon Feb 01 2016 03:00:00.000

3

PO created'

Mon Feb 01 2016 01:00:00.000

3

'Approval requested'

Mon Feb 01 2016 02:00:00.000

3

'Change of PO for approval requested'

Mon Feb 01 2016 03:00:00.000

3

'PO changed'

Mon Feb 01 2016 04:00:00.000

3

'Approval requested'

Mon Feb 01 2016 05:00:00.000

3

'Approval granted'

Mon Feb 01 2016 06:00:00.000

Result

Column1 : INT

Column2 : STRING

Column3 : STRING

Column4 : STRING

1

'Approval requested'

'Change of PO for approval requested'

'PO changed'

1

'Approval requested'

'Change of PO for approval requested'

'PO changed'

1

'Approval requested'

'Approval granted'

null

2

'Approval requested'

'Approval granted'

null

3

'Approval requested'

'Change of PO for approval requested'

'PO changed'

3

'Approval requested'

'Approval granted'

null

[4] In the next step, we want only activity sequences starting with activity 'Approval requested' followed by activity 'Change of PO for approval requested' and ending with activity 'PO changed' in the second next activity. Therefore, we are applying an additional filter to the table.

Query

Filter

FILTER
    "Activities"."Activity" = 'Approval requested'
    AND
    ACTIVITY_LEAD ( "Activities"."Activity" ) = 'Change of PO for approval requested'
    AND
    ACTIVITY_LEAD ( "Activities"."Activity" , 2 ) = 'PO changed';

Column1

"Activities"."Case"

Column2

"Activities"."Activity"

Column3

ACTIVITY_LEAD ( "Activities"."Activity" )

Column4

ACTIVITY_LEAD ( "Activities"."Activity" , 2 )

Input

Output

Activities

Case : INT

Activity : STRING

Timestamp : DATE

1

'PO created'

Mon Feb 01 2016 01:00:00.000

1

'Approval requested'

Mon Feb 01 2016 02:00:00.000

1

'Change of PO for approval requested'

Mon Feb 01 2016 03:00:00.000

1

'PO changed'

Mon Feb 01 2016 04:00:00.000

1

'Approval requested'

Mon Feb 01 2016 05:00:00.000

1

'Change of PO for approval requested'

Mon Feb 01 2016 06:00:00.000

1

'PO changed'

Mon Feb 01 2016 07:00:00.000

1

'Approval requested'

Mon Feb 01 2016 08:00:00.000

1

'Approval granted'

Mon Feb 01 2016 09:00:00.000

2

'PO created'

Mon Feb 01 2016 01:00:00.000

2

'Approval requested'

Mon Feb 01 2016 02:00:00.000

2

'Approval granted'

Mon Feb 01 2016 03:00:00.000

3

'PO created'

Mon Feb 01 2016 01:00:00.000

3

'Approval requested'

Mon Feb 01 2016 02:00:00.000

3

'Change of PO for approval requested'

Mon Feb 01 2016 03:00:00.000

3

'PO changed'

Mon Feb 01 2016 04:00:00.000

3

'Approval requested'

Mon Feb 01 2016 05:00:00.000

3

'Approval granted'

Mon Feb 01 2016 06:00:00.000

Result

Column1 : INT

Column2 : STRING

Column3 : STRING

Column4 : STRING

1

'Approval requested'

'Change of PO for approval requested'

'PO changed'

1

'Approval requested'

'Change of PO for approval requested'

'PO changed'

3

'Approval requested'

'Change of PO for approval requested'

'PO changed'

[5] Next, we want to count the occurrences of the specific activity sequence in the table. To do so, we are using a COUNTaggregation and additionally concatenate the columns with activity names.

Query

Filter

FILTER
    "Activities"."Activity" = 'Approval requested'
    AND
    ACTIVITY_LEAD ( "Activities"."Activity" ) = 'Change of PO for approval requested'
    AND
    ACTIVITY_LEAD ( "Activities"."Activity" , 2 ) = 'PO changed';

Column1

"Activities"."Case"

Column2

"Activities"."Activity"
||
' - '
||
ACTIVITY_LEAD ( "Activities"."Activity" )
||
' - '
||
ACTIVITY_LEAD ( "Activities"."Activity" , 2 )

Column3

COUNT ( "Activities"."Activity" )

Input

Output

Activities

Case : INT

Activity : STRING

Timestamp : DATE

1

'PO created'

Mon Feb 01 2016 01:00:00.000

1

'Approval requested'

Mon Feb 01 2016 02:00:00.000

1

'Change of PO for approval requested'

Mon Feb 01 2016 03:00:00.000

1

'PO changed'

Mon Feb 01 2016 04:00:00.000

1

'Approval requested'

Mon Feb 01 2016 05:00:00.000

1

Change of PO for approval requested'

Mon Feb 01 2016 06:00:00.000

1

'PO changed'

Mon Feb 01 2016 07:00:00.000

1

'Approval requested'

Mon Feb 01 2016 08:00:00.000

1

'Approval granted'

Mon Feb 01 2016 09:00:00.000

2

'PO created'

Mon Feb 01 2016 01:00:00.000

2

'Approval requested'

Mon Feb 01 2016 02:00:00.000

2

'Approval granted'

Mon Feb 01 2016 03:00:00.000

3

'PO created'

Mon Feb 01 2016 01:00:00.000

3

'PO created'

Mon Feb 01 2016 02:00:00.000

3

Change of PO for approval requested'

Mon Feb 01 2016 03:00:00.000

3

'PO changed'

Mon Feb 01 2016 04:00:00.000

3

'Approval requested'

Mon Feb 01 2016 05:00:00.000

3

'Approval granted'

Mon Feb 01 2016 06:00:00.000

Result

Column1 : INT

Column2 : STRING

Column3 : INT

1

Approval requested - Change of PO for approval requested - PO changed'

2

3

Approval requested - Change of PO for approval requested - PO changed'

1

[6] In the last step, we additionally want to calculate the average throughput time for the activity sequences per case. For this purpose, we are using the AVG aggregation around the DATEDIFF function in a separate column.

Query

Filter

FILTER
    "Activities"."Activity" = 'Approval requested'
    AND
    ACTIVITY_LEAD ( "Activities"."Activity" ) = 'Change of PO for approval requested'
    AND
    ACTIVITY_LEAD ( "Activities"."Activity" , 2 ) = 'PO changed';

Column1

"Activities"."Case"

Column2

"Activities"."Activity"
||
' - '
||
ACTIVITY_LEAD ( "Activities"."Activity" )
||
' - '
||
ACTIVITY_LEAD ( "Activities"."Activity" , 2 )

Column3

COUNT ( "Activities"."Activity" )

Column4

AVG ( DATEDIFF ( hh , "Activities"."Timestamp" , ACTIVITY_LEAD ( "Activities"."Timestamp" , 2 ) ) )

Input

Output

Activities

Case : INT

Activity : STRING

Timestamp : DATE

1

'PO created'

Mon Feb 01 2016 01:00:00.000

1

'Approval requested'

Mon Feb 01 2016 02:00:00.000

1

'Change of PO for approval requested'

Mon Feb 01 2016 03:00:00.000

1

'PO changed'

Mon Feb 01 2016 04:00:00.000

1

'Approval requested'

Mon Feb 01 2016 05:00:00.000

1

'Change of PO for approval requested'

Mon Feb 01 2016 06:00:00.000

1

'PO changed'

Mon Feb 01 2016 07:00:00.000

1

'Approval requested'

Mon Feb 01 2016 08:00:00.000

1

'Approval granted'

Mon Feb 01 2016 09:00:00.000

2

'PO created'

Mon Feb 01 2016 01:00:00.000

2

'Approval requested'

Mon Feb 01 2016 02:00:00.000

2

'Approval granted'

Mon Feb 01 2016 03:00:00.000

3

'PO created'

Mon Feb 01 2016 01:00:00.000

3

'Approval requested'

Mon Feb 01 2016 02:00:00.000

3

'Change of PO for approval requested'

Mon Feb 01 2016 03:00:00.000

3

'PO changed'

Mon Feb 01 2016 04:00:00.000

3

'Approval requested'

Mon Feb 01 2016 05:00:00.000

3

'Approval granted'

Mon Feb 01 2016 06:00:00.000

Result

Column1 : INT

Column2 : STRING

Column3 : INT

Column4 : FLOAT

1

'Approval requested - Change of PO for approval requested - PO changed'

2

2.0

3

'Approval requested - Change of PO for approval requested - PO changed'

1

2.0