Applies to:  CELONIS 4.6 CELONIS 4.7 

Description

These examples show various ways of throughput time calculations in PQL.

Throughput time calculations are vital in powerful process mining applications. In the following, different kinds of throughput times and possible calculations using PQL are presented:

  • Throughput Times per Case: Learn how to calculate throughput times inside a case, e.g. the time from the start of each case to the end of the case or between specified activities within a case.
  • Throughput Times between Activities: Learn how to calculate the throughput times between consecutive activities inside a case.
  • Throughput Times over multiple Cases: Learn how to calculate throughput times over multiple cases, e.g. the throughput times of orders consisting of multiple cases.

Throughput Times per Case

Throughput time calculations on case level can be solved using CALC_THROUGHPUT. It is usually used together with REMAP_TIMESTAMPS, which adds many options for calendar specifications.


[1] Here, the case throughput time from the first activity to the last activity of each case is calculated in days. As DAYS is specified as the time unit, the minutes of the input timestamps are ignored:

Query
Column1
"Cases"."Case"
Column2
CALC_THROUGHPUT ( CASE_START TO CASE_END , REMAP_TIMESTAMPS ( "Activities"."Timestamp" , DAYS ) )
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Wed Jan 01 2020 00:00:00.000
1
'C'
Mon Jan 06 2020 00:00:00.000
1
'C'
Tue Jan 07 2020 12:00:00.000
2
'A'
Fri Jan 03 2020 00:00:00.000
2
'A'
Thu Jan 09 2020 12:00:00.000
2
'C'
Sat Jan 11 2020 00:00:00.000
Cases
Case : INT
1
2

Foreign Keys
Activities.CaseCases.Case
Output
Result
Column1 : INTColumn2 : INT
1
6
2
8



[2] In this example, the throughput time between the last A-activity and the first C-activity of each case is calculated:

Query
Column1
"Cases"."Case"
Column2
CALC_THROUGHPUT (
    LAST_OCCURRENCE [ 'A' ]
    TO
    FIRST_OCCURRENCE [ 'C' ] ,
    REMAP_TIMESTAMPS ( "Activities"."Timestamp" , DAYS )
)
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Wed Jan 01 2020 00:00:00.000
1
'C'
Mon Jan 06 2020 00:00:00.000
1
'C'
Tue Jan 07 2020 12:00:00.000
2
'A'
Fri Jan 03 2020 00:00:00.000
2
'A'
Thu Jan 09 2020 12:00:00.000
2
'C'
Sat Jan 11 2020 00:00:00.000
Cases
Case : INT
1
2

Foreign Keys
Activities.CaseCases.Case
Output
Result
Column1 : INTColumn2 : INT
1
5
2
2



[3] If the case throughput time should be calculated in days, but the hours should not be ignored, we can calculate the throughput times in hours first and then divide the result by 24 (hours per day):

Query
Column1
"Cases"."Case"
Column2
CALC_THROUGHPUT ( CASE_START TO CASE_END , REMAP_TIMESTAMPS ( "Activities"."Timestamp" , HOURS ) )
/
24
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Wed Jan 01 2020 00:00:00.000
1
'C'
Mon Jan 06 2020 00:00:00.000
1
'C'
Tue Jan 07 2020 12:00:00.000
2
'A'
Fri Jan 03 2020 00:00:00.000
2
'A'
Thu Jan 09 2020 12:00:00.000
2
'C'
Sat Jan 11 2020 00:00:00.000
Cases
Case : INT
1
2

Foreign Keys
Activities.CaseCases.Case
Output
Result
Column1 : INTColumn2 : FLOAT
1
6.5
2
8.0



[4] In this example, only weekdays should be counted towards the case throughput time, ignoring Saturdays and Sundays. This can be done using the WEEKDAY_CALENDAR in REMAP_TIMESTAMPS:

Query
Column1
"Cases"."Case"
Column2
CALC_THROUGHPUT (
    CASE_START
    TO
    CASE_END ,
    REMAP_TIMESTAMPS (
        "Activities"."Timestamp" ,
        DAYS ,
        WEEKDAY_CALENDAR ( MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY )
    )
)
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Wed Jan 01 2020 00:00:00.000
1
'C'
Mon Jan 06 2020 00:00:00.000
1
'C'
Tue Jan 07 2020 12:00:00.000
2
'A'
Fri Jan 03 2020 00:00:00.000
2
'A'
Thu Jan 09 2020 12:00:00.000
2
'C'
Sat Jan 11 2020 00:00:00.000
Cases
Case : INT
1
2

Foreign Keys
Activities.CaseCases.Case
Output
Result
Column1 : INTColumn2 : INT
1
4
2
6



[5] The activity table of this example contains several Error-activities. This example shows how to calculate the throughput time between the case start and the first Error-activity of each case. This can be done by mapping all Error-activities (those are all activities starting with Error) to a common string (ErrorActivity in this example) using a CASE WHEN in the last argument of the CALC_THROUGHPUT function. As the result of this CASE WHEN is now taken as the activity column to be used in CALC_THROUGHPUT, the ErrorActivity can now be used inside the FIRST_OCCURRENCE range specifier:

Query
Column1
"Cases"."Case"
Column2
CALC_THROUGHPUT (
    CASE_START
    TO
    FIRST_OCCURRENCE [ 'ErrorActivity' ] ,
    REMAP_TIMESTAMPS ( "Activities"."Timestamp" , DAYS ) ,
    CASE
        WHEN "Activities"."Activity" LIKE 'Error%' THEN 'ErrorActivity'
        ELSE
            "Activities"."Activity"
    END
)
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Wed Jan 01 2020 00:00:00.000
1
'C'
Mon Jan 06 2020 00:00:00.000
1
'Error X'
Tue Jan 07 2020 12:00:00.000
2
'A'
Fri Jan 03 2020 00:00:00.000
2
'Error Y'
Thu Jan 09 2020 12:00:00.000
2
'B'
Sat Jan 11 2020 00:00:00.000
Cases
Case : INT
1
2

Foreign Keys
Activities.CaseCases.Case
Output
Result
Column1 : INTColumn2 : INT
1
6
2
6


Throughput Times between Activities

Throughput times between consecutive activities can be calculated using SOURCE and TARGET, usually combined with a DateTime Difference function or REMAP_TIMESTAMPS. The Process Explorer and the Variant Explorer use this approach to calculate the throughput time for each edge in the graph.


[6] This simple example shows how to calculate the number of days between consecutive activities of each case using SOURCE/TARGET and DAYS_BETWEEN:

Query
Column1
"Cases"."Case"
Column2
SOURCE ( "Activities"."Activity" )
Column3
TARGET ( "Activities"."Activity" )
Column4
DAYS_BETWEEN ( SOURCE ( "Activities"."Timestamp" ) , TARGET ( "Activities"."Timestamp" ) )
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Thu Jan 02 2020 00:00:00.000
1
'B'
Sat Jan 04 2020 00:00:00.000
1
'C'
Mon Jan 06 2020 00:00:00.000
2
'A'
Thu Jan 02 2020 00:00:00.000
2
'B'
Mon Jan 20 2020 00:00:00.000
2
'D'
Sat Jan 25 2020 00:00:00.000
2
'C'
Tue Jan 28 2020 00:00:00.000
Cases
Case : INT
1
2

Foreign Keys
Activities.CaseCases.Case
Output
Result
Column1 : INTColumn2 : STRINGColumn3 : STRINGColumn4 : FLOAT
1
'A''B'
2.0
1
'B''C'
2.0
2
'A''B'
18.0
2
'B''D'
5.0
2
'D''C'
3.0



[7] Here, we use REMAP_TIMESTAMPS on the SOURCE/TARGET timestamps and subtract those values to get the difference between consecutive timestamps in days. By using the WEEKDAY_CALENDAR option, we only take weekdays into account, meaning that Saturday and Sunday do not count towards the time difference:

Query
Column1
"Cases"."Case"
Column2
SOURCE ( "Activities"."Activity" )
Column3
TARGET ( "Activities"."Activity" )
Column4
REMAP_TIMESTAMPS (
    TARGET ( "Activities"."Timestamp" ) ,
    DAYS ,
    WEEKDAY_CALENDAR ( MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY )
)
-
REMAP_TIMESTAMPS (
    SOURCE ( "Activities"."Timestamp" ) ,
    DAYS ,
    WEEKDAY_CALENDAR ( MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY )
)
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Thu Jan 02 2020 00:00:00.000
1
'B'
Sat Jan 04 2020 00:00:00.000
1
'C'
Mon Jan 06 2020 00:00:00.000
2
'A'
Thu Jan 02 2020 00:00:00.000
2
'B'
Mon Jan 20 2020 00:00:00.000
2
'D'
Sat Jan 25 2020 00:00:00.000
2
'C'
Tue Jan 28 2020 00:00:00.000
Cases
Case : INT
1
2

Foreign Keys
Activities.CaseCases.Case
Output
Result
Column1 : INTColumn2 : STRINGColumn3 : STRINGColumn4 : INT
1
'A''B'
2
1
'B''C'
0
2
'A''B'
12
2
'B''D'
5
2
'D''C'
1


Connection to the Process Explorer

The SOURCE and TARGET functions are used in the Process Explorer and Variant Explorer to calculate the edge KPIs, as shown in the following examples:


[8] This example shows the query that is used to calculate the average throughput time between activities in days. Therefore, the difference of the two timestamps is first calculated in SECONDS and then converted to DAYS by dividing the difference by 60*60*24. Finally, the result is rounded. The corresponding Process Explorer would look like this:


Query
Column1
SOURCE ( "Activities"."Activity" )
Column2
TARGET ( "Activities"."Activity" )
Column3
ROUND (
    AVG (
        (
            REMAP_TIMESTAMPS ( TARGET ( "Activities"."Timestamp" ) , SECONDS )
            -
            REMAP_TIMESTAMPS ( SOURCE ( "Activities"."Timestamp" ) , SECONDS )
        )
        /
        ( 60 * 60 * 24 )
    )
)
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Thu Jan 02 2020 00:00:00.000
1
'B'
Sat Jan 04 2020 00:00:00.000
1
'C'
Mon Jan 06 2020 00:00:00.000
2
'A'
Thu Jan 02 2020 00:00:00.000
2
'B'
Mon Jan 20 2020 00:00:00.000
2
'D'
Sat Jan 25 2020 00:00:00.000
2
'C'
Tue Jan 28 2020 00:00:00.000
Cases
Case : INT
1
2

Foreign Keys
Activities.CaseCases.Case
Output
Result
Column1 : STRINGColumn2 : STRINGColumn3 : INT
'A''B'
10
'B''C'
2
'D''C'
3
'B''D'
5



[9] The query below calculates the average number of days between consecutive activities, with activity 'B' being ignored. This is the query that is used when activities are ignored using the 'Eye' icon in the Process Explorer:

The corresponding Process Explorer looks like this:


Query
Column1
SOURCE ( "Activities"."Activity" , REMAP_VALUES ( "Activities"."Activity" , [ 'B' , null ] ) )
Column2
TARGET ( "Activities"."Activity" )
Column3
ROUND (
    AVG (
        (
            REMAP_TIMESTAMPS ( TARGET ( "Activities"."Timestamp" ) , SECONDS )
            -
            REMAP_TIMESTAMPS ( SOURCE ( "Activities"."Timestamp" ) , SECONDS )
        )
        /
        ( 60 * 60 * 24 )
    )
)
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Thu Jan 02 2020 00:00:00.000
1
'B'
Sat Jan 04 2020 00:00:00.000
1
'C'
Mon Jan 06 2020 00:00:00.000
2
'A'
Thu Jan 02 2020 00:00:00.000
2
'B'
Mon Jan 20 2020 00:00:00.000
2
'D'
Sat Jan 25 2020 00:00:00.000
2
'C'
Tue Jan 28 2020 00:00:00.000
Cases
Case : INT
1
2

Foreign Keys
Activities.CaseCases.Case
Output
Result
Column1 : STRINGColumn2 : STRINGColumn3 : INT
'A''C'
4
'D''C'
3
'A''D'
23


Throughput Times over multiple Cases

Throughput times over multiple cases can be calculated using PU_MIN and PU_MAX, combined with a DateTime Difference function or REMAP_TIMESTAMPS. This can be used to calculate the time range covered by all cases that are grouped together based on a certain property.


[10] This example shows how to calculate the throughput time of each order. An order consists of one or more order items, which are the cases in the example. The throughput time of an order is the time difference between the first activity among all cases related to that order and the last activity among all cases of that order. We can use PU_MIN and PU_MAX to get the smallest (earliest) and largest (latest) timestamps related to each order and compare those using DAYS_BETWEEN:

Query
Column1
"Orders"."ID"
Column2
DAYS_BETWEEN (
    PU_MIN ( "Orders" , "Activities"."Timestamp" ) , PU_MAX ( "Orders" , "Activities"."Timestamp" )
)
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Wed Jan 01 2020 00:00:00.000
1
'B'
Mon Jan 06 2020 00:00:00.000
1
'C'
Tue Jan 07 2020 00:00:00.000
2
'A'
Fri Jan 03 2020 00:00:00.000
2
'C'
Thu Jan 09 2020 00:00:00.000
3
'A'
Fri Jan 03 2020 00:00:00.000
3
'B'
Sun Jan 05 2020 12:00:00.000
Cases
Case : INTOrder : INT
1
100
2
100
3
200
Orders
ID : INT
100
200

Foreign Keys
Activities.CaseCases.Case
Cases.OrderOrders.ID
Output
Result
Column1 : INTColumn2 : FLOAT
100
8.0
200
2.5



[11] This example shows how to calculate the average throughput time of an order, using an AVG around the query that calculates the throughput time of each order. Instead of using DAYS_BETWEEN like in the previous example, we can also use REMAP_TIMESTAMPS again to get the number of days between two timestamps. As explained above, REMAP_TIMESTAMPS only counts full days with the DAYS specification. Using REMAP_TIMESTAMPS allows you to take calendars into account, as already shown above:

Query
Column1
AVG (
    REMAP_TIMESTAMPS ( PU_MAX ( "Orders" , "Activities"."Timestamp" ) , DAYS )
    -
    REMAP_TIMESTAMPS ( PU_MIN ( "Orders" , "Activities"."Timestamp" ) , DAYS )
)
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Wed Jan 01 2020 00:00:00.000
1
'B'
Mon Jan 06 2020 00:00:00.000
1
'C'
Tue Jan 07 2020 00:00:00.000
2
'A'
Fri Jan 03 2020 00:00:00.000
2
'C'
Thu Jan 09 2020 00:00:00.000
3
'A'
Fri Jan 03 2020 00:00:00.000
3
'B'
Sun Jan 05 2020 12:00:00.000
Cases
Case : INTOrder : INT
1
100
2
100
3
200
Orders
ID : INT
100
200

Foreign Keys
Activities.CaseCases.Case
Cases.OrderOrders.ID
Output
Result
Column1 : FLOAT
5.0


  • No labels