Applies to:  CELONIS 4.7 

Description

This example shows how to calculate the throughput time of a machine with PQL.

In production use cases machine throughput time is often a number of interest. Calculating this number is not straight forward as the event log is usually built around the products which go through the assembly line and not the machine. Functions which are usually used for calculating the throughput time like SOURCE and TARGET or CALC_THROUGHPUT operate on case level. But a machine is normally involved in many cases. Therefore, LEAD and LAG can be used to calculate the duration of activities through a variety of cases. These functions can also be used to trace errors.

Duration of Stations

The duration of a station can be calculated by comparing the timestamp of an Activity to the timestamp of the next Activity, that can be identified using LEAD. The duration can also be calculated using LAG, but this would return the duration of the previous activity.


[1] With LEAD, we can compare the timestamps of Activities to the timestamps of their following Activities, that belong to the same Activity type. A function that calculates Time Differences like HOURS_BETWEEN can be used to then return the duration for each Activity.

Query
Filter
FILTER LEAD ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) IS NOT NULL;
Column1
"Activities"."Activity"
Column2
HOURS_BETWEEN (
    "Activities"."Timestamp" ,
    LEAD ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) )
)
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Mon Feb 01 2016 01:00:00.000
1
'B'
Mon Feb 01 2016 02:00:00.000
2
'A'
Mon Feb 01 2016 02:00:00.000
2
'B'
Mon Feb 01 2016 03:00:00.000
3
'A'
Mon Feb 01 2016 04:00:00.000
3
'B'
Mon Feb 01 2016 06:00:00.000
Output
Result
Column1 : STRINGColumn2 : FLOAT
'A'
1.0
'B'
1.0
'A'
2.0
'B'
3.0



[2] In this example we can see, how the average duration of each activity type can be calculated.

Query
Filter
FILTER LEAD ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) IS NOT NULL;
Column1
"Activities"."Activity"
Column2
AVG (
    HOURS_BETWEEN (
        "Activities"."Timestamp" ,
        LEAD ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) )
    )
)
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Mon Feb 01 2016 01:00:00.000
1
'B'
Mon Feb 01 2016 02:00:00.000
2
'A'
Mon Feb 01 2016 02:00:00.000
2
'B'
Mon Feb 01 2016 03:00:00.000
3
'A'
Mon Feb 01 2016 04:00:00.000
3
'B'
Mon Feb 01 2016 06:00:00.000
Output
Result
Column1 : STRINGColumn2 : FLOAT
'A'
1.5
'B'
2.0



[3] This is an example how to calculate the duration of one activity at its station with the usage of LAG. As B could identify a certain machine, this would show the duration of every previously performed work at this station.

Query
Filter
FILTER LAG ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) IS NOT NULL;
Filter
FILTER "Activities"."Activity" = 'B';
Column1
"Activities"."Case"
Column2
HOURS_BETWEEN (
    LAG ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) ,
    "Activities"."Timestamp"
)
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Mon Feb 01 2016 01:00:00.000
1
'B'
Mon Feb 01 2016 02:00:00.000
2
'A'
Mon Feb 01 2016 02:00:00.000
2
'B'
Mon Feb 01 2016 03:00:00.000
3
'A'
Mon Feb 01 2016 04:00:00.000
3
'B'
Mon Feb 01 2016 06:00:00.000
Output
Result
Column1 : INTColumn2 : FLOAT
2
1.0
3
3.0


Error Cause

In case of errors occurring in processes, it is always interesting to identify the reason behind this error. With the usage of LEAD and LAG, it is easy to find the activities that happened before or after an error.


[4] This example shows how the cause for an Error could be identified using LAG.

Query
Filter
FILTER ( "Activities"."Activity" ) LIKE 'Error%';
Column1
"Activities"."Activity"
Column2
LAG ( "Activities"."Activity" )
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Mon Feb 01 2016 01:00:00.000
1
'B'
Mon Feb 01 2016 02:00:00.000
1
'Error X'
Mon Feb 01 2016 04:00:00.000
2
'A'
Mon Feb 01 2016 02:00:00.000
2
'Error Y'
Mon Feb 01 2016 03:00:00.000
2
'B'
Mon Feb 01 2016 04:00:00.000
3
'A'
Mon Feb 01 2016 04:00:00.000
3
'Error Z'
Mon Feb 01 2016 05:00:00.000
3
'B'
Mon Feb 01 2016 07:00:00.000
Output
Result
Column1 : STRINGColumn2 : STRING
'Error X''B'
'Error Y''A'
'Error Z''A'


Error Duration

Furthermore the duration of errors is also interesting with regard to delays in production or the duration of production in general. LEAD and LAG can be used to calculate the duration of errors.


[5] This example calculates the duration of each station after an error occured with the usage of LAG.

Query
Column1
CASE
    WHEN
        LAG ( "Activities"."Activity" , PARTITION BY ( "Activities"."Case" ) ) LIKE 'Error%'
    THEN
        HOURS_BETWEEN (
            LAG ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Case" ) ) ,
            "Activities"."Timestamp"
        )
    ELSE
        null
END
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Mon Feb 01 2016 01:00:00.000
1
'B'
Mon Feb 01 2016 02:00:00.000
1
'Error X'
Mon Feb 01 2016 04:00:00.000
2
'A'
Mon Feb 01 2016 02:00:00.000
2
'Error Y'
Mon Feb 01 2016 03:00:00.000
2
'B'
Mon Feb 01 2016 04:00:00.000
3
'A'
Mon Feb 01 2016 04:00:00.000
3
'Error Z'
Mon Feb 01 2016 05:00:00.000
3
'B'
Mon Feb 01 2016 07:00:00.000
Output
Result
Column1 : FLOAT
null
null
null
null
null
1.0
null
null
2.0



[6] In this example, LAG is used to calculate the average duration of an Error.

Query
Column1
AVG (
    CASE
        WHEN
            LAG ( "Activities"."Activity" , PARTITION BY ( "Activities"."Case" ) ) LIKE 'Error%'
        THEN
            HOURS_BETWEEN (
                LAG ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Case" ) ) ,
                "Activities"."Timestamp"
            )
        ELSE
            null
    END
)
Input
Activities
Case : INTActivity : STRINGTimestamp : DATE
1
'A'
Mon Feb 01 2016 01:00:00.000
1
'B'
Mon Feb 01 2016 02:00:00.000
1
'Error X'
Mon Feb 01 2016 04:00:00.000
2
'A'
Mon Feb 01 2016 02:00:00.000
2
'Error Y'
Mon Feb 01 2016 03:00:00.000
2
'B'
Mon Feb 01 2016 04:00:00.000
3
'A'
Mon Feb 01 2016 04:00:00.000
3
'Error Z'
Mon Feb 01 2016 05:00:00.000
3
'B'
Mon Feb 01 2016 07:00:00.000
Output
Result
Column1 : FLOAT
1.5


  • No labels