Skip to main content

PQL47 (PQL Function Library - CPM 4.7)

Machine utilization in Production

Applies to: CELONIS 4.7

Description

This example shows how to calculate the throughput time of a machine with PQL.

In production use cases machine throughput time is often a number of interest. Calculating this number is not straight forward as the event log is usually built around the products which go through the assembly line and not the machine. Functions which are usually used for calculating the throughput time like SOURCE and TARGET or CALC_THROUGHPUT operate on case level. But a machine is normally involved in many cases. Therefore, LEAD and LAG can be used to calculate the duration of activities through a variety of cases. These functions can also be used to trace errors.

Duration of stations

The duration of a station can be calculated by comparing the timestamp of an Activity to the timestamp of the next Activity, that can be identified using LEAD. The duration can also be calculated using LAG, but this would return the duration of the previous activity.

[1] With LEAD, we can compare the timestamps of Activities to the timestamps of their following Activities, that belong to the same activity type. A function that calculates time differences like HOURS_BETWEEN can be used to then return the duration for each activity.

Query

Filter

FILTER LEAD ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) IS NOT NULL;

Column1

"Activities"."Activity"

Column2

HOURS_BETWEEN (
    "Activities"."Timestamp" ,
    LEAD ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) )
)

Input

Output

Activities

Case : INT

Activity : STRING

Timestamp : DATE

1

'A'

Mon Feb 01 2016 01:00:00.000

1

'B'

Mon Feb 01 2016 02:00:00.000

2

'A'

Mon Feb 01 2016 02:00:00.000

2

'B'

Mon Feb 01 2016 03:00:00.000

3

'A'

Mon Feb 01 2016 04:00:00.000

3

'B'

Mon Feb 01 2016 06:00:00.000

Result

Column1 : STRING

Column2 : FLOAT

'A'

1.0

'B'

1.0

'A'

2.0

'B'

3.0

[2] In this example we can see, how the average duration of each activity type can be calculated.

Query

Filter

FILTER LEAD ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) IS NOT NULL;

Column1

"Activities"."Activity"

Column2

AVG (
    HOURS_BETWEEN (
        "Activities"."Timestamp" ,
        LEAD ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) )
    )
)

Input

Output

Activities

Case : INT

Activity : STRING

Timestamp : DATE

1

'A'

Mon Feb 01 2016 01:00:00.000

1

'B'

Mon Feb 01 2016 02:00:00.000

2

'A'

Mon Feb 01 2016 02:00:00.000

2

'B'

Mon Feb 01 2016 03:00:00.000

3

'A'

Mon Feb 01 2016 04:00:00.000

4

'B'

Mon Feb 01 2016 06:00:00.000

Result

Column1 : STRING

Column2 : FLOAT

'A'

1.5

'B'

2.0

[3] This is an example how to calculate the duration of one activity at its station with the usage of LAG. As B could identify a certain machine, this would show the duration of every previously performed work at this station.

Query

Filter

FILTER LAG ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) IS NOT NULL;

Filter

FILTER "Activities"."Activity" = 'B';

Column1

"Activities"."Case"

Column2

HOURS_BETWEEN (
    LAG ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) ,
    "Activities"."Timestamp"
)

Input

Output

Case : INT

Activity : STRING

Timestamp : DATE

1

'A'

Mon Feb 01 2016 01:00:00.000

1

'B'

Mon Feb 01 2016 02:00:00.000

2

'A'

Mon Feb 01 2016 02:00:00.000

2

'B'

Mon Feb 01 2016 03:00:00.000

3

'A'

Mon Feb 01 2016 03:00:00.000

3

'B'

Mon Feb 01 2016 06:00:00.000

Result

Column1 : INT

Column2 : FLOAT

2

1.0

3

3.0

Error cause

In case of errors occurring in processes, it is always interesting to identify the reason behind this error. With the usage of LEAD and LAG, it is easy to find the activities that happened before or after an error.

[4] This example shows how the cause for an error could be identified using LAG.

Query

Filter

FILTER ( "Activities"."Activity" ) LIKE 'Error%';

Column1

"Activities"."Activity"

Column2

LAG ( "Activities"."Activity" )

Input

Output

Activities

Case : INT

Activity : STRING

Timestamp : DATE

1

'A'

Mon Feb 01 2016 01:00:00.000

1

'B'

Mon Feb 01 2016 02:00:00.000

1

'Error X'

Mon Feb 01 2016 04:00:00.000

2

'A'

Mon Feb 01 2016 02:00:00.000

2

'Error Y'

Mon Feb 01 2016 03:00:00.000

2

'B'

Mon Feb 01 2016 04:00:00.000

3

'A'

Mon Feb 01 2016 04:00:00.000

3

'Error Z'

Mon Feb 01 2016 05:00:00.000

3

'B'

Mon Feb 01 2016 07:00:00.000

Result

Column1 : STRING

Column2 : STRING

'Error X'

'B'

'Error Y'

'A'

'Error Z'

'A'

Error duration

Furthermore the duration of errors is also interesting with regard to delays in production or the duration of production in general. LEAD and LAG can be used to calculate the duration of errors.

[5] This example calculates the duration of each station after an error occured with the usage of LAG.

Query

Column

CASE
    WHEN
        LAG ( "Activities"."Activity" , PARTITION BY ( "Activities"."Case" ) ) LIKE 'Error%'
    THEN
        HOURS_BETWEEN (
            LAG ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Case" ) ) ,
            "Activities"."Timestamp"
        )
    ELSE
        null
END

Input

Output

Activities

Case : INT

Activity : STRING

Timestamp : DATE

1

'A'

Mon Feb 01 2016 01:00:00.000

1

'B'

Mon Feb 01 2016 02:00:00.000

1

'Error X'

Mon Feb 01 2016 04:00:00.000

2

'A'

Mon Feb 01 2016 02:00:00.000

2

'Error Y'

Mon Feb 01 2016 03:00:00.000

2

'B'

Mon Feb 01 2016 04:00:00.000

3

'A'

Mon Feb 01 2016 04:00:00.000

3

'Error Z'

Mon Feb 01 2016 05:00:00.000

3

'B'

Mon Feb 01 2016 07:00:00.000

Result

Column1 : FLOAT

null

null

null

null

null

10

null

null

2.0

[6] In this example, LAG is used to calculate the average duration of an error.

Query

Column1

AVG (
    CASE
        WHEN
            LAG ( "Activities"."Activity" , PARTITION BY ( "Activities"."Case" ) ) LIKE 'Error%'
        THEN
            HOURS_BETWEEN (
                LAG ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Case" ) ) ,
                "Activities"."Timestamp"
            )
        ELSE
            null
    END
)

Input

Output

Activities

Case : INT

Activity : STRING

Timestamp : DATE

1

'A'

Mon Feb 01 2016 01:00:00.000

1

'B'

Mon Feb 01 2016 02:00:00.000

1

'Error X'

Mon Feb 01 2016 04:00:00.000

2

'A'

Mon Feb 01 2016 02:00:00.000

2

'Error Y'

Mon Feb 01 2016 03:00:00.000

2

'B'

Mon Feb 01 2016 04:00:00.000

3

'A'

Mon Feb 01 2016 04:00:00.000

3

'Error Z'

Mon Feb 01 2016 05:00:00.000

3

'B'

Mon Feb 01 2016 07:00:00.000

Result

Column1 : FLOAT

1.5