Skip to main content

PQL46 (PQL Function Library - CPM 4.6)

SOURCE - TARGET

Applies to: CELONIS 4.0 CELONIS 4.2 CELONIS 4.3 CELONIS 4.4 CELONIS 4.5 CELONIS 4.6

Description

SOURCE / TARGET operators create a temporary table with values from different activities (Rows), which can be used to do calculations based on two actions of a case.

In Celonis, a case is stored as a list of activities, which fits nicely in the relational storage model used by Celonis. Unfortunately, it makes it hard to do calculations based on two actions of a case, like calculating the time spent between two activities, because the math operators work only on single rows. But to calculate something based on two activities an operator needs input from two rows.

The problem is solved via a temporary table in which the values from two rows are moved in to one single row. The input from this temporary table can than be used by all operators. Creating a temporary table, with values from different activities (Rows), can be done by the Source and Target operators. The SOURCE / TARGET operators regard a case as edges and nodes. Nodes are the activities. Edges connect the activities in the order in which they happened. An edge points from a source activity to a target activity.

Syntax

Adds the values from a column, which correspond to a source activity, to a temporary table based on the optional filter column and the optional edge configuration:

SOURCE ( table.column [, table.filter_column ] [, edge_configuration ] )

Adds the values from a column, which correspond to a target activity, to a temporary table based on the optional filter column and the optional edge configuration:

TARGET ( table.column [, table.filter_column ] [, edge_configuration ] )
  • table.column: Data transformed to SOURCE / TARGET layout.

  • table.filter_column: Rows in which the value of the filter column are null, are ignored by the SOURCE / TARGET operators.

  • table.edge_configuration: Describes which edges between the activities of a case are considered. See below for detailed explanation.

The following example illustrates how SOURCE and TARGET can be used to compute the throughput time between an event and its direct successor. While SOURCE always refers to the actual event, TARGET refers to its following event inside the case. Such a connection is called an edge. Consequently, SOURCE and TARGET can be used to combine an event with its following event in the same row of a table. Both operators accept a column of the activity table as input and return the respective value of the referred event, as illustrated in the following example.Besides the computation of custom process KPIs, like the throughput time between certain activities, SOURCE and TARGET also enable more advanced use cases, like the Segregation of Duties.

    Examples

    [1] This example shows how SOURCE / TARGET operators interpret the case by default (no filter or edge configuration).

    27263143.png

    Query

    Column1

    SOURCE ( "ACTIVITIES"."ACTIVITY" )

    Column2

    SOURCE ( "ACTIVITIES"."TIMESTAMP" )

    Column3

    TARGET ( "ACTIVITIES"."ACTIVITY" )

    Column4

    TARGET ( "ACTIVITIES"."TIMESTAMP" )

    Input

    Output

    Activities

    CASE_ID : STRING

    ACTIVITY : STRING

    TIMESTAMP : DATE

    '1'

    'A'

    Tue Jan 01 2019 13:00:00.000

    '1'

    'B'

    Tue Jan 01 2019 13:15:00.000

    '1'

    'C'

    Tue Jan 01 2019 13:45:00.000

    Result

    Column1 : STRING

    Column2 : DATE

    Column3 : STRING

    Column4 : DATE

    'A'

    Tue Jan 01 2019 13:00:00.000

    'B'

    Tue Jan 01 2019 13:15:00.000

    'B'

    Tue Jan 01 2019 13:15:00.000

    'C'

    Tue Jan 01 2019 13:45:00.000

    [2] This can be combined with further operators. In the following example the time between two activities is calculated. REMAP_TIMESTAMPS is required to convert the timestamp to a number such that the subtraction can be done. Alternatively, MINUTES_BETWEEN can be used.

    Query

    Column1

    SOURCE ( "ACTIVITIES"."ACTIVITY" )

    Column2

    TARGET ( "ACTIVITIES"."ACTIVITY" )

    Column3

    TARGET ( REMAP_TIMESTAMPS ( "ACTIVITIES"."TIMESTAMP", MINUTES ) ) - SOURCE ( REMAP_TIMESTAMPS ( "ACTIVITIES"."TIMESTAMP", MINUTES ) )

    Column4

    MINUTES_BETWEEN ( SOURCE ( "ACTIVITIES"."TIMESTAMP" ), TARGET ( "ACTIVITIES"."TIMESTAMP") )

    Input

    Output

    ACTIVITIES

    CASE_ID : STRING

    ACTIVITY : STRING

    TIMESTAMP : DATE

    1

    'A'

    Tue Jan 01 2019 13:00:00.000

    1

    'B'

    Tue Jan 01 2019 13:15:00.000

    1

    'C'

    Tue Jan 01 2019 13:45:00.000

    Result

    Column1 : STRING

    Column2 : STRING

    Column3 : INT

    Column4 : FLOAT

    'A'

    'B'

    15

    15.0

    'B'

    'C'

    30

    30.0

    [3] In this example, the case frequency of each edge is calculated. The case frequency is the number of cases in which the edge occurs.

    Query

    Column1

    SOURCE ( "ACTIVITIES"."ACTIVITY" )

    Column2

    TARGET ( "ACTIVITIES"."ACTIVITY" )

    Column3

    COUNT ( DISTINCT TARGET ( "ACTIVITIES"."CASE_ID" ) )

    Input

    Output

    ACTIVITIES

    CASE_ID : STRING

    ACTIVITY : STRING

    TIMESTAMP : DATE

    '1'

    'A'

    Tue Jan 01 2019 13:00:00.000

    '1'

    'B'

    Tue Jan 01 2019 13:01:00.000

    '1'

    'C'

    Tue Jan 01 2019 13:02:00.000

    '1'

    'A'

    Tue Jan 01 2019 13:03:00.000

    '1'

    'B'

    Tue Jan 01 2019 13:04:00.000

    '1'

    'C'

    Tue Jan 01 2019 13:05:00.000

    '2'

    'A'

    Tue Jan 01 2019 13:06:00.000

    '2'

    'B'

    Tue Jan 01 2019 13:07:00.000

    '2'

    'D'

    Tue Jan 01 2019 13:08:00.000

    '2'

    'B'

    Tue Jan 01 2019 13:09:00.000

    '2'

    'D'

    Tue Jan 01 2019 13:10:00.000

    Result

    Column1 : STRING

    Column2 : STRING

    Column3 : INT

    'C'

    'A'

    1

    'A'

    'B'

    2

    'D'

    'B'

    1

    'B'

    'C'

    1

    'B'

    'D'

    1

    [4] In this example, the activity frequency of each edge is calculated. The activity frequency is the number of times the edge occurs.

    Query

    Column1

    SOURCE ( "ACTIVITIES"."ACTIVITY" )

    Column2

    TARGET ( "ACTIVITIES"."ACTIVITY" )

    Column3

    COUNT ( TARGET ( "ACTIVITIES"."ACTIVITY" ) )

    Input

    Output

    ACTIVITIES

    CASE_ID : STRING

    ACTIVITY : STRING

    TIMESTAMP : DATE

    '1'

    'A'

    Tue Jan 01 2019 13:00:00.000

    '1'

    'B'

    Tue Jan 01 2019 13:01:00.000

    '1'

    'C'

    Tue Jan 01 2019 13:02:00.000

    '1'

    'A'

    Tue Jan 01 2019 13:03:00.000

    '1'

    'B'

    Tue Jan 01 2019 13:04:00.000

    '1'

    'C'

    Tue Jan 01 2019 13:05:00.000

    '2'

    'A'

    Tue Jan 01 2019 13:06:00.000

    '2'

    'B'

    Tue Jan 01 2019 13:07:00.000

    '2'

    'D'

    Tue Jan 01 2019 13:08:00.000

    '2'

    'B'

    Tue Jan 01 2019 13:09:00.000

    '2'

    'D'

    Tue Jan 01 2019 13:10:00.000

    Result

    Column1 : STRING

    Column2 : STRING

    Column3 : INT

    'C'

    'A'

    1

    'A'

    'B'

    3

    'D'

    'B'

    1

    'B'

    'C'

    2

    'B'

    'D'

    2

    [5] In the following example the filter column is used to filter out all activities B. In this example that leads to a direct edge between activity A and C.

    Query

    Column1

    SOURCE ( "ACTIVITIES"."ACTIVITY", REMAP_VALUES ( "ACTIVITIES"."ACTIVITY", [ 'B', NULL ] ) )

    Column2

    TARGET ( "ACTIVITIES"."ACTIVITY", REMAP_VALUES ( "ACTIVITIES"."ACTIVITY", [ 'B', NULL ] ) )

    Column3

    TARGET ( REMAP_TIMESTAMPS ( "ACTIVITIES"."TIMESTAMP", MINUTES ), REMAP_VALUES ( "ACTIVITIES"."ACTIVITY", [ 'B', NULL ] ) ) - SOURCE ( REMAP_TIMESTAMPS ( "ACTIVITIES"."TIMESTAMP", MINUTES ), REMAP_VALUES ( "ACTIVITIES"."ACTIVITY", [ 'B', NULL ] ) )

    Input

    Output

    ACTIVITIES

    CASE_ID : STRING

    ACTIVITY : STRING

    TIMESTAMP : DATE

    '1'

    'A'

    Tue Jan 01 2019 13:00:00.000

    '1'

    'B'

    Tue Jan 01 2019 13:15:00.000

    '1'

    'C'

    Tue Jan 01 2019 13:45:00.000

    Result

    Column1 : STRING

    Column2 : STRING

    Column3 : INT

    'A'

    'C'

    45

    Edge Configuration

    With this parameter the edge configuration can be defined. The edge configuration describes which edges between the activities of a case are considered.

    • edge_configuration: The following combinations are supported:

      SourceTarget_1.png
      • ANY_OCCURRENCE[] TO ANY_OCCURRENCE[]: Default configuration, edges are drawn between consecutive activities within cases.

      • FIRST_OCCURRENCE[] TO ANY_OCCURRENCE[]: From the first activity of each case an edge is drawn to all other activities in this case.

        27263144.png
      • FIRST_OCCURRENCE[] TO ANY_OCCURRENCE_WITH_SELF[]: From the first activity of each case an edge is drawn to all activities in this case, including the first one.

        27263145.png
      • ANY_OCCURRENCE[] TO LAST_OCCURRENCE[]: All activities, besides the last one of the case, draw an edge to the last activity of the case.

        27263146.png
      • FIRST_OCCURRENCE[] TO LAST_OCCURRENCE[]: From the first activity of a case an edge is drawn to the last activity in this case.

        27263147.png
    Example

    [6] Calculate the number of minutes since case start.

    Query

    Column1

    SOURCE ( "ACTIVITIES"."ACTIVITY" )

    Column2

    TARGET ( "ACTIVITIES"."ACTIVITY" )

    Column3

    TARGET ( REMAP_TIMESTAMPS ( "ACTIVITIES"."TIMESTAMP", MINUTES ), FIRST_OCCURRENCE[] TO ANY_OCCURRENCE[] ) - SOURCE ( REMAP_TIMESTAMPS ( "ACTIVITIES"."TIMESTAMP", MINUTES ), FIRST_OCCURRENCE[] TO ANY_OCCURRENCE[] )

    Input

    Output

    ACTIVITIES

    CASE_ID : STRING

    ACTIVITY : STRING

    TIMESTAMP : DATE

    '1'

    'A'

    Tue Jan 01 2019 13:00:00.000

    '1'

    'B'

    Tue Jan 01 2019 13:15:00.000

    '1'

    'C'

    Tue Jan 01 2019 13:45:00.000

    Result

    Column1 : STRING

    Column2 : STRING

    Column3 : INT

    'A'

    'B'

    15

    'A'

    'C'

    45

    Configuration propagation

    SOURCE / TARGET can be configured in many ways. Except in very rare corner cases, it is desired to apply the same configuration to all SOURCE / TARGET operators in a table or a filter statement. Therefore a configuration has to be specified only once and is automatically propagated to all other occurences of SOURCE / TARGET in the same table or filter statement. A configuration is only propageted to SOURCE / TARGET operators which don't have a own configuration and it is also never done across a filter and a table statement, even if they are part of the same component.

    Join behavior

    SOURCE / TARGET generate temporary tables. For each used unique edge and filter column configuration a separate temporary table is created. SOURCE / TARGET operators with identical edge and filter column add columns to the same temporary table. The temporary tables are joined to the case table.

    Example

    In this example several calls to the SOURCE / TARGET operator are done. After each call the tables status is shown.

    • Temporary edge table is created. SOURCE column is added to it:

      SOURCE ( Activity, ANY_OCCURRENCE[] TO ANY_OCCURRENCE[] )
      27263148.png
    • SOURCE column is added to existing temporary table:

      TARGET ( Activity, ANY_OCCURRENCE[] TO ANY_OCCURRENCE[] )
      27263149.png
    • This column is added to a new temporary table, because the configuration is different to the previous ones:

      SOURCE ( Activity, Filter, ANY_OCCURRENCE[] TO ANY_OCCURRENCE[] )
      27263150.png

    Multiple temporary tables can lead to problems if used within the same query. This happens if SOURCE / TARGET operators with different edge and filter configurations are used within one query. The generated temporary tables can by default not be joined together and an error is returned. If one really needs to have two different configurations, the values in the temporary tables have be pulled up with PU functions to the case table.

    Examples

    [7] In this example, the filter set inside the SOURCE operator is also applied to the TARGET, generating one single temporary edge table.

    Query

    Column1

    SOURCE ( "ACTIVITIES"."ACTIVITY", REMAP_VALUES ( "ACTIVITIES"."ACTIVITY", [ 'B', NULL ], [ 'C', NULL ] ) )

    Column2

    TARGET ( "ACTIVITIES"."ACTIVITY" )

    Input

    Output

    ACTIVITIES

    CASE_ID : STRING

    ACTIVITY : STRING

    TIMESTAMP : DATE

    '1'

    'A'

    Tue Jan 01 2019 13:00:00.000

    '1'

    'B'

    Tue Jan 01 2019 13:01:00.000

    '1'

    'C'

    Tue Jan 01 2019 13:02:00.000

    '1'

    'D'

    Tue Jan 01 2019 13:03:00.000

    Result

    Column1 : STRING

    Column2 : STRING

    'A'

    'D'

    [8] Two different configurations are passed to SOURCE / TARGET operators inside the same query. Two different temporary tables are generated, which are pulled up to the case table using PU_FIRST. The first result column contains 'B', which is the first source activity name not filtered out by the SOURCE configuration. The second result column contains 'D', which is the target activity of the first result edge ('A' → 'D') resulting from the TARGET configuration.

    Query

    Column1

    PU_FIRST ( "ACTIVITIES_CASES", SOURCE ( "ACTIVITIES"."ACTIVITY", REMAP_VALUES ( "ACTIVITIES"."ACTIVITY", [ 'A', NULL ], [ 'D', NULL ] ) ) )

    Column2

    PU_FIRST ( "ACTIVITIES_CASES", TARGET ( "ACTIVITIES"."ACTIVITY", ANY_OCCURRENCE[] TO LAST_OCCURRENCE[] ) )

    Input

    Output

    ACTIVITIES

    CASE_ID : STRING

    ACTIVITY : STRING

    TIMESTAMP : DATE

    '1'

    'A'

    Tue Jan 01 2019 13:00:00.000

    '1'

    'B'

    Tue Jan 01 2019 13:01:00.000

    '1'

    'C'

    Tue Jan 01 2019 13:02:00.000

    '1'

    'D'

    Tue Jan 01 2019 13:03:00.000

    ACTIVITIES_CASES

    CASE_ID : STRING

    '1'

    Foreign Keys

    ACTIVITIES.CASE_ID

    ACTIVITIES_CASES.CASE_ID

    Result

    Column1 : STRING

    Column2 : STRING

    'B'

    'D'

    Filter Behavior

    We have seen that the temporary edge tables are joined to the case table. This means that if a case is filtered out in the case table, all activities related to that case in the temporary tables are also filtered out. This holds also vice versa, so if all activities related to a case are filtered out, the case itself is also filtered out in the case table. If not all activities of a case are filtered out, the case stays in the case table.

    SourceTarget_9.png