Applies to:  CELONIS 4.0 CELONIS 4.2 CELONIS 4.3 CELONIS 4.4 CELONIS 4.5 

Description

SOURCE / TARGET operators create a temporary table with values from different activities (Rows), which can be used to do calculations based on two actions of a case.

In Celonis, a case is stored as a list of activities, which fits nicely in the relational storage model used by Celonis. Unfortunately, it makes it hard to do calculations based on two actions of a case, like calculating the time spent between two activities, because the math operators work only on single rows. But to calculate something based on two activities an operator needs input from two rows.

The problem is solved via a temporary table in which the values from two rows are moved in to one single row. The input from this temporary table can than be used by all operators. Creating a temporary table, with values from different activities (Rows), can be done by the Source and Target operators. The SOURCE / TARGET operators regard a case as edges and nodes. Nodes are the activities. Edges connect the activities in the order in which they happened. An edge points from a source activity to a target activity.

Syntax

Adds the values from a column, which correspond to a source activity, to a temporary table based on the optional filter column and the optional edge configuration:

SOURCE ( table.column [, table.filter_column ] [, edge_configuration ] )

Adds the values from a column, which correspond to a target activity, to a temporary table based on the optional filter column and the optional edge configuration:

TARGET ( table.column [, table.filter_column ] [, edge_configuration ] )
  • table.column: Data transformed to SOURCE / TARGET layout.
  • table.filter_column: Rows in which the value of the filter column are null, are ignored by the SOURCE / TARGET operators.
  • table.edge_configuration: Describes which edges between the activities of a case are considered. See below for detailed explanation.

Examples


This example shows how SOURCE / TARGET operators interpret the case by default (no filter or edge configuration):


Query
Column1
SOURCE ( "ACTIVITIES"."ACTIVITY" )
Column2
SOURCE ( "ACTIVITIES"."TIMESTAMP" )
Column3
TARGET ( "ACTIVITIES"."ACTIVITY" )
Column4
TARGET ( "ACTIVITIES"."TIMESTAMP" )
Input
ACTIVITIES
CASE_ID : STRINGACTIVITY : STRINGTIMESTAMP : DATE
'1''A'
Sun Jan 01 2017 13:00:00.000
'1''B'
Sun Jan 01 2017 13:15:00.000
'1''C'
Sun Jan 01 2017 13:45:00.000
Output
Result
Column1 : STRINGColumn2 : DATEColumn3 : STRINGColumn4 : DATE
'A'
Sun Jan 01 2017 13:00:00.000
'B'
Sun Jan 01 2017 13:15:00.000
'B'
Sun Jan 01 2017 13:15:00.000
'C'
Sun Jan 01 2017 13:45:00.000



This can be combined with further operators. In the following example the time between two activities is calculated. REMAP_TIMESTAMPS is required to convert the timestamp to a number such that the subtraction can be done. Alternatively, MINUTES_BETWEEN can be used.

Query
Column1
SOURCE ( "ACTIVITIES"."ACTIVITY" )
Column2
TARGET ( "ACTIVITIES"."ACTIVITY" )
Column3
TARGET ( REMAP_TIMESTAMPS ( "ACTIVITIES"."TIMESTAMP", MINUTES ) ) - SOURCE ( REMAP_TIMESTAMPS ( "ACTIVITIES"."TIMESTAMP", MINUTES ) )
Column4
MINUTES_BETWEEN ( SOURCE ( "ACTIVITIES"."TIMESTAMP" ), TARGET ( "ACTIVITIES"."TIMESTAMP") )
Input
ACTIVITIES
CASE_ID : STRINGACTIVITY : STRINGTIMESTAMP : DATE
'1''A'
Sun Jan 01 2017 13:00:00.000
'1''B'
Sun Jan 01 2017 13:15:00.000
'1''C'
Sun Jan 01 2017 13:45:00.000
Output
Result
Column1 : STRINGColumn2 : STRINGColumn3 : INTColumn4 : FLOAT
'A''B'
15
15.0
'B''C'
30
30.0



In this example, the case frequency of each edge is calculated. The case frequency is the number of cases in which the edge occurs.

Query
Column1
SOURCE ( "ACTIVITIES"."ACTIVITY" )
Column2
TARGET ( "ACTIVITIES"."ACTIVITY" )
Column3
COUNT ( DISTINCT TARGET ( "ACTIVITIES"."CASE_ID" ) )
Input
ACTIVITIES
CASE_ID : STRINGACTIVITY : STRINGTIMESTAMP : DATE
'1''A'
Sun Jan 01 2017 13:00:00.000
'1''B'
Sun Jan 01 2017 13:01:00.000
'1''C'
Sun Jan 01 2017 13:02:00.000
'1''A'
Sun Jan 01 2017 13:03:00.000
'1''B'
Sun Jan 01 2017 13:04:00.000
'1''C'
Sun Jan 01 2017 13:05:00.000
'2''A'
Sun Jan 01 2017 13:06:00.000
'2''B'
Sun Jan 01 2017 13:07:00.000
'2''D'
Sun Jan 01 2017 13:08:00.000
'2''B'
Sun Jan 01 2017 13:09:00.000
'2''D'
Sun Jan 01 2017 13:10:00.000
Output
Result
Column1 : STRINGColumn2 : STRINGColumn3 : INT
'C''A'
1
'A''B'
2
'D''B'
1
'B''C'
1
'B''D'
1



In this example, the activity frequency of each edge is calculated. The activity frequency is the number of times the edge occurs.

Query
Column1
SOURCE ( "ACTIVITIES"."ACTIVITY" )
Column2
TARGET ( "ACTIVITIES"."ACTIVITY" )
Column3
COUNT ( TARGET ( "ACTIVITIES"."ACTIVITY" ) )
Input
ACTIVITIES
CASE_ID : STRINGACTIVITY : STRINGTIMESTAMP : DATE
'1''A'
Sun Jan 01 2017 13:00:00.000
'1''B'
Sun Jan 01 2017 13:01:00.000
'1''C'
Sun Jan 01 2017 13:02:00.000
'1''A'
Sun Jan 01 2017 13:03:00.000
'1''B'
Sun Jan 01 2017 13:04:00.000
'1''C'
Sun Jan 01 2017 13:05:00.000
'2''A'
Sun Jan 01 2017 13:06:00.000
'2''B'
Sun Jan 01 2017 13:07:00.000
'2''D'
Sun Jan 01 2017 13:08:00.000
'2''B'
Sun Jan 01 2017 13:09:00.000
'2''D'
Sun Jan 01 2017 13:10:00.000
Output
Result
Column1 : STRINGColumn2 : STRINGColumn3 : INT
'C''A'
1
'A''B'
3
'D''B'
1
'B''C'
2
'B''D'
2



In the following example the filter column is used to filter out all activities B. In this example that leads to a direct edge between activity A and C.

Query
Column1
SOURCE ( "ACTIVITIES"."ACTIVITY", REMAP_VALUES ( "ACTIVITIES"."ACTIVITY", [ 'B', NULL ] ) )
Column2
TARGET ( "ACTIVITIES"."ACTIVITY", REMAP_VALUES ( "ACTIVITIES"."ACTIVITY", [ 'B', NULL ] ) )
Column3
TARGET ( REMAP_TIMESTAMPS ( "ACTIVITIES"."TIMESTAMP", MINUTES ), REMAP_VALUES ( "ACTIVITIES"."ACTIVITY", [ 'B', NULL ] ) ) - SOURCE ( REMAP_TIMESTAMPS ( "ACTIVITIES"."TIMESTAMP", MINUTES ), REMAP_VALUES ( "ACTIVITIES"."ACTIVITY", [ 'B', NULL ] ) )
Input
ACTIVITIES
CASE_ID : STRINGACTIVITY : STRINGTIMESTAMP : DATE
'1''A'
Sun Jan 01 2017 13:00:00.000
'1''B'
Sun Jan 01 2017 13:15:00.000
'1''C'
Sun Jan 01 2017 13:45:00.000
Output
Result
Column1 : STRINGColumn2 : STRINGColumn3 : INT
'A''C'
45


Edge Configuration

With this parameter the edge configuration can be defined. The edge configuration describes which edges between the activities of a case are considered.

  • edge_configuration: The following combinations are supported:
    • ANY_OCCURRENCE[] TO ANY_OCCURRENCE[]: Default configuration, edges are drawn between consecutive activities within cases.

    • FIRST_OCCURRENCE[] TO ANY_OCCURRENCE[]: From the first activity of each case an edge is drawn to all other activities in this case.

    • FIRST_OCCURRENCE[] TO ANY_OCCURRENCE_WITH_SELF[]: From the first activity of each case an edge is drawn to all activities in this case, including the first one.

    • ANY_OCCURRENCE[] TO LAST_OCCURRENCE[]: All activities, besides the last one of the case, draw an edge to the last activity of the case.

    • FIRST_OCCURRENCE[] TO LAST_OCCURRENCE[]: From the first activity of a case an edge is drawn to the last activity in this case.

Example


Calculate the number of minutes since case start:

Query
Column1
SOURCE ( "ACTIVITIES"."ACTIVITY" )
Column2
TARGET ( "ACTIVITIES"."ACTIVITY" )
Column3
TARGET ( REMAP_TIMESTAMPS ( "ACTIVITIES"."TIMESTAMP", MINUTES ), FIRST_OCCURRENCE[] TO ANY_OCCURRENCE[] ) - SOURCE ( REMAP_TIMESTAMPS ( "ACTIVITIES"."TIMESTAMP", MINUTES ), FIRST_OCCURRENCE[] TO ANY_OCCURRENCE[] )
Input
ACTIVITIES
CASE_ID : STRINGACTIVITY : STRINGTIMESTAMP : DATE
'1''A'
Sun Jan 01 2017 13:00:00.000
'1''B'
Sun Jan 01 2017 13:15:00.000
'1''C'
Sun Jan 01 2017 13:45:00.000
Output
Result
Column1 : STRINGColumn2 : STRINGColumn3 : INT
'A''B'
15
'A''C'
45


Configuration propagation

SOURCE / TARGET can be configured in many ways. Except in very rare corner cases, it is desired to apply the same configuration to all SOURCE / TARGET operators in a table or a filter statement. Therefore a configuration has to be specified only once and is automatically propagated to all other occurences of SOURCE / TARGET in the same table or filter statement. A configuration is only propageted to SOURCE / TARGET operators which don't have a own configuration and it is also never done across a filter and a table statement, even if they are part of the same component.

Join behavior

SOURCE / TARGET generate temporary tables. For each used unique edge and filter column configuration a separate temporary table is created. SOURCE / TARGET operators with identical edge and filter column add columns to the same temporary table. The temporary tables are joined to the case table.

Example

In this example several calls to the SOURCE / TARGET operator are done. After each call the tables status is shown.

  • Temporary edge table is created. SOURCE column is added to it:
    SOURCE ( Activity, ANY_OCCURRENCE[] TO ANY_OCCURRENCE[] )

  • SOURCE column is added to existing temporary table:
    TARGET ( Activity, ANY_OCCURRENCE[] TO ANY_OCCURRENCE[] )

  • This column is added to a new temporary table, because the configuration is different to the previous ones:
    SOURCE ( Activity, Filter, ANY_OCCURRENCE[] TO ANY_OCCURRENCE[] )

Multiple temporary tables can lead to problems if used within the same query. This happens if SOURCE / TARGET operators with different edge and filter configurations are used within one query. The generated temporary tables can by default not be joined together and an error is returned. If one really needs to have two different configurations, the values in the temporary tables have be pulled up with PU functions to the case table.

Examples


In this example, the filter set inside the SOURCE operator is also applied to the TARGET, generating one single temporary edge table:

Query
Column1
SOURCE ( "ACTIVITIES"."ACTIVITY", REMAP_VALUES ( "ACTIVITIES"."ACTIVITY", [ 'B', NULL ], [ 'C', NULL ] ) )
Column2
TARGET ( "ACTIVITIES"."ACTIVITY" )
Input
ACTIVITIES
CASE_ID : STRINGACTIVITY : STRINGTIMESTAMP : DATE
'1''A'
Sun Jan 01 2017 13:00:00.000
'1''B'
Sun Jan 01 2017 13:01:00.000
'1''C'
Sun Jan 01 2017 13:02:00.000
'1''D'
Sun Jan 01 2017 13:03:00.000
Output
Result
Column1 : STRINGColumn2 : STRING
'A''D'



Two different configurations are passed to SOURCE / TARGET operators inside the same query. Two different temporary tables are generated, which are pulled up to the case table using PU_FIRST. The first result column contains 'B', which is the first source activity name not filtered out by the SOURCE configuration. The second result column contains 'D', which is the target activity of the first result edge ('A' → 'D') resulting from the TARGET configuration.

Query
Column1
PU_FIRST ( "ACTIVITIES_CASES", SOURCE ( "ACTIVITIES"."ACTIVITY", REMAP_VALUES ( "ACTIVITIES"."ACTIVITY", [ 'A', NULL ], [ 'D', NULL ] ) ) )
Column2
PU_FIRST ( "ACTIVITIES_CASES", TARGET ( "ACTIVITIES"."ACTIVITY", ANY_OCCURRENCE[] TO LAST_OCCURRENCE[] ) )
Input
ACTIVITIES
CASE_ID : STRINGACTIVITY : STRINGTIMESTAMP : DATE
'1''A'
Sun Jan 01 2017 13:00:00.000
'1''B'
Sun Jan 01 2017 13:01:00.000
'1''C'
Sun Jan 01 2017 13:02:00.000
'1''D'
Sun Jan 01 2017 13:03:00.000
ACTIVITIES_CASES
CASE_ID : STRING
'1'

Foreign Keys
ACTIVITIES_CASES.CASE_IDACTIVITIES.CASE_ID
Output
Result
Column1 : STRINGColumn2 : STRING
'B''D'


Filter Behavior

We have seen that the temporary edge tables are joined to the case table. This means that if a case is filtered out in the case table, all activities related to that case in the temporary tables are also filtered out. This holds also vice versa, so if all activities related to a case are filtered out, the case itself is also filtered out in the case table. If not all activities of a case are filtered out, the case stays in the case table.

Filter behavior example

  • No labels