Skip to main content

PQL46 (PQL Function Library - CPM 4.6)

Join functionality

Applies to: CELONIS 4.0 CELONIS 4.2 CELONIS 4.3 CELONIS 4.4 CELONIS 4.5 CELONIS 4.6

Description

Celonis performs implicit joins when a query accesses several tables. This means that the user does not have to define the join as part of a query, but the tables are joined as defined in the Data Model Editor.

Implicit joins

[1] Here "ActivityTable" and "CaseTable" are queried. The customer information of the "CaseTable" is joined to the activities of the "ActivityTable".

Query

Column1

"ActivityTable"."Activity"

Column2

"CaseTable"."Customer"

Input

Output

ActivityTable

ACTIVITY : STRING

CASE : INT

'A'

1

'B'

1

'A'

2

'C'

2

'D'

3

'E'

null

CaseTable

ID : INT

Customer : STRING

1

'Customer - X'

2

'Customer - Y'

4

'Customer - Z'

null

'Customer - NULL'

Foreign Keys

ActivityTable.Case

CaseTable.ID

Result

Column1 : STRING

Column2 : STRING

'A'

'Customer - X'

'B'

'Customer - X'

'A'

'Customer - 'Y

'C'

'Customer - Y'

'D'

null

'E'

null

Joins are always done as outer left equi join. In this example the ActivityTable is on the left side of the join. Therefore the activity D which doesn't have a join partner in the case table, is part of the result, while Case ID 4 is not. Also it is noteworthy that null values are never joined, as it is the case for activity E.

Which table is on the left side of the join, depends on the join cardinality. Celonis supports 1 - N relations. 1 - N relation means that if for example Table A and B are joined together, every row in B can have zero or one join partner in table A but not more. The other way around there is no limitation. Every row in A can have an arbitrary number of join partners. This sounds maybe limiting but 1 - N join can actually also process N - 1 and 1 - 1 relations. N - 1 relations will be simply turned around to 1 - N relations by Celonis. A 1 - 1 relation is only a special case of a 1 - N relation. But N - M joins are not supported because they are rare, can be transformed into simpler relations and make it much harder to understand the result.

The N side is always put on the left side of the join. In this example, it is easy to see that case 1 and 2 have two join partners in the activity table, while every activity relates to exactly one case. Therefore the ActivityTable is on the left side of the join.

Parent table

Not every table can be joined to every other table within one data model. To understand which tables can be joined together one has to know that Celonis supports an extended Snowflake schema for a data model. A Snowflake schema consists of one fact table which can have multiple N - 1 relations to dimension tables. These dimension tables can themselves have N - 1 relations to other tables. Celonis extends the classic Snowflake schema by supporting multiple fact tables but no cyclic join graphs. To resolve a cyclic join graph you may have to add a table multiple times to the data model.

We categorize the tables in Parent tables and Child tables. The term becomes clear if a schema is interpreted as a family tree where the fact table is the root as can be seen in the following example:

child_parent_table1.png

For Dim. Table (DT) A and DT B the Fact Table is the parent table. For DT C and DT D the DT A is the common parent table and the Fact Table the "grand parent" table. To process a join Celonis first identifies the closest common ancestor. For example if a column from DT A should be joined with a column from DT B, Celonis identifies the Fact Table as closest common parent. It then joins the needed columns from DT A and DT B to the Fact Table. We call this step pulling up the data. The closest common ancestor doesn't always have to be the Fact Table. It also can be another dimension table like for DT. C and DT D for which DT A is the closest common ancestor.

This works perfectly as long as you have only one Fact Table. If you have multiple it might not be possible to identify a common ancestor for your join. To get a better understanding of the issue lets look at the following schema:

child_parent_table2.png

DT A and DT B still can be joined. DT B can be joined with Fact Table 1 and also with Fact Table 2.But DT A can not be joined with Fact Table 2. Also the two Fact Tables can not be joined and would result in the following error:

no_common_parent_error.png