Applies to:  CELONIS 4.6 CELONIS 4.7 

Description

INDEX_ORDER returns a column with integer indices, starting from 1. The indices indicate the order of the rows.

The INDEX_ORDER function creates an INT index column. The index column contains a unique integer value for every row of a required reference input column.

INDEX_ORDER can be applied to INT, FLOAT, DATE or STRING columns. The result is a column of type INT.

Syntax

INDEX_ORDER returns integer indices based on the given sorting and partition. In this preferred syntax, ORDER BY and PARTITION BY can be used to define the sorting and partitioning that should be used (Since:  CELONIS 4.7):

INDEX_ORDER ( column [, ORDER BY ( sort_column [sorting], ... )] [, PARTITION BY ( partition_column, ... )] )
  • column: The source column on which the index column will be based.
  • sort_column: Optional sorting column to specify an order.
  • sorting: Each of these columns can have an optional tag specifying the ordering of the column. Default is ascending:
    • ASC: Ascending order
    • DESC: Descending order
  • partition_column: Optional partition column to specify groups in which INDEX_ORDER should operate.

Ordering

One or more columns can be given to specify an ordering. This tells the INDEX_ORDER function what the preceding element actually is. Optionally every column can be tagged as ascending or descending. Equal input values are always given distinct output indices. The indices for a given range of equal input values are incremented starting from the topmost value of this range of the input column:


[1] Create an index order for an input column that contains equal values.

Query
Column1
INDEX_ORDER ( "Table1"."Column1" )
Input
Table1
Column1 : INT
1
1
2
2
2
Output
Result
Column1 : INT
1
2
3
4
5


Partitioning

The partition columns specify groups. The INDEX_ORDER function operates independently within every group. This means when an ordering is given it is applied within every group.

Null handling

NULL values are ignored, meaning that the NULL value rows stay NULL in the result column and will not count as an index order value.


[2] Create an index order for an integer input column that includes NULL values:

Query
Column1
INDEX_ORDER ( "Table1"."Column1" )
Input
Table1
Column1 : INT
11
null
14
null
15
Output
Result
Column1 : INT
1
null
2
null
3


Deprecated Behavior

As an alternative to the recommended syntax, the INDEX_ORDER function can be called as follows: [ DEPRECATED SINCE 4.7  ].

INDEX_ORDER ( <column> [, <sorting> ] [, GROUP( <group_column>, ... ) ] )
  • column: The source column on which the index column will be based.
  • sorting: An optional sorting of the indices, which defaults to ascending.
    • ASC: Ascending index order values.
    • DESC: Descending index order values.
  • grouping: Optional grouping columns. Independent index order ranges will be created for each group, all starting from 1.
The values of the indices are based on the order of the values in the input column, meaning that the input column is automatically sorted.

Examples


[3] Create an index order for an integer input column:

Query
Column1
INDEX_ORDER ( "Table1"."Column1" )
Input
Table1
Column1 : INT
11
13
14
12
15
Output
Result
Column1 : INT
1
2
3
4
5



[4] Create an index order for a date input column:

Query
Column1
INDEX_ORDER ( "Table1"."Date" )
Input
Table1
Date : DATE
Tue Jan 01 2019 00:12:30.000
Tue Jan 01 2019 00:12:40.000
Tue Jan 01 2019 00:12:00.000
Tue Jan 01 2019 00:11:30.000
Tue Jan 01 2019 00:18:00.000
Tue Jan 01 2019 00:19:00.000
Output
Result
Column1 : INT
1
2
3
4
5
6



[5] Create an index order for an input column in ascending order:

Query
Column1
INDEX_ORDER ( "Table1"."Column1" , ORDER BY ( "Table1"."Column1" ASC ) )
Input
Table1
Column1 : INT
11
13
14
12
15
Output
Result
Column1 : INT
1
3
4
2
5



[6] Create an index order with a joined case table:

Query
Column1
"companyDetail"."country"
Column2
INDEX_ORDER ( "caseTable"."value" )
Input
caseTable
caseId : INTcompanyCode : STRINGvalue : INT
1
'001'
600
2
'002'
400
3
'003'
200
companyDetail
companyCode : STRINGcountry : STRING
'001''DE'
'002''FR'
'003''US'

Foreign Keys
companyDetail.companyCodecaseTable.companyCode
Output
Result
Column1 : STRINGColumn2 : INT
'DE'
1
'FR'
2
'US'
3



[7] Create an index order on an aggregation function:

Query
Column1
"companyDetail"."country"
Column2
INDEX_ORDER ( COUNT_TABLE ( "caseTable" ) )
Input
caseTable
caseId : INTcompanyCode : STRINGvalue : INT
1
'001'
600
2
'002'
400
3
'003'
200
companyDetail
companyCode : STRINGcountry : STRING
'001''DE'
'002''FR'
'003''US'

Foreign Keys
companyDetail.companyCodecaseTable.companyCode
Output
Result
Column1 : STRINGColumn2 : INT
'DE'
1
'FR'
2
'US'
3



[8] Create an index order for a date input column with equal values:

Query
Column1
INDEX_ORDER ( "Table1"."Date" , ORDER BY ( "Table1"."Date" ) )
Input
Table1
Date : DATE
Tue Jan 01 2019 00:12:30.000
Tue Jan 01 2019 00:12:30.000
Tue Jan 01 2019 00:12:30.000
Tue Jan 01 2019 00:11:30.000
Tue Jan 01 2019 00:18:00.000
Tue Jan 01 2019 00:12:30.000
Output
Result
Column1 : INT
2
3
4
1
6
5



[9] Create an index order for a string input column with a function:

Query
Column1
INDEX_ORDER ( RIGHT ( "Table1"."Column1" , 1 ) )
Input
Table1
Column1 : STRING
'AO'
'BP'
'CQ'
'DR'
'ES'
Output
Result
Column1 : INT
1
2
3
4
5



[10] Create an descending index order for an integer input column that includes NULL values:

Query
Column1
INDEX_ORDER ( "Table1"."Column1" , ORDER BY ( "Table1"."Column1" DESC ) )
Input
Table1
Column1 : INT
11
null
14
null
15
Output
Result
Column1 : INT
3
null
2
null
1



[11] Create an index order for each partition:

Query
Column1
"countryDetail"."country"
Column2
INDEX_ORDER (
    "countryDetail"."value" ,
    ORDER BY ( "countryDetail"."value" ) ,
    PARTITION BY ( "countryDetail"."country" )
)
Input
caseTable
caseId : INTcompanyCode : STRINGvalue : INT
1
'001'
600
2
'002'
400
3
'003'
200
countryDetail
value : INTcountry : STRINGcity : STRING
100
'DE''Munich'
300
'FR''Amiens'
300
'US''Tucson'
200
'DE''Munich'
200
'FR''Paris'
100
'US''Seattle'
300
'DE''Hildesheim'
100
'FR''Amiens'
200
'US''Tucson'

Foreign Keys
countryDetail.valuecaseTable.value
Output
Result
Column1 : STRINGColumn2 : INT
'DE'
1
'FR'
3
'US'
3
'DE'
2
'FR'
2
'US'
1
'DE'
3
'FR'
1
'US'
2



[12] Create an index order for each group, as given by more columns:

Query
Column1
"countryDetail"."country"
Column2
"countryDetail"."city"
Column3
INDEX_ORDER (
    "countryDetail"."value" ,
    ORDER BY ( "countryDetail"."value" ) ,
    PARTITION BY ( "countryDetail"."country" , "countryDetail"."city" )
)
Input
caseTable
caseId : INTcompanyCode : STRINGvalue : INT
1
'001'
600
2
'002'
400
3
'003'
200
countryDetail
value : INTcountry : STRINGcity : STRING
100
'DE''Munich'
300
'FR''Amiens'
300
'US''Tucson'
200
'DE''Munich'
200
'FR''Paris'
100
'US''Seattle'
300
'DE''Hildesheim'
100
'FR''Amiens'
200
'US''Tucson'

Foreign Keys
countryDetail.valuecaseTable.value
Output
Result
Column1 : STRINGColumn2 : STRINGColumn3 : INT
'DE''Munich'
1
'FR''Amiens'
2
'US''Tucson'
2
'DE''Munich'
2
'FR''Paris'
1
'US''Seattle'
1
'DE''Hildesheim'
1
'FR''Amiens'
1
'US''Tucson'
1


Advanced Example: Case Coverage

The INDEX_ORDER function allows to calculate KPIs, that require sorting. One example is calculating how many cases are covered by the N most common process variants:


[13] Given two cities, we want to find out how many cases are covered by the two most common variants for each city. Let us first look at the process variants.

Query
Column1
"Cases"."CITY"
Column2
VARIANT ( "Activities"."ACTIVITY" )
Column3
COUNT ( "Cases"."CASE_ID" )
Input
Activities
CASE_ID : INTACTIVITY : STRINGTIMESTAMP : DATE
1
'A'
Fri Feb 01 2019 00:00:00.000
1
'B'
Sat Feb 02 2019 00:00:00.000
1
'C'
Sun Feb 03 2019 00:00:00.000
2
'A'
Fri Mar 01 2019 00:00:00.000
2
'B'
Sat Mar 02 2019 00:00:00.000
2
'C'
Sun Mar 03 2019 00:00:00.000
3
'A'
Mon Apr 01 2019 00:00:00.000
3
'B'
Tue Apr 02 2019 00:00:00.000
3
'C'
Wed Apr 03 2019 00:00:00.000
4
'A'
Wed May 01 2019 00:00:00.000
4
'B'
Thu May 02 2019 00:00:00.000
4
'C'
Fri May 03 2019 00:00:00.000
5
'R'
Sat Jun 01 2019 00:00:00.000
5
'S'
Sun Jun 02 2019 00:00:00.000
5
'T'
Mon Jun 03 2019 00:00:00.000
6
'A'
Mon Jul 01 2019 00:00:00.000
6
'B'
Tue Jul 02 2019 00:00:00.000
6
'C'
Wed Jul 03 2019 00:00:00.000
7
'R'
Thu Aug 01 2019 00:00:00.000
7
'S'
Fri Aug 02 2019 00:00:00.000
7
'T'
Sat Aug 03 2019 00:00:00.000
8
'R'
Sun Sep 01 2019 00:00:00.000
8
'S'
Mon Sep 02 2019 00:00:00.000
8
'T'
Tue Sep 03 2019 00:00:00.000
9
'R'
Tue Oct 01 2019 00:00:00.000
9
'S'
Wed Oct 02 2019 00:00:00.000
9
'T'
Thu Oct 03 2019 00:00:00.000
10
'X'
Fri Nov 01 2019 00:00:00.000
10
'Y'
Sat Nov 02 2019 00:00:00.000
10
'Z'
Sun Nov 03 2019 00:00:00.000
Cases
CASE_ID : INTCITY : STRING
1
'Casablanca'
2
'Casablanca'
3
'Casablanca'
4
'Casablanca'
5
'Casablanca'
6
'G?ttingen'
7
'G?ttingen'
8
'G?ttingen'
9
'G?ttingen'
10
'G?ttingen'

Foreign Keys
Cases.CASE_IDActivities.CASE_ID
Output
Result
Column1 : STRINGColumn2 : STRINGColumn3 : INT
'Casablanca''A, B, C'
4
'G?ttingen''A, B, C'
1
'Casablanca''R, S, T'
1
'G?ttingen''R, S, T'
3
'G?ttingen''X, Y, Z'
1



[14] In this step we add:
  • a count of how often a variant occurs per city.
  • an index order, which gives an index sorted by how often the different variants occur, per city.

Query
Column1
"Cases"."CITY"
Column2
VARIANT ( "Activities"."ACTIVITY" )
Column3
COUNT ( "Cases"."CASE_ID" )
Column4
INDEX_ORDER (
    PU_COUNT (
        DOMAIN_TABLE ( VARIANT ( "Activities"."ACTIVITY" ) , "Cases"."CITY" ) ,
        "Cases"."CASE_ID"
    ) ,
    ORDER BY (
        PU_COUNT (
            DOMAIN_TABLE ( VARIANT ( "Activities"."ACTIVITY" ) , "Cases"."CITY" ) ,
            "Cases"."CASE_ID"
        ) DESC
    ) ,
    PARTITION BY (
        PU_FIRST (
            DOMAIN_TABLE ( VARIANT ( "Activities"."ACTIVITY" ) , "Cases"."CITY" ) ,
            "Cases"."CITY"
        )
    )
)
Input
Activities
CASE_ID : INTACTIVITY : STRINGTIMESTAMP : DATE
1
'A'
Fri Feb 01 2019 00:00:00.000
1
'B'
Sat Feb 02 2019 00:00:00.000
1
'C'
Sun Feb 03 2019 00:00:00.000
2
'A'
Fri Mar 01 2019 00:00:00.000
2
'B'
Sat Mar 02 2019 00:00:00.000
2
'C'
Sun Mar 03 2019 00:00:00.000
3
'A'
Mon Apr 01 2019 00:00:00.000
3
'B'
Tue Apr 02 2019 00:00:00.000
3
'C'
Wed Apr 03 2019 00:00:00.000
4
'A'
Wed May 01 2019 00:00:00.000
4
'B'
Thu May 02 2019 00:00:00.000
4
'C'
Fri May 03 2019 00:00:00.000
5
'R'
Sat Jun 01 2019 00:00:00.000
5
'S'
Sun Jun 02 2019 00:00:00.000
5
'T'
Mon Jun 03 2019 00:00:00.000
6
'A'
Mon Jul 01 2019 00:00:00.000
6
'B'
Tue Jul 02 2019 00:00:00.000
6
'C'
Wed Jul 03 2019 00:00:00.000
7
'R'
Thu Aug 01 2019 00:00:00.000
7
'S'
Fri Aug 02 2019 00:00:00.000
7
'T'
Sat Aug 03 2019 00:00:00.000
8
'R'
Sun Sep 01 2019 00:00:00.000
8
'S'
Mon Sep 02 2019 00:00:00.000
8
'T'
Tue Sep 03 2019 00:00:00.000
9
'R'
Tue Oct 01 2019 00:00:00.000
9
'S'
Wed Oct 02 2019 00:00:00.000
9
'T'
Thu Oct 03 2019 00:00:00.000
10
'X'
Fri Nov 01 2019 00:00:00.000
10
'Y'
Sat Nov 02 2019 00:00:00.000
10
'Z'
Sun Nov 03 2019 00:00:00.000
Cases
CASE_ID : INTCITY : STRING
1
'Casablanca'
2
'Casablanca'
3
'Casablanca'
4
'Casablanca'
5
'Casablanca'
6
'G?ttingen'
7
'G?ttingen'
8
'G?ttingen'
9
'G?ttingen'
10
'G?ttingen'

Foreign Keys
Cases.CASE_IDActivities.CASE_ID
Output
Result
Column1 : STRINGColumn2 : STRINGColumn3 : INTColumn4 : INT
'Casablanca''A, B, C'
4
1
'Casablanca''R, S, T'
1
2
'G?ttingen''A, B, C'
1
2
'G?ttingen''R, S, T'
3
1
'G?ttingen''X, Y, Z'
1
3



[15] Finally we want to report the fraction of cases which are covered by the two most common process variants, per city. To calculate the fraction, we use the PU_COUNT function to count cases per city:
  • Numerator: Inside the PU_COUNT function we only select the variants that are among the two most common (< 3) for that city.
  • Denominator: We apply the PU_COUNT function on all variants.

Query
Column1
"Cases"."CITY"
Column2
COUNT ( "Cases"."CASE_ID" )
Column3
PU_COUNT (
    DOMAIN_TABLE ( "Cases"."CITY" ) ,
    "Cases"."CASE_ID" ,
    INDEX_ORDER (
        PU_COUNT (
            DOMAIN_TABLE ( VARIANT ( "Activities"."ACTIVITY" ) , "Cases"."CITY" ) ,
            "Cases"."CASE_ID"
        ) ,
        ORDER BY (
            PU_COUNT (
                DOMAIN_TABLE ( VARIANT ( "Activities"."ACTIVITY" ) , "Cases"."CITY" ) ,
                "Cases"."CASE_ID"
            ) DESC
        ) ,
        PARTITION BY (
            PU_FIRST (
                DOMAIN_TABLE ( VARIANT ( "Activities"."ACTIVITY" ) , "Cases"."CITY" ) ,
                "Cases"."CITY"
            )
        )
    )
    <
    3
)
/
PU_COUNT ( DOMAIN_TABLE ( "Cases"."CITY" ) , "Cases"."CASE_ID" )
Input
Activities
CASE_ID : INTACTIVITY : STRINGTIMESTAMP : DATE
1
'A'
Fri Feb 01 2019 00:00:00.000
1
'B'
Sat Feb 02 2019 00:00:00.000
1
'C'
Sun Feb 03 2019 00:00:00.000
2
'A'
Fri Mar 01 2019 00:00:00.000
2
'B'
Sat Mar 02 2019 00:00:00.000
2
'C'
Sun Mar 03 2019 00:00:00.000
3
'A'
Mon Apr 01 2019 00:00:00.000
3
'B'
Tue Apr 02 2019 00:00:00.000
3
'C'
Wed Apr 03 2019 00:00:00.000
4
'A'
Wed May 01 2019 00:00:00.000
4
'B'
Thu May 02 2019 00:00:00.000
4
'C'
Fri May 03 2019 00:00:00.000
5
'R'
Sat Jun 01 2019 00:00:00.000
5
'S'
Sun Jun 02 2019 00:00:00.000
5
'T'
Mon Jun 03 2019 00:00:00.000
6
'A'
Mon Jul 01 2019 00:00:00.000
6
'B'
Tue Jul 02 2019 00:00:00.000
6
'C'
Wed Jul 03 2019 00:00:00.000
7
'R'
Thu Aug 01 2019 00:00:00.000
7
'S'
Fri Aug 02 2019 00:00:00.000
7
'T'
Sat Aug 03 2019 00:00:00.000
8
'R'
Sun Sep 01 2019 00:00:00.000
8
'S'
Mon Sep 02 2019 00:00:00.000
8
'T'
Tue Sep 03 2019 00:00:00.000
9
'R'
Tue Oct 01 2019 00:00:00.000
9
'S'
Wed Oct 02 2019 00:00:00.000
9
'T'
Thu Oct 03 2019 00:00:00.000
10
'X'
Fri Nov 01 2019 00:00:00.000
10
'Y'
Sat Nov 02 2019 00:00:00.000
10
'Z'
Sun Nov 03 2019 00:00:00.000
Cases
CASE_ID : INTCITY : STRING
1
'Casablanca'
2
'Casablanca'
3
'Casablanca'
4
'Casablanca'
5
'Casablanca'
6
'G?ttingen'
7
'G?ttingen'
8
'G?ttingen'
9
'G?ttingen'
10
'G?ttingen'

Foreign Keys
Cases.CASE_IDActivities.CASE_ID
Output
Result
Column1 : STRINGColumn2 : INTColumn3 : FLOAT
'G?ttingen'
5
0.8
'Casablanca'
5
1.0


  • No labels