Applies to:  CELONIS 4.3 CELONIS 4.4 CELONIS 4.5 CELONIS 4.6 

Description

This function calculates the z score over an INT or a FLOAT. The output type is always FLOAT.

ZScore can act as a standardization of data by mapping each value to the distance to the mean in multiples of standard deviations. This is especially useful in evaluating simple 2-, 3- or 6-sigma rules for outlier detection on a column.

Syntax

ZSCORE ( table.column )

Tips

  • The calculation is parallelized. The function first calculates the variance in two parallelized passes over the data in the column then the ZScore over the unique values of the column is calculated in parallel.
  • The ZScore of large FLOAT columns can have a large error.
  • ZScore over DATE columns should utilize a DateTime projection function.

Examples


[1] Simple ZSCORE calculation over FLOAT column.

Query
Column1
ZSCORE("TABLE"."COLUMN")
Input
TABLE
COLUMN : FLOAT
1.0
1.0
7.0
7.0
4.0
Output
Result
Column1 : FLOAT
-1.0
-1.0
1.0
1.0
0.0



[2] ZSCORE in combination with a date column using HOURS as projection function.

Query
Column1
ZSCORE(HOURS("TABLE"."COLUMN"))
Input
TABLE
COLUMN : DATE
Thu Jan 01 1970 06:00:00.000
Thu Jan 01 1970 07:00:00.000
Thu Jan 01 1970 08:00:00.000
Thu Jan 01 1970 09:00:00.000
Thu Jan 01 1970 10:00:00.000
Thu Jan 01 1970 11:00:00.000
Thu Jan 01 1970 12:00:00.000
Thu Jan 01 1970 13:00:00.000
Thu Jan 01 1970 14:00:00.000
Thu Jan 01 1970 18:00:00.000
Thu Jan 01 1970 08:00:00.000
Thu Jan 01 1970 09:00:00.000
Thu Jan 01 1970 08:00:00.000
Output
Result
Column1 : FLOAT
-1.2741407711983999
-0.9729802252787779
-0.6718196793591561
-0.3706591334395343
-0.06949858751991247
0.23166195839970935
0.5328225043193312
0.833983050238953
1.1351435961585747
2.339785779837062
-0.6718196793591561
-0.3706591334395343
-0.6718196793591561


  • No labels