Steps in case of a data model crash or a load failure in CPM4
A data model crash occurs either during a data model load or during query execution.
Engine crash during query execution
If a data model crashes during query execution, you'll find entries like the following in the CPM4 stdout log files.
java.util.concurrent.ExecutionException: com.celonis.engine.transports.exceptions.CrashedEngineException: Process Analysis Engine node 6ad2b785-d79f-4e8d-a356-5516ac5649fb has been shut down., Engine_Version=2.3.25, Stacktrace="null"
The log message contains the version number of the query engine and if available a stacktrace. It also logs the queries which caused the exception. All this information are important if you forward the crash to the development team.
Engine crash during data model load
If a load fails without any further explanation, it is very likely that the engine crashed.
The number one root cause for a crash is the CPM4 server running out of memory. Verifying this type of crash is described below.
If that is not the case, you can try to narrow down the root cause, by checking the data model configurations. The general approach would be to check if custom settings caused the issue. If that is the case, deactivate them and try another reload.
This can be a time consuming process but it is usually effective. Here is a ranked list of the issues causing the most problems in the past:
Check if parallel processes are activated.
Check if the end timestamp column is set.
Turn off pre-caching.
Turn off query caching.
Remove activity table configuration.
Remove joins.
Remove tables.
Try to simplify the data model step by step. When the data model successfully loads again, the failure is probably related to the last change.
Common root cause - out of memory
The number one reason for a data model crashing is insufficient main memory. On Linux, a data model is killed by the operating system if there is not enough memory left. Such an event is logged to the kernel log.(https://askubuntu.com/questions/709336/how-to-find-out-why-process-was-killed-on-server)
Windows systems only throw exceptions, in case they run out of memory, possibly causing a crash.
You can also check the CPM4 log, which contains entries similar to the following in case of little available memory:
2019-10-02 23:01:00 INFO SchedulingService - Memory available 16383 MB, Memory used 15535 MB 2019-10-02 23:01:00 WARN SchedulingService - Memory is to 94 % full. 2019-10-02 23:02:00 INFO SchedulingService - Memory available 16383 MB, Memory used 15579 MB 2019-10-02 23:02:00 WARN SchedulingService - Memory is to 95 % full.
If you believe that not enough memory is the root cause, please compare the hardware specifications with our sizing guidelines.
Reporting a crash
Please provide the following information in case of a crash:
CPM4 stdout log file.
Data Model ID.
Data Model Size -> rows per table (if available).
Hardware configuration.
Operating system.
Engine log file for data model.
Providing engine log files:
Engine log files often provide more information about a crash then the CPM4 stdout logs.
On linux the log file can be found in $CPM4_HOME/root/logs/engine/$DATA_MODEL_ID-LOADTIMESTAMP.log.
On windows it can be found in $CPM4_HOME/appfiles/logs/engine/$DATA_MODEL_ID-LOADTIMESTAMP.log.
By default the engine log file does not contain a lot of data. To get more information switch the engine log level to debug. That can be done in the loading tab in data model editor.