Development

The source code is available at https://github.com/azazellochg/em_health

Changing Dashboards

By default, the provisioned dashboards are read-only. If you set EMHEALTH_DEBUG=true in the docker/.env, you can modify and save changes via the Grafana UI. However, if you then update the provisioned dashboards (e.g. via git pull), the changes made via UI will be lost. See details here. The workaround is the following:

Make changes to a dashboard via Grafana UI.
Save and export dashboard to JSON using Export > Export as code, toggle Share dashboard with another instance.
Overwrite existing dashboard file (they are in docker/grafana/provisioning/dashboards/) with the saved json file.

Any file changes in the provisioning folder are immediately picked up by Grafana. There’s no need to restart it.

There are a few limitations:

You can create nested folders for dashboards. Max level depth is 4.
You should not rename dashboards or folders via GUI as this will conflict with provisioned files. Do it directly in the files if really needed.
Some provisioned resources (alerts, contact points, datasources) cannot be modified from the GUI. You can create new ones though.

DB performance metrics

After installation the DB performance monitoring is enabled by default. You can check the dashboards under DB performance folder.

Performance statistics is inspired by Pganalyze and includes:

database statistics (updated every 10 min)
tables statistics (updated every 10 min)
index statistics (updated every 10 min)
auto-VACUUM statistics (updated every 1 min)
query statistics (updated every 1 min)
CPU and RAM host statistics (updated every 1 min)
auto-EXPLAIN plans (for queries longer than 500ms)

Statistics retention time is 3 months.

SQL commands

Below are some frequently used commands for psql database client:

connect: psql -U postgres -h localhost -d tem
change db to sem: \c sem
list tables: \dt
list materialized views: dm
list table structure: \d data;
list table content: SELECT * FROM parameters;
disconnect: \q

For more examples refer to the command line cheetsheet

Logs

All EMHealth application actions are saved into emhealth.log. PostgreSQL logs are in CSV format and can be accessed through:

docker exec -it emhealth-db bash
cd /var/lib/postgresql/data/log
cat *.csv

Grafana logs are accessible via:

docker logs emhealth-grafana

Database structure

We have two databases: tem and sem, both have the same structure at the moment. Each database has several schemas:

public - default schema
- schema_info - table to store the current schema version
events - schema for storing HM events data
- instruments - global metadata for each microscope
- enum_types - enumeration names for each instrument
- enum_values - enumeration values for each enum
- parameters - parameters metadata
- enum_values_history - old/replaced enumeration values
- parameters_history - old/replaced parameters
- data - main events data table for all instruments
- data_staging - staging table for bulk data inserts with COPY
uec - schema for storing UECs / Alarms. UEC codes are unified across different instruments
- device_type
- device_instance
- error_code
- subsystem
- error_definitions
- errors - main UEC data table for all instruments
fdw_ms_IID - foreign server schema for MSSQL with UECs (for each instrument ID)
- error_definitions
- error_notifications
fdw_pg_IID - foreign server schema for PostgreSQL with HM data (for each instrument ID)
- event_property
- event_property_type
- event_type
- parameter_type
- instrument_event_config
pganalyze - schema to store database statistics
- database_stats
- table_stats
- index_stats
- vacuum_stats
- stat_statements
- stat_snapshots
- queries
- sys_stats
- stat_explains

Measuring ingestion performance

These benchmarks compare different ingestion strategies for loading timeseries-like CSV data into TimescaleDB.

Workflow:

Generate test data with the desired number of rows.
Run one or more ingestion tests (COPY, EXECUTEMANY, UNNEST).
Compare performance metrics such as rows/s, query planning, and execution times.

Input dataset

The input is a simulated CSV file with N rows. Each row contains:

time (timestamp, millisecond precision)
instrument_id (integer)
param_id (integer)
value_num (float)
value_text (string, optional)

Data generation parameters:

30 days of data
10 instruments
500–1500 parameters per instrument

To generate 1,000,000 rows:

emhealth db test-data 1000000

Benchmarking COPY

The COPY test uses psycopg3 text-format COPY with a configurable chunk size. Each chunk is a Python string containing concatenated rows. This test allows tuning both chunk size and Postgres server settings.

Run with an 8 MB chunk size:

emhealth db test-copy 8388608

Benchmarking EXECUTEMANY

The EXECUTEMANY test uses cursor.executemany() in psycopg3. Internally this leverages libpq’s pipeline mode to run batched INSERT .. VALUES statements. We still commit transactions in batches.

Each run inserts batch_size × num_columns values.

Example with batch size 1000:

emhealth db test-execmany 1000

Benchmarking UNNEST

The UNNEST test uses cursor.execute() to run an INSERT .. UNNEST query. Instead of sending row-by-row inserts, this method sends arrays (one per column) and expands them into rows in PostgreSQL. This reduces query planning overhead compared to EXECUTEMANY.

Example with batch size 1000:

emhealth db test-unnest 1000

Example output

Each test is run 5 times. Results include raw wall times, throughput (rows/s), and query planning/execution stats from pg_stat_statements.

Example output (truncated):

Using insert_copy to insert 997,905 rows into data_staging table:
    Batch size: 8000000
    Raw run times: [0.8793832040391862, 0.903236785903573, 0.8884079209528863, 0.8673364277929068, 0.8408198338001966], rows/s: [1134778.3257815468, 1104809.9629840956, 1123250.903627322, 1150539.7075726986, 1186823.8115766563]
    Avg time over 5 runs: 0.8758 s
    Avg performance: 1,140,040.5423 rows/s
    Calls per run: 1
    Plan time per call: 0.0000 ms
    Exec time per call: 865.8786 ms

Interpreting results

COPY is typically the fastest for bulk ingestion. Experiment with chunk sizes (e.g. 4 MB, 8 MB, 16 MB) to balance client/server memory usage.
EXECUTEMANY is slower but more flexible when UPSERTs are required.
UNNEST can outperform EXECUTEMANY for medium batch sizes, since fewer query plans are created.
Always run with different batch sizes (1,000, 5,000, 10,000) and average results across trials for reliable benchmarks.

Import HM data from PostgreSQL

Note

This functionality is currently under development

On a MPC running Windows 10 and TEM server 7.2+, the health monitor data (not UEC) is stored in a PostgreSQL database and linked to MSSQL using foreign-data wrapper (DSPostgres). The client to import this data directly from MPC is under development.