Metrics¶
Performance metrics of all workstations are available using netdata, which collects 2k-20k metrics/second/host depending on the installed hardware and software.
Dashboards¶
The main dashboard at netdata.monitor.phys.ethz.ch shows all available
metrics on a single page. Click on the arrow >
and choose a workstation.
Hold Shift
and scroll the mouse wheel over a chart to zoom out and view earlier data.
Our server currently keeps ~3 days worth of data with multiple levels of resolution (tiering).
We also provide custom dashboards for some use cases at /dash.
Select a host by adding a query parameter ?host=hostname
at the end of the URL.
- basic resource usage: cpu-mem-io-net.html
- cpu + memory usage by user: cpu-mem-io-net-users.html
- gpu usage: gpu.html
- gpu memory usage by user: gpu_mem_user.html
- resource control: resctl.html
The resource control dashbaord gives an overview of important metrics, should you want to make full use of the available system resources (go to the limit). It can also be useful for debugging performance issues or if you suffer from OOM (out of memory) conditions. Refer to resource control for details.
Custom dashboards¶
You can create your own custom dashboards (see documentation). To get started edit one of our dashbaord html
files or create your own and publish it on your personal homepage.