Monitoring
How to monitor your NodeFoundry cluster’s health and performance.
Task Monitoring
All background operations (OSD creation, image downloads, etc.) are tracked as tasks:
$ nf tasks list
Failed tasks include logs for debugging. Check them via the API at GET /task/run/{runId}/logs.
Cluster Health
The Ceph metrics endpoint provides real-time cluster health:
| Metric | Description |
|---|---|
| Health status | HEALTH_OK, HEALTH_WARN, or HEALTH_ERR |
| OSD count | Total, up, and in OSDs |
| Capacity | Total, used, and available storage |
| PG status | Placement group health |
Access via the API at GET /ceph/metrics.
Real-Time Events
Subscribe to the SSE event stream for live updates:
| Event | Description |
|---|---|
image.download.progress | Image download progress |
task.state.changed | Task started, completed, or failed |
node.state.changed | Node status or daemon state changed |
Connect to GET /events for a persistent event stream.
Node Health
Check which daemons are running on each node:
$ nf node list
The output shows MON, MGR, MDS, and RGW status for each node with checkmarks for running daemons.