Get started

Troubleshooting

Common issues and how to resolve them.


Node Won’t PXE Boot

CheckFix
DHCP not reaching nodeVerify switch DHCP options 66/67 point to master IP
Wrong boot modeEnsure node firmware matches (BIOS or UEFI)
No boot image setRun nf images default --id <id> to set a default image
Firewall blockingEnsure master node ports are accessible from the boot network

Node Registers but Shows “pending”

This is normal. Nodes enter pending status after registration and stay there until you deploy daemons to them.

Task Failed

Check the task logs:

Terminal
$ nf tasks list                  # find the task ID

Then check the logs via the API: GET /task/run/{runId}/logs

Common failures:

  • Disk already in use — the disk has an existing filesystem or OSD
  • Node unreachable — network connectivity issue between master and worker
  • Insufficient resources — not enough disk space or memory

OSD Won’t Start

  • Check that the disk is not already formatted (hasFilesystem: true in nf disks list)
  • Ensure the node has network connectivity to the monitors
  • Check task logs for specific error messages

CephFS Not Accessible

  • Verify at least one MDS is running: check nf node list for the MDS column
  • Ensure the CephFS was created: check nf ceph fs was run
  • Verify client has the correct Ceph keyring

S3/RGW Not Working

  • Verify at least one RGW instance is running: nf ceph rgw list
  • Check that an RGW user exists with valid credentials
  • Test with nf s3 --accessKey ... --secretKey ... bucket list
  • Ensure the RGW port (default 7480) is accessible

Getting Help

  • Check task logs for detailed error messages
  • Review the Architecture page to understand component relationships
  • Use --json flag on CLI commands for machine-readable output for debugging