If you have major problems with your Proxmox VE host, for example hardware
issues, it could be helpful to copy the pmxcfs database file
/var/lib/pve-cluster/config.db
, and move it to a new Proxmox VE
host. On the new host (with nothing running), you need to stop the
pve-cluster
service and replace the config.db
file (required permissions
0600
). Following this, adapt /etc/hostname
and /etc/hosts
according to the
lost Proxmox VE host, then reboot and check (and don’t forget your
VM/CT data).
The recommended way is to reinstall the node after you remove it from your cluster. This ensures that all secret cluster/ssh keys and any shared configuration data is destroyed.
In some cases, you might prefer to put a node back to local mode without reinstalling, which is described in Separate A Node Without Reinstalling
For the guest configuration files in nodes/<NAME>/qemu-server/
(VMs) and
nodes/<NAME>/lxc/
(containers), Proxmox VE sees the containing node <NAME>
as the
owner of the respective guest. This concept enables the usage of local locks
instead of expensive cluster-wide locks for preventing concurrent guest
configuration changes.
As a consequence, if the owning node of a guest fails (for example, due to a power outage, fencing event, etc.), a regular migration is not possible (even if all the disks are located on shared storage), because such a local lock on the (offline) owning node is unobtainable. This is not a problem for HA-managed guests, as Proxmox VE’s High Availability stack includes the necessary (cluster-wide) locking and watchdog functionality to ensure correct and automatic recovery of guests from fenced nodes.
If a non-HA-managed guest has only shared disks (and no other local resources
which are only available on the failed node), a manual recovery
is possible by simply moving the guest configuration file from the failed
node’s directory in /etc/pve/
to an online node’s directory (which changes the
logical owner or location of the guest).
For example, recovering the VM with ID 100
from an offline node1
to another
node node2
works by running the following command as root on any member node
of the cluster:
mv /etc/pve/nodes/node1/qemu-server/100.conf /etc/pve/nodes/node2/
Before manually recovering a guest like this, make absolutely sure
that the failed source node is really powered off/fenced. Otherwise Proxmox VE’s
locking principles are violated by the mv
command, which can have unexpected
consequences.
Guests with local disks (or other local resources which are only available on the offline node) are not recoverable like this. Either wait for the failed node to rejoin the cluster or restore such guests from backups.