The cluster network is the core of a cluster. All messages sent over it have to
be delivered reliably to all nodes in their respective order. In Proxmox VE this
part is done by corosync, an implementation of a high performance, low overhead,
high availability development toolkit. It serves our decentralized configuration
file system (pmxcfs
).
This needs a reliable network with latencies under 2 milliseconds (LAN performance) to work properly. The network should not be used heavily by other members; ideally corosync runs on its own network. Do not use a shared network for corosync and storage (except as a potential low-priority fallback in a redundant Section 5.8, “Corosync Redundancy” configuration).
Before setting up a cluster, it is good practice to check if the network is fit
for that purpose. To ensure that the nodes can connect to each other on the
cluster network, you can test the connectivity between them with the ping
tool.
If the Proxmox VE firewall is enabled, ACCEPT rules for corosync will automatically be generated - no manual action is required.
Corosync used Multicast before version 3.0 (introduced in Proxmox VE 6.0). Modern versions rely on Kronosnet for cluster communication, which, for now, only supports regular UDP unicast.
You can still enable Multicast or legacy unicast by setting your
transport to udp
or udpu
in your corosync.conf
Section 5.11.1, “Edit corosync.conf”,
but keep in mind that this will disable all cryptography and redundancy support.
This is therefore not recommended.
When creating a cluster without any parameters, the corosync cluster network is generally shared with the web interface and the VMs' network. Depending on your setup, even storage traffic may get sent over the same network. It’s recommended to change that, as corosync is a time-critical, real-time application.
First, you have to set up a new network interface. It should be on a physically separate network. Ensure that your network fulfills the cluster network requirements Section 5.7.1, “Network Requirements”.
This is possible via the linkX parameters of the pvecm create command, used for creating a new cluster.
If you have set up an additional NIC with a static address on 10.10.10.1/25, and want to send and receive all cluster communication over this interface, you would execute:
pvecm create test --link0 10.10.10.1
To check if everything is working properly, execute:
systemctl status corosync
Afterwards, proceed as described above to add nodes with a separated cluster network Section 5.4.3, “Adding Nodes with Separated Cluster Network”.
You can do this if you have already created a cluster and want to switch its communication to another network, without rebuilding the whole cluster. This change may lead to short periods of quorum loss in the cluster, as nodes have to restart corosync and come up one after the other on the new network.
Check how to edit the corosync.conf file Section 5.11.1, “Edit corosync.conf” first. Then, open it and you should see a file similar to:
logging { debug: off to_syslog: yes } nodelist { node { name: due nodeid: 2 quorum_votes: 1 ring0_addr: due } node { name: tre nodeid: 3 quorum_votes: 1 ring0_addr: tre } node { name: uno nodeid: 1 quorum_votes: 1 ring0_addr: uno } } quorum { provider: corosync_votequorum } totem { cluster_name: testcluster config_version: 3 ip_version: ipv4-6 secauth: on version: 2 interface { linknumber: 0 } }
ringX_addr
actually specifies a corosync link address. The name "ring"
is a remnant of older corosync versions that is kept for backwards
compatibility.
The first thing you want to do is add the name properties in the node entries, if you do not see them already. Those must match the node name.
Then replace all addresses from the ring0_addr properties of all nodes with the new addresses. You may use plain IP addresses or hostnames here. If you use hostnames, ensure that they are resolvable from all nodes (see also Link Address Types Section 5.7.3, “Corosync Addresses”).
In this example, we want to switch cluster communication to the 10.10.10.1/25 network, so we change the ring0_addr of each node respectively.
The exact same procedure can be used to change other ringX_addr values as well. However, we recommend only changing one link address at a time, so that it’s easier to recover if something goes wrong.
After we increase the config_version property, the new configuration file should look like:
logging { debug: off to_syslog: yes } nodelist { node { name: due nodeid: 2 quorum_votes: 1 ring0_addr: 10.10.10.2 } node { name: tre nodeid: 3 quorum_votes: 1 ring0_addr: 10.10.10.3 } node { name: uno nodeid: 1 quorum_votes: 1 ring0_addr: 10.10.10.1 } } quorum { provider: corosync_votequorum } totem { cluster_name: testcluster config_version: 4 ip_version: ipv4-6 secauth: on version: 2 interface { linknumber: 0 } }
Then, after a final check to see that all changed information is correct, we save it and once again follow the edit corosync.conf file Section 5.11.1, “Edit corosync.conf” section to bring it into effect.
The changes will be applied live, so restarting corosync is not strictly necessary. If you changed other settings as well, or notice corosync complaining, you can optionally trigger a restart.
On a single node execute:
systemctl restart corosync
Now check if everything is okay:
systemctl status corosync
If corosync begins to work again, restart it on all other nodes too. They will then join the cluster membership one by one on the new network.
A corosync link address (for backwards compatibility denoted by ringX_addr in
corosync.conf
) can be specified in two ways:
getaddrinfo
, which means that by
default, IPv6 addresses will be used first, if available (see also
man gai.conf
). Keep this in mind, especially when upgrading an existing
cluster to IPv6.
Hostnames should be used with care, since the addresses they resolve to can be changed without touching corosync or the node it runs on - which may lead to a situation where an address is changed without thinking about implications for corosync.
A separate, static hostname specifically for corosync is recommended, if hostnames are preferred. Also, make sure that every node in the cluster can resolve all hostnames correctly.
Since Proxmox VE 5.1, while supported, hostnames will be resolved at the time of entry. Only the resolved IP is saved to the configuration.
Nodes that joined the cluster on earlier versions likely still use their
unresolved hostname in corosync.conf
. It might be a good idea to replace
them with IPs or a separate hostname, as mentioned above.