My muse and inspiration

2012-04-05 19.54.19Although no longer this young and small, my granddaughter has shown an interest in technology from a very early age. Seeing her interest and thrill at new discoveries inspires me each day. Thankyou Maisie, love you loads xxx

Advertisement

VMware vSphere, MS Clusters and vMotion resulting in cluster service failures

Quick guide here, from experiences we’ve gained.

When running an active/passive MS Failover Cluster (MSFoC) in a VMware environment, you need to be aware of the behaviour of the clustering when a vMotion event happens on the active node. Because of the momentary interruption in networking and activity on the active node during the final cutover of the VM to the new host, the passive node can often see this as a failure and try to take over. As the active node is not actually down, this results in both nodes trying to run the services, resulting in ‘split-brain’ detection shutting service down on both nodes.

Our way around this issue is to set DRS for these cluster nodes to manual, making sure we’re aware they are there. In the event of vMotion being needed (maintenance generally, or for manual load balancing), we ensure the MSFoC service is shutdown on the passive node. That way we can move whatever we like as we need to without triggering a false takeover.

Bear this in mind when running MSFoC in VMware. Impact seems to be more often with shared-disk clusters, but also seen on non-shared-disk clusters too.