Yesterday night, I was setting up a new VSAN cluster on Ravello and got hit with a network issue: Apparently my network was partitioned!
The network for all hosts has been setup similarly on all hosts, so the network partition issue didn’t make much sense.
- MGMT Kernel: MGMT and vMotion traffic
- vmk0 10.1.0.1x
- VSAN Kernel: VSAN traffic
- vmk1 10.1.0.2x
Unfortunately esxi004.vmusketeers.local had been added to a separate partition!
Rather than checking each host separately, I used the VSAN PowerCLI commands to figure out if my hosts were properly configured for VSAN networking.
PS C:\Windows\system32> $vsanhealth.networkhealth HostResult : {VMware.VimAutomation.Storage.Impl.V1.Vsan.Health.VsanHostNetworkHealthResultImpl, VMware.VimAutomation.Storage.Impl.V1.Vsan.Health.VsanHostNetworkHealthResultImpl, VMware.VimAutomation.Storage.Impl.V1.Vsan.Health.VsanHostNetworkHealthResultImpl, VMware.VimAutomation.Storage.Impl.V1.Vsan.Health.VsanHostNetworkHealthResultImpl} HostCommunicationFailure : HostDisconnected : HostInEsxMaintenanceMode : HostInVsanMaintenanceMode : HostWithVsanDisabled : IssueFound : True LargePingTestSuccess : False MatchingIPSubnets : True MatchingMulticastConfig : True NetworkPartition : {VMware.VimAutomation.Storage.Impl.V1.Vsan.Health.VsanClusterNetworkPartitionImpl, VMware.VimAutomation.Storage.Impl.V1.Vsan.Health.VsanClusterNetworkPartitionImpl} PingTestSuccess : False PotentialMulticastIssue : False VsanVmknicPresent : True
Looking at the output of the PowerCLI command you can see I did have some issues. So let’s dig one layer deeper and let’s have a look at how the VSAN network has been setup on each host.
As you can see below, the host VSAN network looks properly configured.
PS C:\Windows\system32> $vsanhealth.networkhealth.hostresult Host : esxi002.vmusketeers.local IPSubnet : {10.1.0.0/16} IssueFound : False MulticastConfig : 224.2.3.4/224.1.2.3 VsanVmknicPresent : True PeerNetworkHealth : {vmk1, vmk1, vmk1} Host : esxi001.vmusketeers.local IPSubnet : {10.1.0.0/16} IssueFound : False MulticastConfig : 224.2.3.4/224.1.2.3 VsanVmknicPresent : True PeerNetworkHealth : {vmk1, vmk1, vmk1} Host : esxi004.vmusketeers.local IPSubnet : {10.1.0.0/16} IssueFound : False MulticastConfig : 224.2.3.4/224.1.2.3 VsanVmknicPresent : True PeerNetworkHealth : {vmk1, vmk1, vmk1} Host : esxi003.vmusketeers.local IPSubnet : {10.1.0.0/16} IssueFound : False MulticastConfig : 224.2.3.4/224.1.2.3 VsanVmknicPresent : True PeerNetworkHealth : {vmk1, vmk1, vmk1}
So that leaves just one thing to check, my Ravello VM’s! We know that that only esxi004.vmusketeers.local has been added to a seperate partition, so let’s have a look at the network configuration that I setup for the VM representing esxi004.vmusketeers.local. There is the error: the secondary nic has been added to a different network!
Once this had been corrected, my VSAN was not partitioned anymore.
So remember kids, always keep an eye on the network settings on the Ravello VM’s themselves!
Kim
Leave a Reply