RAVELLO: Troubleshooting VSAN Network connectivity issues


Yesterday night, I was setting up a new VSAN cluster on Ravello and got hit with a network issue: Apparently my network was partitioned!

VSAN health check

 

The network for all hosts has been setup similarly on all hosts, so the network partition issue didn’t make much sense.

  • MGMT Kernel: MGMT and vMotion traffic
    • vmk0 10.1.0.1x
  • VSAN Kernel: VSAN traffic
    • vmk1 10.1.0.2x

Unfortunately esxi004.vmusketeers.local had been added to a separate partition!

partition

Rather than checking each host separately, I used the VSAN PowerCLI commands to figure out if my hosts were properly configured for VSAN networking.

PS C:\Windows\system32> $vsanhealth.networkhealth
HostResult : {VMware.VimAutomation.Storage.Impl.V1.Vsan.Health.VsanHostNetworkHealthResultImpl,
 VMware.VimAutomation.Storage.Impl.V1.Vsan.Health.VsanHostNetworkHealthResultImpl,
 VMware.VimAutomation.Storage.Impl.V1.Vsan.Health.VsanHostNetworkHealthResultImpl,
 VMware.VimAutomation.Storage.Impl.V1.Vsan.Health.VsanHostNetworkHealthResultImpl}
HostCommunicationFailure :
HostDisconnected :
HostInEsxMaintenanceMode :
HostInVsanMaintenanceMode :
HostWithVsanDisabled :
IssueFound : True
LargePingTestSuccess : False
MatchingIPSubnets : True
MatchingMulticastConfig : True
NetworkPartition : {VMware.VimAutomation.Storage.Impl.V1.Vsan.Health.VsanClusterNetworkPartitionImpl,
 VMware.VimAutomation.Storage.Impl.V1.Vsan.Health.VsanClusterNetworkPartitionImpl}
PingTestSuccess : False
PotentialMulticastIssue : False
VsanVmknicPresent : True

Looking at the output of the PowerCLI command you can see I did have some issues.  So let’s dig one layer deeper and let’s have a look at how the VSAN network has been setup on each host.
As you can see below, the host VSAN network looks properly configured.

PS C:\Windows\system32> $vsanhealth.networkhealth.hostresult
Host : esxi002.vmusketeers.local
IPSubnet : {10.1.0.0/16}
IssueFound : False
MulticastConfig : 224.2.3.4/224.1.2.3
VsanVmknicPresent : True
PeerNetworkHealth : {vmk1, vmk1, vmk1}

Host : esxi001.vmusketeers.local
IPSubnet : {10.1.0.0/16}
IssueFound : False
MulticastConfig : 224.2.3.4/224.1.2.3
VsanVmknicPresent : True
PeerNetworkHealth : {vmk1, vmk1, vmk1}

Host : esxi004.vmusketeers.local
IPSubnet : {10.1.0.0/16}
IssueFound : False
MulticastConfig : 224.2.3.4/224.1.2.3
VsanVmknicPresent : True
PeerNetworkHealth : {vmk1, vmk1, vmk1}

Host : esxi003.vmusketeers.local
IPSubnet : {10.1.0.0/16}
IssueFound : False
MulticastConfig : 224.2.3.4/224.1.2.3
VsanVmknicPresent : True
PeerNetworkHealth : {vmk1, vmk1, vmk1}

So that leaves just one thing to check,  my Ravello VM’s! We know that that only esxi004.vmusketeers.local has been added to a seperate partition, so let’s have a look at the network configuration that I setup for the VM representing esxi004.vmusketeers.local.  There is the error: the secondary nic has been added to a different network!

NIcs

Once this had been corrected, my VSAN was not partitioned anymore.

restest

 

So remember kids, always keep an eye on the network settings on the Ravello VM’s themselves!

 

Kim


Be the first to comment

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.