So, it has been a while since I have blogged anything. Life is busy and HCX has been slowly improving with every release of the product.
Now with 4.8, there is a big change which has been asked for by many customers and by us internally for a while. The chance to use multiple service meshes within the same cluster!
Now let’s rewind slightly
If you had a service mesh going from one source cluster to another, that has always been supported.
If you had one source cluster that wanted to go to 2 or more destination clusters, that was fully supported with multiple HCX Service meshes. It was called the One to Many approach:
In the diagram below, this was the only way to scale, you needed one side to have more physical clusters
Now for bulk migration, this in my experience was never really an issue, BUT when it came to RAV/vMotion it became an issue. As with RAV, the cutover is a serial operation and it is a limit tied to each mesh. So if you had 1 mesh, you could only do 1 Vm cutover at a time, and it would take however long it needed to finish, which in turn meant you had no idea really how long your migrations were going to take to finish. It was the same issue for vMotion as well.
Now the only way to get some concurrency was to migrate to multiple clusters, and then have your cutovers spread across clusters. This way you could have multiple concurrent switchovers happening at any given time. it was really the only way to improve throughput and you had to factor this into your migration plan.
Some customers would try and deploy this kind of topology:
Now on the face of it, you’d think that would work, as HCX would let you deploy it. I’m afraid to say it did nothing but waste resources. The first mesh would be the one that was always used and the others would just sit idle doing nothing.
With 4.8 this has now changed:
Let us take a look at the release notes:
Single-Cluster Multi-Mesh Scale Out (Selectable Mesh)
We’ve improved scaling options in HCX 4.8 by removing the notion of clusters as the limiting factor for HCX migration architectures. Single and multi-cluster architectures can now leverage multiple meshes to scale transfer beyond the single IX transfer limitations. The new scale-out patterns help mitigate the concurrency limits for Replication Assisted vMotion and HCX vMotion.
You can now explicitly select the Service Mesh during a migration operation. You might choose a specific Service Mesh to use for a migration based on the parameters or resources associated with that Service Mesh configuration, or to manually load balance operations across the cluster. If no Service Mesh is selected, HCX determines the Service Mesh to use for the migration. To migrate workloads using HCX, see Migrating Virtual Machines with HCX.
Note:
A scaled-out HCX 4.8 deployment increases the potential for exceeding vSphere maximums for vMotion migration limits. Please observe all vSphere migration maximums when increasing the HCX scale within a cluster.
So you can do that kind of deployment and get much more concurrency with your migration waves!
Just remember that the same Service Mesh limits apply:
- 2Gb total b/w per IX in a mesh MAX
- 1 concurrent switchover or RAV/vMotion/Cold at any given time.
So if you wanted to have 4 RAV migrations cutting over at the same time, you would need 4 service meshes.
The downside is the amount of appliances you’d have to deploy, but you can always scale up and down depending on your requirements. I have always told customers, that everyone wants to push big numbers, but it can become a nightmare to manage. Please start small and then grow as you need and as your confidence grows with the product.
Also remember that migration data, can eat a lot of bandwidth and you need to factor that in with your cutovers, having lots of data syncing and then a lot of VMs cutting over can put a lot of strain on the underlying network…and that is not a HCX problem!
It does not automatically load balance, so when you are selecting your VM to migrate, you can manually select which mesh to use. So it is up to you to distribute across all available meshes, however, you see fit.
https://docs.vmware.com/en/VMware-HCX/4.8/rn/vmware-hcx-48-release-notes/index.html
My colleague Chris Dooks, goes over it in some more detail over on his blog, and there is no need for me a rewrite what he has said.
Leave a Reply