At this point, I started deploying the basic block of any NSX implementation: Logical Switches. Logical Switches are deployed as a virtual distributed port group on your vSphere Distributed Switch (vDS). Logical Switches are deployed within a logical entity called Transport Zone. This Transport Zone can span one or more clusters and single host can’t be added to the Transport Zone.
Couple of thoughts jumped to my mind: what is the limit of a Transport Zone? and How does it represent a logical boundary for your NSX implementation? What is the relation between it and both of added clusters and the vDS(es) in the environment? How do all of that affect my design any NSX implementations?
To answer my thoughts together, let’s start with a quote from VMware Documentation explaining perfectly Transport Zones
A Transport Zone controls to which hosts a logical switch can reach. It can span one or more vSphere clusters. Transport zones dictate which clusters and, therefore, which VMs can participate in the use of a particular network.
These couple of lines perfectly describe the Transport Zone. VMware clearly stated in the documentation that you must have at least one Transport Zone in which you’ll put your Logical Switches.
To answer my first question, I searched for any hard limits of Transport Zone. Learning VMware NSX Book by Ranjit Singh Thakurratan stated that each Transport Zone can host only 256 ESXi Hosts and the max. number of Logical Switches is 10K, assuming no other Transport Zones in your NSX environment.
To answer my second question, let’s quote again another part of the documentation
An NSX environment can contain one or more transport zones based on your requirements. A host cluster can belong to multiple transport zones. A logical switch can belong to only one transport zone.
NSX does not allow connection of VMs that are in different transport zones. The span of a logical switch is limited to a transport zone, so virtual machines in different transport zones cannot be on the same Layer 2 network. A distributed logical router cannot connect to logical switches that are in different transport zones. After you connect the first logical switch, the selection of further logical switches is limited to those that are in the same transport zone. Similarly, an edge services gateway (ESG) has access to logical switches from only one transport zone.
It’s clear that Transport Zones and Clusters have many-to-many relationship, i.e. one Transport Zone can have many clusters in, and one cluster can join multiple Transport Zones. Also, each Transport Zone is a closed Layer, which doesn’t allow communication or connecting components of it with other components in other Transport Zones. All NSX services and features, used in a certain Transport Zone, are limited by the Transport Zone boundary. It’s something similar to Parallel Universes Theory (I’m waiting to see if we can communicate with something from any Parallel Universe!!). For example, for VM (A) in Transport Zone (TZ-A) to communicate with VM (B) in Transport Zone (TZ-B), you need an Edge Services Gateway (ESG) per each Transport Zone and both ESG has an Uplink Interface in the same or different port group and routing is adjusted to steer traffic from one to the other, as per following figure
For the third question, vDS and Tarnsport zones also have many-to-many relationship, i.e. one Transport Zone can span one or many vDS’s and one vDS can serve multiple Transport Zones. There’s no certain official limitation on how they -Transport Zones, Clusters and vDS’s- relate to each other or how to be organized.
Now, let’s see how all of this may affect our design for any NSX environment:
- We all know that Logical Switches are represented with virtual-wires port groups on vDS. When preparing a cluster for VxLAN, a VDS is chosen to host the VTEPs. This vDS will have all of the virtual-wires port groups of the Logical Switches created. We can call this vDS a fully-connected vDS. Any other vDS connected to the cluster may be called partially-connected vDS.
- If a cluster, under certain Transport Zone is fully-connected to a vDS while partially-connected to other one, only the fully-connected vDS will have the virtual-wire port groups as per following figure:
- If a Transport Zone contains two or more clusters, where each of them has its own fully-connected vDS, and a Logical Switch is created, it’d be attached to each vDS as per following figure:
Any partially-connected vDS will not have Logical Switches on it. This may be the case for certain environments, where the may use a dedicated cluster/vDS pair for certain isolated workloads in DMZ, which have their uplinks on distributed port groups and internal communication on Logical Switches.
- If you have a cluster joined to two or more Transport Zones, each Logical Switch created in any one of these Transport Zone would be connected to the fully-connected vDS of this cluster as per the following figure:
This may introduce a huge risk! Any administrator who can change VM settings, can easily change the port group where the VM vNIC is connected. That means that a VM would be connected to Logical Switch 2 in Transport Zone 2 by mistake, although it should be connected to Logical Switch 1 in Transport Zone 1. This will cause a disruption to the VM connectivity to other subnets/Logical Switches, because the VM will not be able to communicate with the DLR connected to its own Transport Zone.
- If there’re two Transport Zones, each has its own clusters and there’s a single vDS that spans all of these clusters and Transport Zones, this vDS will have all Logical Switches connected to it as per following figure:
This also introduces the same risk as previous point. A VM can be mistakenly connected to the wrong Logical Switch by editing the VM settings. This will cause a disruption to the VM connectivity to other subnets/Logical Switches, because the VM will not be able to communicate with the DLR connected to its own Transport Zone. This design also introduces some complexity in designing vDS Uplinks configuration. Caution must be taken while assigning which Uplinks would be used for which port groups.
- If there’s a vDS that is fully-connected to two or more clusters and the Transport Zone contains only a sub-set of these clusters, Logical Switches created on this vDS will be available to all compute clusters connected to that vDS, wither in the Transport Zone or not as per following figure:
This also introduces the same risk as the previous two points. A VM -from the clusters outside the Transport Zone- can be mistakenly connected to any Logical Switch by editing VM settings. Even if connected, communication will be disturbed as the required modules (VIBs)
are not available on the underlying hosts.
- IDEAL SCENARIO: A single Transport Zone contains one or more clusters. Each cluster is fully-connected to a single vDS only. That vDS is limited to this Transport Zone. This will make only virtual-wires port groups (Logical Switches) and other NSX services under this Transport Zone available only to the VMs running on the cluster(s) under same Transport Zone. In addition, designing vDS(es) will be much more easier. The only draw back is the number of vDS(es) in the environment which introduces more management overhead. The following figure shows that case
I hope that this post could summarize the relation and design aspects of Transport Zones and how they would relate to Clusters and vDSes. In the end, it all depends on the scenario and the customer’s requirements, constraints and use case. There’s no right or wrong answer. Each topology has its own pros and cons. The most important thing is to understand the pros and cons to apply the correct topology that can satisfy the customer’ requirement, within his constraints and fulfilling his use case.
What I tested in my lab:
- Doing Hosts Preparation tasks from NSX Tab in vCenter Web Client.: installing VxLAN and Firewall VIBs and adding VTEPs.
- Doing Logical Networks Preparation tasks from NSX Tab in vCenter Web Client: configuring Segment ID pool and creating Transport Zones.
- Deploying Logical Switches needed for the lab and adding VMs to their corresponding Logical Switches.
The following diagram is showing how my lab looks like at this stage (click here to view in full size or download)