Working with VMware NSX – Logical networking

In my last post, we wrapped up the base components required to deploy NSX.  In this post, we’re going to configure some logical routing and switching.  I’m specifically referring to this as ‘logical’ since we are only going to deal with VM to VM traffic in this post.  NSX allows you to logically connect VMs at either layer 2 or layer 3.  So let’s look at our lab diagram…

image

If you recall, we had just finished creating the transport zones at the end of the last post.  The next step is to provision logical switches.  Since we want to test layer 2 and layer 3 connectivity, we’re going to provision NSX in two separate fashions.  The first method will be using the logical distributed router functionality of NSX.  In this method, tenant 1 will have two logical switches.  One for the app layer and one for the web layer.  We will then use the logical distributed router to allow the VMs to route to one another.  The 2nd method will be to have both the web and app VMs on the same logical layer 2 segment.  We will apply this method to tenant 2.  So let’s get started…

Tenant 1
So the first thing we need to do is create the logical switches for our tenant, this is done on the ‘Logical Switches’ tab of the NSX menu.  Navigate here and then click on the plus sign to add a new one…

image

Give it a descriptive name, and ensure the control plane is set to Unicast.  Do the same thing for the App switch for tenant 1…

image

Once both switches are created, you should see them both under logical switches showing a status of ‘Normal’.  Note that NSX allocated a segment ID for each of the switches out of the pool we created in the last post. 

The next step is to attach the tenant VMs to the logical switch.  If we look at the DVS, we see that any DVS we associated with the transport zone has a port-group associated with this new logical switch…

image

So each logical switch really has it’s own port-group.  One would think that this would mean we could just manually edit a VM’s properties and select the port-group to associate it with the logical switch.  From my testing, this didn’t work.  Host association needs to occur from the NSX management portal.  This is done by clicking on the ‘Add Virtual Machine’ button on the logical switch menu (highlighted in red below)…

image

So let’s start with the tenant1-app VM.  Select it and click next.  On the next screen, select the VNIC you want to move to the logical switch and then click next…

image

The last screen has you confirm the move.  Click finish to submit the change…

image

That’s all it takes.  Now tenant1-app is successfully associated with the logical switch tenant1_app.  Let’s do the same changes for tenant1-web…

image

image

image

So now we have two VMs connected to two different logical network segments.  How do we get them to talk to each other?  We need some sort of layer 3 gateway that each host can use to get off subnet.  This is where the distributed local router (DLR)comes in.  The DLR is considered to be a ‘edge’ in NSX, so let’s click on the edges menu on NSX and add an edge…

image

Make sure you select ‘Logical (Distributed) Router’.  Just a quick FYI, I’ve seen it referred to both as the logical distributed router as well as the distributed local router. The Edge services gateway is used to connect your logical networks to the physical.  We’ll use those later on in upcoming posts.  Above I’ve filled out the basic information.  Click next to continue…

image

Enter the credentials you want to use and enable SSH access to the edge (I’m a network guy, I still need CLI).  Click Next to move on…

Note: For packet capture reasons, I decided to deploy my DLR controllers on the management cluster.  In order to do this, I had to go back to the transport zone for tenant1 and add the management cluster to the transport zone for tenant1.  Despite doing this, the change didn’t seem to ‘take’.  I rebooted the NSX controllers, then the managers.  I still couldn’t provision the DLR’s to the management cluster.  I finally rebooted vCenter which resolved the issue.

image

As you can see above, I chose to deploy the DLRs to the management cluster.  This is so I’ll be able to more easily implement packet captures between nodes later on.  The next thing we need to do is allocate the IP addressing for the DLR.  We’ll need an IP in each logical switch for the VMs to use as a gateway.  In addition, the DLR allows you to provision a management interface.  We’ll pick an IP out of the ESX management VLAN for the management interface and use the following networks for the logical switch interfaces…

image
When provisioning the interfaces on the DLR ensure you select the ‘Internal’ option for the Type. Then select the logical switch you want the interface on, and assign an IP address for it…

image

Do the same for the second interface for the web segment…

image

When you’re done, you should have both interfaces configured as shown below…

image

Note: If you noticed, you don’t get to pick a default gateway for the management IP. What does this mean? As far as I can tell, it makes the management IP useless. I’ll need to follow up on this to see what it can be used for.

The configure HA screen asks for 2 IP addresses to use in the HA configuration.  As far as I know, these IPs can be locally significant so I just picked two randomly…

image 

On the last screen make sure that everything looks correct and then click finish..

image

After you click finish, NSX will begin deploying the DLR VMs.  Keep in mind that these VMs aren’t in the data path, they just provide control plane operations to the DLR located on each physical ESX hosts. 

Once they’re deployed, we should see two DLR VMs in the management cluster…

image

In addition, NSX should report the edge as deployed…

image

So now that the DLR is deployed, let’s check our VMs.  As shown above, the App VM has an IP address of 10.20.20.74/29 and the Web VM has an IP address of 10.20.20.66/29.  Since they’re in different subnets, they’ll have to route to talk to each other…

So let’s take a look at the App VM…

image

As you can see, it’s IP is correct and it can ping the web VM off subnet.  Let’s check the web VM and see if we see similar results…

image

Yep, so looks like we’re routing just fine.  However, this doesn’t seem like anything crazy at this point.  We’ve connected two VMs, on two different hosts, that normally would have had to route to talk to each other anyway.  All we did was made them route through a logical router.  The tenant 1 configuration looks like this on our diagram…

image

With the black dotted line I’m showing the path that the VMs traffic took to reach the web server.  Seems like nothing’s changed really right?  We’re still routing.  Actually, a lot has changed, we’re tunneling the routed VM traffic with VXLAN. 

1 – The app server tries to talk to the web server.  Being off subnet, it needs to talk to its default gateway which is on the DLR.
2 – The DLR on the local ESX host receives the traffic and knows that the destination (the web server) is on one of it’s directly connected interfaces.  That directly connected interface is logical switch tenant1_web.  Through the NSX control plane the DLR knows that the web server is actually on another host.  It encapsulates the original packet in a VXLAN header and sends the packet towards the physical network.
3 – The encapsulated VXLAN packet now reaches the physical NIC and has a source of the VTEP interface on Thumper3 and a destination of the VTEP interface on Thumper2.  The ESX host must now encapsulate the packet in a dot1q header for VLAN 119  to get it onto the physical network.
4 – The MLS receives the dot1q packet, strips the layer 2 header and routes the original IP packet
5 – Leaving the physical switch the packet needs to be retagged with another dot1q header for VLAN 118 which is the VLAN where the VTEP interface for Thumper2 resides.
6 – Thumper2 receives the packet, strips the header and passes the VXLAN packet to the DLR
7 – The DLR strips the VXLAN packet off and examines the inner IP packet.  Since we are now on the right host the DLR forwards the IP packet accordingly
8 – The Web server receives the packet.

To see this in action, lets look at a packet capture I pulled off the wire of a ping between the app and web servers…

Note: The VLAN tag shows as 118 in both packet captures.  This is because I was only spanning the packets on the switch interface facing Thumper2.

image

Let’s look and see what happened to se if we can match it up with some of the steps from above…

-The original data packet has a source of 10.20.20.74 (Tenant1-App VM) and a destination of 10.20.20.66 (Tentant1-Web VM).
-Original data packet is encapsulated in VXLAN.  Note the segment ID is 5000 which coincides with segment ID given to the Web logical switch
-The VXLAN outer packet has a source of 10.20.20.42 (The VTEP on Thumper3) and a destination of 10.20.20.35 (The VTEP on Thumper2)

The return looks similar but in reverse…

image

The VLXAN encapsulation is what makes NSX so powerful.  We can fully encapsulate layer 2 and layer 3 in a layer 3 header and route it.  So like I said, this is nice and all, but let’s look at an example of where NSX can really shine when using VXLAN.  Let’s move ahead with setting up tenant 2. 

Tenant 2
As we said earlier, we’re going to deploy the tenant 2 app and web VMs on the same subnet so they’re layer 2 adjacent.  Normally, this would mean that the two VMs would need to be on the same host in the same port-group (or same VLAN) or separate hosts that trunked the same VLAN.  You’ll note that in our case, each of the tenant 2 VMs are on different physical hosts per the diagram above.  In addition, I’m not trunking a common VLAN for this purpose to both hosts.  So let’s deploy a new logical switch just called ‘tenant2’…

image

Now let’s add both of the tenant2 VMs onto the logical switch (I’m not going to show how to do that since we did it above for tenant1).  Once both of the VMs are on the same logical switch, let’s take a look at the IP address allocation we have for tenant2…

image

So pretty straight forward.  Now, let’s try and ping from host to host and see what happens…

image

Cool, it works.  Wait, what?!?!  Yeah, that’s right.  NSX just extended layer 2 for us across layer 3 boundaries.  Let’s look at one of those pings in the packet capture so you can see it in action…

  image

Check out the Mac addresses in the VXLAN encapsulated packet.  They match up perfectly with the Mac addresses on the VMs…

image

image

As we’ve seen today, NSX can create logical networks to encapsulate layer 2 and layer 3 network traffic inside of VXLAN.  In addition, NSX’s control plane removes the need for me to support multicast on the physical network gear when using VXLAN.  AKA, this all just sort of worked without having to tweak my physical network config.  Next up we’ll start looking at routing logical networks back to the physical network.

Tags:

Reply

Your email address will not be published. Required fields are marked *