In our last post, we saw a glimpse of what MPLS was capable of.  We demonstrated how routers could forward traffic to IP end points without looking at the IP header.  Rather, the routers performed label operations by adding (pushing), swapping, or removing (popping) the labels on and off the packet.  This worked well and meant that the core routers didn’t need to have IP reachability information for all destinations.  However – setting this up was time consuming.  We had to configure static paths and operations on each MPLS enabled router.  Even in our small example, that was time consuming and tedious.  So in this post we’ll look at leveraging the Label Distribution Protocol (LDP) to do some of the work for us.  For the sake of clarity, we’re going to once again start with a blank slate.  So back to our base lab that looked like this…

Note: I refer to the devices as routers 1-4 but you’ll notice in the CLI output that their names are vMX1-4.

Each device had the following base configuration…

Again – nothing exciting here. The only difference between this base config and the one in the previous post was that I’ve given all of the routers loopback addresses rather than just routers 1 and 4. But while we’re at it – we know from the previous post that to use MPLS we need to turn MPLS on each interface which will pass labelled traffic.  So as we did in our first post, let’s once again enable MPLS on all of the interfaces that we expect to pass labelled traffic…

So now we’re back to pretty much where we were in the last post before we started programming our static LSP.  As you’ll recall – that was tedious and we had to manually generate and track what labels we were using.  To overcome this – a group of protocols exist called “Label distribution Protocols” which take care of much of this tedious work for us.  The two primary label distribution protocols are LDP (Label Distribution Protocol) and RSVP (Resource Reservation Protocol).  In this post, we’ll be covering LDP since its the easiest to get up and running with and we’ll tackle RSVP in an upcoming post.

Enabling LDP is just as easy as enabling MPLS was.  We simply turn it on for any interface which we expect to handle labelled packets…

Now we can verify that LDP is on with various show commands…

Notice that router 1 believes it has a single LDP interface which is accurate. Also notice that LDP has identified the IP address of this interface. We’ll also note that the neighbor count (Nbr count) is listed as 1 so if we run the same command on router 2 we should see that it’s interfaces are now also enabled for LDP…

Indeed it is – so the neighbor count being 1 implies that there is an LDP relationship between router 1 and router 2 on their directly connected interfaces. If we performed the same command on router 3 and router 4 we would see similar output. You might be wondering what the Label space ID field is.  This is often referred to simply as the LDP ID and consists of two components separated by a colon.  The first piece is the neighbors transport address.  In the above output, you can see that each router is using it’s own local loopback address (2.2.2.2 in the case of router 2) for each of it’s LDP enabled interfaces. If we look at the LDP neighbors for router 2 we’ll see the LDP ID for it’s adjacent neighbors…

As you can see – router 2 has two neighbors, 10.1.1.0 (ge-0/0/1 on router 1) and 10.1.1.3 (ge-0/0/0 on router 3).  Notice that each neighbors LDP ID starts with the loopback address of the neighboring router.  1.1.1.1 for router router 1 and 3.3.3.3 for router router 3.  With MPLS (especially on Juniper) it’s common for a nodes transport address to be the loopback address given that the transport address is expected to be unique.  The second part of the LDP ID in all cases appears to be 0.  So what does that mean?

The number after the colon indicates the means in which the node is allocating MPLS labels.  0 means that the node is using what is called a per-router (sometimes also called per-platform) label allocation method which is the default on Juniper devices.  This means that regardless of interface – all labels advertised for FECs need to be unique.  The other option is per-interface label space and means that in addition to labels you also track the interface they are associated with.  That is – with per-interface label space you could advertise the same label, out of two different interfaces, for two different FECs.  With per-router label space you would need to advertise two unique labels one for each FEC.  For now we’ll be dealing only with per-router label space.

Note: We talked a little bit about what FECs were in our last post.  In the majority of cases, a FEC is represented by a prefix.  That being said, it’s safe in most cases to use the two terms interchangeably.  However – FECs get labels assigned to them and in JunOS only a router’s loopback address, by default, is considered a FEC.  So when I refer to FECs throughout this post, Im referring to the routers /32 loopback address which is (naturally) a prefix.  I’ll try to only use the term prefix when referring to a prefix that is routed across an MPLS network and FEC when talking about a prefix that MPLS is aware of (has a label)

So we’ve verified our interfaces are enabled for LDP and we’ve even seen that we have some neighbors at this point.  The catch here is that there is a distinction between finding neighbors and building sessions.  Just because we have a neighbor doesn’t mean we have a valid LDP session.  LDP has merely discovered that there are other possible peers reachable through it’s interfaces by using LDP hellos.  The real magic happens once an LDP session is established.  If we look at router 2 we’ll see that we don’t currently have any active LDP sessions…

So why is this? The routers are obviously talking otherwise we wouldn’t see any neighbors. One of the requirements for LDP is that a given router can reach the transport address of any LDP peers. In this case, router 2 doesn’t know about any of the other routers loopbacks…

The simple fix here is to setup an IGP like OSPF and advertise the loopbacks into it. Let’s do that on all 4 routers now…

Notice that on the edge routers (routers 1 and 4) I dont put the user facing interfaces in OSPF. If I did, we wouldn’t need MPLS 🙂

We can quickly validate on one of our hosts that OSPF adjacencies have formed…

And if we look at the LDP sessions we should now see they are operational…

If you check all of your routers, you should see a similar story.  So what do we get out of an LDP session?  Let’s tack on the term extensive to our show ldp session command and see…

As you can see there’s quite a lot of info here but lets focus on one item for now – the received next-hop addresses for each peer.  These are learned as part of the LDP session establishment. If we looked at the output of show ldp neighbor we’d see that a couple of these addresses (for directly connected interfaces) appeared in that output as well.  However, this output gives us the complete list of possible IP addresses for each peer.  We’ll get to why these are important shortly.

So now that our sessions are established the next thing LDP will start doing for us is sending LDP label mappings.  LDP will automatically assign a label to each FEC that a router knows about and advertise that label and FEC (typically a prefix) to all of it’s peers.  LDP does this the easiest way possible – by flooding the advertisements out of every LDP enabled interface.   You might recall from the last post that Juniper devices by default advertise their own /32 loopback address as a FEC.  So in our case, each router already knows about a FEC to advertise since each router has a local loopback address.  In order to advertise the FEC with a LDP label mapping message the router needs to assign a label to the loopback.  To see what the router assigned, we can take a look at the LDP database…

The top table Input label database shows the labels the vMX1 has received.  These are the labels that router 1 will push onto a frame’s label stack when sending traffic to these destination FECs through a given node.  The “given node” is an important distinction to make here.  Since labels are unique per router we need to track which labels we learn through which LDP peering session.  Notice in the output about that the router is showing the input and output label database for the peering session between 1.1.1.1:0 and 2.2.2.2:0.  As we’ll see later on – since LDP floods you’ll receive label mappings on any LDP enabled interface but that doesn’t necessarily mean that the router will use them in forwarding.  Let’s look at the output of the LDP database on the other 3 nodes as well…

So let’s see what this looks like visually just in terms of the set of labels that router 1 is advertising…

This is where things can get a little confusing if you aren’t used to working with MPLS.  Let’s just talk about how to reach the 1.1.1.1/32 prefix to start with.  In this case router 1 is advertising a label of 3 to router 2 for the FEC 1.1.1.1/32.  3 is a special label which signals the receiving router to perform PHP (Penultimate Hop Pop) so that the sending router receives a native packet and can directly process that rather than dealing with an MPLS header.

Note: Label 3 falls into the reserved label range and is actually called the “explicit null” label.  If a upstream router receives this label it automatically knows to pop the top label off of the label stack before forwarding the frame downstream (I talk about upstream/downstream below).

Router 2 receives the label mapping advertisement for the 1.1.1.1/32 FEC from router 1 and then assigns a label to the FEC locally.  In this case, it assigns a label of 299904 to that FEC and then advertises that label to all of its LDP peers.

Note: While not pictured, you might have caught from the output that router 2 advertises the label not only to router 3, but also back to router 1.  This is because LDP is flooding and based on the output we can see that no split-horizon rules are in play for LDP.  Also note that it advertises the same label for the FEC since we’re using a per-router(platform) label space.    

Router 3 receives the LDP label mapping from router 2 for FEC1.1.1.1/32 and assigns a label of 299856 to the FEC locally.  It then advertises the label mapping to all of its LDP peers.   This method of advertising labels is referred to as “downstream unsolicited” since each LSR (Label Switching Router (a fancy term for an MPLS enabled router)) is sending all of the label mappings to each of its peers without being asked to.  This is the default means of label distribution with LDP.  So the unsolicited piece makes sense because the LSRs are sending LDP label mapping advertisements whether you like it or not.  But what does downstream mean?

I’ll admit that the upstream and downstream terminology was initially confusing for me when I started using MPLS because it’s all relative.  The key to understanding the terms “upstream” and “downstream” in the MPLS world is that they refer to a specific FEC and how an LSR perceives that FEC.  So in our example above – router 2 would be considered to be an upstream router (or LSR, I use the terms interchangeably here) to router 1 for the FEC 1.1.1.1/32.  Likewise, router 2 would be a downstream LSR to router 3 for the same FEC.  The bottom line is that the closer you are to the originator of the FEC – the further downstream you are.  Label advertisement is downstream to upstream.  So in this case – since the downstream router is advertising label mapping upstream without being asked for it – it’s called downstream unsolicited label distribution.  So if label advertisements flow from downstream to upstream (the blue dotted line) it also makes sense that actual traffic flows from upstream to downstream (the green dotted line).

Now let’s take a minute to address the LDP flooding and lack of split-horizon rules we saw above.  For instance, let’s look at the LDP database on router 2 more closely…

This is the oddity we saw above when we looked at the LDP database on router 1.  Why is router 2 receiving a LDP mapping for 3.3.3.3/32 from both it’s peering to router 1 and router 3? Clearly, router 2 only has a single path to 3.3.3.3/32 and that’s toward router 3. So why is router 1 advertising a path to 3.3.3.3/32 as well? The reason is that LDP floods completely indiscriminately. In other words, its job is just to tell every peer about every FEC it knows about. So in this case, this is what’s happening with the 3.3.3.3/32 FEC…

So let’s walk  through this….

  1. Router 3 has the FEC of 3.3.3.3/32 locally which it tells all of it’s neighbors about by advertising the prefix to all of it’s established LDP peers with a label of 3.(only showing the conversation to router 2 in the diagram).
  2. Router 2 receives the label mapping, assigns a local label of 299872 to the FEC and then advertises that label mapping to all of it’s LDP peers.  All of it’s LDP peers happens to include router 1 AND router 3 which was the originator of the FEC to start with.  At this point, LDP just told router 3 that it can reach the FEC of 3.3.3.3/32 by passing router 2 a label of 299872.
  3. Router 1 has also received the label mapping for 3.3.3.3/32 and happily told router 2 that it can reach the FEC by passing it a label of 299872 as well.

So this is a mess.  Moreover – this sort of advertising is looking like it’s going to cause nothing but loops.  So how does a router know which way is the correct downstream path to reach a FEC?  The answer is that each LSR relies on the underlying IGP (in this case OSPF) to determine the right label to use.   So let’s look at the inet.0 routing table on router 2 for the destination 3.3.3.3/32 to make this determination…

The RFC for LDP says that…

An LSR receiving a Label Mapping message from a downstream LSR for a Prefix SHOULD NOT use the label for forwarding unless its routing table contains an entry that exactly matches the FEC Element.

In other words, we need an entry in the inet.0 table for the FEC in order to even consider using the label we received to try and reach it.  As we can see above, we do have a route for 3.3.3.3/32 in inet.0 table so we’ve passed that requirement.  The problem now is how do we determine which label to use?  The problem is that the information required to make this assessment lives in a couple of different places.  Recall that when we looked at the output of show ldp session extensive on router 2 that we learned all of the MPLS enabled addresses of each LDP peer.  In this case, router 2 learned that router 1 has interface 1.1.1.0 and router 3 has interfaces 10.1.1.3 and 10.1.1.4.  If we consult the inet.0 entry for 3.3.3.3/32 we can see that it’s reachable through a physical next hop of 10.1.1.3 which the router knows is associated with the LDP peering toward router 3.  That means that router 3 is the true downstream LSR for this prefix and that router 2 should use the label it learned from router 3 to reach that FEC.  The logic looks like this…

So you now might be wondering why the inet.0 table on all of the routers doesn’t show any label operations since we just validated that the routers at least have some usable labels.  As in the first post, while the routers know what labels to use to reach a specific FEC, they still dont have a means to get traffic into an LSP to reach a particular FEC.   As we talked about in the first post, forwarding entries for MPLS are in the inet.3 routing table on Juniper equipment.  Recall that this is the routing table where Juniper stores it’s FECs and is sometimes called the FEC mapping table.  More specifically, it’s a recursive lookup table leveraged by BGP.  So let’s see what router 1 has in it’s inet.3 table…

Ahah!  The inet.3 table does list some label operations to reach all of the FECs we currently know about.  If we compare these to the LDP input database on router 1, we’ll see that these entries line up perfectly with the entries in the LDP peering session between router 1 and router 2 that we examined earlier (output above).  Namely…

  • To reach 2.2.2.2 on router 2 we send the traffic out of the ge-0/0/1.0 interface towards 10.1.1.1.  Notice that we do not push a label for this action.  This is because router 1 was told through its LDP peering session that to reach 2.2.2.2 it should use a label of 3.  3 indicates that the penultimate (the second to last) router should pop the label and deliver the traffic normally.  Since this is the second to last router it simply doesnt impose a label.
  • To reach 3.3.3.3 on router 3 we send the traffic out of the ge-0/0/1.0 interface towards 10.1.1.1.  In this case, we do impose a label of 299872.  This is the label we learned from the upstream router (router 2) for the FEC/prefix 3.3.3.3/32.  You can confirm that by consulting the LDP database on the peering session between router 1 and router 2.  You’ll see that router 2 advertised a label of 299872 to reach the prefix of 3.3.3.3/32.
  • To reach 4.4.4.4 on router 3 we send the traffic out of the ge-0/0/1.0 interface towards 10.1.1.1.  In this case, we do impose a label of 299888.  This is the label we learned from the upstream router (router 2) for the FEC/prefix 3.3.3.3/32.  You can confirm that by consulting the LDP database on the peering session between router 1 and router 2.  You’ll see that router 2 advertised a label of 299888 to reach the prefix of 3.3.3.3/32.

So this all seems to be lining up but let’s do the exercise one more time on router 2 so that we’re sure it makes sense.  So let’s look at the inet.3 table on router 2…

Notice that in this case, only one of the FECs impose a label (4.4.4.4./32).  That’s because the other LDP peers (router 1 and 3) are directly connected so they advertise router 2 a label of 3 which you’ll recall is the explicit null label and asks the upstream router to pop the label before forwarding the frame downstream.  The only other label we learned was from router 3 and was for how to reach the 4.4.4.4/32 FEC.  In this case router 3 is saying “Sending the frame out of your ge-0/0/1.0 interface and push a label of 299808 onto the frame”.  If we consult the output of show ldp database above from router 3 we’ll see that 299808 is indeed the label router 3 advertised to router 2 for that FEC/prefix.

One thing I hope you’ve noticed is that the labels here are locally significant.  If you replicate this lab on your own, there’s a chance that you’ll see each router allocate the same label for the same prefix.  That would work just as well.  Since each router keeps track of it’s own set of labels for each specific peering session there’s no reason they can’t overlap.

Looking back at the inet.0 table we’ll see that the router wants to use OSPF to reach all of these prefixes and doesn’t list a label to be pushed…

So despite having the entries in inet.3 they arent being used since we dont have any forwarding entries in inet.0.  The solution to this is to put recursive routes in the inet.0 table which will force the router to reference the inet.3 table.  In a real environment this would be done with BGP but ,like we did in the first post in this series, we’re going to leverage static routes to keep things simple.  Let’s use static routes to make our client subnets reachable.  To do this, we simply add the following routes on router 1 and router 4…

At this point if we consult the inet.0 table on router 1 and router 4 we should see that the client subnets exist and that each router is showing a label operation to reach the destination prefix…

If you look at the ingress LDP database on router 1 and router 4 you’ll find that the labels being pushed to reach the client subnets line up with the label the router learned for that prefix or FEC…

And if you check connectivity on each client, you’ll see that they are now able to ping each other successfully so long as they have a route pointing to their directly connected router interface for the remote client network. To validate we’re doing MPLS, let’s start a ping on the top client to the bottom client and do a capture on the link between router 2 and router 3.  Based on the information we know from looking at the LDP database, we can determine what labels we should see in the capture as the traffic traverses the link in between routers 2 and 3.  For the ICMP echo request I’d expect to see a label value of 299808 and for the ICMP echo reply I’d expect to see a value of 299904.  Let’s see if we’re right…

If that didn’t add up to you, look at the LDP database tables on router 2 and 3 again.  The important thing to remember is upstream versus downstream.  In our case, looking between router 2 and 3 the connectivity looks like this…

The blue arrows show the label advertisements and the green arrows show the actual data path label use.  Don’t be put off if this seems overly confusing.  It’s a relatively simple abstraction but one that cant be hard to follow without walking through it step by step.  Fortunately, there’s an easier way to see some of this as well.  What we’ve been looking at so far has mostly been the pieces of control plane that make MPLS work.  The real combination of all these efforts is displayed in the mpls.0 table.  Let’s take a look at that on router 2 so you can see what I mean…

This table describes all label operations that the router will take on labeled packets.  This table confirms what we saw above.  If router 2 receives a frame with label 299888 it will swap the label with 299808 and forward it out of the routers ge-0/0/1.0 interface toward router 3.  This would be the ICMP echo request coming from the top client headed to the bottom client.  On the return, router 2 told router 3 that it needed a label of 299904 in order to reach the downstream DEC of 1.1.1.1/32.  We can see that if router 2 receives a frame with label 299904 it will pop the label and forward the native frame to router 1 (Pen-ultimate hop popping).

This wasn’t meant to be an exhaustive view of LDP but I hope it gave you an idea of what LDP is capable of.  I also hope this demonstrated how necessary a label distribution protocol is to any MPLS network.  Doing static LSPs just isn’t feasible for sort of scalable deployment so you’ll want to make sure you’re comfortable with deploying something like LDP to help out with label assignment and distribution.  In the next post, we’ll layer on BGP so we don’t need to use static routes and talk about how LDP handles failures and reroutes.  Stay tuned!

Here we are – the first day of 2018 and Im anxious and excited to get 2018 off to a good start.  Looking back – it just occurred to me that I didn’t write one of these for last year.  Not sure what happened there, but Im glad to be getting back on track.  So let’s start with 2017…

2017 was a great year for me.  I started the year continuing my work at IBM with the Watson group.  About half way through the year (I think) I was offered the opportunity to transition to a role in the Cloud Networking group.  It was an opportunity I couldn’t pass up to work with folks whom I had an incredible amount of respect for.  So I began the transition and within 3 months had fully transitioned to the new team.  Since then, I’ve been heads down working (the reason for the lack of blog posts recently (sorry!)).  But being busy at work is a good thing for me.  For those of you that know me well you know that “bored Jon” is “not happy Jon” so Im in my own little heaven right now.  Im learning a lot and getting to work on new things and ideas that I wouldn’t have had exposure to elsewhere.

2018 looks to be another busy year for me on the job front but, as usual, I have other goals as well.  So without further blabbering from me, here are my 2018 goals…

  • Get better at time management – This is a big area I can work on with many facets.  First and foremost – I’d like to work on being more productive in the morning.  A typical day for me involves getting up between 6 and 7am.  I help my wife get the kids ready – she takes them to day care – and I usually go to my office and get to work.  I’d like to try and see if its feasible for me to get up earlier to try and get some things done before I start my work day.  5am seems a bit early – so Im not sure what the right time is but any time I have in the morning would be free time that I could get things done in.  The bigger challenge for me here will be making sure I got to bed at a decent time to make this feasible.  As it stands now – I typically work until midnight or 1am and I often get caught up in something and push this to 3 or 4.  While its great to get work done – this isn’t a great model for me to follow and I realize that.  So I need to work on that.  The other piece of this is working to be better at sorting out my work/life balance.  It seems almost trite to bring up – but its often hard for me to draw clear lines between work and life.
  • Be healthier – The downside of working from home is that the cafeteria is always open.  I snack almost all day long.  This is driven by many things – skipping breakfast and/or lunch is a big reason I typically snack so Im going to get better at having actual meals.  Im also going to make a point of buying better food to snack on.  The other area that bites me is that I live in Minnesota.  During the summer I can run in the morning or over lunch.  During the winter (when it’s -15 without the windchill right now!) I don’t care to go out to run.  Im going to try and find a better way to work out inside and try and stick with it.
  • Blog more – Since starting at IBM the amount of blogging I’ve done has gone down considerably.  I want to fix that.  I view blogging as one of the means that I can give back to the networking community and Im disappointed with my current cadence of blog posts.
  • Give another talk – While I’ll admit I was hesitant to do it – I managed to give my first talk last year at CHINOG07 (see here!).  I actually really enjoyed doing it and Im hoping I can find another topic and another conference I can give a talk at.

So there you have it – my goals for 2018.  You’ll note that I’ve taken running a marathon off the list.  I still want to do this, so maybe I’ll add it back onto the goal list later on 🙂

I hope you all had a great 2017 and have an even better 2018!

MPLS 101 – The Basics

In this series of posts, I want to spend some time reviewing MPLS fundamentals.  This has been covered in many places many times before – but I’ve noticed lately that often times the basics are missed or skipped when looking at MPLS.  How many “Introduction to MPLS” articles have you read where the first step is “Enable LDP and MPLS on the interface” and they dont actually explain whats happening?  I disagree with that being a valid starting point so in this post I’d like to start with the basics.  Subsequent posts will build from here as we get more and more advanced with the configuration.

Warning: In order to get up and running with even a basic configuration we’ll need to introduce ourselves to some MPLS terminology and concepts in a very brief fashion.  The descriptions of these terms and concepts is being kept brief intentionally in this post and will be covered in much great depth in a future post.

Enough rambling from me, let’s get right into it…

So what is MPLS?  MPLS stands for Multi-Protocol Label Switching and it provides a means to forward multiple different protocols across a network.  To see what it’s capable of, let’s dive into a real working example of MPLS.

Note: I encourage you to follow along with the examples by using virtual routing instances.  I’ll be using the vMX from Juniper which is free to try and full featured.  You can check it out here.

Above we have a very simple network topology.  Four routers connected together in a chain with two clients (apparently people pointing their fingers in the air) at either end of the chain.  At this point, the routers have a simple configuration that includes their interface IP addressing and that’s about it.  The configurations look like this…

So nothing exciting here – what we currently have is a fairly broken IP network. None of the routers or clients can communicate to anything besides their directly connected interfaces (perhaps the reason the clients are raising their hands). To be fair, the clients can at this point talk to their directly connected router’s interfaces since they are using them as their default gateway. But they can’t talk past there.

So how do we fix this? Well typically we’d just configure something like OSPF on all of the routers, let them advertise their connected networks to each other, and things would magically work. That’s all well and good, but that also means we need to share a lot of state with all of the routers. Each router has to learn about all of the prefixes in the network. Certainly in this case, that’s not a concern, but what if this network was 100 million times bigger? If it was – it might be worth while to look at other options to keep the amount of state being shared between the devices as small as possible especially in the case of routers 2 and 3 which aren’t even directly connected to a client. However – without knowledge of all possible networks, we’d need to find another means to forward traffic instead of relying on standard IP forwarding. Enter MPLS.

MPLS offers a new means of forwarding traffic that relies on labels rather than IP addresses to perform forwarding actions.  So rather than relying on doing IP lookups at each hop, the router will perform a label lookup.  MPLS tags are inserted in between the layer 2 header and the IP packet…

Given the tag placement, MPLS is often said to use a ‘shim’ header.  While the MPLS header includes more than just a label (tag) let’s just focus on the label for now.  We’ll revisit this later on, but the fact that this is a shim header is important to understand.  Many people incorrectly believe that MPLS represents a totally different means to transport data across a network.  That’s not the case at all.  Since the L2 header is still the outer header that means that we’re still using L2 forwarding semantics to get the packet from point A to point B.  The MPLS header is just additional information that can be acted on once a device discards the L2 header.  This is the beginning of a longer theme about how tightly coupled MPLS is to the underlay network it rides on top of.

So now that we have a label we need to do something with it.  When an MPLS enabled router receives a packet – it can perform three basic actions.  It can push a label, swap a label, or pop a label.  Pushing a label would occur when you wish to have traffic enter the MPLS network.  Swapping happens inside of the MPLS network and represents the basic forwarding action MPLS relies on.  Popping a tag occurs as traffic leaves the MPLS network and egresses back onto a normal IP network.

Note: These are certainly not the only times these actions are used but for the sake of keeping things simple in this intro post let’s assume they are.  We’ll cover more advanced use cases in later posts.

So now that we know some of the basics – lets get some of the configuration done and then come back and explain it.

The first thing we have to do is configure the routers interfaces for MPLS. To do so on a Juniper router you configure both the physical interface for MPLS as well as specify the interface under the MPLS protocol configuration. Our router 1 configuration would now look like this…

Notice the highlighted lines above. This enabled these interfaces for MPLS transport. Let’s make the same changes to the rest of the routers to enable their transit interfaces for MPLS…

Once again – nothing exciting about this. All we’ve done at this point is enabled the interfaces for MPLS transport.

Note: This is where things diverge drastically between vendors, specifically Cisco and Juniper. Since I’ll be using the vMX for this series of posts we’ll be talking about how this is handled with Juniper. While many things are similar, and they certainly interoperate, just keep in mind that the default behavior for many things is different between the two vendors.

So let’s take a step back here and talk about how this works with IP. If the top client is using router 1 as it’s default gateway, when it sends traffic destined to the bottom client, router 1 has to sort out what to do with it. In the case of normal IP forwarding the router would look at the incoming packets destination IP, consult it’s routing table, hopefully find a prefix that matches the destination IP of the packet, and forward the packet on to the next hop the route specifies. If we wish to forward the top clients packet using MPLS transport, we need a means to tell the router to change it’s normal forwarding behavior. In other words – the router needs some mechanism to tell it what traffic should be sent using MPLS tags. That mechanism comes in the form of a LSP – or a label switched path. A LSP defines a path through an MPLS network. For the sake of keeping things simple as we begin our MPLS labs we’ll rely on statically defined LSPs. Static LSPs are defined under the MPLS protocol section of the configuration as follows…

Above we have a new LSP called router1->router4. The LSP defines…

  • A method of ingress meaning that this will be the entry point of traffic entering the LSP.
  • A destination using the to definition. In this case the LSP is headed to or terminates at 4.4.4.4 which is the loopback IP address of router 4.
  • A action of push meaning this router will push a label onto the packet
  • A next-hop of 10.1.1.1.

Most of these items should make at least some sense to you.  We know we want to use MPLS to forward the traffic so the ingress method makes sense.  We know that we want to get the traffic all the way to router 4 since that’s what the bottom client is directly connected to so saying that we’re heading to 4.4.4.4 also seems to add up.  We also talked about how when we enter a MPLS network we need to push a label so I’m also OK with making that conclusion.  What’s curious is the next-hop of 10.1.1.1.  Why do we need a next hop IP if we aren’t relying on IP forwarding to move the traffic?  The next-hop is required so we know what interface to send the traffic out of.  While slightly deceiving, the IP address you specify for the next hop is used to resolve what interface should be used to send the labelled traffic out of.  The router will consult the IP routing table, determine which interface is used to reach that IP, and then use that as the egress interface when sending traffic down that LSP.  Remember, MPLS isn’t using the IP header to make any forwarding decisions.  The fact that you specify an IP for the next-hop is simply a matter of convenience and is then resolved to a next-hop interface for the local router.

So let’s install this static LSP on router 1.  If you want to copy and paste the configuration load merge terminal is your friend here.  So now that we have defined the ingress to our LSP, we need to define the rest of the path.  We just told router 1 to push a label of 1001001 onto the packet as it enters the LSP.  We also told it to send the newly labeled packet over to router 2.  When router 2 receives the labeled packet it needs to perform a MPLS forwarding operation.  In this case, it will be a swap.  Let’s tell router 2 to swap the label 1001001 with 1001002

The above label operation will be part of the same LSP we started on router 1 and defines…

  • A method of transit meaning that this router will not be starting or terminating an LSP. Rather we’ll be transiting traffic across the router
  • A action of swap meaning this router will examine the incoming label and if it matches the label defined in the method (1001001) it will swap the label for 1001002
  • A next-hop of 10.1.1.3. As we saw above this means that it will send the relabeled packet out of its interface towards router 3.

Now our MPLS packet is making it’s way to router 3.  Router 3 will need to deal with the MPLS packet as well, however, its not going to perform a swap operation.  Rather, it’s going to perform a pop operation.  This might seem strange at first since there is another router (router 4) in path to reach the bottom client, however we want to try and minimize the operations each router has to do.  If we sent a packet with an MPLS header to router 4 it would not only need to pop the MPLS tag, it would also need to do an IP lookup.  So rather than making router 4 do all that work, we simply tell router 3 to pop the label off and send the packet on it’s way to router 4 who can then just act on the IP packet alone.  This function is called PHP or penultimate hop pop and is very common in MPLS networks.  So let’s configure this on router 3…

Once again – the above label operation will be part of the same LSP we started on router 1 and 2 and defines…

  • A method of transit meaning that this router will not be starting or terminating an LSP. Rather we’ll be transiting traffic across the router
  • A action of pop meaning this router will examine the incoming label and if it matches the label defined in the method (1001002) it pop off the label before forwarding
  • A next-hop of 10.1.1.5 to send the packet toward router 4.

Now that we’ve defined the LSP we need to tell the router to use it to get to the bottom client.  Let’s look at our routing table now on router 1 and see what it looks like…

Notice that in addition to the normal inet.0 routing table we now also have a inet.3 routing table.  Where did this come from?  When we defined the ingress LSP ending at 4.4.4.4 the router created an entry for 4.4.4.4 in the inet.3 table.  The table is often called by other names, but it’s a special table in Junos that is used to lookup BGP next hops.  Recall that BGP is unique in the fact that it can put a next hop in the routing table that is not directly connected.  To understand how this is used with MPLS we need to talk briefly about some more MPLS terminology that will be discussed in more detail in later posts.  The inet.3 table is sometimes also called the FEC mapping table.  So what is a FEC?  FEC stands for Forwarding Equivalence Class and defines a group of packets that are sent to the same next-hop, out the same interface, using the same behavior (think QOS etc).  FECs aren’t unique to MPLS.  In standard IP routing FECs are defined the same way, the difference is that they are defined on a hop by hop basis.  In MPLS, since we build LSPs across an MPLS cloud (virtual circuits) the FEC is the same end to end.  So how is this different than an LSP?  An LSP is a generic path through an MPLS cloud.  Many FECs could use the same LSP.  That is a FEC with a lower priority could use the same LSP as one with a higher priority.  So LSPs define the path of the virtual circuit while FECs define more granular policy on a more specific set of classified flows.

So why is the inet.3 table also called the FEC mapping table?  Because its used to determine the FEC for a given packet flow.  If we look at the routing table above we can see that the inet.3 table shows an entry for 4.4.4.4/32.  That’s the loopback of router 4 and the endpoint for the static LSP we defined.  It lists the other information we need to get to that endpoint.  Namely, the next hop (which resolves to the egress interface), and the MPLS label operation that we need to use to get into that FEC or LSP.  The problem we have now is that while we know how to get there, we dont know how to get traffic into the LSP.  Note that the inet.0 table still lacks an entry for the subnet of the bottom client (10.2.2.2/31).  So how do we fix that?  Well typically BGP would advertise the remote prefix for us (we’ll see this in a later post) and BGP would lookup the next-hop in the inet.3 table which would then put our entry into the inet.0 table.  Since we are doing everything manually, we can still make this happen with a static route.  For instance we can put this config in router 1…

Much like BGP would do for us, we’re telling router 1 here that the route to reach 10.2.2.2/31 is reachable through 4.4.4.4. Since 4.4.4.4 is not directly connected we need to tell the router to resolve 4.4.4.4 into a usable next hop.

Note: Another difference between IOS and JunOS is that in JunOS you need to tell the router to recurse (resolve) a route.  IOS does that automatically.  

When the router resolves the route, it consults the inet.3 table just like BGP would. When it does that, it finds an entry for 4.4.4.4 because of our static LSP. Now if we look at the inet.0 table we’ll see a usable entry to reach the bottom client…

Success! At this point, we should be able to see some of this in action. Let’s start a ping on the top client toward the bottom client and take a couple of packet captures alone the way.

Here is the ICMP request on the wire before it gets to router 1. Looks like a normal frame and packet…

Now below is the same frame and packet as it traverses the link between router 1 and router 2. Notice that addition of the MPLS header with it’s label of 1001001.  Based on the ingress LSP we configured on router 1 this makes total sense.  Also notice that in the ethernet (L2) header the type has changed from 0x0800 (IPv4) to 0x8847 (MPLS).  This is how a receiving router knows to process this frame as an MPLS datagram…

If we capture the frame again on the link between routers 2 and 3 we’ll see the MPLS header has changed to reflect the new label.  Also notice that the MPLS TTL has been decremented.  Since we’re dealing with a new forwarding paradigm here (not IP) we still need a means to keep track of TTL and MPLS acts in much the same way that IP does for TTL – decrementing it at each hop…

The capture between routers 3 and 4 is more interesting because, as you can see, we no longer have an MPLS header.  This is the PHP action I mentioned earlier.  Router 3 pops the MPLS header off before sending it to router 4 so that all router 4 has to do is perform an IP lookup…

And lastly we can see the normal frame and packet making it all the way to the bottom client as we’d expect using a normal L3 forwarding mechanism…

What the capture on the link between the bottom client and router 4 will also show is router 4 generating ICMP unreachable packets back to the bottom client (not pictured).  When the bottom client attempts to return the traffic to the top client it sends its traffic to router 4.  Router 4 has no means to reach the top client at 10.2.2.1 since it’s forwarding table has no entry for it…

So this is strange. Why cant the return traffic from the bottom client use the same LSP that the top client used? It’s because LSPs are unidirectional. At this point our MPLS LSP looks like this…

When we defined the LSP on router 1 we defined an endpoint for it to use as router 4. To get return traffic to work we’ll need to define an LSP in the other direction as well.  So our LSPs will look like this…

Above we now show bidirectional LSPs.  With this configuration, the top and bottom client should be able to communicate normally.  The additional configuration for the second LSP on each router is shown below (again, load merge terminal is your friend here)…

Notice how the router 4 configuration also includes the static route to get the traffic into the LSP.  At this point, the top and bottom client should be able to communicate to one another successfully.  And if we look at the routing table of router 2 or 3, we’ll see that they still have no clue about either clients subnet…

Now we’ve just scratched the surface of MPLS and if you’re new to MPLS Im sure you’ll still have many many questions. Hang in there – in the next post we’ll talk about LDP and BGP and how they work in conjunction with BGP. Comments and questions welcome!

« Older entries