Understanding IP Multicast

      5 Comments on Understanding IP Multicast

We’ve covered IP multicast many times in previous posts, but until now, we’ve never really talked about what it is and how it works.  Let’s start with the basics. 

Layer 2 and Layer 3 Unicast/Broadcast addressing
We know what unicast traffic and broadcast traffic are.  Both of those technologies can be represented in both the layer 2 as well as the layer 3 layers.

From a layer 2 perspective, a unicast address is the BIA (Burned In Address) of the actual NIC unless its manually changed.  Layer 2 BIAs (MAC addresses) should be unique across the entire global network.  If a device wishes to send data to a specific device on a layer 2 segment it would send it to the PC’s BIA.  For instance, my home computers BIA is 10:BF:48:88:3E:92.  Note that the MAC is represented in hex.  If I wish to broadcast traffic on a layer 2 segment, I would send a frame with a source of my MAC address and a destination of FF:FF:FF:FF:FF:FF.  FF:FF:FF:FF:FF:FF is known as the broadcast MAC address.  All devices on that segment will listen to not only their local BIA, but also this broadcast address and process frames destined to either. 

From a layer 3 perspective, a unicast address is an IP address (we’ll deal with IPv4 only during this post).  For instance, my PC’s IP address is 10.20.30.41 /24.  Another layer 3 device sending me direct unicast traffic would send it directly to that IP address.  Note that I list not only the actual IP address, but I also include the subnet mask.  The subnet mask is crucial in defining the layer 3 broadcast address.  The broadcast address for layer 3 is the last ‘useable’ IP address in the network.  Note that when I say usable, I’m referring to it as “it has a purpose” not useable in the sense of end host addressing.  For instance, on my home network the broadcast IP for my computer would be 10.20.30.255.  Packets sent to that IP address should be processed by all devices on that layer 3 segment. 

Taking a look at a sample of unicast traffic, we can see that these rules hold true…

image

This shows a ICMP echo reply packet sent from my PC (10.20.30.41) to another host on my network (10.20.30.40).  The layer 3 unicast addresses are shown in green boxes.  The layer 2 unicast addresses are show in the red boxes and you can see that the source matches the PC BIA that I listed above. 

Broadcasts can also be seen on the wire…

image

I generated this traffic by sending an ICMP echo request to the broadcast address of the network (10.20.30.255).  Note what happens when we send a layer 3 broadcast packet.  The source layer 2 and layer 3 is as we expect (the host sending the broadcast).  The layer 3 destination is also what we expected since we were directly pinging the 10.20.30.255 address, but note how the layer 2 destination also turned into the layer 2 broadcast destination of ff:ff:ff:ff:ff:ff.  The MAC address of the packet HAS to be the layer 2 broadcast.  The NIC on my computer is currently only listening to frames destined to two addresses.  The BIA and the broadcast MAC address.  If the layer 3 broadcast didn’t use the layer 2 broadcast destination, it wouldn’t work at all. 

Ok, I know I was kind of beating the dead horse on that, but these are important concepts to know before you start dealing with multicast since the exact same principles apply. 

Layer 2 and Layer 3 Multicast addressing
Now that we have a firm hold on how unicast and broadcast handle layer 2 and layer 3 addressing, let’s talk about how multicast does so (I know I haven’t actually explained what multicast is yet, just keep with for a few more paragraphs!). 

Like unicast and broadcast traffic, multicast has a predetermined set of layer 2 and layer 3 addresses.  Let’s start with layer 3 multicast…

image

The entire layer 3 multicast range (224.0.0.0 through 239.255.255.255) is split into a few different allocations.  Let’s walk through them one at at time…

224.0.0.0 to 224.0.0.255
This range is considered to be part of the permanent IANA allocation.  This range is used for network protocols on a local segment.  Multicast routers do NOT forward packets with a destination in this range.  Things like EIGRP (224.0.0.10) and OSPF (224.0.0.5 and .6) live in this range. 

224.0.1.0 to 224.0.1.255
This range is also considered to be part of the permanent IANA allocation.  This range is used for network protocols that ARE forwarded.  That is, multicast routers will forward packets with a destination IP in this range. 

232.0.0.0 to 232.255.255.255
This range is used for source specific multicast (SSM).  We’ll touch on SSM later but for now we should just know that SSM is means for a host to indicate that it would only like to receive packets from a specific source. 

233.0.0.0 to 233.255.255.255
This range is referred to as GLOP addressing.  What does GLOP stand for?  I don’t think anyone actually knows to be honest.  The range can/should be used on an experimental basis by anyone that owns a registered ASN.  If I happened to own ASN 5660, I could convert my ASN to a 16 bit number.  In this case…

image

So you convert your ASN to binary.  Then you chop it into 8 bit chunks and turn it back into decimal.  Those two decimal numbers make up the second and third octet of your GLOP range with the first octet always being 223. 

239.0.0.0 to 239.255.255.255
This is considered the ‘administratively scoped’ range of addresses.  Much like how RFC1918 specific private IP space to be used on private networks this range specifies the private range of multicast addresses that can be used on private networks.  This space can NOT be advertised ‘outside’. 

The rest of the space
The rest of the multicast allocation is said to be ‘transient’ space.  Technically, any enterprise can use this space and then release it back once done.  To be frank, I’m not sure how this actually works but if it comes up in the course of my studies I’ll be sure to clarify.

Now that we’ve covered the layer 3 addressing, it’s time to talk about what multicast does for layer 2.  As we know, unicast and broadcast traffic have specific layer 2 addresses they use in their communication.  Multicast does as well, it’s just a little harder to determine.

Layer 2 multicast addresses are based entirely off of the layer 3 multicast address being used.  The IEE has registered the OUI piece of the multicast MAC address as 01:00:5e.  That piece (the first half) will always be the same.  The second half of the MAC is based off of the IP address of the multicast group with the exception of the first hexadecimal number which is always 0.  The last 3 hexadecimal numbers are calculated in this fashion…

image

Those steps outline how to build the entire multicast MAC address. 

Ok, so now we know how layer 2 and layer 3 addresses are created for unicast, multicast, and broadcast.  Now we can start talking about what multicast actually solves for.

Multicast the problem definition
The classic example given in most books can be summarized quite easily.  If you have multiple downstream hosts that want to receive the same traffic, why send it multiple times?  Why not just send it once to the end layer 3 device and then distribute it to each host that wants the traffic?  Not only is this the smart thing to do, but it can seriously reduce the bandwidth utilization of the mid-stream links. 

Take for instance the example of streaming video.  If we were to build a topology like this…

image

You have a user sitting off of an access layer switch and a content server somewhere else on the network.  In the unicast world, the green stream would be a distinct unicast stream from the content server to user 1.  Likewise, the orange stream would be another separate unicast stream directly to user 2.  While this is a small example, there’s no reason for us to send the traffic as two distinct unicast streams to each user.  Why not just send it once and have both users use the same stream?  In Multicast, it would look more like this…

image

Here one stream is sent from the content server and delivered across the network.  Ports that are interested in receiving the data get the data while other ports do not. 

Layer 3 Multicast
I think it’s best to start with a discussion of how layer 3 multicast is routed across the network.  Let’s first talk about the problem with routing multicast traffic. 

With unicast traffic, the router has a rather simple job.  Look at the packet destination, find the most specific prefix in the FIB, and send the traffic out the interface that the FIB indicates for egress.  Multicast isn’t that easy.  For instance, if you were trying to route the multicast IP address of 239.1.2.3 across the network, it’s fairly safe to say that you won’t find that prefix in the normal IP FIB.  So how does the router know how to forward this traffic?

Multicast layer 3 forwarding is accomplished with PIM (Protocol Independent Multicast).  PIM can run in either dense or sparse mode.  In this post I’m going to only examine PIM dense mode since it’s a little easier to get going with a base configuration.  We’ll discuss sparse mode PIM in the next post. 

PIM Dense Mode
Dense mode PIM makes the assumption that all devices (or at least the bulk of them) want to hear the multicast traffic.  This means that when a router receives a multicast packet it should ,by default, forward it out all interfaces except the one in which is was heard.  Routers can request to be ‘pruned’ out of a multicast group if they have not heard of any directly connected subnets that have requested the group and if the router does not have any other downstream routers that need to receive a copy of the packet. 

An important part of multicast routing is the RPF (reverse path forwarding) check.  Since we truly don’t know where the exact destinations of the multicast traffic live, we need some means to ensure that the multicast packets aren’t looped through the network.  As I stated above, the first basic loop prevention mechanism is not to flood the multicast traffic out of the interface you heard the initial packet on.  But we can still have problems even with this check. 

Take this part of our topology for example…

image

The packet comes into router2 and then is sent out all of the interfaces other than the one it was received on.  Blue goes one way, red goes another.  Following the same logic each other router sends the packet out ‘any interface except the one it received it on’ and we start looping packets in a big way. 

Reverse path forwarding fixes this for us by verifying the reverse path.  For instance, look at what happens when the RPF check is in place with the same packet…

image

When a new multicast group starts sending traffic, the router receiving the traffic picks a RPF interface for that group.  That is, the router does an RPF check by checking the source of the multicast stream and doing a unicast lookup of that IP address.  The interface that the router would use to forward unicast packets to that source is this multicast groups RPF interface.  Let’s walk through an example of how this works…

For the sake of this example, let’s assume the source is 10.20.30.41 and multicast destination is 239.1.2.3.  The device interested in the multicast stream is the user at 10.0.0.42 hanging off of switch1. 

Please note that steps 1, 2, and 3 occur at the same time that 4 and 5 do.  The numbers are just to identify pieces of the process.

Step 1 – The packet is received on router2 and sent out of all router2’s interface except the interface in which the multicast packet was received.   The packet reaches router1 who examines the source of the packet and does a unicast lookup on the source.  The lookup tells router1 that it would normally forward traffic towards 10.20.30.41 out of it’s 10.0.0.2 interface facing router2.  This interface becomes router1’s RPF interface for this particular multicast group.

Step 2 – Router1 forwards the packet out all of the interfaces except the one that it received the packet on.  The packet reaches switch1 in this step.  Switch1 does a unicast lookup on 10.20.30.41 and determines that it actually has 2 destinations for that unicast address since each path is equal cost.  In this case, the switch picks the interface with the highest IP address as it’s RPF interface.  Switch1 marks the 10.0.0.25 interface as RPF, and forwards the packet out of all of the other interfaces except the one it received the multicast packet on. 

Step 3 – Router4 receives the packet from Switch1 and does a unicast lookup on the source.  It determines that it’s best path to 10.20.30.41 is out through router2, not back through switch1.  Router4 then marks the 10.0.0.13 interfaces as the RPF and completely disregards the packet it just heard from switch1.

Step 4 – Going back to when the initial packet arrived router2 forwards the packet down to router4.  Router4 does a unicast lookup on the source of 10.20.30.41 and determines that it’s best path is back through router2.  Router4 marks the 10.0.0.13 interface as it’s RPF interface for this multicast group and forwards the packet out of all of the interfaces it has except the one it heard the initial packet on. 

Step 5 – Switch1 receives the multicast packet from router4 and has already determined that it’s RPF interface is 10.0.0.25.  Since the RPF interface has already been selected, switch1 completely disregards the packet.

The crucial piece to take away from this process is that each device will pick a RPF interface for each multicast group.  This is based on the shortest path to the source of the multicast packet.  Once this is selected each router follows two simple rules…

1 – If you receive a multicast packet on your RPF interface, forward it out all other interfaces

2 – If you receive a multicast packet on a non-RPF interface, drop the packet. 

If each router follows these rules, we can ensure a loop free topology.

When the RPF selection occurs, the router’s all exchange pruning messages to make each other aware of links that can be pruned out of the multicast tree.  Recall that dense mode multicast assumes that all links show hear the multicast traffic to begin with and then prunes links out as needed.  We can examine a great deal of information on the multicast tree with the ‘show ip mroute’ command.  Let’s look at that output on each of the devices to make sure we understand what happened…

image

Let’s tear apart the output on router2 so we make sure we understand what we are seeing.  The top red box shows a multicast group registration for        (*, 239.1.2.3).  These (*, G) entries don’t actually have anything to do with the PIM dense mode but Cisco devices create a (*, G) entry for each (S, G) entry.  Confused?  Understandably so at this point.  I don’t want to dive into the details on this just yet, but I can tell you briefly about the two kinds of entries you might see. 

(*, G) – The G here stands for group which would be the particular multicast group we are talking about.  In our example, that group is 239.1.2.3.

(S, G) – The S stands for source.  In our case, that source is 10.20.30.41.  So the (S, G) entry for the group we are looking for is (10.20.30.41, 239.1.2.3).  S could also be * as listed above which would indicate ‘any’ source. 

The only thing you need to know at this point is that (S, G) entries deal with dense mode PIM which is what we are showing.  The (*, G) entries have to do with other types of multicast.  We’ll cover the (*, G) entries later, but for now just be aware that for each (S, G) entry, Cisco also generates a (*, G) entry which is what we are showing in the top red box.  Interestingly enough, we can garnish some information from this entry despite that fact that it’s not actually an entry for dense mode PIM.  For instance, the (*, 239.1.2.3) entry tells us that there are 3 other PIM-DM (dense mode) neighbors or  directly connected group members for the 239.1.2.3 group.  That would be the fa0/0.4, fa0/0.30, and fa0/0.10.

Note: If you are confused about (something, something) right now don’t focus on it, we’ll cover that in much greater detail later so don’t spin on it for too long.  The bottom line is that the pairing is source and destination (group).

The next red box  shows us the actual entry we are looking for.  This box tells us what the incoming interface for the multicast packet was (fa0/0.40) as well as what the RPF neighbor is for the packet.  In this case, it’s 10.0.0.10 which is the interface from which the multicast packet came. 

The last red box tells us what the outgoing interfaces are for this packet.  That is, where router2 is going to forward this multicast packet.  In this case, it’s going to forward it toward router1 and router4. Now let’s look at the same output from router1…

image

Here we find the same kind of information.  The RPF interface in router1’s case is fa0/0.10 pointing toward router2.  This makes sense since router1’s best unicast path back to 10.20.30.41 is through router2.  We can also see here that router1 is forwarding the multicast traffic down to switch1 out of the fa0/0.50 interface.  Now let’s check out router4…

image

Here we see something a little different.  While the RPF interface is what we expected (heading back up to router2) check out the outgoing interface list.  Router2 would like to forward the multicast packet out of it’s fa0/0.20 interface towards switch1, but notice how the state lists that entry as ‘Pruned’.  After the initial multicast routing is built, routers that have decided to ignore the multicast traffic coming to them send prune messages to the routers sending them the multicast traffic they don’t want to hear.  Recall that router’s don’t want to hear multicast traffic on links that are not the RPF interface.  In this case, we can assume that switch1’s interface to router1 is it’s RPF interface so it’s telling router4 to prune the traffic it was sending over to switch1.  Let’s take a look at switch1 to confirm…

image

As expected, switch1’s RPF interface is  up to router1 through VLAN 50.  You can see that it has also been asked by router4 to prune the multicast traffic it had been sending over to router4.

Now that we know how the traffic is getting to the client, let’s look at the actual data being sent to the client…

image

Note that layer 3 address is 239.1.2.3.  More importantly, note that the layer 2 address is 01:00:5e:01:02:03.  That means, that for this to be working, the client at 10.0.0.42 must be listening for (and accepting) frames with the destination MAC of 01:00:5e:01:02:03.  Recall that this MAC is generated based on the multicast IP address.  A quick conversion using this site…

IP – MAC Calculator

Shows us that the IP of 239.1.2.3 in MAC form is 01:00:5e:01:02:03. 

Now let’s look at what happens after this is all running and a new host wants to join the group or an existing host wants to leave the group. 

Joining a multicast group
Let’s go back to our larger lab topology…

image

In this case, we have a multicast server (10.0.0.42) sending a multicast stream to one user (192.168.1.2) at this time.  Things have converged and the multicast routing table it stable, but now we’d like to add a second user to the group so he too can watch the multicast video stream. 

Let’s watch the packets on the wire to determine how this happens.

When a host joins the group, it immediately sends a IGMP group membership report to the multicast address which signals the router that this host in particular would like to get the stream for the given multicast address…

image

This is referred to as a ‘unsolicited’ host membership report.  That is, the router didn’t ask for this request, the host just told the router that it wanted to be in the group.  As expected, there is also a ‘solicited’ group membership response as well.  These are sent in response to a routers IGMP query.

image

Note that the query is sent to the 224.0.0.1 IP address which is used for ‘all multicast devices’.  The response of the client to the query is identical to the IGMP group join that we saw earlier…

image

So let’s take into consideration the case of user1 joining the multicast stream that user2 is already listening to.  The process we described above works to tell the local router (switch3) that user1 would like to hear the stream.  But how do we get the stream from switch2 to extend over to switch3?

image

The process is rather similar to the IGMP join process that user1 used.  Except this time, the router (switch3) needs to do the same thing.  Recall that earlier switch3 would have sent switch2 a prune message indicating that it had no need to hear the traffic for 239.1.2.3 since it had no connected clients interested in hearing it.  Since routers use PIM to router layer3, switch3 needs to send some sort of PIM message to switch2 to tell it that it once again wants to hear the stream.  To do this switch3 sends what is called a PIM ‘graft’ message to switch2.  This message would look like this…

image

Switch3 sends switch2 a direct unicast message indicating that it would like to join the 239.1.2.3 stream.  To finish the graft process, switch2 sends an ACK…

image

Once this occurs, switch2 changes it’s OIL (Outgoing Interface List) for the (10.0.0.42,239.1.2.3) group to indicate that switch3 is no longer pruned…

image

If user2 had not already been listening to the stream, then this entire graft process would take place all the way back to the source.  The entire process for user1 joining the stream would look like this…

image

1 – User1 sends a IGMP membership report to switch3 indicating that it would like hear the stream for 239.1.2.3. 

2 – Switch3 knows where the stream initially came from since it had sent a prune to switch2 indicating that it no longer want to hear it since it had no interested clients.  Switch3 sends a PIM graft message to switch2 indicating that it would like to be ‘grafted’ back into the multicast group.

3 – Switch2 changes the OIL for the group and changes the interface pointing towards switch3 from prune to forward. 

Leaving a Multicast group
When a device wants to leave a multicast group, it sends a IGMP leave message to the 224.0.0.2 (All Multicast routers) IP address.  The message indicates that the host wants to leave as well as the group IP that the host wants to leave…

image

Immediately after receiving the group leave message, the router sends a query back out the interface it received the leave on asking if there are any other devices that still want to receive the stream…

image

If there are other devices that still want to hear the stream, they can reply with a group membership report indicating that they still need to hear that multicast traffic.  On the other hand, if the router doesn’t receive any replies to it’s query, the router knows it can stop forwarding multicast traffic out of that given interface. 

That takes care of the local router and the host, but now we need to tell switch2 to prune the traffic off of the link heading to switch3.  PIM can take care of this for us…

image

As we saw before, the device (switch3) will just send a PIM prune message to switch2 which will then update it’s OIL for that multicast group to once again prune that interface…

image

Multicast with multiple paths
We talked about about how the multicast SPT (shortest path tree) is built and how it works to prevent loops.  What we didn’t talk about was what happens when a layer 2 segment has two paths to the multicast source.  For instance, if we slightly modify our topology…

image

In this case, user1 is hanging off of a layer2 switch with the IP address of 172.16.0.3.  The switch has connections to two routers both of which have interfaces on the 172.16.0.0/24 network.  The content server is now sitting at 10.20.30.190 and streaming content to the same multicast IP address of 239.1.2.3.

In this case, we could possibly have another possible multicast issue.  Since both routers will pick their interface towards router2 as their RPF interfaces, they would be more than happy to send multicast traffic down to the layer2 switch meaning that there would be duplicate copies of the multicast traffic heading down toward user1.  Luckily for us, PIM has a method to solve this using PIM assert messages.  The actual message on the wire looks like…

image

One of the routers asserting itself needs to be elected as the winning router.  The rules for winning go like this with a tie moving onto the next step…

1 –The router advertising the lowest admin distance of of the route used to learn the source of the multicast stream wins.

2 – The router with the lowest metric to the route wins

3 – The router with the highest IP address on the LAN segment wins.

In this case, each router has an admin distance of 120 (RIP) and a metric of 6 (10.20.30.190 is 6 hops away).  So the winner becomes router4 since he has the highest LAN IP address (172.16.0.2).  We can also see the process by watching a ‘debug ip pim’ output…

image

Additionally, we can see the router that won by examining the output of the ‘show ip mroute’ command…

image

image

Note that router4 has the ‘A’ after the (S,G) entry symbolizing that it won the assert for that multicast group.

Report Suppression and the IGMP Querier
We spoke above about the fact that router’s will routinely (default every 60 seconds) send a IGMP query out onto a LAN segment to ensure that there are still hosts that need to receive traffic for a particular multicast stream.  Regardless of how many devices reply, we really only need to hear one response so the router knows to keep sending traffic onto the segment.   But what happens if there are 200 hosts on a segment, won’t they all send group membership replies to answer the query?

Here’s when a feature called ‘report suppression’ kicks in.  Upon receiving a query from the router, each device will set a timer based on a random number between 0 and the IGMP MRT (Max Response Time (defaults to 10 seconds)).  When their timer expires, they will send the response back on the multicast address for the group they want to continue hearing traffic for.  The instant any other host hears the response they cancel their reply.  This mechanism ‘suppresses’ the responses from other devices that the router doesn’t really need to hear to keep forwarding the traffic. 

In addition, there is a mechanism in place to make sure that you don’t send more queries than required.  For instance, take our lab topology where routers 1 and 4 share the same common LAN.  Does it make sense for each router to send queries for the same groups?  Not really.  In this case, the router with the lowest IP address assumes the role of the ‘IGMP querier’ and takes ownership of sending the queries onto the segment.  This works out sort of nicely since the router with the highest IP address takes the assert role and forwards multicast traffic onto the segment.  Recall that the responses to the queries are multicast so even though the assert router isn’t sending queries, it’s still hearing the response.

Here you can see router1 (172.16.0.1) has taken the role of querier and is sending queries onto the segment…

image

You can also see the multicast response to the query from 172.16.0.3 showing that it is still interested in hearing about traffic in the multicast group 239.1.2.3…

image

And here you can see rouer4 receiving the response to the query that router1 sent…

image

The Configuration
So at this point, I’ve spent a lot of time talking about multicast and showing examples, but I haven’t actually told you how to configure it.  You might have noticed that the lab topology I was using in my examples was very similar to the lab topology I used for the OSPF and RIP labs in previous posts.  That’s entirely true and the only change that I made to get multicast working was to add the command…

‘ip pim dense-mode’

Onto every layer 3 interface in the lab.  In addition, you have to enable multicast globally on the router with the command…

‘ip multicast-routing’

This was part of the reason I wanted to start with PIM dense mode since the configuration is really quite easy.  You’ll note that we used other protocols like IGMP but those don’t require any configuration to run in a ‘default’ manner.  In the coming posts, we’ll examine ways to tune IGMP as well as configure PIM sparse mode.

I hope that this was enough to get you thinking about multicast if you’ve never used it before.  And if you have used it before, I’d love to hear your comments and feedback!  Thanks for reading!

5 thoughts on “Understanding IP Multicast

  1. Mike

    Thanks so much for this article, Jon! Will you be covering what to do with the multicast MAC address overlap issue in your coming articles?

    This is a definite bookmark and new weekly read – great job!

    Reply
  2. Pingback: Multicast – White Board Engineer

  3. Tom

    You convert Binary to Octal,not hex,so isn’t 00 01 32?Thanks for your exciting post that help me a lot.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *