GLBP – Gateway Load Balancing Protocol

      5 Comments on GLBP – Gateway Load Balancing Protocol

GLBP is a Cisco proprietary tool that takes HSRP and VRRP to the next level (or so I though it would).  In addition to providing first hop redundancy, it also provides a load balancing mechanism for the clients.  That is, all routers in the group are active at all times.  As with HSRP and VRRP, routers that are to participate in GLBP must be a member of the same group.  Once the routers are part of the same group, they elect one router to be the AVG (Active Virtual Gateway).  The AVG is elected based on highest priority which then falls back to highest IP if the priorities match.  Up to 4 routers total can be in the same GLBP group.  One is the AVG, and up to 3 others can be AVFs (Active Virtual Forwarders).  The 4 router limit only applies to routers that will actively forward traffic.  If a 5th (or higher number) router joins the group it will become a SVF (Standby Virtual Forwarder) and will take the place of a AVF in case of failure.  There is also a SVG (Standby Virtual gateway) role as well. 

GLBP balances traffic by having the AVG assign each AVF virtual MAC addresses.  When the GLBP group sees a ARP request come in for the virtual IP, the AVG responds to the clients ARP request with one of the AVF’s virtual MACs.

Note: I’ll note here that some documentation I’ve found uses the SVF term to describe a router that is above and beyond the 4 router AVF limit and is waiting to join the group (the way I use it).  Other documentation uses SVF to describe an active AVF that is ready to take over another AVFs role in case of failure.  That is, router1 is a SVF for routers 2,3,4 and 5. 

image

In this example, we have 5 GLBP routers.  Let’s put the bare minimum GLBP config on each router and examine what occurs.  On each router, we’ll run this command to enable GLBP…

Router#config t
Enter configuration commands, one per line.  End with CNTL/Z.
Router(config)#
int fa0/0
Router(config-if)#glbp 5 ip 192.168.0.1
Router(config-if)#

This tells the router that it should be part of GLBP group 5 which has a virtual IP of 192.168.0.1.  Once we have that configured on each router, let’s take a look at the ‘show glbp’ output on Router 5…

Router5#show glbp
FastEthernet0/0 – Group 5
  State is Active
    1 state change, last state change 00:05:38
  Virtual IP address is 192.168.0.1
  Hello time 3 sec, hold time 10 sec
    Next hello sent in 1.856 secs
  Redirect time 600 sec, forwarder timeout 14400 sec
  Preemption enabled, min delay 0 sec
  Active is local
  Standby is 192.168.0.14, priority 100 (expires in 9.920 sec)
  Priority 100 (default)
  Weighting 100 (default 100), thresholds: lower 1, upper 100
  Load balancing: round-robin
  Group members:
    0013.19d7.6990 (192.168.0.12)
    0018.19f3.86fa (192.168.0.11)
    001d.704c.0dac (192.168.0.13)
    0021.a009.993c (192.168.0.14)
    0021.a0f1.61fc (192.168.0.15) local
  There are 4 forwarders (1 active)
  Forwarder 1
    State is Listen
      4 state changes, last state change 00:04:40
    MAC address is 0007.b400.0501 (learnt)
    Owner ID is 0013.19d7.6990
    Redirection enabled, 597.888 sec remaining (maximum 600 sec)
    Time to live: 14397.888 sec (maximum 14400 sec)
    Preemption enabled, min delay 30 sec
    Active is 192.168.0.12 (primary), weighting 100 (expires in 8.224 sec)
    Client selection count: 1
  Forwarder 2
    State is Active
      1 state change, last state change 00:05:08
    MAC address is 0007.b400.0502 (default)
    Owner ID is 0021.a0f1.61fc
    Redirection enabled
    Preemption enabled, min delay 30 sec
    Active is local, weighting 100
  Forwarder 3
    State is Listen
    MAC address is 0007.b400.0503 (learnt)
    Owner ID is 001d.704c.0dac
    Redirection enabled, 597.408 sec remaining (maximum 600 sec)
    Time to live: 14397.408 sec (maximum 14400 sec)
    Preemption enabled, min delay 30 sec
    Active is 192.168.0.13 (primary), weighting 100 (expires in 8.384 sec)
  Forwarder 4
    State is Listen
    MAC address is 0007.b400.0504 (learnt)
    Owner ID is 0021.a009.993c
    Redirection enabled, 597.760 sec remaining (maximum 600 sec)
    Time to live: 14397.760 sec (maximum 14400 sec)
    Preemption enabled, min delay 30 sec
    Active is 192.168.0.14 (primary), weighting 100 (expires in 8.288 sec)
Router5#

Let’s dissect this output piece by piece and talk about what it means.

The top portion of the output talks about the who the AVG is, as well as the general state of the group.

FastEthernet0/0 – Group 5
State is Active

The first line tells about the group we are looking at as well as the interface that GLBP is running on.  The second line tells us that this router in the Active AVG.  Why is this router the active AVG?  Since we haven’t configured the priority yet, GLBP picks the router with the highest IP address to be the active AVG.  If we look at the other 5 routers this is what we see…

router1#show glbp brief
Interface   Grp  Fwd Pri State    Address         Active router   Standby router
Fa0/0       5    –   100 Listen   192.168.0.1     192.168.0.15    192.168.0.14
<additional output not shown>

router2#show glbp brief
Interface   Grp  Fwd Pri State    Address         Active router   Standby router
Fa0/0       5    –   100 Listen   192.168.0.1    192.168.0.15    192.168.0.14
<additional output not shown>

router3#show glbp brief
Interface   Grp  Fwd Pri State    Address         Active router   Standby router
Fa0/0       5    –   100 Listen   192.168.0.1     192.168.0.15    192.168.0.14
<additional output not shown>

router4#show glbp brief
Interface   Grp  Fwd Pri State    Address         Active router   Standby router
Fa0/0       5    –   100 Standby  192.168.0.1     192.168.0.15    local
<additional output not shown>

So all the other routers know who the AVG is as well as who the SVG is.  Router4 is the SVG so he marks himself as ‘local’ under the Standby router field…

1 state change, last state change 00:05:38
Virtual IP address is 192.168.0.1
Hello time 3 sec, hold time 10 sec
Next hello sent in 1.856 secs
Redirect time 600 sec, forwarder timeout 14400 sec
Preemption enabled, min delay 0 sec
Active is local
Standby is 192.168.0.14, priority 100 (expires in 9.920 sec)
Priority 100 (default)
Weighting 100 (default 100), thresholds: lower 1, upper 100

The next chunk of output gives us some general information about the local GLBP host as well as the group in general.  We can see the virtual IP that the group is responsible for which was configured on all of the hosts to star the GLBP process.  We can also see the local priority of this GLBP host.  Note that it’s set to 100.  As the output, states this is the default value.  The priority is used to determine who the active AVG is.  The router with the highest AVG will always be the AVG, and the second highest will always be the SVG.  Let’s take a look at this by enabling GLBP preemption on the hosts and changing the priorities of all the routers…

Router1
interface FastEthernet0/0
glbp 5 priority 200
glbp 5 preempt

Router2
interface FastEthernet0/0
glbp 5 priority 190
glbp 5 preempt

Router3
interface FastEthernet0/0
glbp 5 priority 180
glbp 5 preempt

Router4
interface FastEthernet0/0
glbp 5 priority 170
glbp 5 preempt

Router5
interface FastEthernet0/0
glbp 5 priority 160
glbp 5 preempt

Now let’s take a look at the output of the ‘show glbp brief’ command on rouer1…

image 
As you can see, router1 is now the active AVG with router 2 (with the second highest priority) being the SVG.  Let’s take a second to talk about the output from this command. 

image  The first line in the output talks about the group in general.  It tells you the priority of the AVG, the GLBP group IP, the AVG and the SVG.  In this case, the priority of the AVG is 200, the group IP is 192.168.0.1, the AVG is local, and the SVG is router2.

image  The second line talks about the first AVF in the group.  The meaning of the  ‘state’ column changes here slightly.  As far as router1 is concerned, it is listening to this AVF to make sure that it is still online.  This does NOT imply that this AVF is not active.  This is just the view point from router1.  The rest of the line shows the virtual MAC that this AVF is responsible for as well as the router’s IP address.

image
The third line talks about the second virtual forwarder.  Again, from router1’s perspective it is listening to this AVF.  We see the virtual MAC that this AVF is using and responsible for as well as it’s IP address.

image The fourth line talks about the third virtual forwarder.  The state is shown as active here since the third AVF is the local router itself.  This shows that a router can own both the AVG as well as the AVF roles.  We see the virtual MAC as well as ‘local’ to indicate that this router has this role.

image
The fifth line shows the forth AVF, it’s virtual MAC and IP address.

The remainder of the initial ‘show glbp’ output talks about the load balancing method used as well as the forwarders.  Let’s spend some time talking about the AVFs and then talk about load balancing methods.

You might be wondering how the AVFs are picked.  You might have noticed that after we changed the AVG by manipulating the priorities, that we ended up with this..

AVG – Router1
SVG – Router2
AVFs…
Router1
Router2
Router3
Router5

So what happened to router4?  Why isn’t it being used as an AVF?  The answer is hard to come by but I can tell you you what I’ve discerned from testing this in my lab. 

The decision of which routers should be AVFs is based on weight.  Moreover, it appear to be based on the order in which routers join the group.  While it’s quite easy to move the AVG and SVG around, moving a router from being an AVF or SVF is considerably harder to do.  Let’s first start by talking about weight.  Some documentation I’ve read indicates that the weight is what’s used to determine which router’s will be AVFs and which will be SVFs.  Additionally, AVF/SVF preemption is a default setting in GLBP.  The AVFs or SVFs will preempt automatically with a 30 second delay.  This all being said, one would assume that you could set the weights in the same fashions that we set the priorities and move the AVF and SVF roles around.  That’s not the case.  It appears that AVF preemption doesn’t actually exist. 

image

Let’s start by looking at this example.  As stated, it’s pretty easy to move the AVG role around with priorities, so it’s easy to tell here the router1 is the AVG and router2 is the SVG based on their active priorities.  Since we have 4 router’s here, it’s safe to say that each one of the these routers will successfully claim and AVF role since we can have a max of 4.  Let’s take a look and verify…

image

As we can see, all 4 of the routers have an AVF role as we expected.  Now, let’s add the 5th router in with a priority of 160 and a weight of 200.  With a weight of 200, we can very easily assume that it will take one of the AVF roles from one of the other routers.  Let’s see what happens…

image

So what happened?  Nothing…

image

Taking a look at router 5 we see that it’s in a listen state for all of the AVFs…

image

At this point, it’s pretty easy to see that AVF’s don’t really preempt each other in regards to the assigned weight.  If they did, then router5 would have taken any of the other 4 routers AVF role. 

The other way to use weight is to determine if the AVF is a feasible forwarder.  That is, if it should even be forwarding packets.  This is done using the same tracking method we saw in HSRP and VRRP.  Additionally, when you configure the weight of a router, you also configure it’s upper and lower limits.  The upper and lower limits dictate when a AVF should and shouldn’t be an AVF.  For instance, let’s look at this configuration on router1…

track 1 interface Loopback0 line-protocol

interface Loopback0
ip address 1.1.1.1 255.255.255.255

interface FastEthernet0/0
ip address 192.168.0.11 255.255.255.0
duplex auto
speed auto
glbp 5 ip 192.168.0.1
glbp 5 timers redirect 60 660
glbp 5 priority 200
glbp 5 preempt
glbp 5 weighting 160 lower 150 upper 155
glbp 5 weighting track 1 decrement 15

For now, focus just on the bolded lines.  This configuration says that the router should have a weight of 160.  If the weight falls below the lower limit of 150, then we should take away it’s AVF role.  When the weight once again returns to be equal or above the upper limit of 155, we can give it the AVF role back.  We are also telling the router to track interface loop0 on the line protocol.  If loop0 goes down, decrement the weight by 15. 

I have placed a similar configuration on each of the 5 router’s with the following weights…

Router1
glbp 5 weighting 160 lower 150 upper 155

Router2
glbp 5 weighting 170 lower 160 upper 165

Router3
glbp 5 weighting 180 lower 170 upper 175

Router4
glbp 5 weighting 190 lower 180 upper 185

Router5
glbp 5 weighting 200 lower 190 upper 195

So let’s shutdown loop0 on router1 and see what happens.  On router 1 we see…

*Jan  6 21:16:45.955: GLBP: Fa0/0 5 Track 1 object changed, state Up -> Down
*Jan  6 21:16:45.955: GLBP: Fa0/0 5 Weighting 160 -> 145
*Jan  6 21:16:47.955: %LINK-5-CHANGED: Interface Loopback0, changed state to administratively down
*Jan  6 21:16:48.955: %LINEPROTO-5-UPDOWN: Line protocol on Interface Loopback0, changed state to down

On all of the other router’s we see…

*Jan  6 21:22:43.011: GLBP: Fa0/0 5.1 Preemption delayed, 30 secs remaining

The preemption delay defaults to 30 seconds and is the time that a router will wait before attempting to preempt a downed AVF.  In the end, we see that router5 claims the forwarder 1 position from router1…

image

Let’s talk briefly about what occurs during an AVF failure.  When an AVF fails, another AVF will assume responsibility for the failed AVFs MAC address.  That is, it will continue to respond to request for it’s own virtual MAC, as well as the MAC for the failed AVF that it is assuming responsibility for. 

Now, let’s quickly talk about two other timers you need to be aware of that handle AVF failure.  The first is the redirect timer which dictates how long the AVG will continue to hand the virtual MAC address out to clients.  This defaults to 600 seconds.  During this time, if the AVF returns to the GLBP group, it will reclaim it’s role as the active AVF and things will resume as they should.   If the AVF does not return to the group by the time the redirect timer expires, the AVF that is handling the failed AVFs virtual MAC will continue to do so, but the AVG will no longer hand out the virtual MAC to clients. 

The second timer is the forwarder timeout timer.  This timer defaults to 14400 seconds and is used to determine when the failed AVF should be entirely flushed from GLBP.  When this timer expires, the AVG will take the forwarder out of the GLBP table entirely and revoke the virtual MAC it had been assigned.  Let’s take a look at this in action by decreasing the redirect and forwarder timeout values on the AVG.  Note, all other AVFs will learn these two timer values from the AVG…

interface FastEthernet0/0
bp 5 timers redirect 60 660

Now, let’s watch and see what happens when the forwarder timeout timer expires for forwarder1.  You’ll recall that forwarder 1 had previously been the AVG (router1).  We then shut the loopback 0 interface on router1 down which decremented the weight and removed it from being an active AVF.  At that point, router5 took router1’s place.  If we look at GLBP on router1 we can see that the AVG still thinks that it ‘owns’ the AVF role…

Group members:
   0013.19d7.6990 (192.168.0.12)
   0013.19d7.aaa4 (192.168.0.15)
   0018.19f3.86fa (192.168.0.11) local
   001d.704c.0dac (192.168.0.13)
   0021.a009.993c (192.168.0.14)
There are 4 forwarders (0 active)
Forwarder 1
   State is Listen
     4 state changes, last state change 00:06:47
   MAC address is 0007.b400.0501 (default)
   Owner ID is 0018.19f3.86fa
   Redirection enabled
   Preemption enabled, min delay 30 sec
   Active is 192.168.0.15 (secondary), weighting 200 (expires in 7.432 sec)
   Arp replies sent: 1

At this point, we need to wait 11 minutes for the forwarder timeout timer to expire before we see what the AVG does with the role.  So after 11 minutes pass, let’s see what happened…

Group members:
  0013.19d7.6990 (192.168.0.12)
  0013.19d7.aaa4 (192.168.0.15)
  0018.19f3.86fa (192.168.0.11) local
  001d.704c.0dac (192.168.0.13)
  0021.a009.993c (192.168.0.14)
There are 4 forwarders (0 active)
Forwarder 1
  State is Listen
    4 state changes, last state change 00:11:40
  MAC address is 0007.b400.0501 (default)
  Owner ID is 0018.19f3.86fa
  Redirection enabled
  Preemption enabled, min delay 30 sec
  Active is 192.168.0.15 (secondary), weighting 200 (expires in 8.684 sec)
  Arp replies sent: 1

Again, nothing happened.  Why would router1 will still own the forwarder role?  Let’s shutdown the loopback interfaces on routers 2 and 3 and see what happens…

image

And if we look at the forwarders we can see that the owner ID’s are ,for the most part, still listed as the original owners…

 Group members:
   
0013.19d7.6990 (192.168.0.12)
    0013.19d7.aaa4 (192.168.0.15)
    0018.19f3.86fa (192.168.0.11) local
    001d.704c.0dac (192.168.0.13)
    0021.a009.993c (192.168.0.14)
  There are 4 forwarders (0 active)
  Forwarder 1
    State is Listen
      6 state changes, last state change 00:01:48
    MAC address is 0007.b400.0501 (default)
    Owner ID is 0018.19f3.86fa
    Redirection enabled
    Preemption enabled, min delay 30 sec
    Active is 192.168.0.15 (secondary), weighting 200 (expires in 7.576 sec)
  Forwarder 2
    State is Listen
    MAC address is 0007.b400.0502 (learnt)
    Owner ID is 0013.19d7.6990
    Redirection disabled
    Time to live: 563.576 sec (maximum 566 sec)
    Preemption enabled, min delay 30 sec
    Active is 192.168.0.15 (secondary), weighting 200 (expires in 7.576 sec)
  Forwarder 3
    State is Listen
    MAC address is 0007.b400.0503 (learnt)
    Owner ID is 0013.19d7.aaa4
    Redirection enabled, 57.576 sec remaining (maximum 60 sec)
    Time to live: 657.576 sec (maximum 660 sec)
    Preemption enabled, min delay 30 sec
    Active is 192.168.0.15 (primary), weighting 200 (expires in 7.572 sec)
  Forwarder 4
    State is Listen
    MAC address is 0007.b400.0504 (learnt)
    Owner ID is 0021.a009.993c
    Redirection disabled
    Time to live: 583.976 sec (maximum 585 sec)
    Preemption enabled, min delay 30 sec
    Active is 192.168.0.15 (secondary), weighting 200 (expires in 8.976 sec)
router1#

Note that the owner for forwarder 3 has changed to the MAC of router5.  Why this happened, I can’t be sure but it does seem odd.  Nothing from my packet captures or debugs seemed to show why.  Also note that the router with the highest IP address will always win a AVF preemption event.  This means that we could end up with a scenario like this…

image

Where you have vey uneven load balancing occurring between AVFs.  You’d think that GLBP would have a mechanism in play to level the playing field when something like this occurs. 

I also find it interesting that when a router is removed using weight, that it’s never actually pulled from the GLBP table.  That is, the AVG still think that the original owner owns the virtual MAC despite the forwarder timeout expiring.  If you want to pull a router out of GLBP entirely, you need to completely cut connectivity between the AVG and the AVF.  If you do, after the forwarder timeout expires, GLBP will pull the router out of GLBP entirely.  If there is another router to take it’s place, that router will take the position.  If there’s not, the forwarder count will reduce and the virtual MAC for the legacy forwarder will be removed…

image

Now, we should probably talk about how GLBP actually handles traffic.  GLBP balances traffic across AVFs by using virtual MAC addresses.  As you’ve seen, each router in the GLBP group will be assigned a virtual MAC…

image

In the case of GLBP, the MACs will always start with 0007.b400.  The last octet represents the GLBP group (5 in our case) as well as the forwarder ID.  Once the virtual MACs are assigned, it’s up to the AVG to answer the client’s ARP requests with one of the virtual MACs assigned to the AVFs.  The manner in which this is done can be changed be manipulating the load balancing method…

image

The default load balancing method is round-robin.  I think we all know that that is so I won’t explain it.  Weighted uses the AVFs defined weight to send proportionate amounts of traffic to a given AVF.  Host-dependant ensure that the same virtual MAC is always given to the same client.

Let’s take a look at what happens when a client tries talking to the GLBP VIP in our original example…

image

Let’s assume that host 192.168.0.157 is trying to talk to 10.0.0.1 over at the data center.  GLBP group 5 is fronting the default gateway for the 192.168.0.0/24 subnet at .1.  So let’s look at this in action.  Here is the client’s ARP request…

image

And here is the ARP reply…

image

The ARP request is pretty straight forward, but the ARP reply is the interesting part.  Note that the reply came from 0018.19f3.86fa which just so happens to be the MAC of router1’s (The AVG) FA0/0 interface…

router1#show int fa0/0
FastEthernet0/0 is up, line protocol is up
  Hardware is Gt96k FE, address is 0018.19f3.86fa (bia 0018.19f3.86fa)
  Internet address is 192.168.0.11/24
  MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec,
     reliability 255/255, txload 1/255, rxload 1/255

You can see that the reply contains the MAC address of 0007.b400.0503, the virtual MAC of GLBP forwarder 3…

image

I’m not going to dig into the other load balancing methods since they are rather straight forward, but I would like to close with some comments on GLBP.

After spending close to a week testing the protocol, I can honestly say I’m not a fan.  Here’s why…

-I don’t care for the way that AVFs are elected and that they can’t truly be actively preempted. 

-I found through my testing that I wasn’t always receiving the same results.  For instance, one time I shut off all of the FA0/0 interfaces on all of the GLBP routers on the switch side (using the interface range command).  I then turned them all back on at the same time.  Despite having priorities set, router2 became the AVG.  There were just too many times that GLBP didn’t do what it should have done.

-The config seems sticky.  I ran a couple different code version (12.x and 15.x) and both seemed to experience random GLBP config problems.  For instance, there were two different times that a router wouldn’t let me remove the GLBP IP command (glbp 5 ip 192.168.0.1) from an interface.  There were even more times that it wouldn’t let me put one on.  A reboot always solved the problem.

-It seemed that even when a AVF shouldn’t be forwarding due to a tracked interface failing that it sometimes temporarily came back into the group.  No idea why, but I saw it happen several times. 

Bottom line, I’m glad I spent the time researching this one.  It’s not a protocol I had ever used before in production environments and it certainly operated differently than I had thought it would.  I can’t say that I’d ever recommend putting this in anywhere though, too many inconsistencies.
 

5 thoughts on “GLBP – Gateway Load Balancing Protocol

  1. Michael

    Nice write-up, mate. I think the idea of AVF non-preemption becomes a bit clearer if you think of the AVF as the representation of a router. It will always try to claim it’s own AVF as long as the weighting requirements are met.

    Reply
  2. kurt fedee

    I thought that the router with the next highest weight would preempt the failed AVF. If the weights were the same (default 100) then any of the other available AVF’s could randomly become the AVF for the failed one after the 30sec forwarder default timer has expired. This action can be influenced by changing the default timers on the AVF you would like to succeed as the preemptor i.e “(config t)# glbp 1 forwarder preempt delay minimum 20”. For example, if this is configured on R2, then R2 will become the AVF for R1 mac because its timer would have expired before the remaining AVF’s.

    Reply

Leave a Reply to Thiyagu Ganesan Cancel reply

Your email address will not be published.