Linux

You are currently browsing the archive for the Linux category.

In the first post of this series we talked about some of the CNI basics.  We then followed that up with a second post showing a more real world example of how you could use CNI to network a container.  We’ve covered IPAM lightly at this point since CNI relies on it for IP allocation but we haven’t talked about what it’s doing or how it works.  In addition – DNS was discussed from a parameter perspective in the first post where we talked about the CNI spec but that’s about it.  The reason for that is that CNI doesn’t actually configure container DNS.  Confused?  I was too.  I mean why is it in the spec if I can’t configure it?

To answer these questions, and see how IPAM and DNS work with CNI, I think a deep dive into an actual CNI implementation would be helpful.  That is – let’s look at a tool that actually implements CNI to see how it uses it.  To do that we’re going to look at the container runtime from the folks at CoreOS – Rocket (rkt).  Rkt can be installed fairly easily using this set of commands…

After you install rkt check to make sure it’s working…

Note: This post is not intended to be a ‘How to get started with rkt’ guide.  I might do something similar in the future but right now the focus is on CNI.

Great so now what? I mentioned above that rkt implements CNI. In other words, rkt uses CNI to configure a containers network interface.  Before we jump into that though – let’s talk about what’s already in place from the work we did in the first two posts. Let’s take a look at some files on the system to see what CNI has done up to this point…

Notice we switched over to the root user to make looking at these files easier. If we look in the ‘/var/lib/cni/networks’ path we should see a directory using the name of the network we defined. If you go back and look at the two previous posts you’ll notice that despite the networks being different – I neglected to change the name of the network between definitions. I only changed the ‘bridge’ parameter. If we look in the ‘mybridge’ folder we should see a few files…

Looking at the files we see some familiar values. The ‘10.15.20.2’ file has ‘1234567890’ in it which is the name of the network namespace from the first post. The ‘10.15.30.100’ file has the value of ‘1018026ebc02fa0cbf2be35325f4833ec1086cf6364c7b2cf17d80255d7d4a27’ which is the container ID we passed to CNI when we connected a Docker container with CNI in the second post. The last file is called ‘last_reserved_ip’ and has the value of 10.15.30.100 in it.  The last_reserved_ip file is sort of a helper file to tell CNI what the next IP is that it can allocate.  In this case, since the last IP was allocated out of the 10.15.30.0/24 network it lists that IP address.

So why are these files here?  Well they’re here because in both of the previous posts we told CNI to use the ‘host-local’ IPAM driver.  This is what host-local does, it stores all of the allocation locally on the host.  Pretty straight forward.  Let’s create another network definition on this host and use in conjunction with rkt so you can see it in action…

The first thing we want to do is to create a new network definition.  In the previous posts, we were storing that in our ‘~/cni’ directory and passing it directly to the CNI plugin.  In this case, we want rkt to consume the configuration so we need to put it where rkt can find it.  In this case, the default directory rkt searches for network configuration files is ‘/etc/rkt/net.d/’.  So we’ll create the ‘net.d’ directory and then create this new network configuration in it.  Notice that the name of this network is ‘customrktbridge’.  Now let’s run a simple container on the host using rkt…

To exit the containers interactive shell use the key sequence Ctrl + ]]]

The command we executed told rkt to run a container in interactive mode, using the network ‘customrktbridge’, from the container ‘quay.io/coreos/alpine-sh’.  Once the container was running we looked at it’s interfaces and found that in addition to a loopback interface, it also has a eth0 and eth1 interface.  Eth0 seems to line up with what we defined as part of our custom CNI network but what about eth1?  Well eth1 is an interface on what rkt refers to as the ‘default-restricted’ network.  This is one of the built in network types that rkt provides by default.  So now you’re wondering what rkt provides by default.  There are two networks that rkt defines by default.  They are the ‘default’ and the ‘default-restricted’ networks. As you might expect, the definitions for these networks are in CNI and you can take a look at them right here in the GitHub repo.  Let’s review them quickly so we can get an idea of what each provides…

The above CNI network definition describes the default network.  We can tell that this network uses the ‘ptp’ CNI driver, enables outbound masquerading, uses the host-local IPAM plugin, allocates container IPs from the 172.16.28.0/24 subnet, and installs a default route in the container.  Most of this seems pretty straight forward except for the ptp type.  That’s something we haven’t talked about yet but for now just know that it creates a VETH pair for each container.  One end lives on the host and the other lives in the container.  This is different from the default Docker model where the host side of the VETH pair goes into the docker0 bridge which acts as the container’s gateway.  In the ptp case, the host side VETH pairs are IP’d.  In fact – they’re IP’d using the same IP.  If you created multiple containers with rkt using the default network you’d see a bunch of VETH pair interfaces on the host all with 172.16.28.1/24.  In addition, you’d see routes for each container on the host pointing to the host side VETH pair for each destination IP in the container.

The above shows the CNI network definition for the default-restricted network which is what we saw in our output above.  We can tell this network uses the ptp CNI driver, disables out bound masquerading, uses the host-local IPAM plugin, and allocates container IPs out of the 172.17.0.0/16 subnet.  So the real question is why does our container have an interface on this network?  The answer lies in the docs (taken from here)…

The default-restricted network does not set up the default route and IP masquerading. It only allows communication with the host via the veth interface and thus enables the pod to communicate with the metadata service which runs on the host. If default is not among the specified networks, the default-restricted network will be added to the list of networks automatically. It can also be loaded directly by explicitly passing --net=default-restricted.

So that interfaces is put there intentionally for communication to the metadata service.  Again – this article isnt intended to be a deep drive on rkt networking – but I felt it was important to explain where all the container interfaces come from.  Ok – So now that we ran our container – let’s now go and look at our ‘/var/lib/cni/networks’ directory again…

This is what I’d expect to see. Rkt launched a container using CNI that ended up having two interfaces. One of which was the ‘customrktnetwork’ we defined and the other was the ‘default-restricted’ network that rkt connected for us by default. Since both plugins use the host-local IPAM driver they both got folders in ‘/var/lib/cni/networks’/ and they both have entries showing the assigned IP address as well and the container ID.

If you did a ‘sudo rkt list –full’ you’d see the full container ID which is ‘8d7152a7-9c53-48d8-859e-c8469d5adbdb’

At this point – we’ve shown how rkt uses CNI to provision container networks and how the host-local IPAM driver stores that information on the host locally.  You might now be wondering if there are other options for IPAM (I know I was).  If so – you’re in luck because by default, CNI also comes with the DHCP IPAM plugin.  So let’s take a look at a custom CNI network definition that uses DHCP for IPAM…

There are again some new things in this CNI network definition. Namely – you should see that the type of this network is being defined as MacVLAN. In order to use an external DHCP service we need to get the containers network interface right onto the physical network. The easiest way to do this is to use MacVLAN which will put the containers interface directly onto the host network. This isn’t a post on MacVLAN so I’ll be leaving the details of how that works out. For now just know that this works by using the hosts interface (in this case ens32) as the parent or master interface for the containers interface. You’ll also note that we are now using an IPAM type of dhcp rather than host-local. DHCP acts just the way you’d expect, it relies on an external DHCP server to get IP address information for the container. The only catch is that for this to work we need to run CNI’s DHCP daemon to allow the container to get a DHCP address. The DHCP process will act as a proxy between the client in the container and the DHCP service that’s preexisting on your network. If you’ve completed the first two posts in this series you already have that binary in your ~/cni directory. To test this we’ll need two SSH sessions to our server. In the first, we’ll start CNI’s DHCP binary…

Since we’re just running the executable here the process will just hang until it needs to do something. In our second window, let’s start a new container using our new network definition…

In this case, my DHCP server is allocating IP addresses out of 10.20.30.0/24 so our container ended up with 10.20.30.152. However, if we check the routing table, we’ll see that the container does not have a default route (this seems like something that should work so I opened a GH issue on it here.  In other words – there’s a chance I’m doing this wrong but I don’t think I am)…

My assumption was that this should have been added by the DHCP plugin and captured as a DHCP option but it was not. If we look back at our first window we can see that the DHCP daemon is working though…

So we can see how the DHCP plugin can work – but in it’s current state it doesn’t seem quite usable to me.   I will stress that the CNI plugins provided by default are meant to showcase the possibilities for what CNI can do. I don’t believe all of them are meant to be or are used in ‘production’. As we’ll see in later posts – other systems use CNI and write their own CNI compatible plugins.

So what about DNS? We haven’t touched on that yet. Do you recall from our first and second post that when we manually ran the CNI plugin we got a JSON return? Here’s a copy and paste from the first post of the output Im referring to…

See that empty DNS dictionary at the bottom? It’s empty because we were using the host-local IPAM driver which doesn’t currently support DNS. But what does supporting DNS even mean in the context of CNI? It doesnt mean what I thought it meant initially. My assumption was that I could pass DNS related parameters to CNI and have it install those settings (DNS name server, search domain, etc) in the container. That was an incorrect assumption. The DNS parameters are return parameters that CNI can pass to whatever invoked it. In the case of DHCP – you could see how that would be useful as CNI could return information it learned from the DHCP server back to rkt in order to configure DNS in the container. Unfortunately, both the default bundled IPAM drivers (host-local and DHCP) don’t currently support returning DNS related information which is why you see an empty DNS return in the CNI JSON response.  There is a current PR in the repo for adding this functionality to the DHCP plugin so if and when that happens we’ll revist it.

Next up we’re going to revisit another system that uses CNI (cough, Kubernetes, cough).

Using CNI with Docker

In our last post we introduced ourselves to CNI (if you haven’t read that yet, I suggest you start there) as we worked through a simple example of connecting a network namespace to a bridge.  CNI managed both the creation of the bridge as well as connecting the namespace to the bridge using a VETH pair.  In this post we’ll explore how to do this same thing but with a container created by Docker.  As you’ll see, the process is largely the same.  Let’s jump right in.

This post assumes that you followed the steps in the first post (Understanding CNI) and have a ‘cni’ directory (~/cni) that contains the CNI binaries.  If you don’t have that – head back to the first post and follow the steps to download the pre-compiled CNI binaries.  It also assumes that you have a default Docker installation.  In my case, Im using Docker version 1.12.  

The first thing we need to do is to create a Docker container.  To do that we’ll run this command…

Notice that when we ran the command we told Docker to use a network of ‘none’. When Docker is told to do this, it will create the network namespace for the container, but it will not attempt to connect the containers network namespace to anything else.  If we look in the container we should see that it only has a loopback interface…

So now we want to use CNI to connect the container to something. Before we do that we need some information. Namely, we need a network definition for CNI to consume as well as some information about the container itself.  For the network definition, we’ll create a new definition and specify a few more options to see how they work.  Create the configuration with this command (I assume you’re creating this file in ~/cni)…

In addition to the parameters we saw in the last post, we’ve also added the following…

  • rangeStart: Defines where CNI should start allocating container IPs from within the defined subnet
  • rangeEnd: Defines the end of the range CNI can use to allocate container IPs
  • gateway: Defines the gateway that should be defined.  Previously we hadnt defined this so CNI picked the first IP for use as the bridge interface.

One thing you’ll notice that’s lacking in this configuration is anything related to DNS.  Hold that thought for now (it’s the topic of the next post).

So now that the network is defined we need some info about the container. Specifically we need the path to the container network namespace as well as the container ID. To get that info, we can grep the info from the ‘docker inspect’ command…

In this example I used the ‘-E’ flag with grep to tell it to do expression or pattern matching as Im looking for both the container ID as well as the SandboxKey. In the world of Docker, the network namespace file location is referred to as the ‘SandboxKey’ and the ‘Id’ is the container ID assigned by Docker.  So now that we have that info, we can build the environmental variables that we’re going to use with the CNI plugin.  Those would be…

  • CNI_COMMAND=ADD
  • CNI_CONTAINERID=1018026ebc02fa0cbf2be35325f4833ec1086cf6364c7b2cf17d80255d7d4a27
  • CNI_NETNS=/var/run/docker/netns/2e4813b1a912
  • CNI_IFNAME=eth0
  • CNI_PATH=pwd

Put that all together in a command and you end up with this…

The only thing left to do at this point is to run the plugin…

As we saw in the last post, the plugin executes and then provides us some return JSON about what it did.  So let’s look at our host and container again to see what we have…

From a host perspective, we have quite a few interfaces now. Since we picked up right where we left off with the last post we still have the cni_bridge0 interface along with it’s associated VETH pair. We now also have the cni_bridge1 bridge that we just created along with it’s associated VETH pair interface.  You can see that the cni_bridge1 interface has the IP address we defined as the ‘gateway’ as part of the network configuration.   You’ll also notice that the docker0 bridge is there since it was created by default when Docker was installed.

So now what about our container?  Let’s look…

As you can see, the container has the network configuration we’d expect…

  • It has an IP address within the defined range (10.15.30.100)
  • Its interface is named ‘eth0’
  • It has a default route pointing at the gateway IP address of 10.15.30.99
  • It has an additional route for 1.1.1.1/32 pointing at 10.15.30.1

And as a final quick test we can attempt to access the service in the container from the host…

So as you can see – connecting a Docker container wasn’t much different than connecting a network namespace. In fact – the process was identical, we just had to account for where Docker stores it’s network namespace definitions. In our next post we’re going to talk about DNS related setting for a container and how those play into CNI.

Tags:

If you’ve been paying attention to the discussions around container networking you’ve likely heard the acronym CNI being used.  CNI stands for Container Networking Interface and it’s goal is to create a generic plugin-based networking solution for containers.  CNI is defined by a spec (read it now, its not very long) that has some interesting language in it.  Here are a couple of points I found interesting during my first read through…

  • The spec defines a container as being a Linux network namespace.  We should be comfortable with that definition as container runtimes like Docker create a new network namespace for each container.
  • Network definitions for CNI are stored as JSON files.
  • The network definitions are streamed to the plugin through STDIN.  That is – there are no configuration files sitting on the host for the network configuration.
  • Other arguments are passed to the plugin via environmental variables
  • A CNI plugin is implemented as an executable.
  • The CNI plugin is responsible wiring up the container.  That is – it needs to do all the work to get the container on the network.  In Docker, this would include connecting the container network namespace back to the host somehow.
  • The CNI plugin is responsible for IPAM which includes IP address assignment and installing any required routes.

If you’re used to dealing with Docker this doesn’t quite seem to fit the mold.  It’s apparent to me that the CNI plugin is responsible for the network end of the container, but it wasn’t initially clear to me how that was actually implemented.  So the next question might be, can I use CNI with Docker?  The answer is yes, but not as an all in one solution.  Docker has it’s own network plugin system called CNM.  CNM allows plugins to interact directly with Docker.  A CNM plugin can be registered to Docker and used directly from it.  That is, you can use Docker to run containers and directly assign their network to the CNM registered plugin.  This works well, but because Docker has CNM, they dont directly integrate with CNI (as far as I can tell).  That does not mean however, that you can’t use CNI with Docker.  Recall from the sixth bullet above that the plugin is responsible for wiring up the container.  So it seems possible that Docker could be the container runtime – but not handle the networking end of things (more on this in a future post).

At this point – I think its fair to start looking at what CNI actually does to try to get a better feel for how it fits into the picture.  Let’s look at a quick example of using one of the plugins.

Let’s start by downloading the pre-built CNI binaries…

Ok – let’s make sure we understand what we just did there.  We first created a directory called ‘cni’ to store the binaries in.  We then used the curl command to download the CNI release bundle.  When using curl to download a file we need to pass the ‘O’ parameter to tell curl to output to a file.  We also need to pass the ‘L’ parameter in this case to allow curl to follow redirects since the URL we’re downloading from is actually redirecting us elsewhere.  Once downloaded, we unpack the archive using the tar command.

After all of that we can see that we have a few new files.  For right now, let’s focus on the ‘bridge’ file which is the bridge plugin.  Bridge is one of the included plugins that ships with CNI.  It’s job, as you might have guessed, is to attach a container to a bridge interface.  So now that we have the plugins, how do we actually use them?  One of the earlier bullet points mentioned that network configuration is streamed into the plugin through STDIN.  So we know we need to use STDIN to get information about the network into the plugin but that’s not all the info the plugin needs.  The plugin also needs more information such as the action you wish to perform, the namespace you wish to work with, and other various information.  This information is passed to the plugin via environmental variables.  Confused?  No worries, let’s walk through an example.  Let’s first define a network configuration file we wish to use for our bridge…

Above we create a JSON definition for our bridge network.  There are some CNI generic definitions listed above as well as some specific to the bridge plugin itself.  Let’s walk through them one at a time.

CNI generic parameters

  • cniVersion: The version of the CNI spec in which the definition works with
  • name: The network name
  • type: The name of the plugin you wish to use.  In this case, the actual name of the plugin executable
  • args: Optional additional parameters
  • ipMasq: Configure outbound masquerade (source NAT) for this network
  • ipam:
    • type: The name of the IPAM plugin executable
    • subnet: The subnet to allocate out of (this is actually part of the IPAM plugin)
    • routes:
      • dst: The subnet you wish to reach
      • gw: The IP address of the next hop to reach the dst.  If not specified the default gateway for the subnet is assumed
  • dns:
    • nameservers: A list of nameservers you wish to use with this network
    • domain: The search domain to use for DNS requests
    • search: A list of search domains
    • options: A list of options to be passed to the receiver

Plugin (bridge) specific parameters

  • isgateway: If true, assigns an IP address to the bridge so containers connected to it may use it as a gateway.
  • isdefaultgateway: If true, sets the assigned IP address as the default route.
  • forceAddress: Tells the plugin to allocate a new IP address if the previous value has changed.
  • mtu: Define the MTU of the bridge.
  • hairpinMode: Set hairpin mode for the interfaces on the bridge

The items that are in bold above are the ones we’re using in this example.  You should play around with the others to get a feeling for how they work but most are fairly straight forward.  You’ll also note that one of the items is part of the IPAM plugin.  We arent going to cover those in this post (we will later!) but for now just know that we’re using multiple CNI plugins to make this work.

Ok – so now that we have our network definition, we want to run it.  However – at this point we’ve only defined characteristics of the bridge.  The point of CNI is to network containers so we need to tell the plugin about the container we want to work with as well.  These variables are passed to the plugin via environmental variables.  So our command might look like this…

Let’s walk through this.  I think most of you are probably familiar with using environmental variables on systems by setting them at the shell or system level.  In addition to that, you can also pass them directly to a command.  When you do this, they will be used only by the executable you are calling and only during that execution.  So in this case, the following variables will be passed to the bridge executable…

  • CNI_COMMAND=ADD – We are telling CNI that we want to add a connection
  • CNI_CONTAINER=1234567890 – We’re telling CNI that the network namespace we want to work is called ‘1234567890’ (more on this below)
  • CNI_NETNS=/var/run/netns/1234567890 – The path to the namespace in question
  • CNI_IFNAME=eth12 – The name of the interface we wish to use on the container side of the connection
  • CNI_PATH=pwd – We always need to tell CNI where the plugin executables live.  In this case, since we’re already in the ‘cni’ directory we just have the variable reference pwd (present working directory). You need the ticks around the command pwd for it to evaluate correctly. Formatting here seems to be removing them but they are in the command above correctly

Once the variables you wish to pass to the executable are defined, we then pick the plugin we want to use which in this case is bridge.  Lastly – we feed the network configuration file into the plugin using STDIN.  To do this just use the left facing bracket ‘<‘.  Before we run the command, we need to create the network namespace that the plugin is going to work with.  Tpically the container runtime would handle this but since we’re keeping things simple this first go around we’ll just create one ourselves…

Once that’s created let’s run the plugin…

Running the command returns a couple of things.  First – it returns an error since the IPAM driver can’t find the file it uses to store IP information locally.  If we ran this again for a different namespace, we wouldn’t get this error since the file is created the first time we run the plugin.  The second thing we get is a JSON return indicating the relevant IP configuration that was configured by the plugin.  In this case, the bridge itself should have received the IP address of 10.15.20.1/24 and the namespace interface would have received 10.15.20.2/24.  It also added the default route and the 1.1.1.1/32 route that we defined in the network configuration JSON.  So let’s look and see what it did…

Notice we now have a bridge interface called ‘cni_bridge0’ which has the IP interface we expected to see.  Also note at the bottom we have one side of a veth pair.  Recall that we also asked it to enable masquerading.  If we look at our hosts iptables rules we’ll see the masquerade and accept rule…

Let’s now look in the network namespace…

Our namespace is also configured as we expected.  The namespace has an interface named ‘eth12’ with an IP address of 10.15.20.2/24 and the routes we defined are also there.  So it worked!

This was a simple example but I think it highlights how CNI is implemented and works.  Next week we’ll dig further into the CNI plugins as we examine an example of how to use CNI with a container runtime.

Before I wrap up – I do want to comment briefly on one item that I initially got hung up on and that’s how the plugin is actually called.  In our example – we’re calling a specific plugin directly.  As such – I was initially confused as to why you needed to specify the location of the plugins with the ‘CNI_PATH’.  After all – we’re calling a plugin directly so obviously we already know where it is.  The reason for this is that this is not how CNI is typically used.  Typically – you have a another application or system that is reading the CNI network definitions and running them.  In those cases, the CNI_PATH will already be defined within the system.  Since the network configuration file defines what plugin to use (in our case bridge) all the system would need to know is where to find the plugins.  To find them, it references the CNI_PATH variable.  We’ll talk more about this in future posts where we discuss what other applications use CNI (cough, Kubernetes, cough) so for now just know that the example above shows how CNI works, but does not show a typical use case outside of testing.

Tags:

« Older entries