There’s been a lot of uptake in the interest around network automation during the last couple of years. More and more people are using Ansible, Salt or Nornir to configure their network devices. However, these tools tend to focus on devices which are already up and running. A lot of organizations still don’t have a good way to provision new devices or to replace broken ones. This guide aims to address this issue and help people who want to do zero touch deployments of their network devices.
While there will be quite a lot of handholding throughout the tutorial, it is going to be easier if you, as a reader, are somewhat familiar with Linux. At least being comfortable with installing Python packages will also help. If you have gotten started with network automation, know how to push configuration to your devices but unsure about how to provision new ones you are the target audience.
Just the last couple of years Cisco have come out with a few different methods of provisioning new devices. First, there was the Plug and Play app within APIC-EM and later their zero-touch deployment concept which uses the Python guestshell. A problem with those two options is that only newer Cisco devices are supported. The other obvious problem is that there’s a good chance that you have other vendors within your network. This tutorial will use what Cisco calls AutoInstall. Where a DHCP server points to a TFTP server from where the configuration is downloaded. It will, however, be done in a flexible way which will allow you to hand off the device to your other automation solution once done.
While this guide has been written with Cisco IOS in mind it should be applicable to just about any type of network device.
The goal is to help you understand how devices can be provisioned and give you different options. There’s a good chance that the workflow presented here won’t be a perfect fit for your organization, but that’s not really the point. What I want is for you to see how it can be done and find a way to expand it to suit your needs.
I hope you enjoy it. :)
The code used in this tutorial is available on GitHub, please raise an issue there if there is anything that doesn't work. Also, if you like or dislike this tutorial, I would be grateful if you could let me know. Any feedback is welcome!
If you found this tutorial useful, please share it with your peers.
Start from the beginning or choose your chapter
It’s possible to set up a zero-touch provisioning system using only switches and routers. As you can imagine this doesn’t provide the best of flexibility.
The network we are going to provision is very small, in fact, it fits on the desktop in my home office. The main reason why I choose this equipment is that I had them all at home and that they didn’t have any fans. If you are following along as we setup things you probably won’t have the same devices, they are basically end of life. It won’t, however, be hard to just make small modifications so that it works with your devices. Mostly it will be a matter of interface configuration.
Before starting this step, please note that as previously mentioned a Linux server is required. The other requirement is that you have to be running Python 3.
We will be using ISC DHCP as our server, while this guide describe the steps using Ubuntu 16.04 it should be similar enough on another Linux distribution.
While testing I would recommend that you start by having console access to your device so that you see what’s going on. As mentioned previously my first device will be a Catalyst 2960 switch, but you can use anything when following along. As long as you make slight changes to the configurations to suite your environment it will work just fine.
There’s one person who might laugh at you when you say you are doing zero-touch provisioning. Can you guess who it is?
The problem described in the installer feedback section, was that some devices send several requests to the TFTP server. Since we are using that event as the trigger we want to avoid sending notifications and starting other tasks twice.
As of yet, we haven’t done anything about the actual configuration of the device. Before we can apply the desired configuration we need some kind of input data. Lots of things could potentially be used for this including:
You might have noticed that we defined the staging credentials in two places when connecting to the device, it was both in the network-confg file which is a static file and in /opt/ztp/app/network.py. There are a few problems with this ranging from security concerns depending on where we store the files, to practical issues like if we change the password we need to do it in two places.
I dare to say that most organizations know less about their network than they want to. Especially when there’s a need to describe the network or use the information in any way. In theory, there might be a lot of things that are known about the network. However, often it’s hidden away in someone’s head, or some wiki page, old Visio diagrams or just lost to the ages.
Up until now, the TFTP server has only sent out the network-confg file as that is the one that new devices are requesting. Even if we can use dynamic templates the ZTP server doesn’t have a lot of information to act upon. We aren’t however actually only limited to the requested filename network-confg, we also have the IP address of the requesting device. The IP address can tell us where the device is located logically. The other obvious way would be to use static assignments for the device, this would require us to know the mac address of each device we are going to provision.
While things often feel better if you’re always positive and believe that things will work, in reality, it’s better to plan for failure. There are several things which can break down or go wrong in a ZTP solution.
Saving the configuration to flash makes sure that everything keeps working once devices are reloaded. Like a lot of things in this tutorial, there are many ways to solve this problem. All of the configuration commands gets applied in config mode on the Cisco device since you save the configuration in enable mode with “write mem” or “copy run start” you don’t have direct access to those commands from the config mode. There are however workarounds, so you could have this in your template:
As you never know when you need to apply a security patch to your devices you should have a good way to upgrade devices at any time. You do however also want the ability to control which image will be used as you are provisioning new devices. This could be just to comply to your standard, or it could be a matter of having to run a specific version since the configuration you want to use might not be available in the version installed on the device from the beginning.
While we have been setting up the ZTP application we have been running the TFTP server and RQ worker in separate windows just by running the commands. This works fine when testing things and developing the service, but if we are to use this in production we need something more robust.
If you’ve made it this far, thank you for your time! I hope you found this tutorial useful!