Providing feedback to the installer

  • by Patrick Ogenstad
  • April 25, 2018

There’s one person who might laugh at you when you say you are doing zero-touch provisioning. Can you guess who it is?

It’s the person who is physically installing the devices. There’s no concept of zero-touch in that regard. It could be that this person is you, or it could be someone else who doesn’t have any access to the device after it has been installed. A common problem is that the installer doesn’t have a good way to get feedback about how the provisioning is going. If that person has traveled many miles to install the device it would be stupid to leave the site without a verification. A quick way to get that verification is to pick up the phone and call you. While you in some sense have a zero touch provisioning solution, you’ve also creating a support hotline for yourself.

Installer feedback

I’ve seen this done at scale, where the preparation for deploying new switches is a matter of quickly adding them to the inventory and then schedule a time for when to be available when the installer dials in. After a couple of hundred devices, you would think someone grew tired of it. Wouldn’t you?

A better approach

We should keep in mind that automation and progress, in general, is an iterative process and it can be a huge win just to avoid shipping pre-configured devices or requiring the person physically installing the device to also configure it from scratch. Once you have solved that problem just make sure you don’t stop there.

The first thing we should ask is what kind of notifications do we want to have, or in which situations. Some things that come to mind are notifications for a) when the process has started, b) when there’s an error along with actionable feedback, c) when the provisioning has successfully been completed.

To initialize a process like this we need an event to trigger it. There are several of these kinds of events which could do the trick, such as:

  • SNMP trap from the upstream switch, for interface up
  • Syslog message from the upstream switch, for interface up
  • Notification from the DHCP server when it hands out an address
  • Notification from the TFTP server when a file is requested
  • Notification from the provisioned device once it’s online

As it happens we already have a place where we can insert one of these hooks. If you remember when we were setting up the TFTP server we defined a Python function called session_stats().

def session_stats(stats):
    print('')
    print('#' * 60)
    print('Peer: {} UDP/{}'.format(stats.peer[0], stats.peer[1]))
    print('File: {}'.format(stats.file_path))
    print('Sent Packets: {}'.format(stats.packets_sent))
    print('#' * 60)

In this function from the “stats” object, we have access to the information such as what IP address the new device has and which file was requested. Currently, we’re only printing it out on the screen, which might be nice when we are testing things. But in the long run, it isn’t very practical.

The actual notification we use to start a process could be anything from starting a job in Stackstorm, ServiceNow or Salt. You will have to decide what makes sense for you. In this tutorial, I’m going to send the notification to Slack. While we could just add this code to the ztp_tftp.py file, we don’t want to clutter it too much. We might as well add this new code to a separate file and have some structure in the project.

On the tftp server:

mkdir /opt/ztp/app
touch /opt/ztp/app/__init__.py

Then we paste the below code into /opt/ztp/app/notifications.py

import json
import os
import requests


slack_token = os.environ.get('SLACK_TOKEN')

headers = {
    'Content-Type': 'application/json',
    'Accept': 'application/json',
    'User-Agent': 'ZTP Server'
}


def notify_slack(msg):
    url = 'https://hooks.slack.com/services/' + slack_token
    data = {}
    data['text'] = msg
    requests.post(url, headers=headers, data=json.dumps(data))

In this file we use the Python requests module which isn’t installed by default, so we need to add it.

pip install requests

As you might have noticed the code also requires a token for sending messages to Slack, here I use an environment variable called SLACK_TOKEN. You also need to setup an incoming webhook in Slack. Again note, this doesn’t have to be Slack. I only use it here because it’s an easy example, a trigger like this could point to anything.

The final step before we can test this is to add the code to make this call to our main ztp_tftp.py code.

Add the new line under the imports of fbtftp:

from fbtftp.base_handler import BaseHandler
from fbtftp.base_server import BaseServer
from app.notifications import notify_slack

Then add the notify_slack function in the session_stats function.

def session_stats(stats):
    print('')
    print('#' * 60)
    print('Peer: {} UDP/{}'.format(stats.peer[0], stats.peer[1]))
    print('File: {}'.format(stats.file_path))
    print('Sent Packets: {}'.format(stats.packets_sent))
    print('#' * 60)
    if stats.packets_sent > 0:
        notify_slack(
           'New request from {}, downloading {}'.format(
             stats.peer[0], stats.file_path))

Before we can restart the tftp server again we need to set the environment variable with the Slack token, which should be available when you setup the webhook.

export SLACK_TOKEN=[long_random_string_from_slack]

python ztp_tftp.py
WARNING:root:No callback specified for server statistics logging, will continue without

After reloading our first device again we should get a notification in Slack.

Installer webhook

We do get a notification, two in fact. Looking at the output from the TFTP server we also see this behavior:

############################################################
Peer: 172.29.50.18 UDP/57713
File: network-confg
Sent Packets: 1
############################################################

############################################################
Peer: 172.29.50.18 UDP/59347
File: network-confg
Sent Packets: 1
############################################################

Looks like the device is sending multiple requests from different UDP source ports. So the switch must really wants that file. While this doesn’t really matter, it doesn’t look great from a notification point of view.

After the initial notification

We’ve now added some feedback, but it isn’t enough. Is it? The installer is now able to see that the device has successfully booted up and has downloaded a file. This is a good start if this notification doesn’t appear in Slack the installer knows something is wrong.

What about after this? We still don’t know if the configuration was correctly applied to the device. We don’t know if the correct ports were used. We don’t know if the device is actually reachable. In short, this is a start.

It was fairly easy to add the hook to send a notification to Slack. How about if we just add other hooks there too. In the best case scenario, we would want something to log in to the device and validate its configuration. Probably also to validate its LLDP (or CDP) neighbors to make sure it’s connected to the correct port on the upstream device.

Just below the “notify_slack” function, we could add a function. As it will take a while until the device is actually reachable the function will have to start by waiting before trying to connect. Within this, we can add more calls to notify_slack.

For example, we could notify the installer if the device isn’t accessible after two minutes, we can let the installer know that. A different notification could be sent if the device is instead accessible. We can also use these kinds of triggers to hand over the device to some other automation framework like Nornir, Ansible, Salt or just configure the device with Napalm.

Before we get ahead of ourselves, remember that we got two notifications for the first device? While this might not be universal across devices, we don’t really want to have those duplicates. Especially if we trigger an Ansible playbook we don’t want those other tasks to run twice, or have two tasks which are trying to configure the device.