I want to start this blog post by clearing something up; when you decide to start exploring network automation you’re essentially starting at the bottom of the mountain, therefore you’d be ill advised to try and plot a new path and get to the top on your own. You’re much better off taking the path well troden, following the examples of others and ultimately re-using people’s code and ideas.
Now that might sound like plagerism but it’s really not, as you get more involved in the coding community (community being the key word here) you come to realise that a lot of code is made up of little bits of other people’s code. If you need to solve a problem while coding, chances are someone else has already faced the same challenge, come up with, and shared the answer.
Whether it be a code snippet, a Python module or a fully functioning script, these solutions have been shared with the sole intention of helping others and attempting to find an alternative is frankly just re-inventing the wheel.

I guess the point I’m making here is that Google is your friend, sites such as StackOverflow and the Cisco Devnet community forums are brimming with information, help and solutions so use them as much as you can and eventually give something back by contributing some answers (or a humble blog) yourself.

Now here’s the but……

Having said all of that there is a bit of an issue out there that forms the main topic of this post. Over the last couple of years I’ve spent a lot of time online researching, looking for answers to problems and generally looking into how other people are achieving their automation goals. There are quite a few how to articles popping up but there seems to be a recuring theme, as a general rule the automation is done in an ideal world!

Now that’s a pretty big generalisation and I should point out that there is a lot of excellent content out there that really gets into the techincal nuts and bolts of how network automation works. What I mean is that these guides are focussed on delivering the how from a tooling perspective and in order to demonstrate that tooling (and crucially for the reader to understand it), the scenarios used are often structured in a very uniform and consistent way.

Unfortunately for us, most networks aren’t very uniform and they certainly aren’t consistent. I’m sure that every team of network engineers starts off with the best of intentions but over time inconsistencies or configuration drift will creep into your network until no two devices are the same, there are no guarantees that every device conforms to the assumed build standards and some devices will just become complete anomalies for one reason or another. This scenario is the inevitable result of manual changes being carried out by a changing group of human network engineers over many years, every network device becomes a ‘snowflake’ and this is one of the biggest challenges you’ll face when introducing automation to a long established network.

Let’s look at a simple example….

You have a network that consists of 50 routers and a request has been raised to configure a new interface on each one, details are as follows:

Interface: GigabitEthernet 0/1/0
Description: New interface for 2nd ISP link
Status: Shutdown
IP Address: 10.1.x.1/24
VRF: ISP2-VRF

This is a perfect example of where you could use automation to carry out a repetitive task to consistently configure many interfaces across many devices. One method you could use is a simple Jinja2 template to generate each individual configuration which can then be pushed out using Ansible

{% for host_ip in range(1,51) %}
configure terminal
interface GigabitEthernet 0/1/0
ip vrf forwarding ISP2-VRF
description New interface for 2nd ISP link
shutdown
ip address 10.1.{{host_ip}}.1 255.255.255.0
{% endfor %}

So far so good……

Now let’s take a look at the effect of running that automation on the first five routers on your network

Router Result
1 Success!
2 Failed - VRF does not exist (configured on device as “ISP2_VRF”)
3 Failed - Interface was already in use for an ad-hoc connection to AWS, the config change placed this interface in a shutdown state and caused an outage
4 Success!
5 Failed - GigabitEthernet 0/1/0 does not exist on this older device

Now this is a hypothetical scenario with a lot of bad luck included but you get the picture, the big lesson here (and the real point of this blog) is…

Before you start to automate configuration changes in a brownfield network, you must first understand the current state of your network devices!

OK so now we’ve got that out of the way how on earth I hear you ask, do we go about understanding the current state of the network?

Well the simple answer is exactly how you would if you were doing it manually; by issuing show commands on your devices and processing the output. Where we differ from the manual process is in the fact that we can make a computer do most of the heavy lifting when it comes to gathering, assessing and reporting on that data. The caveat to all of this of course is that the data that we feed into the computer needs to be in a machine readable format, something which is not natively the case with a lot of network devices.

If we issue a simple show ip interfaces brief on a Cisco IOS router for example the output is very readable to a human but for a computer, not so much:

Router# show ip interface brief
Interface             IP-Address      OK?    Method Status     	Protocol
GigabitEthernet0/1    unassigned      YES    unset  up         	up
GigabitEthernet0/2    192.168.190.235 YES    unset  up         	up
GigabitEthernet0/3    unassigned      YES    unset  up         	up
GigabitEthernet0/4    192.168.191.2   YES    unset  down       	down

Now to a computer the above is just a collection of text, it means absolutely nothing. To give it meaning we need to convert this output into a machine readable format, and for that we need a parser.

Parsers are simply a way of passing over data, in our case the plain text above, looking for certain paterns or keywords in the text and converting it into a format that a computer can understand. One tool commonly used for this purpose is regular expresion or REGEX for short. REGEX can be described as a patern used to match character combinations within a sting or body of text, paterns can be relatively simple but can also combine many special symbols order to perform complex matches on large data sets.

Using a parser specifically written to extract values related to our show command we can convert the above output into a machine readable format, in this case JSON:

{
    "interface": {
        "GigabitEthernet0/1": {
            "ip_address": "unassigned",
            "interface_status": "up",
            "protocol_status": "up",
        },
        "GigabitEthernet0/2": {
            "ip_address": "192.168.190.235",
            "interface_status": "up",
            "protocol_status": "up",
        },
        "GigabitEthernet0/3": {
            "ip_address": "unassigned",
            "interface_status": "up",
            "protocol_status": "up",
        },
        "GigabitEthernet0/4": {
            "ip_address": "192.168.191.2",
            "interface_status": "down",
            "protocol_status": "down",
        }
}

As you can see we now have the same device information from a manual show command but crucially, in a machine readable format.

So the big question is…. how is that useful to me?

Well the answer is that you can now make decisions during your automation on a far more granular scale, now that you know the state of your device or devices, you can take different actions on an item by item basis (items could be anything such as a group of devices but in this example would be a group of interfaces)

If we circle back to our original example of configuring a new interface on a number of devices, let’s assume that we’ve used a parser to understand the state of the interface on each device before we do any configuration.

{
    "Devices": {
      "router1":{
          "GigabitEthernet_0/1/0_present": "True",
          "interface_status": "down",
          "ispvrf_present" : "True"},
      "router2":{
          "GigabitEthernet_0/1/0_present": "True",
          "interface_status": "down",
          "ispvrf_present" : "False"},
      "router3":{
          "GigabitEthernet_0/1/0_present": "True",
          "interface_status": "up",
          "ispvrf_present" : "True"},
      "router4":{
          "GigabitEthernet_0/1/0_present": "True",
          "interface_status": "down",
          "ispvrf_present" : "True"},
      "router5":{
          "GigabitEthernet_0/1/0_present": "False",
          "interface_status": "N/A",
          "ispvrf_present" : "True"}
      }
}

Armed with this information we can add some if logic to our jinja2 template in order to take different actions depending on the results of the tests:

{% for device in devices %}
{%   if device.GigabitEthernet_0/1/0_present == True %}  <-- Test 1
{%     if device.ispvrf_present == True %}  <-- Test 2
{%       if device.interface_status == down %}  <-- Test 3
configure terminal
interface GigabitEthernet 0/1/0
ip vrf forwarding ISP2-VRF
description New interface for 2nd ISP link
shutdown
ip address 10.1.1.{{loop.index}} 255.255.255.0
{%       endif %}
{%     endif %}
{%   endif %}
{% endfor %}

Hopefully you can follow the logic in the above template, essentially it’s carrying out a series of tests before finally generating the device configuration if all tests pass. Using the devices from our previous example the results table would now be as folows:

Router Result
1 Success!
2 Skipped - Test 2 failed - VRF does not exist
3 Skipped - Test 3 failed - Interface was not in a shutdown state
4 Success!
5 Skipped - Test 1 failed - GigabitEthernet 0/1/0 does not exist on this older device

Now the above example is very simplistic but hopefully you get the point that by using the power of automation to first understand the existing state of your network, you can avoid some of the common pitfalls associated with making the assumption that all devices are configured the same.

Ultimately this is the difference between what I would describe as ‘fire and forget’ and inteligent automation, there is of course many use cases for both but I would always advise caution when pushing out change en masse without fully understanding existing state.

Look out for another blog coming soon where I will explore the world of PyATS and specifically the huge library of parsers available to help you really understand the state of your network.