Hi all,
For a while now, I wanted to write about networking automation. I see a lot of different views on this market and a lot of confusion. As it can really impact your business, you want to make the right decisions and investments going forward.
I have been working in the network automation space for over 18 years and I hope I can share some valuable insights to help you. When I started in 2002, I was part of a team that built the first release of a network provisioning system for the Rabobank Group. Since then, I have been involved in evaluating, selecting, implementing and building network automation solutions for many customers. I started NetYCE together with my business partner Eric Yspeert because we saw a gap in the market.
Let me start by giving you my definition of network automation. I define it as:
“The ability to automate configuration changes and related engineering change processes that allow you to create, update and delete end-to-end networks and services across multi-vendor and multi-domain networks. Ideally, with the flexibility to build any automation use-case, irrespective of the chosen network design.”
In essence, any network goes through (re)design-, build- and deploy cycles. Network designers, engineers and operators work with Word, Visio, Excel and Notepad to document what they do and they end up building configs and jobs to make the required network changes. In some cases, scripts are used.
Once deployed, there are lots of validation and troubleshooting activities that require logging into the production devices to fix things. This approach is often chaotic and unstructured, to say the least. A well-defined process to manage the design, build and deploy cycles is non-existent and left to the interpretation of individual persons.
So, in the end, network devices get configured as deemed best by the individual, resulting in configuration drift and different logic applied on production devices (both physical and virtual). In all my years in this business, I have seldom seen it being done more efficient. But as network automation becomes more popular, these processes need to get optimized.
When looking at solution requirements for network automation, you need to consider the full life-cycle around designing, building, deploying and validating network & config changes. This is where the confusion really starts, as most people consider network automation to only deal with job automation and config change management (backup and diff configs from production devices) or only focus on a single domain or single use-case solutions.
To control the full life cycle, you need to be able to automate all different process steps. Let’s quickly run through them, starting at the top.
Let me give you a bit of a historical perspective and categorize the different types of solutions that I have encountered so far. My objective is to highlight the different approaches that vendors have taken to solve the challenges around automating network configuration changes. It is by no means complete or a comparison of different solutions, but merely to give you more insight into how they match the requirements I defined above.
Originally the domain of network automation was led by vendors like Opsware (acquired by HP, now HPNA), Intelliden (acquired by IBM), Netmri (acquired by Infoblox) and BladeLogic (acquired by BMC). They called themselves automation tools, but if you looked more closely, the main focus was network configuration management; i.e. doing configuration backup of (multi-vendor) network devices with some basic command-job automation and possibly some compliance checks. Typically these solutions also had some kind of network monitoring functionality to check changes happening in the network.
The second wave of solutions were software vendors that can be best classified as commercial applications offering a predefined way of dealing with certain use-cases or type of networks. These can be a good fit for certain use cases, but as they lack the flexibility to build your own specific use cases, they can not be classified as (open) frameworks. I would say that vendors like Anuta networks, Solarwinds, Apstra and Gluware fall into this category.
Another category is solutions from network equipment vendors such as Blueplanet (Cienna) and Cisco NSO (former Tail-f). These are a mix of an application with some elements of a framework but typically lack the openness, flexibility and ease of use that engineers need. And they can be quite expensive. What characterizes them is a focus on runtime execution with vendor modules and libraries that contain the vendor’s OS syntax options in an effort to abstract the CLI for network engineers.
This can work well for service providers that have fairly standardized designs and services, but once you have specific design choices (as most do) and/or multi-vendor and multi-domain requirements, you end up with customization, requiring lots of (new) software development skills. What I often see happening, is that organizations buy network equipment from these vendors and start with their solutions, but as customization is cumbersome and labour intensive, network engineers start looking for other options.
In today's world when it comes to network automation, engineers are looking for agile and cheap frameworks that offer the flexibility to build any type of solution and use case they want for any type of network. So it’s logical that engineers started looking for Open Source solutions. In this category, you see tools that originated from the systems management domain, offering configuration management for (large) server environments, like Ansible, Puppet Chef and Saltstack.
Some of them have evolved quite well over the last few years to also automate configuration changes and jobs on network devices. Especially Ansible is very popular amongst network engineers with a big user community. Their main challenge, however, is that their key focus lies in device automation, or what I call, runtime automation. So instead of servers, you can now automate configuration changes on (many) network devices. But as these solutions didn’t originate from the network domain, many things you need for network and service provisioning are not readily available, such as managing the data, topology, dependencies, network specific parameters, relationships and design abstractions. I am not saying it can’t be done, but you need lots of programming skills and time to be able to build what you want. And you end up buying commercial versions like Ansible Tower in order to do this.
Then you have the programming frameworks (languages) that offer full flexibility like Python, Perl and other scripting languages. Especially Python has become very popular with numerous GitHub libraries like Netmiko, Napalm, Jinja2 and others developed by the open source community. Again, and even more so, the main challenge here is that you need to be more of a software developer than a network engineer to achieve what you want. And support is a key concern as whatever gets built will need to be supported by whoever built it. In many cases, when people leave the company, other people stop using it.
Another challenge with this programming approach is that solutions end up being developed for a single domain or single use case situations. So you end up with different solutions, scripts and data sources that don’t work together and are not shared between different teams in the company.
Then there is another category altogether with solutions like Onap, OpenMano, OpenStack and others that also originated from the open source domain aimed to offer complete framework solutions for mainly virtual domains. I would categorise them as an ‘IT-approach’ to network automation as they deal with the creation and orchestration of VNFs, storage, supervisors etc. As far as network automation, they primarily deal with the process of spinning-up (or down) VNF’s that have a pre-set of configurations.
Now, of course, it’s now far easier to spin up a VNF (e.g virtual firewall) with a base config than setting up and configuring a hardware box. But in the end, these VNFs still run a specific vendor OS (Cisco IOS/XR, Junos, Checkpoint etc.) and they still need specific syntax commands to be configured in a certain way in order to build end-to-end services. And in many cases, you also need to configure the PEs, Access devices and CPEs with attributes like VLANs, IP addresses, NNIs and what have you. Therefore, these solutions don’t fit my definition of network automation solutions.
Finally, there is the category that I call domain-specific solutions offered by networking vendors such as SD-WAN solutions (Versa), SDN controllers (Nokia Nuage) and solutions like Cisco ACI and many others. These also cannot be classified as network automation solutions, as in the end, these are merely innovative network solutions for a specific domain with some additional automation capabilities.
Let’s look back at my earlier definition of network automation:
“The ability to automate configuration changes and related engineering change processes that allow you to create, update and delete end-to-end networks and services across multi-vendor and multi-domain networks. Ideally, with the flexibility to build any automation use case, irrespective of the chosen network design.”
If I look at solutions that match this definition and actually offer the freedom and flexibility that engineers want, this only leaves development frameworks like Ansible, Python and NetYCE in my opinion. There are some differences of course, but they all represent an open framework that enables engineers to develop what they want. Depending on your desired level of programming skills (from high to low), you can choose, for Python, Ansible and NetYCE. I may be biased, but many other thought leaders in this industry share this opinion.
Let me finish with some key takeaways.
Network automation is not:
What network automation is all about: