Parse Cisco IOS configurations using RegEx
In one of my earlier posts, I parse IP parameters from an existing Cisco IOS configuration using ciscoconfparse. In this post, I’ll like to provide some basic patterns how to parse (almost any) information from a running configuration. At first, I like to use only the standard python libraries. For these type of tasks, we will use regular expressions (RegEx). In the next post, I’ll take a look how to do the same with the ciscoconfparse library.
This post will focus on the parameter extraction part using regular expressions. If you like to generate a configuration based on the extracted values, please take a look at Configuration Generator with Python and Jinja2 post.
The examples in this post shows, how to parse a Cisco IOS configuration to get the following information:
- check if OSPFv2 is used as routing protocol
- extract the interface name and description
- extract the current IPv4 address (if any) from the interface
You can find the entire example code on my python script examples repository on GitHub.
find commands with regular expressions
I know that regular expressions (aka regex) are quite difficult for many people. I try to keep it as simple as possible in this post. At this point, you should know what regular expressions are. In a nutshell: regular expressions are a structured way, to define search patterns within text.
Python contains a standard module to work with regular expressions: the
re module. You don’t need to install anything to run the example code from this post.
The first example script checks if the command
router ospf is used within the configuration.
# check if OSPF is used as the routing protocol # the following regex_pattern matches only the "router ospf <process-id>" command (no VRFs) ospf_regex_pattern = r"^router ospf \d+$" # we will use the re.search() function, because the re.match() function ignores the MULTILINE flag # if the command is not found, the return value is None is_ospf_in_use = True if re.search(ospf_regex_pattern, sample_config, re.MULTILINE) else False if is_ospf_in_use: print("==> OSPF is used in this configuration") else: print("==> OSPF is not used in this configuration")
First, lets take a look at the regular expression:
^sign matches the start of a line/string
- then we look for the static string
\d+expression matches one or more digits (in our case the process ID)
$sign matches the end of a line/string
You have a ton of options when using regular expressions. There are many cheatsheets available, which explains the possible expressions and control structures in more detail. I use quite frequently the following one: regex cheatsheet at cheatography.
Before this regular expression works in python, we need to change the default behavior for the
^ and the
$ sign. By default these match the start/end of the entire string. If you set the
re.MULTILINE flag, it will match the start/end of a line, which is what we expect in this case. This is one reason, why we use the
re.search() function, because this flag don’t work with the
re.match() function, according the official python documentation for the regular expression module.
That’s it, if the search method returns not None, OSPF is used in the configuration.
parse the interface names and description
To know that something is used in a given configuration is good, but to read parameters from it is better. We can do this with group definitions in regular expressions. The following example shows how to read the interface name and the description for multiple interfaces within the configuration:
# extract the interface name and description interface_descriptions = re.finditer(r"^(interface (?P<intf_name>\S+))\n" r"( .*\n)*" r"( description (?P<description>.*))\n", sample_config, re.MULTILINE) for intf_part in interface_descriptions: print("==> found interface '%s' with description '%s'" % (intf_part.group("intf_name"), intf_part.group("description")))
You see, we need a bit more complex expression in this case, because we need to deal with the indentation of the Cisco IOS configuration format. Futhermore is possible, that other lines are between the interface and the description command.
The most important statements within the entire regex are the group definitions, which are expressed using the following syntax:
name parameter is used later in the script to identify the value, which is the content that matches the
regular_expression within the parenthesis.
I divided the expression into multiple parts, one for every (possible) line within the configuration. The first statement
(interface (?P<intf_name>\S+))\n will mach a line that starts with the string
interface following a string without whitespace until the end of the line. This string is used as the interface name.
The second statement
( .*\n)* matches zero or many lines within a single blank at the beginning (same configuration level) and any characters and signs until the end of the line. We need this command to maintain the configuration hierarchy if the description command don’t follows the interface statement.
The third statement is similar to our interface regex:
( description (?P<description>.*))\n
First we check if the line starts with a blank following the static string
description. If this matches, we use another group definition that matches any sign until the end of the line.
Now we got the pattern to read all interface names and descriptions from an existing configuration. To iterate over the results, we can use the
re.finditer() method. The resulting iterator can be used with a for-loop to get the matching statements within the existing configuration. You can now read the values by name from the group definitions in our regular expression with the
parse the IPv4 address of an interface
To extract the IPv4 address from an interface, you need minor changes on the regular expression that we already used to read the description value. The following code example works in the exact same way, but this time we match two groups in a single line, the IPv4 address following the subnet mask.
# extract the IPv4 address of the interfaces interface_ips = re.finditer(r"^(interface (?P<intf_name>.*)\n)" r"( .*\n)*" r"( ip address (?P<ipv4_address>\S+) (?P<subnet_mask>\S+))\n", sample_config, re.MULTILINE) for intf_ip in interface_ips: print("==> found interface '%s' with ip '%s/%s'" % (intf_ip.group("intf_name"), intf_ip.group("ipv4_address"), intf_ip.group("subnet_mask")))
You see, the functionality is the same as with the description. Now, you can use, for example, the
subnet_mask group to create an IPv4 object from the standard python library. This is quite useful if you plan to do some readdressing. I already wrote about the python ipaddress module in the past, therefore I skip further explanations in this post.
First the positive side of this approach: it works without any dependency to a third party python library. We got a generic set of parameter that we can use for whatever task (e.g. generating a documentation or create a configuration in a different format).
On the other hand, the list with drawbacks and risks associated to this approach is quite long. If you plan to use this in a larger project you might run into some maintenance issues, because at some point the code is difficult to read and to understand. Furthermore, there is some risk associated to the identification of the command hierarchy, because this association rely’s on the proper definition within the regular expression (e.g. if you plan to work with existing scripts, this approach might fail).
As you can see, the extraction of configuration parameters using only regular expressions is possible but quite difficult. In the next post, I’ll like to show you, how to accomplish similar results using the ciscoconfparse library.
Thats it for today. Thank you for reading.
Links within this post
- regular expressions (Wikipedia)
- regex cheatsheet (cheatography)
- documentation about the standard regular expression module (python 3.4)