What? Who needs regular expressions in the age of automation? I do! And maybe you need them too.

Personally I think that there are times when regular expressions are the right tool for the job. They have been around for a while and they are still a very useful aid that can be used to solve many problems in the network engineering space.

Need to quickly find where ACL is applied, or which config has got the specific IP in it? How about quickly audit config files for the missing configuration line? Regex to the rescue!

Contents

Introduction

If you never heard of regular expressions, or regex in the short form, they can be described in simple terms as a language used to describe search patterns. These patterns, when applied to text, allow for finding, as well as replacing, matching characters.

Regular expressions have seen widespread adoption over the years and can be used in Linux utilities like grep, sed, awk, and countless others. Quite a lot of software out there has some kind of support for regex, you'll find in them in Notepad++, and Wireshark for example.

Most of the networking vendors use regex in the CLI to filter output of the commands, and in some configuration commands, like as-path access lists. Regular expressions are also supported by pretty much all of the programming languages out there, including Perl, Go, and Python.

Basic syntax and metacharacters

I will use only basic regex syntax in this blog post as this is meant as an introduction, and because even basic regex can help us solve many problems.

We'll begin with metacharacters, characters that have special meaning. When combined with regular characters they allow us to do pretty funky things. Below is a short list of metacharacters and their meaning.

Matching single character

Below metacharacters are used to match single characters in the text.

  • . – dot matches any character once, to match dot character itself you need to escape it i.e. use "\."

  • [ ] – character class, matches any one character listed; examples of character classes and their meaning:

    • [0-9] – digits from 0-9
    • [a-z] – letters from a to z
    • [A-Z] – capital letters
    • [-a-z_] – a to z and _ (underscore) as well as – (dash), dash has to come first as it has special meaning when used between other characters
    • [0-9a-zA-Z] – digits and all of the small and capital letters
  • [^ ] – negated character class, matches any one character that is not listed; examples:

    • [^a-d] - matches all characters apart from lower case a, b, c and d
    • [^-_|] - matches all characters apart from -, _ and |

Quantifiers

These allow us to specify number of occurrences of preceding elements, either single characters or group of characters.

  • ? – whatever precedes is allowed once but it is optional, e.g. "akira?" - matches "akir" and "akira"

  • * - any number of preceding allowed, all are optional, often used with . (dot)

  • + - at least one required, any extra ones are optional,
    {min,max} – allowed between min and max, e.g. [0-9]{1,3} matches 3, 50 and 980

Match position

Used to anchor the match, i.e. limit the positions at which match can occur.

  • ^ - matches the position at the start of the line

  • $ - matches the position at the end of the line

Other metacharacters

Other useful metacharacters.

  • | - matches either element on the left or right, (cat|dog) will match both "cat" and "dog"

  • ( ) – can be used to group for quantifiers, capture for backreferences or limits scope of alternation "|"

  • \1 , \2 – backreferences, these refer to text matched within first, second, etc., set of parentheses.

Regular expressions examples

I realise that some of the above might make little sense so I'll move right to examples, as the best way to learn regexes is to use them.

I'll be using grep, a popular Unix tool, also available for Windows, as it comes pretty much with every flavour of Linux out there. Occasionally other Unix tools will be used to help with filtering output.

Example 1 - find hostname in the config file

The first one is very simple, but hey, we have to start somewhere. This allows for quick extraction of the hostname from the config file. For example you could use this to validate that the configured hostname matches hostname used to name the config file, e.g. the one downloaded by Rancid.

[przemek@quasar configs]$ grep 'hostname' festive_curran.cfg
hostname festive_curran

Example 2 - Find all of the IP addresses configured on the device

Sometimes you need to extract all of the interface IPs from the config, this regex does just that:

[przemek@quasar configs]$ grep '^ ip address' stoic_davinci.cfg
 ip address 10.1.1.1 255.255.255.0
 ip address 10.197.154.1 255.255.255.0
 ip address 10.116.81.1 255.255.255.0
 ip address 10.216.198.1 255.255.255.0
 ip address 10.83.224.1 255.255.255.0
 ip address 10.112.175.1 255.255.255.0
 ip address 10.204.200.1 255.255.255.0
 ip address 10.100.125.1 255.255.255.0
 ip address 10.101.237.1 255.255.255.0
 ip address 10.238.220.1 255.255.255.0
 ip address 10.142.52.1 255.255.255.0
 ip address 10.85.34.1 255.255.255.0
 ip address 10.199.65.1 255.255.255.0
 ip address 10.160.62.1 255.255.255.0
 ip address 10.232.123.1 255.255.255.0
 ip address 10.79.73.1 255.255.255.0
 ip address 10.97.179.1 255.255.255.0
 ip address 10.54.60.1 255.255.255.0
 ip address 10.19.173.1 255.255.255.0
 ip address 10.255.177.1 255.255.255.0
 ip address 10.59.254.1 255.255.255.0
 ip address 10.73.101.1 255.255.255.0
 ip address 10.15.120.1 255.255.255.0
 ip address 10.252.116.1 255.255.255.0

Example 3 - Finding all of the IPs belonging to given network

If you have an aggregate/network that is used across many devices, you can quickly find out what IPs have been used, and where, using this single regex. Note '*' at the end of the line, this tells grep to look for the matches in all of the files in the current directory.

[przemek@quasar configs]$ grep '^ ip address 10\.66\.' *
boring_lamport.cfg: ip address 10.66.85.1 255.255.255.0
boring_lamport.cfg: ip address 10.66.14.1 255.255.255.0
cocky_carson.cfg: ip address 10.66.221.1 255.255.255.0
confident_kowalevski.cfg: ip address 10.66.161.1 255.255.255.0
frosty_lamarr.cfg: ip address 10.66.248.1 255.255.255.0
mystifying_montalcini.cfg: ip address 10.66.207.1 255.255.255.0

Example 4 - Find devices that have specific config line in them

Despite being very simple in nature this regex comes incredibly useful when auditing network configs as it allows you to quickly find devices with specified config line. We tell grep to show only files with the match by using option '-l' (lower case L)

[przemek@quasar configs]$ grep -l 'ip name-server 8.8.8.8' *
festive_curran.cfg
flamboyant_noether.cfg
frosty_lamarr.cfg
goofy_varahamihira.cfg
laughing_hermann.cfg
modest_panini.cfg
mystifying_montalcini.cfg
nostalgic_ptolemy.cfg
optimistic_nightingale.cfg
practical_benz.cfg
stoic_davinci.cfg
stupefied_stonebraker.cfg
thirsty_shannon.cfg
vigilant_heyrovsky.cfg
zen_mirzakhani.cfg
zen_snyder.cfg

It can happen that you are redirecting the output somewhere and you don't want to see file extensions. By using another Unix utility, called 'cut', we can get rid of the file extension. In our case -d '.' specifies delimiter, i.e. dot, and -f 1 tells cut to display first field only.

[przemek@quasar configs]$ grep -l 'ip name-server 8.8.8.8' * | cut -d '.' -f 1
festive_curran
flamboyant_noether
frosty_lamarr
goofy_varahamihira
laughing_hermann
modest_panini
mystifying_montalcini
nostalgic_ptolemy
optimistic_nightingale
practical_benz
stoic_davinci
stupefied_stonebraker
thirsty_shannon
vigilant_heyrovsky
zen_mirzakhani
zen_snyder

Example 5 - Find devices missing specific config

This is similar to example 4, except here we want to know which devices miss specified config line. Again, very handy for conducting configuration audits. Displaying of file names without match is enabled by using grep with '-L' option.

[przemek@quasar configs]$ grep -L 'no ip domain lookup' *
optimistic_nightingale.cfg
stoic_davinci.cfg

Example 6 - Find number of occurrences of given config item

Depending on the hardware there might be limitations on the number of configured instances of given feature. By crafting simple grep and piping output to another Unix tool, called wc, we can quickly arrived at the number of occurrences of given element. Option -l (lower case L) tells wc to display number of lines only. By default wc shows number of lines, words, and characters.

[przemek@quasar configs]$ grep 'ip access-list extended' dazzling_brattain.cfg | wc -l
49

Example 7 - Find all configured VLANs

Let's talk about something a bit more complex, that appears simple at first. We compose naïve regex for matching lines with vlan ids, 'vlan', and run it against config file:

[przemek@quasar configs]$ grep 'vlan' zen_snyder.cfg
switchport access vlan 150
switchport trunk allowed vlan
switchport trunk native vlan

Right, this didn't work quite as expected. We need to be more strict with our regex. We know that config line will look like 'vlan <1-4095>' so let's try composing regex that matches that.

'^vlan [0-9]{1,4}$'

A little breakdown of the above regex:

  • ^ - start from the beginning of the line (no spaces or other preceding characters)
    'vlan ' - literal word 'vlan' followed by space
  • [0-9]{1,4} - digit from 0 to 9, between 1 and 4 times
  • $ - end of the line, so no characters allowed after last digit match

Strictly speaking this could also match VLANs with invalid ID, like 9501 or 0001, but we know that this should never show up in the config so it's ok to use it here.

I'm using grep with -E flag here to enable extended regular expressions, without it we'd have to escape some of the metacharacters.

[przemek@quasar configs]$ grep -E '^vlan [0-9]{1,4}$' zen_snyder.cfg
vlan 10
vlan 150
vlan 160
vlan 170
vlan 180
vlan 195
vlan 2001
vlan 2002

Looking better now, we have all of the vlans. This might be, or might not, be useful. What if we want to extract names instead of VLAN IDs? We'll see how to do it in Example 8.

Example 8 - Find names of configured VLANs

We know that names of vlans come on a line after vlan id (for most of the cisco like config files). It so happens that grep with -A option will print N lines after the match. Taking advantage of this we make a modification to our grep command:

[przemek@quasar configs]$ grep -E -A 1 '^vlan [0-9]{1,4}$' zen_snyder.cfg
vlan 10
 name VOICE
vlan 150
 name PROCUREMENT
vlan 160
 name SALES
vlan 170
 name ACCOUNTS
vlan 180
 name MARKETING
vlan 195
 name IT-SUPPORT
vlan 2001
 name INET_NTT
vlan 2002
 name INET_LEVEL3

Option -N works as advertised, great stuff. Now, say we want to see only names of the VLANs and nothing else. We could pipe the above output to another grep, to show lines with 'name' in them, and then pipe it to a Unix tool like cut or awk, to extract the actual names. I'm going to use awk here:

awk '{print $NF}'

$NF - refers to the last field, and this awk expression is very handy for displaying last column of the given output:

[przemek@quasar configs]$ grep -E -A 1 '^vlan [0-9]{1,4}$' zen_snyder.cfg | grep 'name' | awk '{print $NF}'
VOICE
PROCUREMENT
SALES
ACCOUNTS
MARKETING
IT-SUPPORT
INET_NTT
INET_LEVEL3

It is possible to do the name extraction using regex in the Perl mode, but it's a more advanced topic that I might talk about in one of the future blog posts.

Conclusion

I hope that this short introduction to regular expressions, with a brief mention of other Unix tools, showed you how quickly you can perform some of the tasks when working with configuration files. Presented examples show how with one command and only rudimentary knowledge of regex syntax you can get useful results.

Personally, I find that it's difficult to beat regex for answering ad hoc questions related to network devices and their config. This is especially true when you're only give minutes to accomplish the task.

Once you learned a few basic expressions you might find yourself wanting more. I plan writing more on the topic of regular expressions and use of Unix tools, so stay tuned!

In the meantime you can use one of the online tools for practicing your regex, https://regex101.com/ being one that I use quite often.