NAPALM-Ansible - Automatic validation - part 2

This is the second post in the series on Napalm validation in which we will explore more options available to us for writing validation tests. First part can be found here: NAPALM-Ansible - Automatic validation - part 1.

First we'll make amendments to YAML files we used to describe desired LLDP state and see how it affects validation results. After that we'll introduce two more examples, one for checking BGP peerings, and one for verifying reported interfaces.

Contents


Partial string matching and regex

To start off let's change YAML file for vEOS-01, where instead of including full hostnames of LLDP neighbours we will include partial match only, and see what happens. We modify validation file for vEOS-01 so that Ethernet1 and Ethernet2 have 'vEOS' as the value of the neighbour.

Amended YAML file for vEOS-01:

[przemek@quasar validate]$ cat vEOS-01_lldp.yml
---
- get_lldp_neighbors:
    Ethernet1:
      - hostname: EOS-0
        port: Ethernet1
    Ethernet2:
      - hostname: vEOS
    Ethernet3:
      - hostname: CSR-01.lab
        port: Gi5

Before we run Napalm validation, let's check LLDP neighbours as reported by the device:

[przemek@quasar napalm_validate_p2]$ ansible-playbook get_facts_lldp.yml -e "device=vEOS-01"

PLAY [Get lldp neighbours] *******************************************************************************************************

TASK [Use napalm_get_facts to retrieve lldp neighbours] **************************************************************************
ok: [vEOS-01]

TASK [Debug lldp neighbours] *****************************************************************************************************
ok: [vEOS-01] => {
    "lldp_facts": {
        "ansible_facts": {
            "napalm_lldp_neighbors": {
                "Ethernet1": [
                    {
                        "hostname": "vEOS-02.lab",
                        "port": "Ethernet1"
                    }
                ],
                "Ethernet2": [
                    {
                        "hostname": "vEOS-03.lab",
                        "port": "Ethernet1"
                    }
                ],
                "Ethernet3": [
                    {
                        "hostname": "CSR-01.lab",
                        "port": "Gi5"
                    }
                ]
            }
        },
        "changed": false,
        "failed": false
    }
}

Now it's time to ask Napalm to verify this state against what we expect to find, as expressed in YAML file:

[przemek@quasar napalm_validate_p2]$ ansible-playbook validate_lldp.yml -e "device=vEOS-01"

PLAY [Validate LLDP neighbours] **************************************************************************************************

TASK [Use Napalm to automatically validate LLDP neighbours] **********************************************************************
ok: [vEOS-01]

TASK [Display full compliance report] ********************************************************************************************
ok: [vEOS-01] => {
    "val_lldp.compliance_report": {
        "complies": true,
        "get_lldp_neighbors": {
            "complies": true,
            "extra": [],
            "missing": [],
            "present": {
                "Ethernet1": {
                    "complies": true,
                    "nested": false
                },
                "Ethernet2": {
                    "complies": true,
                    "nested": false
                },
                "Ethernet3": {
                    "complies": true,
                    "nested": false
                }
            }
        },
        "skipped": []
    }
}

TASK [Compliance check failed] ***************************************************************************************************
skipping: [vEOS-01]

PLAY RECAP ***********************************************************************************************************************
vEOS-01                    : ok=2    changed=0    unreachable=0    failed=0

State validates just fine, which might come as a surprise. This is because by default Napalm checks if the string from the validation file is contained within the value returned by the device. Both 'EOS-0' and 'vEOS' fit the bill so Napalm is happy with this.

If you dynamically generate validation files, and always include full hostnames, this should not cause any problems. You won't generally have a false positive match here unless your device names overlap.

But what can we do if we want to ensure that the hostnames strictly comply with our intent? Luckily Napalm allows use of regular expressions to validate string values. For example, if you want to match hostnames exactly as they are, enclose them in ^$ characters, e.g.

- hostname: ^vEOS-01.lab$

To show use of regex we first check LLDP state on vEOS-02 with partial name match only, expressed as regex enclosed in ^$ characters, something that would pass validation if we didn't use regex:

[przemek@quasar napalm_validate_p2]$ cat validate/vEOS-02_lldp.yml
---
- get_lldp_neighbors:
   Ethernet1:
    - hostname: vEOS-01.lab
      port: Ethernet1
   Ethernet2:
    - hostname: ^vEOS-03$
      port: Ethernet2

Validation output:

[przemek@quasar napalm_validate_p2]$ ansible-playbook validate_lldp.yml -e "device=vEOS-02"

PLAY [Validate LLDP neighbours] **************************************************************************************************

TASK [Use Napalm to automatically validate LLDP neighbours] **********************************************************************
fatal: [vEOS-02]: FAILED! => {"changed": false, "compliance_report": {"complies": false, "get_lldp_neighbors": {"complies": false, "extra": [], "missing": [], "present": {"Ethernet1": {"complies": true, "nested": false}, "Ethernet2": {"actual_value": [{"hostname": "vEOS-03.lab", "port": "Ethernet2"}], "complies": false, "expected_value": [{"hostname": "^vEOS-03__aSyNcId_<_zOjFkZmF__quot;, "port": "Ethernet2"}], "nested": false}}}, "skipped": []}, "failed": true, "msg": "Device does not comply with policy"}
...ignoring

TASK [Display full compliance report] ********************************************************************************************
ok: [vEOS-02] => {
    "val_lldp.compliance_report": {
        "complies": false,
        "get_lldp_neighbors": {
            "complies": false,
            "extra": [],
            "missing": [],
            "present": {
                "Ethernet1": {
                    "complies": true,
                    "nested": false
                },
                "Ethernet2": {
                    "actual_value": [
                        {
                            "hostname": "vEOS-03.lab",
                            "port": "Ethernet2"
                        }
                    ],
                    "complies": false,
                    "expected_value": [
                        {
                            "hostname": "^vEOS-03__aSyNcId_<_zOjFkZmF__quot;,
                            "port": "Ethernet2"
                        }
                    ],
                    "nested": false
                }
            }
        },
        "skipped": []
    }
}

TASK [Compliance check failed] ***************************************************************************************************
fatal: [vEOS-02]: FAILED! => {"changed": false, "failed": true, "msg": "Non-compliant state encountered. Refer to the full report."}

PLAY RECAP ***********************************************************************************************************************
vEOS-02                    : ok=2    changed=0    unreachable=0    failed=1

This time Napalm reported non-compliant state as we forced the exact match for the neighbour found on the Ethernet2.

Below is one more example, this time for vEOS-03, showing exact matches with regex.

[przemek@quasar napalm_validate_p2]$ cat validate/vEOS-03_lldp.yml
---
- get_lldp_neighbors:
   Ethernet1:
    - hostname: ^vEOS-01.lab$
   Ethernet2:
    - hostname: ^vEOS-02.lab$
[przemek@quasar napalm_validate_p2]$ ansible-playbook validate_lldp.yml -e "device=vEOS-03"

PLAY [Validate LLDP neighbours] **************************************************************************************************

TASK [Use Napalm to automatically validate LLDP neighbours] **********************************************************************
ok: [vEOS-03]

TASK [Display full compliance report] ********************************************************************************************
ok: [vEOS-03] => {
    "val_lldp.compliance_report": {
        "complies": true,
        "get_lldp_neighbors": {
            "complies": true,
            "extra": [],
            "missing": [],
            "present": {
                "Ethernet1": {
                    "complies": true,
                    "nested": false
                },
                "Ethernet2": {
                    "complies": true,
                    "nested": false
                }
            }
        },
        "skipped": []
    }
}

TASK [Compliance check failed] ***************************************************************************************************
skipping: [vEOS-03]

PLAY RECAP ***********************************************************************************************************************
vEOS-03                    : ok=2    changed=0    unreachable=0    failed=0

This time state is compliant as hostnames in the validation file match exactly with hostnames reported by the device.

BGP peering validation example

Now, let's move to another example. Let's say we just turned up a new BGP peering with a 3rd party and we expect to receive no more than 10 prefixes from them. We will verify that BGP is up and that we receive the correct number of prefixes.

Our peer's IP is 10.1.11.2 and its AS is 65101. We'll have Napalm check the status of the peering on vEOS-01. The below output omits state of the internal BGP peers for brevity.

[przemek@quasar napalm_validate_p2]$ ansible-playbook get_bgp_peers.yml -e "device=vEOS-01"

PLAY [Get BGP neighbours] ********************************************************************************************************

TASK [Use napalm_get_facts to retrieve BGP neighbours] ***************************************************************************
ok: [vEOS-01]

TASK [Debug BGP neighbours] ******************************************************************************************************
ok: [vEOS-01] => {
    "bgp_peers": {
        "ansible_facts": {
            "napalm_bgp_neighbors": {
                "global": {
                    "peers": {		 
                        "10.1.11.2": {
                            "address_family": {
                                "ipv4": {
                                    "accepted_prefixes": -1,
                                    "received_prefixes": 16,
                                    "sent_prefixes": 7
                                },
                                "ipv6": {
                                    "accepted_prefixes": -1,
                                    "received_prefixes": 0,
                                    "sent_prefixes": 0
                                }
                            },
                            "description": "",
                            "is_enabled": true,
                            "is_up": true,
                            "local_as": 65001,
                            "remote_as": 65101,
                            "remote_id": "10.200.10.1",
                            "uptime": 86437
                        }
                    },
                    "router_id": "10.50.255.1"
                }
            }
        },
        "changed": false,
        "failed": false
    }
}

PLAY RECAP ***********************************************************************************************************************
vEOS-01                    : ok=2    changed=0    unreachable=0    failed=0

Similar to LLDP validation we need to create a YAML file with data structure matching that returned by Napalm's getter. We will ask Napalm to check that we have a peer with IP of 10.1.11.2 and AS of 65101. This peer needs to be configured and be up, and we should be sending exactly 7 IPv4 prefixes and receive no more than 10.

This desired state is captured in the below validation file:

[przemek@quasar napalm_validate_p2]$ cat validate/vEOS-01_bgp.yml
---
- get_bgp_neighbors:
    global:
      peers:
        10.1.11.2:
          is_enabled: true
          is_up: true
          remote_as: 65101
          address_family:
            ipv4:
              sent_prefixes: 7
              received_prefixes: '<=10'

Armed with the above, we ask Napalm to perform the check. Because full output was quite verbose I'm only including the interesting bits below:

[przemek@quasar napalm_validate_p2]$ ansible-playbook validate_bgp.yml -e "device=vEOS-01"

PLAY [Validate BGP peerings] *****************************************************************************************************

TASK [Use Napalm to automatically validate BGP peerings] *************************************************************************
fatal: [vEOS-01]: FAILED! => {"changed": false, "compliance_report": {"complies": false, "get_bgp_neighbors": {"complies": false, "extra": [], "missing": [], "present": {"global": {"complies": false, "diff": {"complies": false, "extra": [], "missing": [], "present": {"peers": {"complies": false, "diff": {"complies": false, "extra": [], "missing": [], "present": {"10.1.11.2": {"complies": false, "diff": {"complies": false, "extra": [], "missing": [], "present": {"address_family": {"complies": false, "diff": {"complies": false, "extra": [], "missing": [], "present": {"ipv4": {"complies": false, "diff": {"complies": false, "extra": [], "missing": [], "present": {"received_prefixes": {"actual_value": 16, "complies": false, "expected_value": "<=10", "nested": false}, "sent_prefixes": {"complies": true, "nested": false}}}, "nested": true}}}, "nested": true}, "is_enabled": {"complies": true, "nested": false}, "is_up": {"complies": true, "nested": false}, "remote_as": {"complies": true, "nested": false}}}, "nested": true}}}, "nested": true}}}, "nested": true}}}, "skipped": []}, "failed": true, "msg": "Device does not comply with policy"}
...ignoring

TASK [Display full compliance report] ********************************************************************************************
ok: [vEOS-01] => {
    "val_bgp.compliance_report": {
        "complies": false,
        "get_bgp_neighbors": {
            "complies": false,
            "extra": [],
            "missing": [],
            "present": {
                "global": {
                    ... (omitted for brevity)
                                    "present": {
                                        "10.1.11.2": {
                                            "complies": false,
                                            
                                                                "ipv4": {
                                                                    "complies": false,
                                                                    "diff": {
                                                                        "complies": false,
                                                                        "extra": [],
                                                                        "missing": [],
                                                                        "present": {
                                                                            "received_prefixes": {
                                                                                "actual_value": 16,
                                                                                "complies": false,
                                                                                "expected_value": "<=10",
                                                                                "nested": false
                                                                            },
                                                                            "sent_prefixes": {
                                                                                "complies": true,
                                                                                "nested": false
                                                                            }
    
                                                    },
                                                    "is_enabled": {
                                                        "complies": true,
                                                        "nested": false
                                                    },
                                                    "is_up": {
                                                        "complies": true,
                                                        "nested": false
                                                    },
                                                    "remote_as": {
                                                        "complies": true,
                                                        "nested": false
                                                    }

}

TASK [Compliance check failed] ***************************************************************************************************
fatal: [vEOS-01]: FAILED! => {"changed": false, "failed": true, "msg": "Non-compliant state encountered. Refer to the full report."}

PLAY RECAP ***********************************************************************************************************************
vEOS-01                    : ok=2    changed=0    unreachable=0    failed=1

Napalm informs us that peer 10.1.11.2 is enabled, is up, and has the expected AS number. The number of prefixes we send is also as expected, however the number of prefixes received from the peer is more than 10, hence the overall state is non-compliant.

Strict mode and 'list' key

In the last example we will look at workings of strict mode, and learn how to include list of objects.

By default Napalm ignores any extra state found on the device and only checks for state we defined in the validation files. That is, if we want to assert that a specific BGP peer exists then Napalm will return compliant state as long as that BGP peer is found on the device, even if there are 100 additional peers.

In the strict mode we can, for example, ask Napalm to check there is one, and only one, BGP peer on the device. Strict mode applies to the hierarchy level at which it is used. This way we can be selective and apply it only to the state that we feel requires it.

To showcase use of strict mode I'll validate that my device has all of specified interfaces, and no more or less interfaces are found. This example also includes 'list' key, which Napalm requires in order to validate list of objects. If you can see in Napalm's getter output multiple objects enclosed in '[]', then that means you need to include 'list' key when writing validator for that level.

YAML file with desired state:

[przemek@quasar validate]$ cat vEOS-03_interfaces.yml
---
- get_facts:
    interface_list:
      _mode: strict
      list:
        - Ethernet1
        - Ethernet2
        - Loopback0
        - Loopback100
        - Management1
        - Vlan700
        - Vlan721
        - Vlan4094

Validation results:

[przemek@quasar napalm_validate_p2]$ ansible-playbook validate_interface_list.yml -e "device=vEOS-03"

PLAY [Validate interfaces] *******************************************************************************************************

TASK [Use Napalm to automatically validate existing interfaces] ******************************************************************
fatal: [vEOS-03]: FAILED! => {"changed": false, "compliance_report": {"complies": false, "get_facts": {"complies": false, "extra": [], "missing": [], "present": {"interface_list": {"complies": false, "diff": {"complies": false, "extra": ["Loopback999"], "missing": [], "present": ["Ethernet1", "Ethernet2", "Loopback0", "Loopback100", "Management1", "Vlan700", "Vlan721", "Vlan4094"]}, "nested": true}}}, "skipped": []}, "failed": true, "msg": "Device does not comply with policy"}
...ignoring

TASK [Display full compliance report] ********************************************************************************************
ok: [vEOS-03] => {
    "val_intf.compliance_report": {
        "complies": false,
        "get_facts": {
            "complies": false,
            "extra": [],
            "missing": [],
            "present": {
                "interface_list": {
                    "complies": false,
                    "diff": {
                        "complies": false,
                        "extra": [
                            "Loopback999"
                        ],
                        "missing": [],
                        "present": [
                            "Ethernet1",
                            "Ethernet2",
                            "Loopback0",
                            "Loopback100",
                            "Management1",
                            "Vlan700",
                            "Vlan721",
                            "Vlan4094"
                        ]
                    },
                    "nested": true
                }
            }
        },
        "skipped": []
    }
}

TASK [Compliance check failed] ***************************************************************************************************
fatal: [vEOS-03]: FAILED! => {"changed": false, "failed": true, "msg": "Non-compliant state encountered. Refer to the full report."}

PLAY RECAP ***********************************************************************************************************************
vEOS-03                    : ok=2    changed=0    unreachable=0    failed=1

Napalm checked all of the interfaces and it noticed there is an extra interface it didn't expect to see there, so the overall state is non-compliant. Let's remove this interface, which seems to have been added in error, and re-run the playbook.

[przemek@quasar napalm_validate_p2]$ ansible-playbook validate_interface_list.yml -e "device=vEOS-03"

PLAY [Validate interfaces] *******************************************************************************************************

TASK [Use Napalm to automatically validate existing interfaces] ******************************************************************
ok: [vEOS-03]

TASK [Display full compliance report] ********************************************************************************************
ok: [vEOS-03] => {
    "val_intf.compliance_report": {
        "complies": true,
        "get_facts": {
            "complies": true,
            "extra": [],
            "missing": [],
            "present": {
                "interface_list": {
                    "complies": true,
                    "nested": true
                }
            }
        },
        "skipped": []
    }
}

TASK [Compliance check failed] ***************************************************************************************************
skipping: [vEOS-03]

PLAY RECAP ***********************************************************************************************************************
vEOS-03                    : ok=2    changed=0    unreachable=0    failed=0

Once the erroneously added interface has been removed Napalm happily reports that everything is in order. This shows that using strict mode indeed requires the state reported by the device state to match exactly desired state as captured by the validation definitions.

To summarise, additional options we discussed in this post allow us to be more precise with our validation tests and should help increase confidence in the correct state of infrastructure under our control. Even simple testing performed after maintenance can save hours of troubleshooting later so I'd highly encourage introducing Napalm, and it's validation module, into your toolbox.

Writing YAML files used by Napalm is fairly intuitive and small time investments can have big returns, especially if we automate creation of the used files, which we will talk about in the part 3 of this series. Stay tuned!

Playbook listings


get_facts_lldp.yml

---
- name: Get lldp neighbours
  hosts: "{{ device }}"
  connection: local

  tasks:
  - name: Use napalm_get_facts to retrieve lldp neighbours
    napalm_get_facts:
      provider: "{{ napalm_provider }}"
      optional_args:
        eos_transport: http
      filter: 'lldp_neighbors'
    register: lldp_facts

  - name: Debug lldp neighbours
    debug:
      var: lldp_facts


validate_lldp.yml

---
- name: Validate LLDP neighbours
  hosts: "{{ device }}"
  connection: local

  vars:
    val_dir: "{{ playbook_dir }}/validate"

  tasks:
  - name: Use Napalm to automatically validate LLDP neighbours
    napalm_validate:
      provider: "{{ napalm_provider }}"
      validation_file: "{{ val_dir }}/{{ inventory_hostname }}_lldp.yml"
      optional_args:
        eos_transport: http
    register: val_lldp
    ignore_errors: yes

  - name: Display full compliance report
    debug:
      var: val_lldp.compliance_report

  - name: Compliance check failed
    fail:
      msg: "Non-compliant state encountered. Refer to the full report."
    when: not val_lldp.compliance_report.complies


get_bgp_peers.yml

---
- name: Get BGP neighbours
  hosts: "{{ device }}"
  connection: local

  tasks:
  - name: Use napalm_get_facts to retrieve BGP neighbours
    napalm_get_facts:
      provider: "{{ napalm_provider }}"
      optional_args:
        eos_transport: http
      filter: 'bgp_neighbors'
    register: bgp_peers

  - name: Debug BGP neighbours
    debug:
      var: bgp_peers


validate_bgp.yml

---
- name: Validate BGP peerings
  hosts: "{{ device }}"
  connection: local

  vars:
    val_dir: "{{ playbook_dir }}/validate"

  tasks:
  - name: Use Napalm to automatically validate BGP peerings
    napalm_validate:
      provider: "{{ napalm_provider }}"
      validation_file: "{{ val_dir }}/{{ inventory_hostname }}_bgp.yml"
      optional_args:
        eos_transport: http
    register: val_bgp
    ignore_errors: yes

  - name: Display full compliance report
    debug:
      var: val_bgp.compliance_report

  - name: Compliance check failed
    fail:
      msg: "Non-compliant state encountered. Refer to the full report."
    when: not val_bgp.compliance_report.complies


validate_interface_list.yml

---
- name: Validate interfaces
  hosts: "{{ device }}"
  connection: local

  vars:
    val_dir: "{{ playbook_dir }}/validate"

  tasks:
  - name: Use Napalm to automatically validate existing interfaces
    napalm_validate:
      provider: "{{ napalm_provider }}"
      validation_file: "{{ val_dir }}/{{ inventory_hostname }}_interfaces.yml"
      optional_args:
        eos_transport: http
    register: val_intf
    ignore_errors: yes

  - name: Display full compliance report
    debug:
      var: val_intf.compliance_report

  - name: Compliance check failed
    fail:
      msg: "Non-compliant state encountered. Refer to the full report."
    when: not val_intf.compliance_report.complies

Full listings of the playbooks can also be found in my GitHub repository:
https://github.com/progala/ttl255.com/tree/master/ansible/napalm-validate-p2