L3/routing OpenStack OVS/OVN

Using a tunnelled network as router gateway in Neutron ML2/OVN

One symptom of narcissism is to make self-referring quotes; thus this is how I’ll start this post. Previously in this fantastic blog, we talked about the high availability router gateway ports in OVN. In this post we briefly commented that the gateway Logical Router Ports are scheduled across the gateway chassis, using any of the implemented OVN L3 schedulers. A router gateway port is the port that connects the router with the gateway network; that means all the external traffic is routed through this port. In OVN, all the SNAT traffic uses this (centralized) port; the DNAT traffic (floating IP traffic) could also use this port if DVR is not enabled or the compute node that is hosting the port has no external connectivity (that is enabled with the option enable-chassis-as-gw configured in the local OVS Open vSwitch register)

The OVN L3 scheduler.

The OVN L3 scheduler is a piece of code that takes a gateway Logical Router Port, retrieves the current chassis with external connectivity (that means gateway chassis) and assigns a set of chassis to this gateway Logical Router Port. That’s all, quite simple. But of course we love to make things complex in Neutron so the logic behind deserves some initial explanation. We have recently added a specific documentation page for it. As I always say, I recommend you to read the latest document version to understand how the different OVN L3 schedulers work.

But despite all the complexity of the logic, the goal is to create a set of gateway chassis and assign it to the Logical Router Port gateway_chassis field. Our college Numan Siddique has a very good post describing how to manually create an OVN router with distributed gateway. The section “Scheduling in HA mode” list the commands and the output to schedule a gateway port to multiple gateway chassis (same as the OVN L3 scheduler is doing).

“Bring me the candidate chassis.”

But before assigning the chassis gateways, the scheduler needs to select them. This is the method that gets the candidate chassis to be scheduled. Leaving aside the availability zone filtering, this method is (1) retrieving all chassis with external connectivity (that means with the key enable-chassis-as-gw in the CMS options) and then (2) filtering those chassis with the physical network (physnet) in the bridge mappings. Thus only those chassis with connectivity to a specific physical network can be candidates.

What happens if the external network is tunnelled? That there are no candidates and the Logical Router Port cannot be scheduled to any chassis. Or in other words, this router won’t have external connectivity.

L3 gateway router.

Ordinary OVN logical routers are distributed: they are not implemented in a single place but rather in every hypervisor chassis. However OVN supports L3 gateway routers, wich are OVN logical routers that are implemented in a designated chassis.

ovn-northd uses a special l3gateway port (instead of a patch binding) in the Southbound database to connect the logical router to its neighbors. In turn, ovn-controller tunnels packets to this port binding to the designated L3 gateway chassis, instead of processing them locally.

This functionality allow us to bind a router to a chassis, instead of binding the gateway Logical Router Port. If the chassis has connectivity to the external tunnelled network, the router will have external connectivity. By design, all gateway chassis have connectivity to the overlay network.

Easy peasy! If we assign a tunnelled network as gateway network to a router, instead of using the OVN L3 scheduler to assign the chassis to the Logical Router Port, we instead create a L3 gateway router assigning a chassis to the router. That is implemented in the router chassis pinning feature.

The only drawback of this feature is that all DNAT and SNAT traffic will be processed in the designated chassis and we won’t have distributed floating IP traffic.

High availability.

OVN has its own method to provide high availability services: the HA Chassis Group table. Each register in this table is a list of HA Chassis registers, which in turn is a chassis with a priority. For example, to provide connectivity to external ports, Neutron creates a HA Chassis Group with a set of HA Chassis, each one pointing to a external connectivity chassis. This HA Chassis Group is assigned to the Logical Switch Port. OVN will bind the port to the highest priority chassis of the list; if this chassis fails, OVN will catch this event, read the HA Chassis list and rebind the port to the next highest priority chassis.

But we still don’t have this functionality for Logical Routers (although is on its way!). Therefore we need to “manually” implement this high availability functionality in Neutron.

I didn’t mention that before: to pin a Logical Router to a chassis, we need to write the key chassis in the options field. Where this chassis is coming from? From the same artifact OVN uses for high availability: from a HA Chassis Group register that is created per Logical Router. But instead of assigning this HA Chassis Group to the Logical Router (that is not possible right now), we retrieve the highest priority chassis.

At the same time, Neutron has two event watches. The first one captures any chassis event. This event is capable of detecting if a chassis was removed or added, and updated the HA Chassis Group correspondingly. The second event captures any HA Chassis Group change. If the first event trigger has updated any HA Chassis Group associated to a Logical Router, this event will capture it. If the highest priority chassis has changed, the event watch updates the Logical Router options:chassis value. That rebinds the Logical Router to the new chassis.

Stay tuned for more posts!

Leave a Reply

Your email address will not be published. Required fields are marked *