OpenStack OVS/OVN QoS

Tunnelled networks support in Neutron Placement API report

Long time ago (only one year after I started “playing” with OpenStack), Miguel Angel Ajo reported a RFE in Launchpad: strict minimum bandwidth support. After some releases, we finally added this functionality to OpenStack Neutron and now we are able to model and report to Placement API the available bandwidth of the physical interfaces connected to the physical networks (flat and VLAN networks).

But nowadays most of the OpenStack administrators deploy Neutron using overlay networks (VXLAN or Geneve) and that leads to a problem: how to report the available bandwidth for tunnelled networks.

How to represent the tunnelled traffic.

For physical backed networks (flat or VLAN networks), the resource_provider is the physical network interface. Both in ML2/OVS and ML2/OVN there are configuration parameters to define a link between a physical network and an OVS bridge (called bridge mappings in both mechanism drivers). Each physical network will have its own resource provider under the corresponding mechanism driver parent resource provider (and the upper parent, that is the compute node).

But we don’t have this information for tunnelled networks thus initially we can’t model this traffic in Placement API. However we know that in ML2/OVS, the tunnelled traffic goes through the local_ip VTEP interface, that is connected to a single physical network; in ML2/OVN, the tunnelled traffic between chassis uses its own overlay network. Knowing that all the ingress and egress tunnelled traffic from a compute node uses a single network, why don’t we create a specific resource provider name for this kind of network? The default name provided in the implementation of this feature is “rp_tunnelled” and can be changed.

This configurable string could be used as any other physical network in the Neutron Placement API configuration, and represents the available bandwidth (ingress and egress) of the tunnelled network connected to the compute node. The implementation patch provides some examples for both ML2/OVS and ML2/OVN mechanism drivers.

Shared resource providers.

And what is happening if we share the same network interface for VLAN tagged traffic (physical network) and the tunnelled traffic (VXLAN, Geneve)? That is a very common situation where the compute node has one single user traffic interface for any network type.

Let’s first refresh how does the Placement API work. We have a resource class (network bandwidth in this case); we have a resource provider (that means, something that has a finite and countable supply of resource class units); and then we have the “traits”, that are a description of a resource class.

Let’s make an example: we have an OVS agent in a compute node and this compute node is connected to the physical network “physnet” through the OVS bridge “br-physnet”. When the OVS agent starts, reads the configuration and creates a resource provider (that provides network bandwitdh, that is the resource class) and this resource provider has a trait describing it; by default Neutron creates traits with the name of the physical network, as in CUSTOM_PHYSNET.

Compute RP (name=hostname)
+-------+OVS agent RP (for OVS agent) inventory:
              +------+Physnet network interface RP,
                      traits: CUSTOM_PHYSNET, CUSTOM_VNIC_TYPE_NORMAL
                        {NET_BW_IGR_KILOBIT_PER_SEC: 10000,
                         NET_BW_EGR_KILOBIT_PER_SEC: 10000}

When a request for bandwidth is done, it is requested a resource class amount and a resource class type (a trait or a set of traits). This is described in the “QoS minimum bandwidth allocation in Placement API” spec.

Then, if we have two networks (one physical, one tunnelled) using the same physical interface, why don’t we describe the single resource provided with two different traits? For example, in the upper example, we can use the traits CUSTOM_PHYSNET and CUSTOM_RP_TUNNELLED in the same resource provider. The Placement API requests for both networks, using two different traits, will subtract the available bandwidth from the same place. This is what was implemented in “Allow shared resources between physical and tunnelled networks“.

Stay tuned for more posts!

Leave a Reply

Your email address will not be published. Required fields are marked *