Optimize live migration in OpenStack

Before diving in, I want to highlight the people responsible for this new optimization, who are Stefan Hoffman, Yatin Karel and Sean Mooney.

One of the major pain points in OpenStack is live migration, particularly regarding connectivity immediately after the virtual machine resumes on the destination node. In clusters with mission-critical workloads, it is imperative that network connectivity is restored instantly and seamlessly. That, sometimes, doesn’t happen and the connectivity is lost for short time (some seconds); that is more frequent in high-loaded environments hosting large virtual machines.

The problem.

The live migration can be summarized in the following steps:

Preparation: Nova initiates the migration by verifying resources and instructing Neutron to provision the necessary logical network ports on the destination host. At this point Nova updates the Neutron port with the key migrating_to
Memory Transfer: QEMU copies the running virtual machine’s memory pages from the source hypervisor to the destination hypervisor while the virtual machine continues to handle traffic.
Cutover: The virtual machine is momentarily paused on the source host to transfer the final CPU state and remaining dirty memory pages to the destination.
Activation: The virtual machine resumes execution on the destination host, activates its network interface, and broadcasts a RARP packet to announce its presence. That is triggered by the Nova Compute agent, after the Neutron API confirms to the Nova API that the port is plugged.
Port Claiming: The ovn-controller on the destination host detects the active interface and updates the OVN Southbound Database to officially claim the physical port binding.

(I think little Timmy has a question) “How is possible that the Neutron API sends the confirmation about the port being plugged but that actually that happens when the virtual machine is activated and then the port is claimed?”

Because Neutron lies. The ML2/OVN mechanism driver receives a port update after Nova starts the migration; if the port is down and it has the “migrating_to” field, Neutron considers this port is in the middle of a live migration and sets it up in order to send the vif-interface-plugged event. This event is expected by Nova to resume the virtual machine in the destination port (activation phase).

In the described sequence, the TAP interface is created in the destination host by QEMU in the activation phase. At this point, the local ovn-controller detects the new port and starts the port claiming phase. That triggers the flow recalculation and the update of the Open Flow rule table of the local Open vSwitch instance. As you can figured out, that is too late if the virtual machine was already using the network. Unless the port claiming and the flow table update is done very fast, some packets will be dropped irremediably.

The solution.

Like my grandma used to say, quit lying. But that requires an extra little effort from the Nova Compute agent: it is needed that the TAP device is created in the destination host before the activation and then you need to define the virtual machine XML specifying that this interface won´t be managed by libvirt.

So the first patch implemented (TAP creation by os-vif) makes the library os-vif responsible for the TAP device creation when the a virtual machine is created or migrated. Of course, Nova must be aware of this new functionality, handled by os-vif, and must build the XML definition in accordance with it (Support the pre-creation of OVS/OVN ports).

The next step is the port binding event handled by Neutron that triggers the vif-interface-plugged event. As commented in the previous section, Nova starts the activation of the virtual machine in the destination host when Neutron informs that the TAP device is present there. But this time Neutron won’t be lying, it will be waiting for a TAP device in the destination host (Wait for the additional chassis in the Port_Binding in live migration).

The lever: OVN support for multiple requested-chassis.

All this is possible in OVN since the support for multiple requested chassis feature. This functionality is described by the author himself (Ihar Hrachyshka) in his own blog post. If you want to deep dive in this feature, I recommend you to read it. Before this functionality, the live migration was a bit painful in OVN because a port had only one single chassis associated. That is true for most of the time: when a virtual machine is created, the TAP device associated to this machine is created in the compute node where the virtual machine is spawned. The ovn-controller of this node detects this new port and claims it; that updates the Port_Binding register associated to this port and binds this port to this chassis. B

But during the live migration, for a short period of time, the port exists in two compute nodes at the same time. Before this feature, both ovn-controller services tried to claim the same port, creating a Port_binding flapping in the Southbound database.

But with this new feature, OVN is capable of assign two chassis to the same port, using the requested_chassis and requested_additional_chassis fields in the Port_binding register. That is what Neutron is now expecting to consider that a port is being migrated.

Next steps: more room for improvement.

OVN (the ovn-controller) is in charge of updating the local Open vSwitch OpenFlow rules when a new port is claimed and bound to a chassis. When the ovn-controller detects a change, it recomputes the logical flows. That is translated to Open Flow rules that are set in the local instance using the ofctrl module. Usually that takes very little time, but in busy environments with thousands of virtual machines, ports, networks and ACLs, that could take several seconds.

The next step to ensure that everything is in place before moving the virtual machine to the destination host is to detect when the ovn-controller has effectively installed the new flows. This information, that could be stored in the Port_Binding record (for example), could be read by Neutron to declare that the destination host is ready and at this point send the vif-plugged message to the Nova API.

If OVN is capable of detecting and informing that the Open Flow rules are in place and Neutron is able to read that status, we would effectively send the vif-interface-plugged event when the destination backend is ready to receive packets from the migrated port.

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

The problem.

The solution.

The lever: OVN support for multiple requested-chassis.

Next steps: more room for improvement.

Leave a Reply Cancel reply