Last week I had a customer with an issue in an OpenStack deployment running with ML2/OVN. Randomly, when creating a virtual machine, the Nova server returned a timeout during the VIF plugin. It took us some time to discover that the IP address assigned to the new port was already assigned to a rogue Logical_Switch_Port present in the OVN Northbound database but absent in the Neutron database. The new created virtual machine port was defined as virtual and Neutron cannot attach a virtual port to a virtual machine.
This kind of issues can be quickly addressed with the neutron-ovn-db-sync-util
script. This script compares the Neutron database and the OVS databases to detect these kind of discrepancies, returning the differences between them. Leaving apart the fact of how this error could happen, I would like to describe how to debug Neutron and OVN databases issues, but locally, within your own computer.
What is needed.
The databases, of course. I know that could sound pedantic, but is worth mentioning it. You need the three databases: the Neutron database and the OVN Northbound and Southbound databases.
The Neutron database can be dumped using the mysqldump client:
$ mysqldump --all-databases > openstack.sql # All OpenStack databases $ mysqldump neutron > neutron.sql # Only the Neutron database
The OVN databases can be retrieved directly from the filesystem. You can locate them by searching for the running processes and filtering by ovsdb-server
:
$ ps aux | ag ovsdb root 508586 0.2 0.0 12104 5944 ? S Jul12 0:40 ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/ovn/ovsdb-server-nb.log --remote=punix:/var/run/ovn/ovnnb_db.sock --pidfile=/var/run/ovn/ovnnb_db.pid --unixctl=/var/run/ovn/ovnnb_db.ctl --detach --monitor --remote=db:OVN_Northbound,NB_Global,connections --private-key=db:OVN_Northbound,SSL,private_key --certificate=db:OVN_Northbound,SSL,certificate --ca-cert=db:OVN_Northbound,SSL,ca_cert --ssl-protocols=db:OVN_Northbound,SSL,ssl_protocols --ssl-ciphers=db:OVN_Northbound,SSL,ssl_ciphers /var/lib/ovn/ovnnb_db.db
The last file, the one with the .db
extension, is the database file. If you are running the databases inside containers, you’ll need to get into them and execute the same command.
You will also need the Neutron configuration files, that by default are /etc/neutron/neutron.conf
and /etc/neutron/plugins/ml2/ml2_conf.ini
.
Run the Neutron database locally.
We are going to run the dumped database locally, but in a disposable way: using containers. Once the debugging is done, the container can be deleted and nothing will remain in the local system. Thus the first step will be to install podman
and download the MariaDB container. Remember that from this point, all commands should be executed as root:
$ dnf install podman $ podman pull docker.io/library/mariadb:latest
To run a container using this downloaded image you just need to execute:
$ podman run --name=neutron_db -e MYSQL_ROOT_PASSWORD=pass -p 3306:3306 -d mariadb:latest
Now it is possible to perform any mysql
command against this instance using the port (3306), user (“root”) and password (“pass”) defined. For example, we can load the OpenStack database file and perform any command:
$ mysql -h 127.0.0.1 -P 3306 -uroot -ppass < openstack.sql $ mysql -h 127.0.0.1 -P 3306 -uroot -ppass -e "use neutron; select * from ports;" mysql: [Warning] Using a password on the command line interface can be insecure. +----------------------------------+--------------------------------------+------+--------------------------------------+-------------------+----------------+--------+----------------------------------------------+--------------------------+------------------+---------------+ | project_id | id | name | network_id | mac_address | admin_state_up | status | device_id | device_owner | standard_attr_id | ip_allocation | +----------------------------------+--------------------------------------+------+--------------------------------------+-------------------+----------------+--------+----------------------------------------------+--------------------------+------------------+---------------+ | b9b53fc1293f42bc9717df55969ead9b | 73a79d2b-d052-48a9-bfd2-afb454e16f68 | | 2d826f9f-1a47-4de4-b2d0-7de9108ea824 | fa:16:3e:1c:b0:bf | 1 | DOWN | ovnmeta-2d826f9f-1a47-4de4-b2d0-7de9108ea824 | network:distributed | 13 | none | | | 77c3e112-8169-49eb-bdae-670f8ec4bbc3 | | d58576fe-1d6e-4db0-85e2-e2cb71dc430f | fa:16:3e:8a:a1:20 | 1 | ACTIVE | 394896dd-4a76-40bd-85f6-e6efd28c8e53 | network:router_gateway | 26 | immediate | | b9b53fc1293f42bc9717df55969ead9b | adeb7213-e97e-4f20-b26a-064593b0c944 | | 2d826f9f-1a47-4de4-b2d0-7de9108ea824 | fa:16:3e:45:dc:63 | 1 | ACTIVE | 394896dd-4a76-40bd-85f6-e6efd28c8e53 | network:router_interface | 24 | immediate | | 9668cbc179c547d2ba70d9c6f48da7d2 | c23d0070-88c8-4ef1-8539-f283b290b50e | | d58576fe-1d6e-4db0-85e2-e2cb71dc430f | fa:16:3e:d9:a5:96 | 1 | DOWN | ovnmeta-d58576fe-1d6e-4db0-85e2-e2cb71dc430f | network:distributed | 23 | none | | 9668cbc179c547d2ba70d9c6f48da7d2 | dc83f22a-c5bd-4315-9378-6b2e020f413c | | cd0b4dc9-df3e-42a0-ae7f-2fb75f506d88 | fa:16:3e:77:07:82 | 1 | DOWN | ovnmeta-cd0b4dc9-df3e-42a0-ae7f-2fb75f506d88 | network:distributed | 29 | none | +----------------------------------+--------------------------------------+------+--------------------------------------+-------------------+----------------+--------+----------------------------------------------+--------------------------+------------------+---------------+
Loading a big database can take time. In order to speed up the loading process, it could be better to copy the SQL file inside the container and execute the following commands:
$ podman cp openstack.sql neutron_db:~/. $ podman exec -uroot -it neutron_db bash # mysql > set autocommit=0; source openstack.sql; commit;
Run the OVN Northbound and Southbound databases locally.
Same as with the Neutron database, we are going to run the OVN databases in a container. We are going to download the Fedora container image and run a container:
$ podman pull docker.io/library/fedora:latest $ podman run --name=ovn_nb -p 6641:6641 -d fedora:latest sleep infinity
This procedure should be done twice, one per each OVN database. This is describing the OVN NB one. The next steps involve the installation of the OVN service inside the container and run the ovsdb-server process. When the local Open vSwitch service is started, both the ovsdb-server
and the vswitchd
services are started, but the second one fails (we won’t have a virtual switch running on the container); but it doesn’t matter, what is important here is the database server. In case of receiving a database file from a RAFT deployment, the ovsdb-tool
command will convert it to a standalone database.
$ podman cp ovnnb_db.db ovn_nb:/. $ podman exec -uroot -it ovn_nb bash $$ dnf install openvswitch ovn procps net-tools -y $$ ./usr/share/openvswitch/scripts/ovs-ctl start $$ ovsdb-tool cluster-to-standalone ovnnb_db.db.sa ovnnb_db.db $$ ovsdb-server ovnnb_db.db.sa --remote=ptcp:6641:0.0.0.0 --log-file=ovnnb-server.log --detach
Execute the neutron-ovn-db-sync-util
script.
At this point we have the three containers running the databases. It is possible to access any of them using the command line tools mysql
, ovn-nbctl
and ovn-sbctl
. The last step is to install and execute the sync tool. We’ll deploy everything in /tmp
because we don’t need to preserve anything after the debug analysis.
$ cd /tmp $ git clone https://opendev.org/openstack/neutron.git $ cd neutron $ python -m venv venv # creates a Python virtual environment $ . venv/bin/activate # activates the virtual environment $ python -m pip install -Ue . # locally installs Neutron $ python -m pip install -U pymysql
Now we have everything almost ready to execute the sync tool. We need a copy of the configuration files, that will be located in /tmp
. In these files we need to change the database accesses. It is needed to change the user, the IP address and the ports:
connection = mysql+pymysql://root:pass@127.0.0.1/neutron?charset=utf8 ovn_sb_connection = tcp:127.0.0.1:6642 ovn_nb_connection = tcp:127.0.0.1:6641
Now, inside the virtual environment that is activated:
$ neutron-ovn-db-sync-util --config-file /tmp/neutron.conf --config-file /tmp/ml2_conf.ini --ovn-neutron_sync_mode=log --log-file /tmp/log_sync.log
Playing a bit with the databases, you can, for example, delete a Logical_Switch_Port from the OVN NB database. In the log file you’ll see the following message:
WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovn_db_sync [^[[01;36mNone req-6db7eb4b-7a41-4cd8-bde1-947c82814add ^[[00;36mNone None] ^[[01;35mPort found in Neutron but not in OVN NB DB, port_id=77c3e112-8169-49eb-bdae-670f8ec4bbc3
Stay tuned for more posts!