~ubuntu-server-dev/ubuntu/+source/openvswitch:upstream

Last commit made on 2023-06-29
Get this branch:
git clone -b upstream https://git.launchpad.net/~ubuntu-server-dev/ubuntu/+source/openvswitch
Members of Ubuntu Server Developers can upload to this branch. Log in for directions.

Branch merges

Branch information

Recent commits

c7cc7f0... by Frode Nordahl

New upstream version 2.17.7

5ed136c... by Frode Nordahl

New upstream version 3.1.2

9167774... by Ilya Maximets

Set release date for 3.1.2.

Acked-by: Eelco Chaudron <email address hidden>
Signed-off-by: Ilya Maximets <email address hidden>

aba1862... by Balazs Nemeth <email address hidden>

ofproto-dpif-upcall: Don't set statistics to 0 when they jump back.

The only way that stats->{n_packets,n_bytes} would decrease is due to an
overflow, or if there are bugs in how statistics are handled. In the
past, there were multiple issues that caused a jump backward. A
workaround was in place to set the statistics to 0 in that case. When
this happened while the revalidator was under heavy load, the workaround
had an unintended side effect where should_revalidate returned false
causing the flow to be removed because the metric it calculated was
based on a bogus value. Since many of those bugs have now been
identified and resolved, there is no need to set the statistics to 0. In
addition, the (unlikely) overflow still needs to be handled
appropriately. If an unexpected jump does happen, just log it as a
warning.

Signed-off-by: Balazs Nemeth <email address hidden>
Acked-by: Eelco Chaudron <email address hidden>
Signed-off-by: Ilya Maximets <email address hidden>

8590712... by Ilya Maximets

ovsdb: monitor: Destroy initial change set when new columns added.

Initial change set is preserved for as long as the monitor itself.
However, if a new client has a condition on a column that is not
one of the monitored columns, this column will be added to the
monitor via ovsdb_monitor_condition_bind(). This new column, however,
doesn't exist in the initial change set. That will cause ovsdb-server
to malfunction or crash trying to access non-existent column during
condition evaluation:

 ERROR: AddressSanitizer: heap-buffer-overflow
 READ of size 4 at 0x606000006780 thread T0
     0 ovsdb_clause_evaluate ovsdb/condition.c:328:26
     1 ovsdb_condition_match_any_clause ovsdb/condition.c:441:13
     2 ovsdb_condition_empty_or_match_any ovsdb/condition.h:84:13
     3 ovsdb_monitor_row_update_type_condition ovsdb/monitor.c:892:28
     4 ovsdb_monitor_compose_row_update2 ovsdb/monitor.c:1058:12
     5 ovsdb_monitor_compose_update ovsdb/monitor.c:1172:24
     6 ovsdb_monitor_get_update ovsdb/monitor.c:1276:24
     7 ovsdb_jsonrpc_monitor_create ovsdb/jsonrpc-server.c:1505:12
     8 ovsdb_jsonrpc_session_got_request ovsdb/jsonrpc-server.c:1030:21
     9 ovsdb_jsonrpc_session_run ovsdb/jsonrpc-server.c:572:17
    10 ovsdb_jsonrpc_session_run_all ovsdb/jsonrpc-server.c:602:21
    11 ovsdb_jsonrpc_server_run ovsdb/jsonrpc-server.c:417:9
    12 main_loop ovsdb/ovsdb-server.c:222:9
    13 main ovsdb/ovsdb-server.c:500:5
    14 __libc_start_call_main
    15 __libc_start_main@GLIBC_2.2.5
    16 _start (ovsdb/ovsdb-server+0x473034)

 Located 0 bytes after 64-byte region [0x606000006740,0x606000006780)
 allocated by thread T0 here:
     0 malloc (ovsdb/ovsdb-server+0x50dc82)
     1 xmalloc__ lib/util.c:140:15
     2 xmalloc lib/util.c:175:12
     3 clone_monitor_row_data ovsdb/monitor.c:336:12
     4 ovsdb_monitor_changes_update ovsdb/monitor.c:1384:23
     5 ovsdb_monitor_get_initial ovsdb/monitor.c:1535:21
     6 ovsdb_jsonrpc_monitor_create ovsdb/jsonrpc-server.c:1502:9
     7 ovsdb_jsonrpc_session_got_request ovsdb/jsonrpc-server.c:1030:21
     8 ovsdb_jsonrpc_session_run ovsdb/jsonrpc-server.c:572:17
     9 ovsdb_jsonrpc_session_run_all ovsdb/jsonrpc-server.c:602:21
    10 ovsdb_jsonrpc_server_run ovsdb/jsonrpc-server.c:417:9
    11 main_loop ovsdb/ovsdb-server.c:222:9
    12 main ovsdb/ovsdb-server.c:500:5
    13 __libc_start_call_main
    14 __libc_start_main@GLIBC_2.2.5
    15 _start (ovsdb/ovsdb-server+0x473034)

Fix that by destroying the initial change set every time new columns
are added to the monitor. This will trigger re-generation of the
change set and it will contain all the necessary columns afterwards.

Fixes: 07c27226ee96 ("ovsdb: Monitor: Keep and maintain the initial change set.")
Reported-by: Han Zhou <email address hidden>
Acked-by: Han Zhou <email address hidden>
Reviewed-by: Simon Horman <email address hidden>
Signed-off-by: Ilya Maximets <email address hidden>

54e45e3... by Ilya Maximets

ovsdb: Monitor: Keep and maintain the initial change set.

Change sets in OVSDB monitor are storing all the changes that happened
between a particular transaction ID and now. Initial change set
basically contains all the data.

On each monitor request a new initial change set is created by creating
an empty change set and adding all the database rows. Then it is
converted into JSON reply and immediately untracked and destroyed.

This is causing significant performance issues if many clients are
requesting new monitors at the same time. For example, that is
happening after database schema conversion, because conversion triggers
cancellation of all monitors. After cancellation, every client sends
a new monitor request. The server then creates a new initial change
set, sends a reply, destroys initial change set and repeats that for
each client. On a system with 200 MB database and 500 clients,
cluster of 3 servers spends 20 minutes replying to all the clients
(200 MB x 500 = 100 GB):

  timeval|WARN|Unreasonably long 1201525ms poll interval

Of course, all the clients are already disconnected due to inactivity
at this point. When they are re-connecting back, server accepts new
connections one at a time, so inactivity probes will not be triggered
anymore, but it still takes another 20 minutes to handle all the
incoming connections.

Let's keep the initial change set around for as long as the monitor
itself exists. This will allow us to not construct a new change set
on each new monitor request and even utilize the JSON cache in some
cases. All that at a relatively small maintenance cost, since we'll
need to commit changes to one extra change set on every transaction.
Measured memory usage increase due to keeping around a shallow copy
of a database is about 10%. Measured CPU usage difference during
normal operation is negligible.

With this change it takes only 30 seconds to send out all the monitor
replies in the example above. So, it's a 40x performance improvement.
On a more reasonable setup with 250 nodes, the process takes up to
8-10 seconds instead of 4-5 minutes.

Conditional monitoring will benefit from this change as well, however
results might be less impressive due to lack of JSON cache.

Reviewed-by: Simon Horman <email address hidden>
Acked-by: Dumitru Ceara <email address hidden>
Signed-off-by: Ilya Maximets <email address hidden>

5fe322e... by Ilya Maximets

fatal-signal: Don't share signal fds/handles with forked process.

The signal_fds pipe and wevent are a mechanism to wake up the process
after it received a signal and stored the number for the future
processing. They are not intended for inter-process communication.
However, in the current code, descriptors are not closed on fork().

The main scenario where we use fork() is a monitor process. Monitor
doesn't actually use poll loops and doesn't wait on the descriptor.
But when a child process is killed, it (child) sends a byte to itself,
then it wakes up due to POLLIN on the pipe and terminates itself after
processing all the callbacks. The byte stays unread. And the pipe is
still open in the monitor process. When child dies, the monitor wakes
up and forks again. New child inherits the same pipe that still
contains unread data. This data is never read, so the child will
constantly wake itself up for no reason.

Interestingly enough raise(SIGSEGV) doesn't immediately kill the
process. The execution continues til the end of a signal handler,
so we're still able to write a byte to a pipe even in this case.
Presumably because we don't have SA_NODEFER.

Fix the issue by re-creating the pipe/event on fork. This way
every new child will have its own notification channel and will
not wake up any other processes.

There was already an attempt to fix the issue, but it didn't get a
follow up (see the reported-at tag). This is an alternative solution.

Fixes: ff8decf1a318 ("daemon: Add support for process monitoring and restart.")
Reported-at: https://patchwork.ozlabs<email address hidden>/
Acked-by: Eelco Chaudron <email address hidden>
Signed-off-by: Ilya Maximets <email address hidden>

3fcb817... by David Marchand

cpu: Fix cpuid check for some AMD processors.

Some venerable AMD processors do not support querying extended features
(EAX=7) with cpuid.
In this case, it is not a programmatic error and the runtime check should
simply return the isa is unsupported.

Reported-by: Davide Repetto <email address hidden>
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2211747
Fixes: b366fa2f4947 ("dpif-netdev: Call cpuid for x86 isa availability.")
Signed-off-by: David Marchand <email address hidden>
Signed-off-by: Ilya Maximets <email address hidden>

01f0668... by Frode Nordahl

tc: Fix crash on malformed reply from kernel.

The tc module combines the use of the `tc_transact` helper
function for communication with the in-kernel tc infrastructure
with assertions on the reply data by `ofpbuf_at_assert` on the
received data prior to further processing.

With the presence of bugs on the kernel side, we need to treat
the kernel as an unreliable service provider and replace assertions
on the reply from it with checks to avoid a fatal crash of OVS.

For the record, the symptom of the crash is this in the log:
  EMER|include/openvswitch/ofpbuf.h:194:
      assertion offset + size <= b->size failed in ofpbuf_at_assert()

And an excerpt of the backtrace looks like this:
  ofpbuf_at_assert (offset=16, size=20) at include/openvswitch/ofpbuf.h:194
  tc_replace_flower at lib/tc.c:3223
  netdev_tc_flow_put at lib/netdev-offload-tc.c:2096
  netdev_flow_put at lib/netdev-offload.c:257
  parse_flow_put at lib/dpif-netlink.c:2297
  try_send_to_netdev at lib/dpif-netlink.c:2384

Reported-At: https://launchpad.net/bugs/2018500
Fixes: 5c039ddc64ff ("netdev-linux: Add functions to manipulate tc police action")
Fixes: e7f6ba220e10 ("lib/tc: add ingress ratelimiting support for tc-offload")
Fixes: f98e418fbdb6 ("tc: Add tc flower functions")
Fixes: c1c9c9c4b636 ("Implement QoS framework.")
Signed-off-by: Frode Nordahl <email address hidden>
Signed-off-by: Ilya Maximets <email address hidden>

45dba48... by Robin Jarry <email address hidden>

netdev-dpdk: Fix warning with gcc 13.

GCC now reports uninitialized warnings from function return values.

../lib/netdev-dpdk.c: In function 'netdev_dpdk_mempool_configure':
../lib/netdev-dpdk.c:964:22: warning: 'dmp' may be used uninitialized [-Wmaybe-uninitialized]
  964 | dev->dpdk_mp = dmp;
      | ~~~~~~~~~~~~~^~~~~
../lib/netdev-dpdk.c:854:21: note: 'dmp' was declared here
  854 | struct dpdk_mp *dmp, *next;
      | ^~~

NB: this looks like a false positive, gcc 13 probably fails to see the link
between reuse and dmp in dpdk_mp_get().

Reviewed-by: David Marchand <email address hidden>
Signed-off-by: Robin Jarry <email address hidden>
Signed-off-by: Ilya Maximets <email address hidden>