Regressions in GnuTLS 3.5.3 break OpenConnect

Bug #1624856 reported by Anders Kaseorg
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
gnutls (Fedora)
Fix Released
Undecided
gnutls28 (Debian)
Fix Released
Unknown
gnutls28 (Ubuntu)
Fix Released
High
Unassigned

Bug Description

OpenConnect is unusable after upgrading to GnuTLS 3.5.3: the VPN connection drops after less than a minute. Please merge gnutls28 3.5.3-5 or 3.5.4-2 from Debian, where this problem was fixed (https://bugs.debian.org/835587).

Revision history for this message
In , Lars (lars-redhat-bugs) wrote :

Created attachment 1194946
verbose openconnect output (see end of file for failure)

Following the F25 update to gnutls-3.5.3-1, VPN connections established through OpenConnect (against an AnyConnect VPN provided by a Cisco ASA device) started to fail, with the remote side appearing dead to OpenConnect:

DTLS Dead Peer Detection detected dead peer!

After re-establishment, the connection works again for some time, then the process is repeated. The connection only fails if it's actually used for something. Not quite sure yet what exactly triggers it. Logging into a RHEV/oVirt web interface seems quite reliable, but I've also seen it happen during DNF package installs.

Disabling DTLS (--no-dtls to openconnect) makes things work again, as does downgrading of gnutls to the previous 3.5.2 version. Looking at the upstream changelog, 3.5.3 appears to introduce a new DTLS sliding window implementation, maybe related?

Version-Release number of selected component (if applicable):
gnutls-3.5.3-1.fc25.x86_64
openconnect-7.07-2.fc25.x86_64

Steps to Reproduce:
1. connect to AnyConnect VPN using OpenConnect
2. use it for some time (not sure what exactly triggers it, doesn't take long though)
3. connection dies with "DTLS Dead Peer Detection detected dead peer!"

Additional info:
I'm well aware that Cisco's DTLS implementation is quite non-standard, but grepping through the GnuTLS code, it seems to me that the intent is to support it (as DTLS0.9), which is why I'm filing this bug against GnuTLS.

OpenConnect does not show the issue when using gnutls-3.5.2-1.fc25 or when built against OpenSSL.

Revision history for this message
In , Nikos (nikos-redhat-bugs) wrote :

Hi, if you downgrade to 3.5.2 does it work? Do you have any debugging output when this issue happens?

Revision history for this message
In , Nikos (nikos-redhat-bugs) wrote :

The best would be if you could provide some gnutls debugging output while that issue occurs. To get that you can set GNUTLS_DEBUG_LEVEL=6 environment variable prior to running openconnect.

Revision history for this message
In , David (david-redhat-bugs) wrote :

You can also add --gnutls-debug=6 to the openconnect command line.

It might be worth getting a packet capture from the real network, showing the DTLS packets. The sequence numbers are in the clear, aren't they? And the only obvious DTLS change I see between 3.5.2 and 3.5.3 is related to the sequence number sliding window — so we hopefully don't even *need* to decrypt the packets to work out what the problem is.

Revision history for this message
In , Nikos (nikos-redhat-bugs) wrote :

David reproduced the issue and send the log back. I'll try to figure the issue from that.

Revision history for this message
In , David (david-redhat-bugs) wrote :
Download full text (7.2 KiB)

REC[0x15970d0]: Decrypted Packet[1.798] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.803] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.800] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.805] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.802] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.807] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.804] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.808] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.806] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.810] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.809] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.812] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.814] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.816] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.818] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.820] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.822] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.824] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.826] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.828] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.830] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.832] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.834] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.836] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.838] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.840] Application Data(23) with length: 1267
REC[0x15970d0]: Discarded duplicate message[1.842]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.844]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.846]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.848]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.849]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.851]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.853]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.855]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.857]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.859]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.861]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.863]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.865]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.867]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.869]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.871]: Application Data
REC[0x15970d0]: Discarded duplicate messag...

Read more...

Revision history for this message
In , Nikos (nikos-redhat-bugs) wrote :
Revision history for this message
In , Fedora (fedora-redhat-bugs) wrote :

gnutls-3.5.3-2.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-66cab49b1f

Revision history for this message
In , David (david-redhat-bugs) wrote :

Not sure I understand DTLS_WINDOW_RECEIVED() being conditional. Assuming you only act on this when a packet passes the MAC validation, why do you not *always* advance the window so that it ends at the latest-received packet?

And if you are marking packets received before their MAC is checked, we have bigger things to talk about...

Revision history for this message
In , Fedora (fedora-redhat-bugs) wrote :

gnutls-3.5.3-2.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-66cab49b1f

Revision history for this message
In , Nikos (nikos-redhat-bugs) wrote :

(In reply to David Woodhouse from comment #8)
> Not sure I understand DTLS_WINDOW_RECEIVED() being conditional. Assuming you
> only act on this when a packet passes the MAC validation, why do you not
> *always* advance the window so that it ends at the latest-received packet?

What if first packet is 1 and you receive 3, 6, 5, 4, 2. If window is advanced to the latest 5, 4, 2 would be discarded.

> And if you are marking packets received before their MAC is checked, we have
> bigger things to talk about...

That's after MAC is checked.

Revision history for this message
In , David (david-redhat-bugs) wrote :

(In reply to Nikos Mavrogiannopoulos from comment #10)
> (In reply to David Woodhouse from comment #8)
> > Not sure I understand DTLS_WINDOW_RECEIVED() being conditional. Assuming you
> > only act on this when a packet passes the MAC validation, why do you not
> > *always* advance the window so that it ends at the latest-received packet?
>
> What if first packet is 1 and you receive 3, 6, 5, 4, 2. If window is
> advanced to the latest 5, 4, 2 would be discarded.

The way this usually works is the window advances so that the *END* of the window is the last-received packet.

So the *end* of the window is advanced to the latest 6. The bitmap then represents the packets leading up to that.

cf. http://git.infradead.org/users/dwmw2/openconnect.git/blob/HEAD:/esp.c#l33

Revision history for this message
In , David (david-redhat-bugs) wrote :

Created attachment 1195938
Import sliding window code from OpenConnect

Here's my code from OpenConnect. We *always* advance the sliding window so it ends at the latest received packet. We never discard a packet because it's "too new".

Revision history for this message
In , David (david-redhat-bugs) wrote :
Revision history for this message
In , Fedora (fedora-redhat-bugs) wrote :

gnutls-3.5.3-2.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.

Revision history for this message
Anders Kaseorg (andersk) wrote :
Revision history for this message
Anders Kaseorg (andersk) wrote :
tags: added: patch
tags: added: patch-accepted-debian
Changed in gnutls28 (Debian):
status: Unknown → Fix Released
Revision history for this message
Marc Deslauriers (mdeslaur) wrote :

Debdiff looks good, uploaded. Thanks!

Mathew Hodson (mhodson)
Changed in gnutls28 (Ubuntu):
importance: Undecided → High
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gnutls28 - 3.5.3-5ubuntu1

---------------
gnutls28 (3.5.3-5ubuntu1) yakkety; urgency=medium

  * Merge with Debian (LP: #1624856). Remaining changes:
    - debian/patches/disable_global_init_override_test.patch: disable failing
      test.
    - debian/patches/add-openssl-test-link.patch: add link for libssl

gnutls28 (3.5.3-5) experimental; urgency=medium

  * Pull DTLS fixes from upstream GIT master.
    45_01-tests-enhance-the-DTLS-window-unit-test-to-account-f.patch
    45_02-dtls-ensure-that-the-DTLS-window-doesn-t-get-stalled.patch
    45_03-tests-mini-dtls-record-modified-expected-order-to-ac.patch
    45_04-Import-DTLS-sliding-window-validation-from-OpenConne.patch
    Closes: #835587

 -- Anders Kaseorg <email address hidden> Sun, 18 Sep 2016 08:03:47 -0400

Changed in gnutls28 (Ubuntu):
status: New → Fix Released
Changed in gnutls (Fedora):
importance: Unknown → Undecided
status: Unknown → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.