~mamarley/openconnect/+git/gitlab-main:oncp-braindamage

Last commit made on 2023-04-19
Get this branch:
git clone -b oncp-braindamage https://git.launchpad.net/~mamarley/openconnect/+git/gitlab-main

Branch merges

Branch information

Name:
oncp-braindamage
Repository:
lp:~mamarley/openconnect/+git/gitlab-main

Recent commits

ac11fd7... by dwmw2

Fix oncp Legacy IP check

Signed-off-by: David Woodhouse <email address hidden>

8ad8486... by dwmw2

more oncp debug

Signed-off-by: David Woodhouse <email address hidden>

939e130... by dwmw2

Attempt to handle Legacy IP frames in the middle of oNCP config

Is there no end to the stupidity of these proprietary protocols?

So... TCP provides you with a byte stream. Forget IP packets underneath;
TCP hides all that for you, and all there is is a hosepipe of bytes.

On top of that, TLS provides 'records' of up to 16KiB, encrypting each
record and sending it down the TCP pipe, then collecting the whole record
on the receiving end, decrypting it and providing it to the user. Mostly,
nobody cares much about TLS records, and they choose to treat TLS just as
a byte stream too.

The Juniper oNCP protocol adds its own layers on top of that TLS byte
stream. First there's the thing which OpenConnect has been calling the
'SSL record', which has a little-endian 16-bit length followed by that
many bytes of data. Then another 16-bit length and that much data, etc.
Those "SSL records" may be correlated with the TLS records, or maybe I
just chose a really stupid name back in 2015 when I first did this. Who
knows, who cares, since we mostly ignore TLS records anyway.

On top of *this* there is an actual packet-based protocol which may
perhaps be called something like KMP. There's a 20-byte header, most of
which we don't understand but it has a type field and a length that we
do understand. This one is big-endian, unlike the one below. Type 300
is data, Type 301 is the configuration we get when we first connect.

Within the KMP300 packets there can be multiple actual IP packets; we
need to look at the length field of the IP packet and split them apart
for delivering them to the tun device. There is no apparent relationship
between the KMP300 packets and the SSL record below; you can have many
KMP300 packets within an SSL record, and the SSL record boundary can
fall *within* a KMP300 packet. So the SSL records have, thus far, just
been a pointless nuisance which means we occasionally have to drop a
2-byte length field from what would otherwise just be treated as a
simple byte stream of data which packetises itself.

Thus far.

If you have a large enough KMP301 config frame (for example, if you have
lots of split include routes), it might get sent in more than one SSL
record. So OpenConnect has code to keep reading those records until the
KMP301 frame is complete. And it *did* assume that the SSL record would
end at the end of the KMP301 frame, since we haven't even finished the
negotiation at this point; we have to send a KMP303 frame back to the
server with our idea of the MTU.

Today we have learned that the server might throw a KMP300 packet with
an IP frame into the *middle* between the two consecutive records that
comprise the KMP301 config frame:
https://gitlab.com/openconnect/openconnect/-/issues/562#note_1358466466

It's not clear if this will only ever be *one* IP packet, or if it can
end up being lots of packets and split across multiple SSL records. In
the example I've seen, it was a *unicast* packet for the VPN client
address.

Attempt to work around the case we've already seen, by discarding a
continuation that looks like a KMP300 with precisely one Legacy IP
packet in it. We *are* also assuming that we at least get the *start*
of the KMP301 config frame first.

Perhaps we'll end up having to keep a list of the records we receive,
then attempt to piece them together more flexibly. But this should
suffice for now.

Fixes: #562
Signed-off-by: David Woodhouse <email address hidden>

e1a3be3... by dwmw2

Fix EINTR handling for select() on cmd_fd

KDE plasma-nm was failing with external browser authentication, reporting
'Socket connect cancelled' immediately after spawning the browser command.

This was caused by the SIGCHLD which kded handles (instead of it being
ignored as in the standalone openconnect process). Thus, the select()
call returns with EINTR, and the fd sets are not changed.

So when check_cmd_fd(vpninfo, &rd_set) is called, it looks like the cmd_fd
was readable. And since plasma-nm uses the legacy cancel_fd setup, the
mere fact of it being *readable* is enough to set vpninfo->got_cancel_cmd.

Instead of forging ahead and interpreting the fd_sets after select()
returns with EINTR, loop and go straight back into select() instead.

For signals like SIGINT from Ctrl-C, we *handle* those and deliver the
cancel command anyway, so we don't depend on the EINTR return in that
case.

cf. https://gitlab.gnome.org/GNOME/NetworkManager-openconnect/-/merge_requests/49#note_1725203

Signed-off-by: David Woodhouse <email address hidden>

7e41de5... by Dan Lenski

More specific error message with proposed workaround for Pulse EAP-TLS requests

See https://gitlab.com/openconnect/openconnect/-/merge_requests/414#note_1149887354
for discussion of the apparent layering insanity that's going on in this
corner case.

Signed-off-by: Daniel Lenski <email address hidden>

f96dae3... by dwmw2

Report unexpected Pulse EAP requests more explicitly

Signed-off-by: David Woodhouse <email address hidden>

2560a29... by Dan Lenski

Add --no-external-auth option, and follow it for Cisco protocol

This option is intended to prevent OpenConnect from advertising to the
server that it supports any kind of authentication mode that requires an
external browser. Some servers will force the client to use such an
authentication mode if the client advertises it, but fallback to a
"scriptable" authentication mode if the client doesn't appear to support it.
See https://gitlab.com/openconnect/openconnect/-/issues/470#note_1028595620
for an example.

This option is only implemented here for the Cisco protocol, in which case
it causes OpenConnect not to advertise the 'single-sign-on-v2' or
'single-sign-on-external-browser' auth-methods.

I have verified on one Cisco VPN that this works as intended, as has one
other user (see
https://gitlab.com/openconnect/openconnect/-/issues/470#note_1045509425).

Signed-off-by: Daniel Lenski <email address hidden>

89ba799... by dwmw2

Fix --server vs. positional argument handling

If we have a --server argument, we shouldn't expect to find a positional
argument at argv[optind]. We mostly got this right, except that we still
called config_lookup_host() with argv[optind] even when --server was
given.

The only thing that saved us from dying with a strcmp() on NULL was
the fact that the loop over the XML elements is using the fact that
vpninfo->hostname gets set as its terminating condition, so it never
got that far.

Fix that, because it's icky. And make the --server argument work for
XML config lookups too.

Fixes: a2fd6f4f2e8a ("New option to define server name in config file")
Signed-off-by: David Woodhouse <email address hidden>

8560cef... by dwmw2

Set SOCK_CLOEXEC on listening socket for Cisco external browser support

Not entirely sure, but it seems to be accused of causing a hang in
https://gitlab.gnome.org/GNOME/NetworkManager-openconnect/-/merge_requests/49#note_1720281
and regardless of whether that's the case or not, we should be consistent
about using {SOCK,O}_CLOEXEC whenever we can.

Signed-off-by: David Woodhouse <email address hidden>

98f10f9... by dwmw2

Resync translations with sources

Signed-off-by: David Woodhouse <email address hidden>