~mamarley/openconnect/+git/gitlab-main:Windows_10_has_AF_UNIX_socket_w_debug_fprintf

Last commit made on 2022-01-13
Get this branch:
git clone -b Windows_10_has_AF_UNIX_socket_w_debug_fprintf https://git.launchpad.net/~mamarley/openconnect/+git/gitlab-main

Branch merges

Branch information

Name:
Windows_10_has_AF_UNIX_socket_w_debug_fprintf
Repository:
lp:~mamarley/openconnect/+git/gitlab-main

Recent commits

ea476b7... by Dan Lenski

dumb_socketpair(): extra debugging output via fprintf

Signed-off-by: Daniel Lenski <email address hidden>

b62764e... by Dan Lenski

Update changelog

Signed-off-by: Daniel Lenski <email address hidden>

a39a0ea... by Dan Lenski

dumb_socketpair(): Try a whole series of plausible temporary/writable directories for AF_UNIX sockets

This is probably trying too hard. Discussed in
https://gitlab.com/openconnect/openconnect/-/merge_requests/320#note_800005154

Signed-off-by: Daniel Lenski <email address hidden>

dd6e91c... by Dan Lenski

dumb_socketpair(): fallback from AF_UNIX to AF_INET if AF_UNIX fails

1) If bind() fails with an AF_UNIX socket, we should retry with
   AF_INET socket

   Because we have to used named paths for AF_UNIX sockets on Windows, a
   likely point of failure is that the temporary directory in
   nonexistent/non-writable, or even that we somehow have a collision in the
   filename.

2) If any of the other AF_UNIX operations (listen, socket, connect, accept)
   fail, we might as well also retry with AF_INET.

   We don't know of any expected points-of-failure, but all indications are
   that AF_UNIX support in Windows is incomplete, undocumented, and
   potentially buggy.

Signed-off-by: Daniel Lenski <email address hidden>

f7a24ed... by Dan Lenski

dumb_socketpair(): generate named socket path more carefully

Windows forces us to use named-path Unix sockets. Generating a path in the
temporary directory, combining current high-res time and PID, seems like a
less-bad option.

On GitHub, a commenter
[suggested](https://github.com/microsoft/WSL/issues/4240#issuecomment-1010545442)
that it would be better to use
[GetTempFileName](https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-gettempfilenamea)
here. However, that function:

- Only adds 16 bits of time-based random bits,
- Will currently fail if there aren't 14 characters available for the filename,
- Might conceivably generate paths longer than UNIX_PATH_MAX, and
- Offers no other apparent offsetting advantages

Signed-off-by: Daniel Lenski <email address hidden>

7920ffb... by Dan Lenski

dumb_socketpair(): try to use AF_UNIX socketpair on Windows 10 and newer

As a workaround for the lack of socketpair() on Windows, we use
dumb_socketpair() from https://github.com/ncm/selectable-socketpair/blob/master/socketpair.c,
which uses a pair of IPv4 local sockets (listening on 127.0.0.1).

Unfortunately, and maddeningly, it's possible for the local IPv4 routes
(127.0.0.0/8) to be deleted on Windows; this will prevent dumb_socketpair()
from working in its current form.

See https://gitlab.com/openconnect/openconnect/-/issues/228 and
https://gitlab.com/openconnect/openconnect/-/issues/361 for examples of how
to trigger this condition. The simplest way to do it is with `route /f`.

Fortunately, Windows 10+ supports AF_UNIX sockets, which we should be able
to use to sidestep this issue.

This feature was announced in December 2017 in
https://devblogs.microsoft.com/commandline/af_unix-comes-to-windows.
It is evidently incomplete, and also buggy:

1) "Abstract" sockets don't actually seem to work, and this is probably why
    the socketpair() function still isn't implemented, even though AF_UNIX
    support would naturally enable it:
    https://github.com/microsoft/WSL/issues/4240#issuecomment-506437851
2) Actual MSDN documentation for this feature is seemingly nonexistent.
3) MinGW lacks the expected <afunix.h> header, but other FLOSS projects show
   how to embed the needed `struct sockaddr_un` definition:
   - https://github.com/MisterDA/ocaml/commit/5855ce5ffd931a2802d5b9a5b2987ab0b276fd0a
   - https://github.com/curl/curl/blob/curl-7_74_0/lib/config-win32.h#L725-L734

Nevertheless, it works well enough that we can use it in OpenConnect. The
modified version of dumb_socketpair() in this patch tries to create an AF_UNIX
socketpair, and only uses IPv4 local sockets as a fallback. With this modified
version, I confirm that I can do the following on Windows 10:

1) Nuke routes with `route /f`.
2) Reconnect network adapter via GUI.
3) Confirm that IPv4 local route (127.0.0.0/8) still hasn't been recreated.
4) Run OpenConnect and successfully create the cmd pipe.

So this appears to fix https://gitlab.com/openconnect/openconnect/-/issues/228 and
https://gitlab.com/openconnect/openconnect/-/issues/361, at least on Windows 10
and newer.

Signed-off-by: Daniel Lenski <email address hidden>

f3e5278... by Dimitri Papadopoulos <email address hidden>

Latest version of vendored dumb_socketpair()

These are minor changes, except the return value which really is an
integer error status, not a socket.

Signed-off-by: Dimitri Papadopoulos <email address hidden>

fa028ee... by Dan Lenski

.mailmap update

Pick most-used name and email for those with multiple variants.

Signed-off-by: Daniel Lenski <email address hidden>

4a77079... by Dan Lenski

Merge branch 'wintun-0.10.2-0.13' into 'master'

Update usage of Wintun, avoid disconnections on Windows

Closes #338

See merge request openconnect/openconnect!306

9033a84... by Dimitri Papadopoulos <email address hidden>

Windows: fix instability with Wintun as tunnel device driver

In https://gitlab.com/openconnect/openconnect/-/issues/338, multiple users
reported that connections using Wintun as the tunnel device driver become
non-functional after 20-30 minutes of operation, without
any message from OpenConnect at all.

We analyzed this issue as follows:

1. There was an off-by-one error in the check of outgoing packet size
   against the tunnel device's MTU (`tun_len < pkt->len`). Outgoing packets
   of exactly the MTU size would be considered errors and silently discarded
   by OpenConnect.
2. However, OpenConnect failed to instruct the driver to release these
   discarded packets. They would accumulate in the Wintun driver buffer and
   probably cause an out-of-memory condition, eventually freezing the
   driver.

We fixed the issued as follows:

1. Fix the off-by-one error, changing to `tun_len <= pkt->len`.
2. Always release outgoing packets, even if discarded.
3. Print extended debugging messages when receiving/sending packets. Such
   messages would have helped us diagnose the error much sooner.

Developers and users have confirmed that, with these changes, Wintun
connections run stably for at least 60 minutes. (See
https://gitlab.com/openconnect/openconnect/-/merge_requests/306#note_745393731).

Also, in
https://gitlab.com/openconnect/openconnect/-/merge_requests/300?commit_id=b5ff6f3fb1b8d06cf56426b13c7af96e25cd922b,
we reverted to TAP-Windows as the default driver on Windows due to the
aforementioned stability issues. Although Wintun connections now appear to
be stable, we are not quite ready yet to un/re-revert, and make Wintun the
default driver again.

Signed-off-by: Dimitri Papadopoulos <email address hidden>
Signed-off-by: Daniel Lenski <email address hidden>