Comment 3 for bug 1696415

Revision history for this message
Sjors Gielen (sgielen) wrote :

I've closed down on the root cause being the /usr/lib/NetworkManager/nm-dhcp-helper tool. Occasionally, this binary runs, but fails to correctly send the update to NetworkManager. No errors occur when this happens; NetworkManager in debug mode just says "accepted connection on private socket" then "closed connection on private socket" without any updates happening.

I've managed to work around the issue by wrapping /usr/lib/NetworkManager/nm-dhcp-helper in a shell script that simply performs the same lease update until the logs indicate that NetworkManager received it. This doesn't fix the communication problem, but adds a safety net that prevents the resulting issues. It's been tested in an office network of some 12 PCs.

If anyone runs into this issue as well, run the following script to work around it:

-----8<-----
#!/bin/bash

HELPERSCRIPT="/usr/lib/NetworkManager/nm-dhcp-helper"
HELPERBIN="/usr/lib/NetworkManager/nm-dhcp-helper.bin"

function is_elf() {
 readelf -h "$1" >/dev/null 2>&1
 if [ "$?" = "1" ]; then
  echo "0"
 else
  echo "1"
 fi
}

if [ "$(is_elf $HELPERSCRIPT)" = "1" ]; then
  mv "$HELPERSCRIPT" "$HELPERBIN"
fi

cat <<EOF >"$HELPERSCRIPT"
#!/usr/bin/perl
use strict;
use warnings;

if(\$< != 0) {
 die "Must run as root\n";
}

my \$reason = \$ENV{reason} || "";
if(\$reason eq "PREINIT") {
 # not lease information, so waiting for the journal will make
 # nm-dhcp-helper wait for too long, just send it once and exit so
 # dhclient will start to get a lease
 system("${HELPERBIN}");
 exit(0);
}

my \$attempts = 0;
my \$success = 0;
while(\$attempts < 10) {
 \$attempts++;
 my \$time = time();
 sleep(1);
 system("${HELPERBIN}");
 sleep(1);
 my \$leasetime = \`/bin/journalctl --since='\\@\$time' | grep NetworkManager | grep ' lease time ' | wc -l\`;
 if(\$leasetime == 1) {
  \$success = 1;
  last;
 }
 # Try again in 5 seconds
 sleep(5);
}

if(\$attempts > 1) {
 open my \$fh, ">>", "/tmp/nm-helper-retries.log" or die \$!;
 my \$date = \`/bin/date\`;
 1 while chomp \$date;
 if(\$success) {
  print \$fh "\$date: needed \$attempts attempts to update NetworkManager (\$reason).\n";
 } else {
  print \$fh "\$date: gave up after \$attempts attempts (\$reason).\n";
 }
 close \$fh;
}

exit(0);
EOF
chmod +x $HELPERSCRIPT
/usr/sbin/aa-complain /etc/apparmor.d/sbin.dhclient