systemd-tmpfiles-setup.services fails to create /var/run directories

Bug #1818814 reported by Kalle Tuulos
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Won't Fix
Undecided
Unassigned
Xenial
Won't Fix
Undecided
Unassigned

Bug Description

1) The release of Ubuntu you are using, via 'lsb_release -rd' or System -> About Ubuntu
Description: Ubuntu 16.04.6 LTS
Release: 16.04

2) The version of the package you are using, via 'apt-cache policy pkgname' or by checking in Software Center
systemd:
  Installed: 229-4ubuntu21.16
  Candidate: 229-4ubuntu21.16
  Version table:
 *** 229-4ubuntu21.16 500
        500 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages
        100 /var/lib/dpkg/status
     229-4ubuntu4 500
        500 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages

3) What you expected to happen
4) What happened instead

Ubuntu server (running in OpenVZ VPS farm, thus the old kernel version) has been up and running happily, until I performed apt-get upgrade and rebooted the server. After reboot, I could not establish SSH connection to server, port 22 connection was refused.

I opened a HTML console to my server instance and checked logs. From the logs, it was shown, that SSH server could not start, as it did not have the /var/run/sshd directory. After scrolling back the /var/log/syslog, I noticed that there were lots of other /var/run subdirectories, which were not created. Here is cut&paste from /var/log/syslog, related to systemd-tmpfiles:

---8<---8<---
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: [/usr/lib/tmpfiles.d/00rsyslog.conf:6] Duplicate line for path "/v
ar/log", ignoring.
Mar 6 12:32:54 vspk systemd[1]: Starting Raise network interfaces...
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: fchownat() of /run/named failed: Invalid argument
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: Failed to validate path /var/run/fail2ban: Too many levels of symb
olic links
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: Failed to validate path /var/run/screen: Too many levels of symbol
ic links
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: Failed to validate path /var/run/sshd: Too many levels of symbolic
 links
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: Failed to validate path /var/run/sudo: Too many levels of symbolic
 links
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: Failed to validate path /var/run/sudo/ts: Too many levels of symbo
lic links
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: fchownat() of /run/utmp failed: Invalid argument
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: fchownat() of /run/systemd/netif failed: Invalid argument
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: fchownat() of /run/systemd/netif/links failed: Invalid argument
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: fchownat() of /run/systemd/netif/leases failed: Invalid argument
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: Failed to validate path /var/run/zabbix: Too many levels of symbol
ic links
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: fchownat() of /run/log/journal failed: Invalid argument
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: fchownat() of /run/log/journal/6d9c7cc322ee4c48af7c0ec3b492b5cc fa
iled: Invalid argument
Mar 6 12:32:54 vspk systemd-tmpfiles[81]: fchownat() of /run/log/journal/6d9c7cc322ee4c48af7c0ec3b492b5cc/sy
stem.journal failed: Invalid argument
Mar 6 12:32:54 vspk systemd[1]: systemd-tmpfiles-setup.service: Main process exited, code=exited, status=1/F
AILURE
Mar 6 12:32:54 vspk systemd[1]: Failed to start Create Volatile Files and Directories.
Mar 6 12:32:54 vspk systemd[1]: systemd-tmpfiles-setup.service: Unit entered failed state.
Mar 6 12:32:54 vspk systemd[1]: systemd-tmpfiles-setup.service: Failed with result 'exit-code'.
---8<---8<---

My first idea was, that for some reason, systemd-tmpfiles was not able to create the /var directory properly, so I renamed /usr/lib/tmpfiles.d/var.conf to 0000var.conf, but it was no help. The only difference was the first line on the above log, complaining about duplicate line for path /var/log.

As I created this error using the "ubuntu-bug" command, I assume necessary background information is automatically attached. I'll check those and add more, if necessary.

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: systemd 229-4ubuntu21.16
Uname: Linux 2.6.32-042stab128.2 x86_64
ApportVersion: 2.20.1-0ubuntu2.18
Architecture: amd64
CurrentDmesg:

Date: Wed Mar 6 12:43:51 2019
Lsusb: Error: command ['lsusb'] failed with exit code 1:
ProcEnviron:
 TERM=vt220
 PATH=(custom, no user)
ProcInterrupts: Error: [Errno 2] No such file or directory: '/proc/interrupts'
ProcKernelCmdLine: quiet
ProcModules:

SourcePackage: systemd
UpgradeStatus: No upgrade log present (probably fresh install)

CVE References

Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :
Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :

Additional information: the problem appeared after updating libsystemd0:amd64 from 229-4ubuntu21.10 to 229-4ubuntu21.16

Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :

I got this fixed by manually downgrading packages libsystemd0 and systemd to versions 229-4ubuntu21.10. I just downloaded those from here: https://launchpad.net/~ubuntu-security-proposed/+archive/ubuntu/ppa/+build/15710450

After rebooting the server, the system works again fine. Anyway, now I have to take care, that I don't upgrade my other VPS servers before this bug has been corrected.

Revision history for this message
Dan Streetman (ddstreet) wrote :

@kalle-tuulos,

are you able to test with the latest systemd pkgs to see if this still happens? If it does, can you paste the output of:

$ grep . /etc/tmpfiles.d/* /run/tmpfiles.d/* /usr/lib/tmpfiles.d/*

Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :

The problem still exists in the latest (229-4ubuntu21.21) version.

The grep result is attached as "grep_result.txt" file:

kalle@nxld:~$ grep . /etc/tmpfiles.d/* /run/tmpfiles.d/* /usr/lib/tmpfiles.d/* > grep_result.txt
grep: /run/tmpfiles.d/*: No such file or directory

Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :

Correction to previous comment: /var/run directories seem to be pretty much ok now, i.e. openssh starts, but mysql refuses still to start. I'll try to find out reason for that and get back to here.

Dan Streetman (ddstreet)
Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :

Now I tested on the original host. On that machine, update to 229-4ubuntu21.21 did not work - /var/run directories are not created properly and e.g. OpenSSH server can't start, following is printed on the log:

Apr 10 13:31:24 vspk systemd[1]: Starting OpenBSD Secure Shell server...
Apr 10 13:31:24 vspk sshd[1928]: Missing privilege separation directory: /var/run/sshd
Apr 10 13:31:24 vspk systemd[1]: ssh.service: Control process exited, code=exited status=255
Apr 10 13:31:24 vspk systemd[1]: Failed to start OpenBSD Secure Shell server.
Apr 10 13:31:24 vspk systemd[1]: ssh.service: Unit entered failed state.
Apr 10 13:31:24 vspk systemd[1]: ssh.service: Failed with result 'exit-code'.

Grep results are attached.

Changed in systemd (Ubuntu):
status: Incomplete → New
Revision history for this message
Dan Streetman (ddstreet) wrote :

Did you create/copy this file yourself? systemd drops the var.conf file there, not 0000var.conf...not that I see any reason it would cause any problems, but it shouldn't be there.

/usr/lib/tmpfiles.d/0000var.conf

What does your /var/run and /run dirs look like?

$ ls -lad /run /var/run

Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :

About 0000var.conf, I wrote that in the initial bug report:
"My first idea was, that for some reason, systemd-tmpfiles was not able to create the /var directory properly, so I renamed /usr/lib/tmpfiles.d/var.conf to 0000var.conf, but it was no help."

The output of "ls -lad" is as follows:

kalle@vspk:~$ ls -lad /run /var/run
drwxr-xr-x 31 root root 980 Apr 10 23:03 /run
lrwxrwxrwx 1 root root 4 May 14 2018 /var/run -> /run

Changed in systemd (Ubuntu):
status: Incomplete → New
Revision history for this message
Dan Streetman (ddstreet) wrote :

Your /run dir and /var/run link look fine - I am 99% sure that tmpfiles has nothing to do with whatever problem you're having (tmpfiles doesn't even create the /var/run symlink, base-files does).

> /var/run directories are not created properly

what specifically do you mean by this? just that you see log errors?

What's the output of:

$ sudo find /run -ls

$ df

Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :
Download full text (4.6 KiB)

>> /var/run directories are not created properly

>what specifically do you mean by this? just that you see log errors?

Following directories were not created automatically, I had to create them manually in order to enable services to start:

/var/run/fail2ban
/var/run/screen
/var/run/sshd
/var/run/zabbix

Following directory was created, but its ownership was not proper:

/var/run/redis

> What's the output of:

> $ sudo find /run -ls
Please see attached file: sudo_find_run_ls.txt (working situation) and find_run_ls_after_reboot.txt (after the system was rebooted i.e. not working situation).

> $ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/simfs 52428800 7818704 44610096 15% /
devtmpfs 1572864 0 1572864 0% /dev
tmpfs 1572864 0 1572864 0% /dev/shm
tmpfs 1572864 9488 1563376 1% /run
tmpfs 5120 4 5116 1% /run/lock
tmpfs 1572864 0 1572864 0% /sys/fs/cgroup
none 1572864 0 1572864 0% /run/shm

To make this a bit more complete, I removed the 0000var.conf and restarted the system. When I checked the md5sum of files /usr/lib/tmpfiles.d, they matched to my other Ubuntu 16.04 server.

After reboot, syslog relevant parts showed following:
---8<---8<---
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: [/usr/lib/tmpfiles.d/var.conf:14] Duplicate line for path "/var/log", ignoring.
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: fchownat() of /run/named failed: Invalid argument
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: Failed to validate path /var/run/fail2ban: Too many levels of symbolic links
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: fchownat() of /run/redis failed: Invalid argument
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: Failed to validate path /var/run/screen: Too many levels of symbolic links
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: Failed to validate path /var/run/sshd: Too many levels of symbolic links
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: Failed to validate path /var/run/sudo: Too many levels of symbolic links
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: Failed to validate path /var/run/sudo/ts: Too many levels of symbolic links
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: fchownat() of /run/utmp failed: Invalid argument
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: fchownat() of /run/systemd/netif failed: Invalid argument
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: fchownat() of /run/systemd/netif/links failed: Invalid argument
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: fchownat() of /run/systemd/netif/leases failed: Invalid argument
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: Failed to validate path /var/run/zabbix: Too many levels of symbolic links
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: fchownat() of /run/log/journal failed: Invalid argument
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: fchownat() of /run/log/journal/6d9c7cc322ee4c48af7c0ec3b492b5cc failed: Invalid argument
Apr 11 08:44:50 vspk systemd-tmpfiles[85]: fchownat() of /run/log/journal/6d9c7cc322ee4c48af7c0ec3b492b5cc/system.journal failed: Invalid argument
Apr 11 08:44:50 vspk systemd[1]: systemd-tmpfiles-setup.service: Main process e...

Read more...

Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :
Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :
Changed in systemd (Ubuntu):
status: Incomplete → New
Revision history for this message
Dan Streetman (ddstreet) wrote :

ok, maybe there is a tmpfiles problem :)

can you test with the systemd pkg from this ppa?
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1818814

Changed in systemd (Ubuntu):
status: New → Fix Released
Changed in systemd (Ubuntu Xenial):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Dan Streetman (ddstreet)
Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :

Yes, tested, and sorry, it did not help. There are still the same errors in syslog, and /var/run don't have proper directories for sshd etc.

The systemd (etc) version is now as follows:
kalle@vspk:~$ apt-cache policy systemd
systemd:
  Installed: 229-4ubuntu21.21+bug1818814v20190411b1
  Candidate: 229-4ubuntu21.21+bug1818814v20190411b1
  Version table:
 *** 229-4ubuntu21.21+bug1818814v20190411b1 500
        500 http://ppa.launchpad.net/ddstreet/lp1818814/ubuntu xenial/main amd64 Packages
        100 /var/lib/dpkg/status
     229-4ubuntu21.21 500
        500 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages
     229-4ubuntu4 500
        500 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages

Revision history for this message
Dan Streetman (ddstreet) wrote :

ok, let's try this; please create a file with the contents and location shown below:

ubuntu@lp1818814:~$ cat /etc/systemd/system/systemd-tmpfiles-setup.service.d/debug.conf
[Service]
Environment="SYSTEMD_LOG_LEVEL=debug"
PassEnvironment=SYSTEMD_LOG_LEVEL

then, reboot (so the problem is reproduced) and get the output of

$ journalctl -b

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

In addition to debug output from systemd-tmpfiles-setup.service as above, it is also interesting to know the permiissions on / itself, ie. what's the output of:
$ ls -la /

Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :

For Dimitri:

> In addition to debug output from systemd-tmpfiles-setup.service
> as above, it is also interesting to know the permiissions
> on / itself, ie. what's the output of:
> $ ls -la /

kalle@vspk:~$ ls -la /
total 1272
drwxr-xr-x 24 root root 4096 Apr 11 15:18 .
drwxr-xr-x 24 root root 4096 Apr 11 15:18 ..
drwx------ 2 root root 4096 Nov 27 2016 .cpt_hardlink_dir_a920e4ddc233afddc9fb53d26c392319
-rw-r--r-- 1 root root 0 Apr 11 15:20 .vzfifo
drwxr-xr-x 2 root root 4096 Apr 11 15:17 bin
drwxr-xr-x 2 root root 4096 May 14 2018 boot
-rw------- 1 root root 1671168 Apr 11 15:19 core
drwxr-xr-x 5 root root 640 Apr 11 15:19 dev
drwxr-xr-x 154 root root 12288 Apr 11 15:18 etc
drwxr-xr-x 5 root root 4096 Apr 10 15:46 home
drwxr-xr-x 16 root root 4096 Jun 9 2018 lib
drwxr-xr-x 2 root root 4096 Mar 4 08:56 lib64
drwx------ 2 root root 4096 Nov 27 2016 lost+found
drwxr-xr-x 2 root root 4096 Nov 27 2016 media
drwxr-xr-x 2 root root 4096 Nov 27 2016 mnt
drwxr-xr-x 2 root root 4096 Nov 27 2016 opt
dr-xr-xr-x 139 root root 0 Apr 11 15:18 proc
drwx------ 9 root root 4096 Apr 11 08:29 root
drwxr-xr-x 32 root root 980 Apr 12 08:25 run
drwxr-xr-x 2 root root 12288 Apr 11 15:17 sbin
drwxr-xr-x 2 root root 4096 May 14 2018 snap
drwxr-xr-x 2 root root 4096 Nov 27 2016 srv
drwxr-xr-x 7 root root 0 Apr 11 15:18 sys
drwxrwxrwt 9 root root 4096 Apr 12 08:25 tmp
drwxr-xr-x 10 root root 4096 Nov 27 2016 usr
drwxr-xr-x 13 root root 4096 Dec 17 13:37 var

Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :

For Dan:

> then, reboot (so the problem is reproduced) and get the output of
> $ journalctl -b

Attached.

Revision history for this message
Dan Streetman (ddstreet) wrote :

AHA!

> Uname: Linux 2.6.32-042stab128.2 x86_64

you can't run this ancient kernel in a Xenial release (and expect things to work right). That's the cause of your problems; tmpfiles was recently updated to start using fchownat() with the AT_EMPTY_PATH flag, and:

$ man fchownat | grep AT_EMPTY_PATH
       AT_EMPTY_PATH (since Linux 2.6.39)

upgrade your kernel to a supported version provided by Ubuntu in Xenial.

Changed in systemd (Ubuntu Xenial):
status: In Progress → Invalid
Changed in systemd (Ubuntu):
status: Fix Released → Invalid
Changed in systemd (Ubuntu Xenial):
assignee: Dan Streetman (ddstreet) → nobody
importance: Medium → Undecided
Revision history for this message
Kalle Tuulos (kalle-tuulos) wrote :

>> Uname: Linux 2.6.32-042stab128.2 x86_64
> you can't run this ancient kernel in a Xenial release

This is something I (and quite many others) can't do anything, as the kernel is provided by OpenVZ. When I last time asked the service provider, they estimated, that it would be on Q2/2019.

> tmpfiles was recently updated to start using fchownat()
> with the AT_EMPTY_PATH flag, and:
> $ man fchownat | grep AT_EMPTY_PATH
> AT_EMPTY_PATH (since Linux 2.6.39)

This means, that the Xenial LTS has been broken :(

I'm changing this status back to New, as there needs to be a way to support Xenial installations in OpenVZ environments, until service providers have upgraded their systems.

Changed in systemd (Ubuntu):
status: Invalid → New
Revision history for this message
Dan Streetman (ddstreet) wrote :

> This is something I (and quite many others) can't do anything, as the kernel
> is provided by OpenVZ

> This means, that the Xenial LTS has been broken :(

well...what you mean is that Xenial LTS is broken *when using a custom kernel from OpenVZ*. The Xenial LTS isn't broken when using any kernel provided by Ubuntu in the Xenial release. So, it's rather likely someone else will come along and set this bug back to Invalid.

> I'm changing this status back to New, as there needs to be a way to support
> Xenial installations in OpenVZ environments

a way for who to support Xenial, OpenVZ? Sure, they could provide a custom systemd package, that works with the custom kernel they provide.

Without a custom systemd package from OpenVZ, you have a couple options; you can pin your systemd version to the last one before the change to use fchownat(). You'll never get any more systemd upgrades for bugfixes, or security issues, though. Alternately, you can maintain your own systemd package with a patch to replace the fchownat() usage; I have a test build in this ppa
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1818814

The last patch listed in the debian/patches/series file is what reverts the fchownat() back to a fchown().

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

OpenVZ has been proactive w.r.t. this issue and have issued an update that includes the required backports a long time ago.

Please see this comment: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1804847/comments/20

"""
Updated OpenVz6 kernel was released:
https://wiki.openvz.org/Download/kernel/rhel6/042stab134.7

We are very grateful for Ubuntu team for reverting of patches specially for OpenVz.

For affected hosters: OpenVz6 is great but it is really old,
and similar incidents can happen again and again.
Please think about switch to RHEL7-based OpenVz7.

Thank you,
   Vasily Averin
"""

Which was released in November 2018. All your provider needs to do, is to apply OpenVZ updates.

From Ubuntu point of view this is a wontfix, as providing systemd without using fchownat opens a security vulnerability CVE-2018-6954.

Please upgrde to OpenVZ kernel 042stab134.7 or anything better. I believe currently the latest kernel is 042stab136.1.

@ddstreet please delete your packages from the PPA, as you are intentially distributing security vulnerable systemd.

Regards,

Dimitri.

Changed in systemd (Ubuntu Xenial):
status: Invalid → Won't Fix
Changed in systemd (Ubuntu):
status: New → Won't Fix
Revision history for this message
Dan Streetman (ddstreet) wrote :

> @ddstreet please delete your packages from the PPA, as you are intentially
> distributing security vulnerable systemd.

hmm, i see you didn't actually look at my ppa, since if you did you'd know i didn't revert the security patches. so no need to delete it.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.