open-vm-tools-desktop: resolutionKMS plugins sometimes fails to load at boot

Bug #1818473 reported by Christian Ehrhardt 
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
VMWare tools
New
Unknown
open-vm-tools (Debian)
Fix Released
Unknown
open-vm-tools (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Fix Released
Undecided
Unassigned
Cosmic
Fix Released
Undecided
Unassigned

Bug Description

[Impact]

 * The service for vmware base kernel mode setting fails sometimes (racy)
   to work after boot, but suceeds when restarted. The reason is that
   loading the required module is not always done in time.

 * This is fixed by a drop-in snippet extending open-vm-tools.service.
   (Not modifying the base .service file as the service and the
   drop in are owned by different packages).
   This drop in makes the service ensuring the load of the module in the
   ExecStartPre phase

 * Backport the change [1] as part of the regular backport we do
   for the latest Ubuntu LTS

[Test Case]

 * Due to the racy nature of this issue there isn't a great 100%
   reliable test. But we can do something to at least try based on
   retries.

 * install a VMware guest with e.g. Ubuntu Desktop and install
   open-vm-tools-desktop
 * Put that guest into a reboot loop for a while, but wait until Desktop
   is fully up before that
 * Checking resulution sucks (manual on each reboot) but without the fix
   chances are (due to the race only chances) that the log contains the
   error triggering, see /var/log/vmware-vmsvc.log
   It will have:
       [ warning] [resolutionCommon] resolutionCheckForKMS: No system
       support for resolutionKMS.
   While with the fix it should always work which looks like:
       [ message] [resolutionCommon] resolutionCheckForKMS: System
       support available for resolutionKMS.
       [ message] [vmtoolsd] Plugin 'resolutionKMS' initialized.

 * For a somewhat faked test you could drop a file that prevents the
   module from loading e.g.
     $ echo "blacklist vmwgfx" | sudo tee /etc/modprobe.d/blacklist-vmware.conf
     $ echo "install vmwgfx /bin/false" | sudo tee -a /etc/modprobe.d/blacklist-vmware.conf
     $ sudo update-initramfs -u -k all
   Then reboot which will make it start without the module and trigger the
   error condition as if it would have raced (service up but the module
   not loaded).
   On a service restart you'll see the error:
     [2019-03-04T08:17:34.112Z] [ message] [resolutionCommon]
     resolutionCheckForKMS: dlopen succeeded.
     [2019-03-04T08:17:34.268Z] [ warning] [resolutionCommon]
     resolutionCheckForKMS: No system support for resolutionKMS.
   Even if you now remove the blacklist it will still fail that way.
   If you modprobe vmwgfx it will switch to
     [2019-03-04T08:18:42.983Z] [ message] [resolutionCommon]
     resolutionCheckForKMS: dlopen succeeded.
     [2019-03-04T08:18:42.986Z] [ message] [resolutionCommon]
     resolutionCheckForKMS: System support available for resolutionKMS.
   The latter should work without manual modprobe and with the fix it
   does.

[Regression Potential]

 * I wondered first if a regression could be that for some users
   this always failed and due to that now after the change they get
   e.g. different guest resolution. But the race seems unable to be
   reliable either way, so those users would today be flaky with/without
   KMS working and the fix would make that reliable. Therefore that is
   no regression but an actual fix for those users.

 * Also failing to load the module is not a (regressing) problem.
   In our kernel packaging that module is only part of the -oem,
   -lowlatency and all modules-extra-... packages. That said there
   can be cases where e.g. running the virt kernels the modules isn't
   available. But that will not make the service fail as it is using
   the prefix "-" which means that if failing it still goes on to
   start the service itself [2].

[Other Info]

 * n/a

[1]: https://github.com/bzed/pkg-open-vm-tools/commit/db2a3642d45
[2]: https://www.freedesktop.org/software/systemd/man/systemd.service.html

---

This is about the tracking of a bug that was reported and fixed in Debian (and thereby also Ubuntu 19.04 already) to also SRU that back as part of our open-vm-tools backports to latest LTS.

Related:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=915031
https://github.com/vmware/open-vm-tools/issues/214
https://github.com/bzed/pkg-open-vm-tools/pull/13

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Added remote bugs as bug tasks

Changed in open-vm-tools (Ubuntu):
status: New → Triaged
description: updated
Changed in open-vm-tools (Ubuntu):
status: Triaged → Fix Released
Changed in open-vm-tools (Ubuntu Bionic):
status: New → Triaged
Changed in open-vm-tools (Ubuntu Cosmic):
status: New → Triaged
Changed in open-vm-tools (Debian):
status: Unknown → Fix Released
description: updated
Changed in open-vm-tools:
status: Unknown → New
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

After improving on what we have found triggered by the first SRU review this is ready again.
Changes:
- Despite being an MRE in general all associated individual bugs have full SRU templates now.
- maintainers correctly updated
- The change to the vgauth Dependencies is now safer

That said, this is uploaded to Bionic/Cosmic unapproved again and waiting for SRU Team re-review.

Revision history for this message
Steve Langasek (vorlon) wrote : Please test proposed package

Hello Christian, or anyone else affected,

Accepted open-vm-tools into cosmic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/open-vm-tools/2:10.3.5-7~ubuntu0.18.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-cosmic to verification-done-cosmic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-cosmic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in open-vm-tools (Ubuntu Cosmic):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-cosmic
Revision history for this message
Steve Langasek (vorlon) wrote :

Hello Christian, or anyone else affected,

Accepted open-vm-tools into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/open-vm-tools/2:10.3.5-7~ubuntu0.18.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in open-vm-tools (Ubuntu Bionic):
status: Triaged → Fix Committed
tags: added: verification-needed-bionic
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

With the trick described in the test instructions:
Prior to the update even when restarting the service:
[2019-03-11T08:31:47.052Z] [ warning] [resolutionCommon] resolutionCheckForKMS: No system support for resolutionKMS.

Upgrade:
sudo apt install open-vm-tools-desktop open-vm-tools
Reading package lists... Done
Building dependency tree
Reading state information... Done
Suggested packages:
  xdg-utils
Recommended packages:
  xserver-xorg-input-vmmouse
The following packages will be upgraded:
  open-vm-tools open-vm-tools-desktop
2 upgraded, 0 newly installed, 0 to remove and 39 not upgraded.
Need to get 672 kB of archives.
After this operation, 6144 B of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic-proposed/universe amd64 open-vm-tools-desktop amd64 2:10.3.5-7~ubuntu0.18.04.1 [129 kB]
Get:2 http://archive.ubuntu.com/ubuntu bionic-proposed/main amd64 open-vm-tools amd64 2:10.3.5-7~ubuntu0.18.04.1 [543 kB]
Fetched 672 kB in 1s (482 kB/s)
dpkg: considering deconfiguration of open-vm-tools-desktop, which would be broken by installation of open-vm-tools ...
dpkg: yes, will deconfigure open-vm-tools-desktop (broken by open-vm-tools)
(Reading database ... 116821 files and directories currently installed.)
Preparing to unpack .../open-vm-tools_2%3a10.3.5-7~ubuntu0.18.04.1_amd64.deb ...
De-configuring open-vm-tools-desktop (2:10.3.0-0ubuntu1~18.04.3) ...
Unpacking open-vm-tools (2:10.3.5-7~ubuntu0.18.04.1) over (2:10.3.0-0ubuntu1~18.04.3) ...
Preparing to unpack .../open-vm-tools-desktop_2%3a10.3.5-7~ubuntu0.18.04.1_amd64.deb ...
Unpacking open-vm-tools-desktop (2:10.3.5-7~ubuntu0.18.04.1) over (2:10.3.0-0ubuntu1~18.04.3) ...
Processing triggers for ureadahead (0.100.0-20) ...
Setting up open-vm-tools (2:10.3.5-7~ubuntu0.18.04.1) ...
Processing triggers for libc-bin (2.27-3ubuntu1) ...
Setting up open-vm-tools-desktop (2:10.3.5-7~ubuntu0.18.04.1) ...
Processing triggers for systemd (237-3ubuntu10.13) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...

service right after upgrade install has the new drop in snippet:
$ systemctl cat open-vm-tools
# /lib/systemd/system/open-vm-tools.service
[Unit]
Description=Service for virtual machines hosted on VMware
Documentation=http://open-vm-tools.sourceforge.net/about.php
ConditionVirtualization=vmware
DefaultDependencies=no
Before=cloud-init-local.service
After=vgauth.service
After=apparmor.service
RequiresMountsFor=/tmp
After=systemd-remount-fs.service systemd-tmpfiles-setup.service

[Service]
ExecStart=/usr/bin/vmtoolsd
TimeoutStopSec=5

[Install]
WantedBy=multi-user.target

# /lib/systemd/system/open-vm-tools.service.d/desktop.conf
[Service]
ExecStartPre=-/sbin/modprobe vmwgfx

The log now is happy for the KMS plugin
[2019-03-11T08:37:35.074Z] [ message] [resolutionCommon] resolutionCheckForKMS: dlopen succeeded.
[2019-03-11T08:37:35.076Z] [ message] [resolutionCommon] resolutionCheckForKMS: System support available for resolutionKMS.

Setting verified

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Ran the same on cosmic, which other than dates in the log does not differ at all.
Setting verified as well.

tags: added: verification-done verification-done-cosmic
removed: verification-needed verification-needed-cosmic
Revision history for this message
Fifi Cek (fifi-cek) wrote :

Thank you for fix. I can confirm this to be effective on Ubuntu 18.04 setup here. Pulled update from proposed repository. Rate of bug reproducibility was getting more and more intensive with the time. In the end there was quite no boot with right screen resolution. Fix resolved it immediately. If more details are needed kindly feel free to ask.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.1 KiB)

This bug was fixed in the package open-vm-tools - 2:10.3.5-7~ubuntu0.18.10.1

---------------
open-vm-tools (2:10.3.5-7~ubuntu0.18.10.1) cosmic; urgency=medium

  * Backport recent open-vm-tools (LP: #1813944)
    - also adresses handling of quiesced snapshot failures (LP: #1814832)
    - also adresses issues with resolutionKMS plugins sometimes fails to
      load at boot (LP: #1818473)

open-vm-tools (2:10.3.5-7) unstable; urgency=medium

  [ Christian Ehrhardt ]
  * [71b468f] make vgauth service execution more reliable.
    Since d3d47039 "Start vgauth before vmtoolsd" there is a potential race
    of starting vgauth so early that it might have issues. This was
    discussed back in the day in [1] to [2], but confirmed to be ok by
    VMWare.
    We were all somewhat convinced by this, but a bad feeling remained not
    only with me but also with Bernd [4].
    A recent SRU review denial made me rethink all of it and I think we can
    make it safer without thwarting the purpose of the original change.
    Note: Disambiguation of service names used below:
    vgauth - open-vm-tools.vgauth.service
    vmtoolsd - open-vm-tools.service
    fs - systemd-remount-fs.service
    tmp - systemd-tmpfiles-setup.service
    cloud-init - cloud-init-local.service
    Currently we have these dependency requirements:
    - vgauth should be before vmtoolsd
    - cloud init should be before vmtoolsd
    - cloud init has to be really early in general
      - therefore this is using DefaultDependencies=No
    That lead to this graph:
     fs / tmp -> vmtoolsd -> cloud-init
    And d3d47039 added it to be like:
     fs / tmp -> vmtoolsd -> cloud-init
                   ^
     vgauth --|
    But there is no need to have vgauth without any pre-dependencies at all.
    It is only needed to be "before" vmtoolsd, therefore we can make it:
     fs / tmp -> vgauth -> vmtoolsd -> cloud-init
    That will make execution of vgauth much less error-prone (even though I
    have no hard issue to report) while at the same time holding up all
    known required ordering constraints.
    [1]: https://bugs.launchpad.net/ubuntu/+source/open-vm-tools/+bug/1804287/comments/3
    [2]: https://bugs.launchpad.net/ubuntu/+source/open-vm-tools/+bug/1804287/comments/12
    [3]: https://bugs.launchpad.net/ubuntu/+source/open-vm-tools/+bug/1804287/comments/25
    [4]: https://github.com/bzed/pkg-open-vm-tools/pull/15#issuecomment-447237910
    Signed-off-by: Christian Ehrhardt <email address hidden>

open-vm-tools (2:10.3.5-6) unstable; urgency=medium

  * [43ec618] Correct and/or improve handling of certain quiesced
    snapshot failures.
    Thanks to Oliver Kurth (Closes: #921470)

open-vm-tools (2:10.3.5-5) unstable; urgency=medium

  * [54cce3e] Start vmtoolsd after apparmor.service.
    Github issue #17

open-vm-tools (2:10.3.5-4) unstable; urgency=medium

  [ Alf Gaida ]
  * [e13792d] udevadm trigger should not fail (Closes: #917642)

open-vm-tools (2:10.3.5-3) unstable; urgency=medium

  [ Christian Ehrhardt ]
  * [d3d4703] Start vgauth before vmtoolsd.
    VGAuthService needs to be ready when vmtoolsd runs. Certain cases - e.g.
    Site Reco...

Read more...

Changed in open-vm-tools (Ubuntu Cosmic):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for open-vm-tools has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.1 KiB)

This bug was fixed in the package open-vm-tools - 2:10.3.5-7~ubuntu0.18.04.1

---------------
open-vm-tools (2:10.3.5-7~ubuntu0.18.04.1) bionic; urgency=medium

  * Backport recent open-vm-tools (LP: #1813944)
    - also adresses handling of quiesced snapshot failures (LP: #1814832)
    - also adresses issues with resolutionKMS plugins sometimes fails to
      load at boot (LP: #1818473)

open-vm-tools (2:10.3.5-7) unstable; urgency=medium

  [ Christian Ehrhardt ]
  * [71b468f] make vgauth service execution more reliable.
    Since d3d47039 "Start vgauth before vmtoolsd" there is a potential race
    of starting vgauth so early that it might have issues. This was
    discussed back in the day in [1] to [2], but confirmed to be ok by
    VMWare.
    We were all somewhat convinced by this, but a bad feeling remained not
    only with me but also with Bernd [4].
    A recent SRU review denial made me rethink all of it and I think we can
    make it safer without thwarting the purpose of the original change.
    Note: Disambiguation of service names used below:
    vgauth - open-vm-tools.vgauth.service
    vmtoolsd - open-vm-tools.service
    fs - systemd-remount-fs.service
    tmp - systemd-tmpfiles-setup.service
    cloud-init - cloud-init-local.service
    Currently we have these dependency requirements:
    - vgauth should be before vmtoolsd
    - cloud init should be before vmtoolsd
    - cloud init has to be really early in general
      - therefore this is using DefaultDependencies=No
    That lead to this graph:
     fs / tmp -> vmtoolsd -> cloud-init
    And d3d47039 added it to be like:
     fs / tmp -> vmtoolsd -> cloud-init
                   ^
     vgauth --|
    But there is no need to have vgauth without any pre-dependencies at all.
    It is only needed to be "before" vmtoolsd, therefore we can make it:
     fs / tmp -> vgauth -> vmtoolsd -> cloud-init
    That will make execution of vgauth much less error-prone (even though I
    have no hard issue to report) while at the same time holding up all
    known required ordering constraints.
    [1]: https://bugs.launchpad.net/ubuntu/+source/open-vm-tools/+bug/1804287/comments/3
    [2]: https://bugs.launchpad.net/ubuntu/+source/open-vm-tools/+bug/1804287/comments/12
    [3]: https://bugs.launchpad.net/ubuntu/+source/open-vm-tools/+bug/1804287/comments/25
    [4]: https://github.com/bzed/pkg-open-vm-tools/pull/15#issuecomment-447237910
    Signed-off-by: Christian Ehrhardt <email address hidden>

open-vm-tools (2:10.3.5-6) unstable; urgency=medium

  * [43ec618] Correct and/or improve handling of certain quiesced
    snapshot failures.
    Thanks to Oliver Kurth (Closes: #921470)

open-vm-tools (2:10.3.5-5) unstable; urgency=medium

  * [54cce3e] Start vmtoolsd after apparmor.service.
    Github issue #17

open-vm-tools (2:10.3.5-4) unstable; urgency=medium

  [ Alf Gaida ]
  * [e13792d] udevadm trigger should not fail (Closes: #917642)

open-vm-tools (2:10.3.5-3) unstable; urgency=medium

  [ Christian Ehrhardt ]
  * [d3d4703] Start vgauth before vmtoolsd.
    VGAuthService needs to be ready when vmtoolsd runs. Certain cases - e.g.
    Site Reco...

Read more...

Changed in open-vm-tools (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.