[SRU] upstart: ceph-all service starts before networks up

Bug #1636322 reported by Billy Olsen
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Fix Released
High
Unassigned
Icehouse
Invalid
High
Unassigned
Kilo
Fix Released
High
Unassigned
Liberty
Fix Released
High
Unassigned
Mitaka
Fix Released
High
Unassigned
ceph (Ubuntu)
Invalid
High
Billy Olsen
Trusty
Fix Released
High
James Page
Xenial
Fix Released
High
Unassigned
Yakkety
Invalid
High
Unassigned
Zesty
Invalid
High
Billy Olsen

Bug Description

As reported in upstream bug http://tracker.ceph.com/issues/17689, the ceph-all service starts at runlevels [2345] and introduces a race condition which allows the ceph service (e.g. ceph-mon) to start prior to the network the service binds to is up on the server. This causes the service to fail on start because it was unable to bind to the specific network the service is configured to listen on.

A work around is to provide a post-up directive to the network stanza configuring the network device in the /etc/network/interfaces file which restarts the necessary ceph service.

[Impact]

 * Ceph service fails to start on reboot of machine/container when networking takes some time to come up.

 * The provided patch to the upstart service configuration adds the static-network-up event as a dependency for the start on service directive. The static-network-up event is started after all the network stanzas have been processed in the necessary config files.

[Test Case]

* Configure multiple network interfaces and have the ceph service bind to one of the last configured network devices to introduce a delayed start of the network interface.

[Regression Potential]

* Upstream previously had the directive to start the service after any network-device-up for a network which is not the loopback interface. This caused some "weirdness" to be seen when the multiple network interfaces were configured. This was likely due the events that it keyed on being the local filesystems being available and a single network interface being available. This would add the change to start only after all the network interface stanzas are processed in the /e/n/i configuration files.

* Additionally, this will cause some ceph services to start later than they previously would have since this change causes additional start dependencies. However, the results should be that the interfaces have always had a chance to be started prior to the attempt to start the ceph service.

Changed in ceph (Ubuntu):
assignee: nobody → Billy Olsen (billy-olsen)
tags: added: sts
Revision history for this message
Billy Olsen (billy-olsen) wrote :
Revision history for this message
Billy Olsen (billy-olsen) wrote :
Revision history for this message
Billy Olsen (billy-olsen) wrote :
Revision history for this message
Billy Olsen (billy-olsen) wrote :
Revision history for this message
Billy Olsen (billy-olsen) wrote :
Revision history for this message
Billy Olsen (billy-olsen) wrote :
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "Patch for xenial" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
James Page (james-page)
Changed in cloud-archive:
importance: Undecided → High
Changed in ceph (Ubuntu):
importance: Undecided → High
status: New → Triaged
Changed in cloud-archive:
status: New → Triaged
tags: added: sts-sru
Louis Bouchard (louis)
Changed in ceph (Ubuntu Yakkety):
status: New → Invalid
Changed in ceph (Ubuntu Xenial):
status: New → Invalid
Changed in ceph (Ubuntu Zesty):
status: Triaged → Invalid
Revision history for this message
Billy Olsen (billy-olsen) wrote :

Based on further discussion with louis-bouchard, it appears that upstart is still a viability so attaching debdiffs for zesty and yakkety for inclusion.

Revision history for this message
Billy Olsen (billy-olsen) wrote :
Revision history for this message
Billy Olsen (billy-olsen) wrote :
tags: added: ubuntu-sponsors
tags: added: sts-sponsor
Louis Bouchard (louis)
tags: removed: sts-sponsor
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ceph - 10.2.5-0ubuntu3

---------------
ceph (10.2.5-0ubuntu3) zesty; urgency=medium

  * Start ceph-all after static-network-up (LP: #1636322).
    - d/p/start-ceph-all-after-network.patch: add dependency on
      the static-network-up event before starting ceph-all.

 -- Billy Olsen <email address hidden> Mon, 30 Jan 2017 09:50:25 -0700

Changed in ceph (Ubuntu Zesty):
status: Invalid → Fix Released
Dave Chiluk (chiluk)
tags: added: sts-sponsor
tags: added: sts-sru-needed
removed: sts-sru ubuntu-sponsors
Changed in ceph (Ubuntu Xenial):
status: Invalid → New
Changed in ceph (Ubuntu Yakkety):
status: Invalid → New
summary: - upstart: ceph-all service starts before networks up
+ [SRU] upstart: ceph-all service starts before networks up
Mathew Hodson (mhodson)
Changed in ceph (Ubuntu Trusty):
importance: Undecided → High
Changed in ceph (Ubuntu Xenial):
importance: Undecided → High
Changed in ceph (Ubuntu Yakkety):
importance: Undecided → High
tags: removed: sts-sponsor
Revision history for this message
James Page (james-page) wrote :

10.2.5-0ubuntu3 was not committed to the git repo in Debian, so got lost by my upload of 10.2.6-0ubuntu1; that version does not include the fix, but as this only effects trusty installs, I'm going to mark tasks for yakkety and zesty as Invalid - the ceph packages in those archives are never backported to trusty (just the one in xenial).

As a side note I'm sorting out the git repo so that we have a mirror on launchpad so we don't have an excuse for dropping changes by mistake.

Changed in ceph (Ubuntu Zesty):
status: Fix Released → Invalid
Changed in ceph (Ubuntu Yakkety):
status: New → Invalid
Revision history for this message
James Page (james-page) wrote :
Changed in ceph (Ubuntu Trusty):
assignee: nobody → James Page (james-page)
status: New → In Progress
Revision history for this message
James Page (james-page) wrote :

Uploads for trusty-kilo and trusty-liberty made to staging PPA's; will poke those through to -proposed once built for testing.

Revision history for this message
James Page (james-page) wrote : Please test proposed package

Hello Billy, or anyone else affected,

Accepted ceph into kilo-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:kilo-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-kilo-needed to verification-kilo-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-kilo-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-kilo-needed
Revision history for this message
James Page (james-page) wrote :

Hello Billy, or anyone else affected,

Accepted ceph into liberty-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:liberty-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-liberty-needed to verification-liberty-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-liberty-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-liberty-needed
Revision history for this message
James Page (james-page) wrote :

Sync from bileto PPA to trusty UNAPPROVED queue completed; pending SRU team review.

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hello Billy, or anyone else affected,

Accepted ceph into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ceph/10.2.7-0ubuntu0.16.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in ceph (Ubuntu Xenial):
status: New → Fix Committed
tags: added: verification-needed
Revision history for this message
Billy Olsen (billy-olsen) wrote :

Completed the verification for UCA Kilo & Liberty. Tested by installing new packages and testing multiple reboots with multiple nics and static networking.

tags: added: verification-kilo-done verification-liberty-done
removed: verification-kilo-needed verification-liberty-needed
Revision history for this message
Billy Olsen (billy-olsen) wrote :

Completed verification for Xenial. Ran the Xenial in a trusty-mitaka deployment (which runs equivalent packages from xenial for OpenStack and Ceph) as well as switching to upstart in a Xenial machine.

tags: added: verification-done
removed: verification-needed
Revision history for this message
James Page (james-page) wrote :

Hello Billy, or anyone else affected,

Accepted ceph into mitaka-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:mitaka-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-mitaka-needed to verification-mitaka-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-mitaka-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-mitaka-needed
Revision history for this message
James Page (james-page) wrote : Update Released

The verification of the Stable Release Update for ceph has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
James Page (james-page) wrote :

This bug was fixed in the package ceph - 0.94.10-0ubuntu0.15.04.1~cloud1
---------------

 ceph (0.94.10-0ubuntu0.15.04.1~cloud1) trusty; urgency=medium
 .
   * Start ceph-all after static-network-up (LP: #1636322).
     - d/p/start-ceph-all-after-network.patch: add dependency on
       the static-network-up event before starting ceph-all.

Revision history for this message
James Page (james-page) wrote :

The verification of the Stable Release Update for ceph has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
James Page (james-page) wrote :

This bug was fixed in the package ceph - 0.94.10-0ubuntu0.15.10.1~cloud2
---------------

 ceph (0.94.10-0ubuntu0.15.10.1~cloud2) trusty; urgency=medium
 .
   * Start ceph-all after static-network-up (LP: #1636322).
     - d/p/start-ceph-all-after-network.patch: add dependency on
       the static-network-up event before starting ceph-all.

Revision history for this message
Ken Dreyer (Red Hat) (kdreyer-redhat) wrote :

James and Billy, was a patch sent upstream for this?

Revision history for this message
Billy Olsen (billy-olsen) wrote :
Revision history for this message
Ken Dreyer (Red Hat) (kdreyer-redhat) wrote :

Thanks Billy, I'll update upstream's tracker with that PR link.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ceph - 10.2.7-0ubuntu0.16.04.1

---------------
ceph (10.2.7-0ubuntu0.16.04.1) xenial; urgency=medium

  [ Billy Olsen ]
  * Start ceph-all after static-network-up (LP: #1636322):
    - d/p/start-ceph-all-after-network.patch: add dependency on
      the static-network-up event before starting ceph-all.

  [ James Page ]
  * New upstream point release (LP: #1684527):
    - d/p/disable-openssl-linking.patch: Dropped, no longer required.
    - d/control: Add BD on libssl-dev to support optional runtime
      loading of openssl in the radosgw.

 -- James Page <email address hidden> Fri, 21 Apr 2017 09:21:10 +0100

Changed in ceph (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
James Page (james-page) wrote :

The verification of the Stable Release Update for ceph has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
James Page (james-page) wrote :

This bug was fixed in the package ceph - 10.2.7-0ubuntu0.16.04.1~cloud0
---------------

 ceph (10.2.7-0ubuntu0.16.04.1~cloud0) trusty-mitaka; urgency=medium
 .
   * New upstream release for the Ubuntu Cloud Archive.
 .
 ceph (10.2.7-0ubuntu0.16.04.1) xenial; urgency=medium
 .
   [ Billy Olsen ]
   * Start ceph-all after static-network-up (LP: #1636322):
     - d/p/start-ceph-all-after-network.patch: add dependency on
       the static-network-up event before starting ceph-all.
 .
   [ James Page ]
   * New upstream point release (LP: #1684527):
     - d/p/disable-openssl-linking.patch: Dropped, no longer required.
     - d/control: Add BD on libssl-dev to support optional runtime
       loading of openssl in the radosgw.

Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Billy, or anyone else affected,

Accepted ceph into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ceph/0.80.11-0ubuntu1.14.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in ceph (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: removed: verification-done
tags: added: verification-needed
Revision history for this message
Billy Olsen (billy-olsen) wrote :

Completed verifying this fix for trusty-proposed today. Tests include the same set that was run for the other releases: multiple reboots, static networking, and running inside containers.

tags: added: verification-done
removed: verification-needed
tags: added: verification-mitaka-done
removed: verification-mitaka-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ceph - 0.80.11-0ubuntu1.14.04.2

---------------
ceph (0.80.11-0ubuntu1.14.04.2) trusty; urgency=medium

  * Start ceph-all after static-network-up (LP: #1636322).
    - d/p/start-ceph-all-after-network.patch: add dependency on
      the static-network-up event before starting ceph-all.

 -- Billy Olsen <email address hidden> Thu, 20 Apr 2017 09:53:08 +0100

Changed in ceph (Ubuntu Trusty):
status: Fix Committed → Fix Released
Changed in cloud-archive:
status: Triaged → Fix Released
tags: added: sts-sru-done
removed: sts-sru-needed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.