cloud-run-user-script.conf upstart script needs to run after all other cloud-init processes

Bug #613309 reported by Pete Crossley
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
Fix Released
Medium
Unassigned
Lucid
Fix Released
High
Unassigned

Bug Description

Binary package hint: cloud-init

All lucid/maverick source packages,

The following upstart scripts need to wait for the 'cloud-config' status in addition to what they currently wait for, otherwise userdata might not be loaded yet.

cloud-apt-update-upgrade.conf
cloud-config-misc.conf
cloud-config-mounts.conf
cloud-config-ssh.conf
cloud-disable-ec2-metadata.conf
cloud-ec2-ebs-mounts.conf
cloud-raid.conf
cloud-runurl.conf

~~
most of these need to just add 'and cloud-config' but one or two need to just 'start on cloud-config'

~~~
cloud-run-user-script.conf

needs 'start on (stopped rc RUNLEVEL=[2345] and stopped cloud-config-misc)'

We need this package to work in ec2 as well as UEC otherwise it make provisioning of a new instance much harder and defeats the purpose of this package. Since Lucid is LTS I would like to see a SRU for this issue as well since it prevent out of the box/repo use of this feature.

==== Begin SRU Justification ====
Impact: The impact of this bug is that a popular portion of "cloud-config" syntax [1] is not easily used in the 10.04 images. If the user specifies commands to run, they cannot rely on other portions of cloud-init having finished before those scripts run. The big examples is that if the user installs a package via 'pkgs', they cannot rely on it being present in their 'runcmd'. The ordering is simply not guaranteed.
Solution: The solution is to make the upstart script that executes the user's commands depend on 'stopped' of each of the other upstart jobs. In this manner, it will not execute until the other jobs are finished.
Patch: Available in branch attached to this bug [2]
Regression Potential: There should be low potential for regression and low realistic change of user expectations. Previously, the order was non-deterministic, this will guarantee that jobs run after packages are installed.

TEST Case:
 * launch ec2 instance (ubuntu-lucid-10.04-i386-server-20100427.1, such as ami-fd4aa494) with user data having 'packages' and 'runcmd' section. Such as:
| #cloud-config
| packages: [ bzr, ubuntu-dev-tools, ccache, vim-nox, git-core, lftp ]
| runcmd:
| - [ sudo, -Hu, ubuntu, sh, -c 'bzr branch lp:ubuntu/lucid-proposed/cloud-init 2>&1 | tee cmd.log' ]
 * without a fix for this bug, the ordering is indeterminable, but most likely, the 'bzr branch' command will run before bzr is installed. With the fix, it is guaranteed to run afterwards.

--
[1] http://bazaar.launchpad.net/~ubuntu-branches/ubuntu/lucid/cloud-init/lucid-proposed/annotate/head%3A/doc/examples/cloud-config.txt
[2] http://bazaar.launchpad.net/~smoser/ubuntu/lucid/cloud-init/bug613309/revision/17
==== End SRU Justification =====

Pete Crossley (peterc)
description: updated
Revision history for this message
Martin Pitt (pitti) wrote :

I subscribed the server team. Once this is reviewed and fixed in maverick, we can discuss an SRU. Please subscribe ubuntu-sru then. Thank you!

tags: added: patch
Revision history for this message
Scott Moser (smoser) wrote :

Pete,
  What problem were you seeing exactly?
  The upstart jobs cloud-* do not explicitly list 'cloud-config' as a 'start on' requirement. That is by design. cloud-init starts on:
     (mounted MOUNTPOINT=/ and net-device-up IFACE=eth0)
  That set of events actually blocks any other upstart jobs from running until it is finished. Because of this, all the other cloud-config-* jobs are guaranteed to run after cloud-init finishes. Your suggestion to modify that to 'filesystem' from 'mounted MOUNTPOINT=/' will break that.

   Have you actually seen this to not be true ?

   I do agree that the parallelization of the cloud-cfg jobs is annoying. This has been changed in maverick. Now the jobs run serially, and the order can actually be controlled by cloud-config input.

   For the case that you were interested in, with the debconf, the setting of debconf values will run prior to the running of 'apt-get' so that should be fine.

  The one thing I would consider most annoying is that user scripts (upstart/cloud-run-user-script.conf) are not guaranteed to run after any of the cloud-config jobs are run. So, you can't rely on the following cloud-config:
| #cloud-config
| packages:
| - axel
| runcmd:
| - [ axel, http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.35.tar.bz2 ]

I find that to be annoying, and would agree to fix that.

Revision history for this message
Pete Crossley (peterc) wrote :

Sorry about that Scott yes I had an old source for the cloud-init upstart script your correct about the mount point.

The issue I was seeing is that the system was trying to query user data prior to the system loading it from the meta data service. So by waiting for cloud-init to load the metadata/userdata everything else worked. I will try and pull the maverick sources and see what I get just from the repo with out any patching. (but i will run them on a lucid uec image).. once i figure out my ssh issue in my cloud :)

More info to come! Thanks for the info.

Revision history for this message
Scott Moser (smoser) wrote : Re: [Bug 613309] Re: upstart scripts do not wait for 'cloud-config' status

I must not have been clear. The lucid cloud-init package should not
run the cloud-cfg jobs until after cloud-init is finished. You should
not need the maverick sources.

Revision history for this message
Pete Crossley (peterc) wrote : Re: upstart scripts do not wait for 'cloud-config' status

Ok Scott, yup I was seeing two issues but after I fixed my metadata issue in UEC you are correct. The main problem I am having is that the user scripts run while apt-get is installing packages that are used in my user run script.

So I will update the bug to reflect the following:

** Please have the upstart/cloud-run-user-script.conf what for a status after the other cloud-init processes have completed. This will allow users to configure/execute programs that were installed during the cloud-init process.

Thanks for taking your time to understand my issue, also allowing me to work through my miss-configuration.

summary: - upstart scripts do not wait for 'cloud-config' status
+ cloud-run-user-script.conf upstart script needs to run after all other
+ cloud-init processes
Revision history for this message
Pete Crossley (peterc) wrote :

Looking at 'cloud-config-misc.conf' script this only waits for the filesystem. What prevents this from running prior to the cloud-init upstart script? 'cloud-config-misc.conf' also is what the 'cloud-run-user-script.conf' waits for to stop to allow the user scripts to run after it. So I think there is still an issue with ordering and making sure these upstart scripts all complete prior to the execution of 'cloud-run-user-script.conf'

Revision history for this message
Scott Moser (smoser) wrote :

> Looking at 'cloud-config-misc.conf' script this only waits for the
> filesystem. What prevents this from running prior to the cloud-init
> upstart script?

filesystem won't be fired until after cloud-init has run. I'm not sure on
the particulars, I'm guessing all 'mounted MOUNTPOINT=' events for entries
in /etc/fstab are consumed 'filesystem' event is fired.

That said, I've never seen that fail, and this method was suggested to me
by the upstart author.

Revision history for this message
Scott Moser (smoser) wrote :

For maverick, this is "fix-released". In newer versions of cloud-init there are only upstart jobs:
 cloud-init
 cloud-config
 cloud-run-user-script

cloud-run-user-script runs on "stopped rc RUNLEVEL=[2345] and stopped cloud-config"

Changed in cloud-init (Ubuntu):
importance: Undecided → Medium
status: New → Fix Released
Revision history for this message
Scott Moser (smoser) wrote :

Pete,
  A build of the linked branch above will be available shortly in my ppa for lucid. https://launchpad.net/~smoser/+archive/ppa/+packages .
  I'd appreciate some test on that. My initial test is that it functions correctly.
  To test:
- launch an instance with some user data and runcmd
- ssh to instance:
  sudo apt-add-repository ppa:smoser/ppa
  sudo apt-get install cloud-init
  sudo rm -Rf /var/lib/cloud
  sudo reboot
- ssh to instance
  - you should see changed ssh keys
  - verify that your user scripts run

I don't have a suggestion on a perfect way to prove that the dependencies are correct, but the fact that it runs generally indicates that it is good. Ie, I'm fairly sure, that given the new 'start on' for cloud-run-user-script, it is not going to run *before* any of the other scripts.
| start on (stopped rc RUNLEVEL=[2345] \
| and stopped cloud-apt-update-upgrade \
| and stopped cloud-config-misc \
| and stopped cloud-config-mounts \
| and stopped cloud-config-puppet \
| and stopped cloud-config-ssh \
| and stopped cloud-disable-ec2-metadata )

Revision history for this message
Pete Crossley (peterc) wrote :

Sure will do, I am working on this stuff right now so once the build is done I will deploy.

Revision history for this message
Pete Crossley (peterc) wrote :

Looks good Scott. Everything seem to start and execute in the correct order.

* I purged all packages and installed via user-data and executed custom user run script.

Which included the following:

 - Custom apt repo
 - Custom key for repo
 - packages:
    - curl
    - pwgen
    - pastebinit
    - python-software-properties
    - chef
    - rubygems

 - Custom script to configure chef
    - validation.pem
 role list and node id (instance-id) from meta-data service

Scott Moser (smoser)
description: updated
Scott Moser (smoser)
Changed in cloud-init (Ubuntu Lucid):
importance: Undecided → High
milestone: none → lucid-updates
status: New → In Progress
Revision history for this message
Martin Pitt (pitti) wrote :

Scott,

in lucid-proposed there are now two cloud-init packages. Please merge them into one upload (0.5.10-0ubuntu1.2); if that's impractical, then please build ubuntu1.3 with -v0.5.10-0ubuntu1.1, so that the source.changes contains both versions. Thanks!

Revision history for this message
Scott Moser (smoser) wrote :

Martin,
  Sorry for the confusion.
  I've just uploaded 0.5.10-0ubuntu1.2 that has fixes for this bug and bug 613309.

Revision history for this message
Scott Moser (smoser) wrote :

above should have said bug 613309 (this one) and bug 582667.

Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Accepted cloud-init into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in cloud-init (Ubuntu Lucid):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Scott Moser (smoser) wrote :

I've verified the update is functioning as it should. Here is how:

- booted ami-1eea0077 (us-east-1 ubuntu-lucid-daily-i386-server-20100817)
  with the following user data:
#cloud-config
debconf_selections: |
  debconf debconf/priority select low
  debconf debconf/frontend select readline
apt_update: true
apt_upgrade: true

sm_misc:
 - &user_setup |
   set -x; exec > ~/user_setup.log 2>&1
   echo "starting at $(date -R)"
   echo "ps grep apt"
   ps axw | grep apt
   echo "ps grep cloud-init-cfg"
   ps axw | grep cloud-init-cfg

runcmd:
 - [ sudo, -Hu, ubuntu, sh, -c, 'read up sleep < /proc/uptime; echo $(date): runcmd up at $up | tee -a ~/runcmd.log' ]
 - [ sudo, -Hu, ubuntu, sh, -c, *user_setup ]
### end user data ###

- on initial boot, the instance has old cloud-init from lucid,
  so old debconf values will be in place.
$ sudo apt-get install debconf-utils
$ sudo debconf-get-selections | grep debconf/
debconf debconf/frontend select Dialog
debconf debconf/priority select high

- save off old logs
  $ mkdir old-logs && mv *.log old-logs

- add proposed and get cloud-init
  $ echo deb http://us.archive.ubuntu.com/ubuntu/ \
     $(lsb_release -sc)-proposed restricted main multiverse universe |
     sudo tee -a /etc/apt/sources.list.d/proposed.list
  $ sudo apt-get install cloud-init
  $ dpkg-query --show cloud-init
  cloud-init 0.5.10-0ubuntu1.2

- remove /var/lib/cloud so cloud-init will run again on next boot
  $ sudo rm -Rf /var/lib/cloud
  $ sudo reboot

- come back to instance, this time
  * debconf set selelections will have taken affect ( bug 582667 is fixed)
    $ sudo debconf-get-selections | grep debconf/
    debconf debconf/frontend select readline
    debconf debconf/priority select low

  * the logs will not show apt or cloud-init-cfg processes ( bug 613309 is fixed) because
    the scripts will have run after those have finished (even though 'apt-get
    upgrade' will have pulled all the -proposed changes)

    the second time, runcmd.log showed it was running at 92.46 seconds of
    uptime, compared to 12.63 the first time. The second time neither 'ps'
    finds cloud-init-cfg processes or apt processes.

tags: added: ec2-images uec-images verification-done
removed: ec2 uec userdata verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.5.10-0ubuntu1.2

---------------
cloud-init (0.5.10-0ubuntu1.2) lucid-proposed; urgency=low

  * add support for setting debconf selections (LP: #582667)
  * force runcmd scripts to run after cloud-init/cloud-config
    scripts are finished (LP: #613309)
 -- Scott Moser <email address hidden> Tue, 17 Aug 2010 08:45:39 -0400

Changed in cloud-init (Ubuntu Lucid):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.