Merge lp:~jordane/cloud-init/sshd-systemd-fix into lp:~cloud-init-dev/cloud-init/trunk

Proposed by Jordan Evans
Status: Merged
Merged at revision: 995
Proposed branch: lp:~jordane/cloud-init/sshd-systemd-fix
Merge into: lp:~cloud-init-dev/cloud-init/trunk
Diff against target: 14 lines (+2/-1)
1 file modified
systemd/cloud-init.service (+2/-1)
To merge this branch: bzr merge lp:~jordane/cloud-init/sshd-systemd-fix
Reviewer Review Type Date Requested Status
cloud-init Commiters Pending
Review via email: mp+226880@code.launchpad.net

This proposal supersedes a proposal from 2014-06-24.

Description of the change

Fixes bug 1333920, a race condition between sshd and cloud-init by making cloud-init start before sshd.

To post a comment you must log in.
Revision history for this message
Scott Moser (smoser) wrote : Posted in a previous version of this proposal

Hi jordan,
 is this necessary?
 In Ubuntu, my experience is that sshd actually does the right thing. On each connection, it will check keys. Ie, you can start sshd whenever, and cloud-init will write the keys later, and ssh will deny connections until those keys are found.

  that is a bit less than ideal, because ideally the port wouldn't even be open, so the "poll-er" could just poll until port open and then expect everythign functional.

  So my question is does this result in permanent broken? or broken until keys generated.

Revision history for this message
Garrett Holmstrom (gholms) wrote : Posted in a previous version of this proposal

IIRC, since Ubuntu uses debconf to manage host keys sshd is free to start without them and wait for something like cloud-init generate them at its leisure. On Fedora, though, sshd-keygen.service generates them. Since that always runs immediately prior to sshd.service it is much more difficult to make sshd start without host keys. If cloud-init.service does not finish before sshd-keygen.service starts then sshd will need a restart to pick up whatever changes cloud-init happens to make to it.

Strictly speaking, this patch is incorrect -- cloud-init.service should precede sshd.service *and* sshd-keygen.service.

Revision history for this message
Jordan Evans (jordane) wrote :

Scott,

The issue isn't that you can't add keys later, but that sshd can start before cloud-init drops keys in. If you have some automated script watching for sshd to be up, and then tries to ssh in *before* cloud-init injects keys, then the login will fail, whereas if you wait until cloud-init finishes, then it will succeed. This leaves two options:

1) Make the automated scripts that are trying to ssh wait an arbitrary amount of time, retry, or some mix of the two.
2) Make sshd wait until cloud-init is done

The only negative affect this will have is

Garret,

Hmm, I entirely missed that. Thanks! I've resubmitted with a fix for that as well.

Revision history for this message
Jordan Evans (jordane) wrote :

> Scott,

> The only negative affect this will have is

Oops, looks like part of my comment disappeared. The only negative effect this will have is slowing down bootup time.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'systemd/cloud-init.service'
2--- systemd/cloud-init.service 2013-09-20 23:04:49 +0000
3+++ systemd/cloud-init.service 2014-07-15 16:31:57 +0000
4@@ -1,8 +1,9 @@
5 [Unit]
6 Description=Initial cloud-init job (metadata service crawler)
7 After=local-fs.target network.target cloud-init-local.service
8+Before=sshd.service sshd-keygen.service
9 Requires=network.target
10-Wants=local-fs.target cloud-init-local.service
11+Wants=local-fs.target cloud-init-local.service sshd.service sshd-keygen.service
12
13 [Service]
14 Type=oneshot