Bootstrapping Vivid: ERROR failed to bootstrap environment, Permission denied (publickey), ci-info: no authorized ssh keys fingerprints found for user ubuntu

Bug #1424900 reported by Ryan Beisner
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Fix Released
High
Unassigned
cloud-init (Ubuntu)
Fix Released
High
Unassigned

Bug Description

Bootstrapping Vivid: ERROR failed to bootstrap environment, Permission denied (publickey), ci-info: no authorized ssh keys fingerprints found for user ubuntu

# Scenario
A user needs Unit 0 to be Vivid. The user manually sets default-series: vivid in ~/.juju/environments.yaml, then issues `juju bootstrap`.

The bootstrap times out after the default 10 minute wait, with: "ERROR failed to bootstrap environment: waited for 10m0s without being able to connect: Permission denied (publickey)."

The user cannot open an ssh session to the new instance using the local user keys, nor the keys in ~/.juju/. Juju stat and juju debug-log do not provide details as the bootstrap node cannot be contacted. Information is collected via nova console-log.

The same user procedure does bootstrap successfully when default-series is set to utopic or trusty. This issue has been confirmed using two juju providers: openstack provider (serverstack) and maas provider (dellstack). It has been confirmed on juju-core 1.21.1-0ubuntu1~14.04.1~juju1 and 1.22-beta3-0ubuntu1~14.04.1~juju1.

# The bootstrap node's nova console log reveals trouble:
Cloud-init v. 0.7.7 running 'modules:final' at Tue, 24 Feb 2015 00:44:33 +0000. Up 11.03 seconds.
ci-info: no authorized ssh keys fingerprints found for user ubuntu.
ci-info: no authorized ssh keys fingerprints found for user ubuntu.

# bootstrap output, with default-series: vivid
jenkins@juju-env0-machine-13:~$ juju bootstrap
Bootstrapping environment "osci-sv10-jdev"
Starting new instance for initial state server
Launching instance
 - 15abce53-fa1a-427e-83d9-551a642f4fd7
Installing Juju agent on bootstrap instance
Waiting for address
Attempting to connect to 172.17.110.41:22
ERROR failed to bootstrap environment: waited for 10m0s without being able to connect: Permission denied (publickey).

See attachment for additional details.

Related branches

Revision history for this message
Ryan Beisner (1chb1n) wrote :
Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
milestone: none → 1.23
tags: added: bootstrap vivid
Revision history for this message
Scott Moser (smoser) wrote :

I believe this is likely bug https://launchpad.net/bugs/1423972 .
i'll investigate further.

Revision history for this message
Scott Moser (smoser) wrote :

it seems cloud-init is having issues consuming gzipped user-data.

Revision history for this message
Scott Moser (smoser) wrote :

$ printf "%s\n%s\n" '#!/bin/sh' 'echo "$(date -R)" === HELLO WORLD === | tee -a /run/my.log"' | gzip -c > my-userdata
$ zcat my-userdata
#!/bin/sh
echo "$(date -R)" === HELLO WORLD === | tee -a /run/my.log"

$ nova boot --user-data=my-userdata --key-name=brickies --flavor=m1.small \
   --nic=net-id=25006453-2caa-4aa4-bdeb-e4822dc700d6 \
   --image=719fc41a-ca89-4f18-b082-94d067d449f9 vivid-20150224-214427

inside the instance, /var/log/cloud-init.log will have a WARN like:
  2015-02-24 21:45:12,010 - __init__.py[WARNING]: Unhandled non-multipart (text/x-not-multipart) userdata: 'b'\\x1f\\\\ufffd\\x08\\x00~\\\\ufffd\\\\ufffdT\\x00'...'

$ sudo file /var/lib/cloud/instance/user-data.txt
/var/lib/cloud/instance/user-data.txt: data

That *should* show 'gzip compressed data'

Also
$ ls -l /var/lib/cloud/instance/user-data.txt
-rw------- 1 root root 155 Feb 24 21:45 /var/lib/cloud/instance/user-data.txt
but
$ wget http://169.254.169.254/openstack/latest/user_data -O user_data
$ ls -l user_data
-rw-rw-r-- 1 ubuntu ubuntu 86 Feb 24 21:50 user_data

something got hosed there.

my tests that i've just committed don't go far enough to ensure this is working.
I think we undergo some conversions bytes -> text -> bytes and lose data in the process.

Revision history for this message
Oleg Strikov (strikov-deactivatedaccount) wrote :

I was able to reproduce the issue while running 12.04 as a master machine (one which runs nova boot <...>).

When trying to repro the issue with 14.04 as a master machine I observer this bug:
https://bugs.launchpad.net/python-novaclient/+bug/1419859
$ ubuntu@strikov-bastion:~/bug-1424900$ nova boot --user-data=./my-userdata --key-name=strikov-yubikey --flavor=m1.small --image=57a76219-c397-4f7d-aca1-cd7d681eadbb vivid-20150224.5
ERROR: 'utf8' codec can't decode byte 0x8b in position 1: invalid start byte

Revision history for this message
Oleg Strikov (strikov-deactivatedaccount) wrote :

When I try to pull user-data *manually* from inside the instance -- it is not corrupted:

ubuntu@vivid-20150224:~$ wget -O - http://169.254.169.254/latest/user-data | zcat

--2015-02-25 11:47:26-- http://169.254.169.254/latest/user-data
Connecting to 169.254.169.254:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 85 [text/html]
Saving to: 'STDOUT'

100%[================================================>] 85 --.-KB/s in 0s

2015-02-25 11:47:27 (15.8 MB/s) - written to stdout [85/85]

#!/bin/sh
echo "$(date -R)" === HELLO WORLD === | tee -a /run/my.log"

Revision history for this message
Oleg Strikov (strikov-deactivatedaccount) wrote :

14.04 cloud instance:

>>> import cloudinit.url_helper as uh
>>> r = uh.readurl("http://169.254.169.254/latest/user-data")
>>> r.contents
'\x1f\x8b\x08\x00\xb8\xb1\xedT\x00\x03SV\xd4O\xca\xcc\xd3/\xce\xe0JM\xce\xc8WPR\xd1HI,IU\xd0\r\xd2TR\xb0\xb5\xb5U\xf0p\xf5\xf1\xf1W\x08\xf7\x0f\xf2q\x01\xf3k\x14JR\x81\xd2\x89\n\xfaE\xa5y\xfa\xb9\x95z9\xf9\xe9J\\\x00L\xeb\x10\x11F\x00\x00\x00'

15.04 cloud instance:

>>> import cloudinit.url_helper as uh
>>> r = uh.readurl("http://169.254.169.254/latest/user-data")
>>> r.contents
'\x1f�\x08\x00���T\x00\x03SV�O���/��JM��WPR�HI,IU�\r�TR���U�p���W\x08�\x0f�q\x01�k\x14JR�҉\n�E�y���z9��J\\\x00L�\x10\x11F\x00\x00\x00'

Looks like multibyte UTF encoding issue to me.
Keep investigating...

Revision history for this message
Oleg Strikov (strikov-deactivatedaccount) wrote :

Simplified repro:

from cloudinit.sources.helpers.openstack import MetadataReader
from cloudinit.util import write_file
userdata = MetadataReader("http://169.254.169.254/").read_v2().get("userdata")
write_file("/tmp/userdata", userdata, 0o600)

/tmp/userdata is corrupted on 15.04 but okay on 14.04

Revision history for this message
Oleg Strikov (strikov-deactivatedaccount) wrote :

15.04:

>>> userdata
'\x1f�\x08\x00���T\x00\x03SV�O���/��JM��WPR�HI,IU�\r�TR���U�p���W\x08�\x0f�q\x01�k\x14JR�҉\n�E�y���z9��J\\\x00L�\x10\x11F\x00\x00\x00'
>>> userdata.encode('utf-8')
b'\x1f\xef\xbf\xbd\x08\x00\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbdT\x00\x03SV\xef\xbf\xbdO\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd/\xef\xbf\xbd\xef\xbf\xbdJM\xef\xbf\xbd\xef\xbf\xbdWPR\xef\xbf\xbdHI,IU\xef\xbf\xbd\r\xef\xbf\xbdTR\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbdU\xef\xbf\xbdp\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbdW\x08\xef\xbf\xbd\x0f\xef\xbf\xbdq\x01\xef\xbf\xbdk\x14JR\xef\xbf\xbd\xd2\x89\n\xef\xbf\xbdE\xef\xbf\xbdy\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbdz9\xef\xbf\xbd\xef\xbf\xbdJ\\\x00L\xef\xbf\xbd\x10\x11F\x00\x00\x00'

14.04:

>>> userdata
'\x1f\x8b\x08\x00\xb8\xb1\xedT\x00\x03SV\xd4O\xca\xcc\xd3/\xce\xe0JM\xce\xc8WPR\xd1HI,IU\xd0\r\xd2TR\xb0\xb5\xb5U\xf0p\xf5\xf1\xf1W\x08\xf7\x0f\xf2q\x01\xf3k\x14JR\x81\xd2\x89\n\xfaE\xa5y\xfa\xb9\x95z9\xf9\xe9J\\\x00L\xeb\x10\x11F\x00\x00\x00'

You may see that 'str' version on 15.04 is equal to 14.04 (some codes are treated as symbols on 15.04 though). But encoded version is clearly different on 15.04. Encoding breaks the thing.

Revision history for this message
Oleg Strikov (strikov-deactivatedaccount) wrote :

Okay. Some of the thought above were wrong.
Here is what's happening.

Python3-based cloud-init on 15.04 reads binary data into UTF-8 string.
It thoughts that this bitstream is utf-8 but it doesn't.
If it meets sequence of bytes that can't be converted into single utf-8 character (some non-sense sequence) it puts 'unknown chracter' (\xef\xbf\xbd or �) instead.
Then, we want to right userdata string to a file we encode it into bitstream but get all this 'unknown character's encoded with \xef\xbf\xbd which makes output longer and add unexpected bits.
That breaks the whole thing.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

So then it doesn't seem juju-core is the reason, but it's a cloud-init thing, as per comment #10.

no longer affects: juju-core
Scott Moser (smoser)
Changed in cloud-init:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Oleg Strikov (strikov-deactivatedaccount) wrote :

I can't reproduce the bug with the latest cloud image:
ubuntu-daily/ubuntu-vivid-daily-amd64-server-20150228-disk1.img (d8f01900-bb0f-480b-a71c-06330d1720a0)
It seems to be fixed.

$ juju bootstrap -v
Bootstrapping environment "strikov"
Starting new instance for initial state server
Launching instance
 - 3878adac-5e16-465b-8415-5aec0d8a4173
Installing Juju agent on bootstrap instance
Waiting for address
Attempting to connect to 10.5.0.28:22
Warning: Permanently added '10.5.0.28' (ECDSA) to the list of known hosts.
Logging to /var/log/cloud-init-output.log on remote host
Running apt-get update
Running apt-get upgrade
Installing package: curl
Installing package: cpu-checker
Installing package: bridge-utils
Installing package: rsyslog-gnutls
Fetching tools: curl -sSfw 'tools from %{url_effective} downloaded: HTTP %{http_code}; time %{time_total}s; size %{size_download} bytes; speed %{speed_download} bytes/s ' --retry 10 -o $bin/tools.tar.gz <[https://streams.canonical.com/juju/tools/releases/juju-1.21.1-vivid-amd64.tgz]>
Bootstrapping Juju machine agent
Starting Juju machine agent (jujud-machine-0)
Bootstrap complete

$ juju ssh 0
<...>
ubuntu@juju-strikov-machine-0:~$ sudo file /var/lib/cloud/instance/user-data.txt
/var/lib/cloud/instance/user-data.txt: gzip compressed data

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Also confirming that his appears to be resolved with the latest daily image.

#
ubuntu@beisner-bastion:~$ juju stat
environment: beis2
machines:
  "0":
    agent-state: started
    agent-version: 1.21.3
    dns-name: 10.5.5.197
    instance-id: a5ff05e0-46b0-406f-af63-a98fc790a2e3
    instance-state: ACTIVE
    series: vivid
    hardware: arch=amd64 cpu-cores=1 mem=1536M root-disk=10240M
    state-server-member-status: has-vote
services: {}

#
| image | ubuntu-daily/ubuntu-vivid-daily-amd64-server-20150228-disk1.img (d8f01900-bb0f-480b-a71c-06330d1720a0) |

Revision history for this message
Scott Moser (smoser) wrote :

this was fixed in 0.7.7~bzr1067-0ubuntu1 uploaded to vivid 2015-02-26.

Changed in cloud-init:
status: Confirmed → Fix Released
status: Fix Released → Fix Committed
Changed in cloud-init (Ubuntu):
status: New → Fix Released
importance: Undecided → Critical
importance: Critical → High
Changed in cloud-init:
status: Fix Committed → Fix Released
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.