Merge ~wesley-wiedenmeier/cloud-init:integration-testing into cloud-init:master
| Status: | Merged |
|---|---|
| Merged at revision: | 76d58265e34851b78e952a7f275340863c90a9f5 |
| Proposed branch: | ~wesley-wiedenmeier/cloud-init:integration-testing |
| Merge into: | cloud-init:master |
| Diff against target: |
1567 lines (+700/-271) 16 files modified
tests/cloud_tests/args.py (+11/-7) tests/cloud_tests/collect.py (+9/-14) tests/cloud_tests/config.py (+53/-12) tests/cloud_tests/images/base.py (+16/-9) tests/cloud_tests/images/lxd.py (+108/-18) tests/cloud_tests/instances/base.py (+65/-29) tests/cloud_tests/instances/lxd.py (+47/-21) tests/cloud_tests/platforms.yaml (+49/-1) tests/cloud_tests/platforms/base.py (+1/-24) tests/cloud_tests/platforms/lxd.py (+27/-15) tests/cloud_tests/releases.yaml (+133/-67) tests/cloud_tests/setup_image.py (+83/-42) tests/cloud_tests/snapshots/base.py (+2/-1) tests/cloud_tests/snapshots/lxd.py (+6/-6) tests/cloud_tests/util.py (+89/-4) tox.ini (+1/-1) |
| Related bugs: |
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| Server Team CI bot | continuous-integration | Approve on 2017-03-17 | |
| Joshua Powers (community) | Approve on 2017-01-18 | ||
| cloud-init commiters | 2016-10-12 | Pending | |
|
Review via email:
|
|||
Description of the Change
integration-testing updates
- enable additional distros
- allow use of either pylxd 2.1.3 or 2.2.3
- detected at runtime during instance.execute()
- when using 2.1.3, the test suite will not have proper
error handling during setup phase, as 2.1.3 does not
return command exit codes. this is adequate for use on
jenkins until the fix for 2.2 is merged in upstream
- refactor image setup code
- improve error handling throughout
- change behavior of --upgrade setup_image to only
upgrade cloud-init
- add --upgrade-all setup_image flag to upgrade all
packages on system
- clean up detection of cloud-init version
- output cloud-init version at end of setup_image
- use 'yum install' rather than 'yum update' in
setup_
- update help in setup_image args
- clean up image config
- new image config format allows finer control over
platform specific image config, with less verbosity
- clean up handling of image config throughout platform
- added additional releases and distros
- re-ordered releases.yaml in order of release date, and
added EOL comments for future reference
- allow image config to override setup options with
'setup_
- allow image config to specify preserving base images
for lxd
- allow using images from images.
releases other than ubuntu
- improve waiting for system to boot
- separated out determining if the system has booted from
determining if cloud-init has finished
- do not wait for cloud-init during setup phases, as it
may not be installed
- use 'systemctl is-system-running' or 'runlevel'
depending on init system
- poll for system ready inside the image rather than
through instance.execute() to make pylxd too many open
files error take longer to strike
- clean up code for handling waiting
- update lxd image metadata and templates if necessary
- many lxd images that do not come with cloud-init
installed do not have the necessary templates to write
a nocloud seed into the image
- if necessary, export image, extract metadata, add new
template scripts, and import as new image
- add new tox environment to run tests
- new versions of pylxd only available through pip on
xenial, best to avoid using pip outside a venv
- 'citest' provides an entry points into the cli,
allowing use of any test commands, using pylxd 2.2.3
Switching to polling in the image rather than with repeated calls to pylxd's execute() seems to have fixed the too many open files issue, although once the test suite is made to run in parallel this may become an issue again. Centos tests are not running smoothly yet because of what looks like a bug in cc_set_hostname, but there is no longer an issue with the test infrastructure on centos and old releases of other distros.
It looks like the remaining centos70 issue is not an error in the test suite or in cloud-init; its a systemd bug in the centos images from images.
Setting the system hostname using hostnamectl fails with:
-[laptop1]- -[~]-
-[08:20 PM]-lxc launch images:
Creating picked-snapper
Starting picked-snapper
-[laptop1]- -[~]-
-[08:21 PM]-lxc exec picked-snapper bash
[root@picked-
Could not set property: Connection timed out
If 'preserve_hostname: true' is added to the cloud-config for a test that should work on centos the test will pass.
Once I figure out whether the bug with hostnamectl is in the image itself or in all of centos I will get a bug filed. I don't think that further work on the test suite should be done to resolve this, as modifying the test configs as a workaround does not seem right here.
This is blocked on pylxd:
https:/
A fix for the pylxd bug is available at:
https:/
This should be ready to merge and includes all changes from:
https:/
| Joshua Powers (powersj) wrote : | # |
@wesley, per the commit message you enabled additional distros. It looks like wheezy, jessie and centros66, and centos70. Do these require the newer version of pylxd for running? Should they work as expected? Trying to understand how to test these.
All distros will work on both pylxd 2.1.3 and pylxd 2.2, the only problem running with 2.1.3 is the lack of error handling, which makes debugging setup_image difficult when it isn't working.
Not all of the new distros work correctly, centos66 will fail due to not having cloud-init in its repos, although I have gotten a run to go through by installing a rpm + a couple deps. It probably shouldn't be used for any sort of testing. I'll go ahead and disable it.
Debian wheezy is not enabled, as it is very old. I could probably get it working but it wasn't a priority for testing.
Debian stretch works, at least for single tests. I haven't run through a full test yet, although I suspect that modules/
Debian jessie is enabled, but does not work due to what I believe is a bug in cc_apt_configure which I haven't had time to resolve yet. Other than that bug, it should work.
Centos70 does not work because of a bug in hostnamectl on the centos images from images.
Due to feature incompatibility between ubuntu/debian and centos, and lack of support for modules/
The supported distro/release matrix has been updated at: goo.gl/q78sY8
| Joshua Powers (powersj) wrote : | # |
LGTM, with two minor comments below. Reusing images has really sped up execution and printing the cloud-init version is great. I ran through the following tests:
$ tox
$ tox -e citest -- run -v -n zesty -t tests/cloud_
$ tox -e citest_new_pylxd -- run -v -n zesty -t tests/cloud_
$ tox -e citest_tree_run
$ python3 -m tests.cloud_tests run -v -n zesty -t tests/cloud_
$ python3 -m tests.cloud_tests run -v -n stretch -t tests/cloud_
$ python3 -m tests.cloud_tests run -v -n jessie -t tests/cloud_
Comment 1:
The warning messages about using pylxd < 2.2 [1]. There are far too many messages right now. I see the message in pylxd.py as it checks for the exit code; can we suppress the warning until 2.2 is stable or even consider changing it to show up when the user is using less than 2.1.3?
Comment 2:
Running on jessie I get a failure running no_stages_errors [2] about security. Is it trying to add the jessie security repo?
[1] https:/
[2] https:/
> LGTM, with two minor comments below. Reusing images has really sped up
> execution and printing the cloud-init version is great.
Thanks. Yeah, re-using the images should help quite a bit with runtime.
> Comment 1:
> The warning messages about using pylxd < 2.2 [1]. There are far too many
> messages right now. I see the message in pylxd.py as it checks for the exit
> code; can we suppress the warning until 2.2 is stable or even consider
> changing it to show up when the user is using less than 2.1.3?
I'll go ahead and comment the message out, we can always put it back later.
Having the message display each execute() does make things harder to read
> Comment 2:
> Running on jessie I get a failure running no_stages_errors [2] about security.
> Is it trying to add the jessie security repo?
Yeah, jessie fails right now due to cc_apt_configure trying to find the security
repo in sources. I'm not 100% sure why its looking for it, as it won't be there.
I think this is likely a bug that needs to be filed, I just haven't had time to
find a way to reproduce the issue outside of cloud tests yet. I'll try to get
that done later this week.
| Scott Moser (smoser) wrote : | # |
lots of comments...
Thanks for all the work here, we need to try to do things more iteratively. I know we're still bringing this all up, but there is just so much here that its almost impossible to review.
General:
* please lets try to do smaller merge proposals, doing one set of things
at a time.
* full chart of enabled/disabled at: http://
this is a private url, not suitable for a commit message
possibly we just move this into doc in cloud-init?
* where is the upstream bug that prevents 2.2 from working correctly?
* i think we can live without 'alias: t'.
* apparently we should be using dnf rather than yum in any recent RH distro
* "switch to using images.
why ?? we really need to be testing Ubuntu images, not linuxcontainers.org
I like using them for other distros where we have nothing else, but
ubuntu should test primarily on ubuntu images.
* system_ready_script can be simplified by being a command, passed
directly to 'execute'.
system_
that saves writing the file and executing it on the other side, and
also then assuming 'bash'.
* "Pass $HOME through in tox citest envs for lxc client use"
message doesn't make sense. where are we using the lxc client? just
for developer debug ?
* "Disable centos66 test"
why ?
tox.ini:
* multiple citest_* are fine for now, but i really do not want them
going forward.
Easily droppable commits / things that can be done separartely:
- "Use LOG.warning instead of deprectated LOG.warn"
why? This is just not necessary in this merge proposal.
- "Change behavior of upgrade in setup_image"
tests/cloud_
* 7cd4a3b5 source_archive: can we either
a.) just assume the deb-src lines are present
b.) hueristically grab one... i'd like to not use us.archive.com if
the image is otherwise using archive.ubuntu.com as that will
avoid any proxy in the path. this is kind of tough actually..
* creating the tarball..
I think what we want to do here is:
a.) make-tarball on "this side"
b.) modify bddeb to be able to just create a debian/ dir
either
i.) have it create the orig tarball and a debian.tar.gz into a
ii.) have it just create the debian.tar.gz and we create the tarball
c.) move the tarball and debian dir over, extract tarball, extract debian/
d.) mk-build-deps --install .... debian/control
e.) dpkg-buildpackage
doing this should mean we do not have to 'install additional build deps'
but rather just have mk-build-deps do the right thing.
* when building with dpkg-buildpackage or debuild, good to
allow setting DEB_BUILD_
nosetests)
* i think we can drop 'git' due to above, 'tar' is essential so you
do not have to declare it.
lets do the install of devscripts and equivs with --no-install-
and also eatmydata everywhere
* i think the above path means we do no...
> lots of comments...
> Thanks for all the work here, we need to try to do things more
> iteratively. I know we're still bringing this all up, but there
> is just so much here that its almost impossible to review.
Yeah this was definitely too large a merge, we just got stuck with
lots of changes that depend on each other. This replaces a pretty
huge chunk of the current codebase as most of that was just
temporary to get some tests running while the main part of this
got finished.
> General:
> * please lets try to do smaller merge proposals, doing one set
> of things at a time.
Definitely, this should be the last major merge, since all of the
pieces that had to be rewritten are finished. Next merge should just
be the new platform.
> * full chart of enabled/disabled at: http://
> this is a private url, not suitable for a commit message
> possibly we just move this into doc in cloud-init?
I'll go ahead and pull this link out of the commit message. I have
been meaning to do some doc updates, I'll add a link to the
spreadsheet and make it public when I put together the merge for
that.
> * where is the upstream bug that prevents 2.2 from working
> correctly?
It was at https:/
been pulled in now. I'm going to set up a ppa to build a version with
the fix for devel use. However, for now using 2.1.3 works for
jenkins, the only disadvantage there is that if anything goes wrong
during setup it will fail silently. This version of the code will run
with either 2.1.3 or 2.2, so when the fix is released all we have to
do to switch is change the version in the tox env.
> * i think we can live without 'alias: t'.
The 'alias' key is used to reference the image, on the ubuntu daily
sstream server the aliases I found were p, t, x, y, z. It's now been
switched over to ubuntu/
images.
> * apparently we should be using dnf rather than yum in any recent
> RH distro
Oh, I didn't know that dnf was officially supported outside Fedora.
It shouldn't take much to switch that back over, just gotta update
the setup image functions.
> * "switch to using images.
> daily images" why ?? we really need to be testing Ubuntu
> images, not linuxcontainers.org I like using them for other
> distros where we have nothing else, but ubuntu should test
> primarily on ubuntu images.
I had mainly done it for consistancy across the different distros and
because linuxcontainers images download way faster than ubuntu daily
images (at least for me) but its just a config change to switch back
to the ubuntu daily repo for ubuntu images. I can switch that back
tonight.
> * system_ready_script can be simplified by being a command,
> passed directly to 'execute'.
> system_
> '/run/cloud-
> that saves writing the file and executing it on the other
> side, and also then assuming 'bash'.
That makes sense, although it does limit how complex the scripts can
get. Its probably best to keep them simple though...
| Scott Moser (smoser) wrote : | # |
>> * full chart of enabled/disabled at: http://
>> this is a private url, not suitable for a commit message
>> possibly we just move this into doc in cloud-init?
>
>I'll go ahead and pull this link out of the commit message. I have
>been meaning to do some doc updates, I'll add a link to the
>spreadsheet and make it public when I put together the merge for
>that.
Great.
>> * where is the upstream bug that prevents 2.2 from working
>> correctly?
> It was at https:/
Thanks, please mention it in the code.
>> * i think we can live without 'alias: t'.
> The 'alias' key is used to reference the image, on the ubuntu daily
> sstream server the aliases I found were p, t, x, y, z. It's now been switched
> over to ubuntu/
> images.
The ubuntu/daily streams also have other more explicit alias:
lxc launch ubuntu-daily:xenial
lxc launch ubuntu-daily:16.04
>> * apparently we should be using dnf rather than yum in any recent
>> RH distro
> Oh, I didn't know that dnf was officially supported outside Fedora.
> It shouldn't take much to switch that back over, just gotta update
> the setup image functions.
I only know because cloud-init had a MP from a RH developer asking for it (see
commit log)
https:/
>> * system_ready_script can be simplified by being a command,
>> passed directly to 'execute'.
>> system_
>> '/run/cloud-
>> that saves writing the file and executing it on the other
>> side, and also then assuming 'bash'.
>That makes sense, although it does limit how complex the scripts can
>get. Its probably best to keep them simple though.
If we needed, we could allow the config to provide stdin also, but with
the above, the only limit on complexity that i can see is the command
line length limit, which shouldn't be too big of a problem
http://
xargs --show-limits:
POSIX upper limit on argument length (this system): 2091520
:)
>> * "Pass $HOME through in tox citest envs for lxc client use"
>> message doesn't make sense. where are we using the lxc client? just
>> for developer debug ?
> It is used for
Sentance fragment ^ ?
>> * "Disable centos66 test"
>> why ?
> Its such an old release that pretty much nothing is going to be
> backported there, testing on it doesn't really make sense IMO. I
> think the one time I tried running centos 6 tests they failed, but I
> don't recall why. I can get tests going there in the future if
> they're needed.
Well, if they were working, then it makes sense to run it, as a stable
platform likely means regression/
cloud-init, not the OS.
>> Easily droppable commits / things that can be done separartely:
>> - "Use LOG.warning instead of deprectated LOG.warn"
>> why? This is just not necessary in this merge proposal.
>> - "Change behavior of upgrade in setup_image"
>Th...
> >> * where is the upstream bug that prevents 2.2 from working
> >> correctly?
>
> > It was at https:/
>
> Thanks, please mention it in the code.
Sure, I'll add a comment in there.
> >> * i think we can live without 'alias: t'.
>
> > The 'alias' key is used to reference the image, on the ubuntu daily
> > sstream server the aliases I found were p, t, x, y, z. It's now been
> switched
> > over to ubuntu/
> > images.
>
> The ubuntu/daily streams also have other more explicit alias:
> lxc launch ubuntu-daily:xenial
> lxc launch ubuntu-daily:16.04
Oh, nice. I'll switch to using those.
> >> * apparently we should be using dnf rather than yum in any recent
> >> RH distro
>
> > Oh, I didn't know that dnf was officially supported outside Fedora.
> > It shouldn't take much to switch that back over, just gotta update
> > the setup image functions.
>
> I only know because cloud-init had a MP from a RH developer asking
> for it (see commit log)
> https:/
It may be good to keep both the yum path in place and add a config
directive to tell the platform which to use, just in case things
switch around more.
> >> * system_ready_script can be simplified by being a command,
> >> passed directly to 'execute'.
> >> system_
> >> '/run/cloud-
> >> that saves writing the file and executing it on the other
> >> side, and also then assuming 'bash'.
>
> >That makes sense, although it does limit how complex the scripts can
> >get. Its probably best to keep them simple though.
>
> If we needed, we could allow the config to provide stdin also, but with
> the above, the only limit on complexity that i can see is the command
> line length limit, which shouldn't be too big of a problem
> http://
> argument-
>
> xargs --show-limits:
> POSIX upper limit on argument length (this system): 2091520
>
> :)
Haha yeah, we're definitely not in danger of hitting that limit.
> >> * "Pass $HOME through in tox citest envs for lxc client use"
> >> message doesn't make sense. where are we using the lxc client? just
> >> for developer debug ?
>
> > It is used for
>
> Sentance fragment ^ ?
Whoops, I must have forgotten to finish that. It is used for exporting
lxd images that need to have their metadata modified to write the
cloud seed into the target on. Pylxd does have an export function for
images, but it does the export as a tmpfile with handle, not as a split
tarball in the filesystem like we need. The lxc client handles that
nicely, but it needs $HOME set so it can locate its config file.
> >> * "Disable centos66 test"
> >> why ?
>
> > Its such an old release that pretty much nothing is going to be
> > backported there, testing on it doesn't really make sense IMO. I
> > think the one time I tried running centos 6 tests they failed, but I
> > don't recall why. I can get tests going there in th...
This branch should mostly be ready to merge in now. There is additonal work to be done on getting the cc_apt_configure issue resolved in debian jessie, and the hostnamectl issue resolved on centos70, but that can wait, as these are most likely not issues caused by the test suite. Support for package management on centos still needs to be switched to dnf, but since centos is blocked on the hostnamectl issue, that can wait as well.
The cleanup to system_
Note: I commented on the wrong MP a couple days ago, this was meant to be here
PASSED: Continuous integration, rev:b7d483f8dce
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
PASSED: Continuous integration, rev:bc332ca6b4d
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
FAILED: Continuous integration, rev:c6e896b627e
https:/
Executed test runs:
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
Click here to trigger a rebuild:
https:/
PASSED: Continuous integration, rev:1c3b07a8634
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
PASSED: Continuous integration, rev:cffc558b6ef
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
- 0bb1f96... by Wesley Wiedenmeier on 2017-03-12
- 4ebe02d... by Wesley Wiedenmeier on 2017-03-12
- 5952da7... by Wesley Wiedenmeier on 2017-03-12
PASSED: Continuous integration, rev:bee10bde56a
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
PASSED: Continuous integration, rev:9e3596dd6f4
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
PASSED: Continuous integration, rev:f28a848f23a
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
I've squashed the commit history for this branch together into a patch series, so it should be easier to review commit by commit now.
- c931e06... by Wesley Wiedenmeier on 2017-03-15
PASSED: Continuous integration, rev:1c55c8bb2a9
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
- 7918b5c... by Wesley Wiedenmeier on 2017-03-15
PASSED: Continuous integration, rev:b9f673fcc73
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
PASSED: Continuous integration, rev:7918b5cf3fd
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
| Scott Moser (smoser) wrote : | # |
I'm going to mark this 'work in progress'. Josh has integrated this branch and others into his merge proposal at https:/
Thanks for your work, Wesley!
| Scott Moser (smoser) wrote : | # |
Hi.
I've marked this 'merged' as I think it is now in trunk under 76d58265e34851b
If you disagree, please feel free to re-open.


There are 2 outstanding issues with releases that are now enabled:
Debian Jessie encounteres errors with cc_apt_configure, even with no 'apt'
config keys presented. This does not cause cloud-init to completely fail;
however, the tests fail as they all check that cloud-init stages did not
encounter any errors. I suspect there may be a bug in cloud-init here, as
it is failing on a supported system even with default config. It's worth
noting that on daily builds of Stretch, everything works properly, and
everything works on Sid, so this only affects Wheezy.
Errors: config. cc_apt_ configure' from '/usr/lib/ python2. 7/dist- packages/ cloudinit/ config/ cc_apt_ configure. pyc'>) failed
2017-01-05 00:15:47,016 - util.py[WARNING]: Running apt-configure (<module 'cloudinit.
"('apt-configure', NotFound(u\"cannot find 'security'\",))"
All CentOS releases, Debian Wheezy and Ubuntu Precise fail while pylxd
attempts to create a websocket to communicate with the image in execute().
Since the failures are consistant and always on older releases (counting
centos 7 as an old release, as most versions in it are fairly old), I
suspect that this has something to do with library versions in target.
The actual error message is:
[Errno 24] Too many open files
I will mark this is 'needs review' once I have fixed these issues or found
better explainations.