Merge ~chad.smith/cloud-init:cyaml-loading into cloud-init:master
| Status: | Work in progress | ||||
|---|---|---|---|---|---|
| Proposed branch: | ~chad.smith/cloud-init:cyaml-loading | ||||
| Merge into: | cloud-init:master | ||||
| Diff against target: |
110 lines (+36/-6) 4 files modified
cloudinit/safeyaml.py (+7/-3) packages/debian/control.in (+1/-1) requirements.txt (+6/-2) tests/unittests/test_util.py (+22/-0) |
||||
| Related bugs: |
|
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| Server Team CI bot | continuous-integration | Needs Fixing on 2017-04-25 | |
| cloud-init commiters | 2017-04-25 | Pending | |
|
Review via email:
|
|||
Description of the Change
[Leaving this branch as work in progress until we have more concrete timing analysis of the impact of loading yam's C-bindings instead of the python implementation]
Use CSafeLoader C-bindings instead of python for a faster yaml.load when processing yaml files.
For faster yaml processing import CSafeLoader in cloudinit/
LP: # 1685939
To test:
tox -r
tox
| Chad Smith (chad.smith) wrote : | # |
From Ryan:
Some thoughts on this:
1) we don't want users who apt install cloud-init to also pull down python-devel, gcc and have to compile the extension, so please don't change package deps here
2) this is currently an issue for the tox/venv environment, so let's focus on how to enable the SafeLoader library for that path
3) Once we make (2) an optional testing path (env or parameter can trigger the dep install prior to calling tox/venv which triggers the compile during pip install of the module IIUC), we can have two travis paths, one without the library (representing how things work today) and one *with* the library enabled. This allows us to compare results, unittests all pass, and time it takes to complete (some estimate of the improvement it may provide).
4) With results from (3), we may propose changes to the underlying images (server seed, cloud-image seed, etc) which could ensure that the library is present in the default cloud images.
- 35bb6bd... by Chad Smith on 2017-05-04
- b1833ad... by Chad Smith on 2017-05-04
| Chad Smith (chad.smith) wrote : | # |
1) We won't have libyaml-dev as a dependency just the shared lib of libyaml-0-2. I've added it as a Recommends and put in logic to handle ImportErrors on CSafeLoader which fall back to SafeLoader
2) I've got a branch up in github which installs tox-needed system debs that we can review as part of the migration to github https:/
3) I have added a skipUnles decorator on unit tests to ensure we exercise the CSafeLoader path only if that module is present
4) depending on what we want to do with a strict yaml C-binding dependency, we might end up tweaking the debian/control file to just expect this as a mandatory deb (libyaml-0-2).
We might need further team discussion on whether that's the approach we want
| Scott Moser (smoser) wrote : | # |
> From Ryan:
> Some thoughts on this:
>
> 1) we don't want users who apt install cloud-init to also pull down python-
> devel, gcc and have to compile the extension, so please don't change package
> deps here
I don't follow this. In ubuntu at least cloud-init does depend on python3-yaml and python3-yaml on libyaml-0-2.
$ apt-cache show python3-yaml | grep Dep
Depends: python3 (<< 3.6), python3 (>= 3.5~), libc6 (>= 2.14), libyaml-0-2
There is no reason to explicitly depend on libyaml-0-2, and we have a fallback
path in the code.
Why did you think we need python-devel ? This is where tox falls apart honestly, when you get into C python extensions. But we already have some of those in the test path (pylxd).
> 2) this is currently an issue for the tox/venv environment, so let's focus on
> how to enable the SafeLoader library for that path
if we do skipIf, then the package build environment will run though the CLoader path, and the tox path would not. We could put in some way of failing if the package build environment didn't take the CLoader path. That'd ensure that we were getting tests on both paths.
I think that might be over-engineering.
> 3) Once we make (2) an optional testing path (env or parameter can trigger the
> dep install prior to calling tox/venv which triggers the compile during pip
> install of the module IIUC), we can have two travis paths, one without the
> library (representing how things work today) and one *with* the library
> enabled. This allows us to compare results, unittests all pass, and time it
> takes to complete (some estimate of the improvement it may provide).
>
> 4) With results from (3), we may propose changes to the underlying images
> (server seed, cloud-image seed, etc) which could ensure that the library is
> present in the default cloud images.
I think over-engineering, especially for the size of the gain. Its entirely possible that there is overhead of loading c that makes this path slower, meaning we'd be doing this work for negative result.
Best case, i think the amount of yaml that cloud-init loads is probably small enough to look like noise here.
| Ryan Harper (raharper) wrote : | # |
On Fri, May 5, 2017 at 8:21 AM, Scott Moser <email address hidden> wrote:
> > From Ryan:
> > Some thoughts on this:
> >
> > 1) we don't want users who apt install cloud-init to also pull down
> python-
> > devel, gcc and have to compile the extension, so please don't change
> package
> > deps here
>
> I don't follow this. In ubuntu at least cloud-init does depend on
> python3-yaml and python3-yaml on libyaml-0-2.
>
> $ apt-cache show python3-yaml | grep Dep
> Depends: python3 (<< 3.6), python3 (>= 3.5~), libc6 (>= 2.14), libyaml-0-2
>
> There is no reason to explicitly depend on libyaml-0-2, and we have a
> fallback
> path in the code.
>
> Why did you think we need python-devel ? This is where tox falls apart
> honestly, when you get into C python extensions. But we already have some
> of those in the test path (pylxd).
>
tox compiling
>
> > 2) this is currently an issue for the tox/venv environment, so let's
> focus on
> > how to enable the SafeLoader library for that path
>
> if we do skipIf, then the package build environment will run though the
> CLoader path, and the tox path would not. We could put in some way of
> failing if the package build environment didn't take the CLoader path.
> That'd ensure that we were getting tests on both paths.
>
> I think that might be over-engineering.
>
> > 3) Once we make (2) an optional testing path (env or parameter can
> trigger the
> > dep install prior to calling tox/venv which triggers the compile during
> pip
> > install of the module IIUC), we can have two travis paths, one without
> the
> > library (representing how things work today) and one *with* the library
> > enabled. This allows us to compare results, unittests all pass, and time
> it
> > takes to complete (some estimate of the improvement it may provide).
> >
> > 4) With results from (3), we may propose changes to the underlying images
> > (server seed, cloud-image seed, etc) which could ensure that the library
> is
> > present in the default cloud images.
>
> I think over-engineering, especially for the size of the gain. Its
> entirely possible that there is overhead of loading c that makes this path
> slower, meaning we'd be doing this work for negative result.
>
> Best case, i think the amount of yaml that cloud-init loads is probably
> small enough to look like noise here.
>
Ideally we'll generate some actual numbers here; if it's not worth it, then
we can dump the whole thing. If it is, ideally we'd have
a path through both w.r.t tox/unittesting; I can't say that it's
over-engineering to want to exercise both paths we might expect
cloud-init users might execute.
>
>
> --
> https:/
> cloud-init/
> Your team cloud init development team is requested to review the proposed
> merge of ~chad.smith/
>
> _______
> Mailing list: https:/
> Post to : <email address hidden>
> Unsubscribe : https:/
> More help : https:/
>
| Scott Moser (smoser) wrote : | # |
On Fri, 5 May 2017, Ryan Harper wrote:
> > Best case, i think the amount of yaml that cloud-init loads is probably
> > small enough to look like noise here.
> >
>
> Ideally we'll generate some actual numbers here; if it's not worth it, then
> we can dump the whole thing. If it is, ideally we'd have
> a path through both w.r.t tox/unittesting; I can't say that it's
> over-engineering to want to exercise both paths we might expect
> cloud-init users might execute.
Maybe.
There are literally hundreds of different paths that software under
cloud-init might take based on some condition. We should have reasonable
code coverage on code we write, but we can't possibly excercise all
possible paths of code underneith us.
As an example, consider the affect on code path through common libraries
based on locale. Should cloud-init run unit tests with multiple different
locales? If that is a non-trivial operation in development time or in
runtime, should we still do that?
It is quite reasonable that pyyaml project should test their coverage of
CLoader and the python path. It is also quite reasonable that cloud-init
assume that path function as it is expected to when it is available.
- 7439b57... by Chad Smith on 2017-05-08
Unmerged commits
- 7439b57... by Chad Smith on 2017-05-08
- b1833ad... by Chad Smith on 2017-05-04
- 35bb6bd... by Chad Smith on 2017-05-04
- 4813c9f... by Chad Smith on 2017-04-25


FAILED: Continuous integration, rev:4813c9f9414 558116c003ce889 9c94c03408b74a /jenkins. ubuntu. com/server/ job/cloud- init-ci/ 275/ /jenkins. ubuntu. com/server/ job/cloud- init-ci/ nodes=metal- amd64/275/ console /jenkins. ubuntu. com/server/ job/cloud- init-ci/ nodes=metal- arm64/275/ console /jenkins. ubuntu. com/server/ job/cloud- init-ci/ nodes=metal- ppc64el/ 275/console /jenkins. ubuntu. com/server/ job/cloud- init-ci/ nodes=metal- s390x/275/ console /jenkins. ubuntu. com/server/ job/cloud- init-ci/ nodes=vm- i386/275/ console
https:/
Executed test runs:
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
Click here to trigger a rebuild: /jenkins. ubuntu. com/server/ job/cloud- init-ci/ 275/rebuild
https:/