Merge ~raharper/cloud-init:ds-ovf-use-util-find-devs-with into cloud-init:master
| Status: | Merged |
|---|---|
| Approved by: | Scott Moser on 2017-09-21 |
| Approved revision: | 6e2771370102fbed46fba7c077fa9c328d02ecaa |
| Merged at revision: | da6562e21d0b17a0957adc0c5a2c9da076e0d219 |
| Proposed branch: | ~raharper/cloud-init:ds-ovf-use-util-find-devs-with |
| Merge into: | cloud-init:master |
| Diff against target: |
309 lines (+221/-27) 3 files modified
cloudinit/sources/DataSourceOVF.py (+47/-27) cloudinit/tests/helpers.py (+10/-0) tests/unittests/test_datasource/test_ovf.py (+164/-0) |
| Related bugs: |
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| Scott Moser | 2017-09-19 | Approve on 2017-09-21 | |
| Server Team CI bot | continuous-integration | Approve on 2017-09-20 | |
|
Review via email:
|
|||
Description of the Change
DataSourceOVF: use util.find_
DataSourceOVF attempts to find iso files via walking os.listdir('/dev/')
which is far too wide. This approach is too invasive and can sometimes
race with systemd attempting to fsck and mount devices.
Instead, utilize cloudinit.
criteria (which uses blkid under the covers). This should result in fewer
attempts to mount block devices which are not actual iso filesystems.
| Scott Moser (smoser) wrote : | # |
| Scott Moser (smoser) wrote : | # |
also seems quite possible that just changing the 'default_regex' to drop 'xvd.*' would probably fix ec2. that explains why we only see this on ec2 and not elsewhere (other than places that have 'hd' disks, which is unlikely at this point in the world).
you can/should/
| Ryan Harper (raharper) wrote : | # |
As it turns out, two parallel instances of /bin/mount pointing to the same device will cause one to fail; it appears that there is some sort of locking/
This sort of race is recreatable outside of cloud-init:
# rw mount loop
while true; do
umount -f ${MNT_POINT} |:
mount ${DISK} ${MNT_POINT} -t ext4 -o defaults || {
echo "`date +%s.%N`: mount failed on COUNT=$COUNT";
exit 1
}
echo "`date +%s.%N`: COUNT=$COUNT" > ${MNT_POINT}
COUNT=$(($COUNT + 1))
done
# ro mount loop
while true; do
dd if=${DISK} bs=512 count=1 of=/dev/null
# we expect this to fail, that's OK
mount -o ro,sync -t iso9660 ${DISK} ${MNT_POINT} |:
echo "`date +%s.%N`: READ COUNT=$COUNT"
COUNT=$(($COUNT + 1))
done
| Ryan Harper (raharper) wrote : | # |
I've kept the regex filter as suggested and added unittests.
PASSED: Continuous integration, rev:2699208b16a
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
SUCCESS: Ubuntu LTS: Integration
SUCCESS: MAAS Compatability Testing
IN_PROGRESS: Declarative: Post Actions
Click here to trigger a rebuild:
https:/
| Ryan Harper (raharper) wrote : | # |
> As it turns out, two parallel instances of /bin/mount pointing
> to the same device will cause one to fail; it appears that there
> is some sort of locking/
> that results in one of the two mount processes getting EBUSY as a result.
In particular, when we open the block device, the kernel sets
FMODE_EXCL flag which prevents additional opens of the same device
where the filesystem is of a different type.
This is expected behavior by the kernel (and our setup).
| Scott Moser (smoser) wrote : | # |
some small comments.
| Ryan Harper (raharper) wrote : | # |
I'll add the os.path.basename(); should we do that in both places?
| Scott Moser (smoser) wrote : | # |
clean up the comment i guess. we can use os.path.basename in both places.
PASSED: Continuous integration, rev:6e277137010
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
SUCCESS: Ubuntu LTS: Integration
SUCCESS: MAAS Compatability Testing
IN_PROGRESS: Declarative: Post Actions
Click here to trigger a rebuild:
https:/


I'm pretty sure that we end up trying to mount everything with:
mount -t iso966o <device> </tmp/mountpoint>
so i'd have thought that that would fail really early, i'm not sure how this would cause issues with other mounts, unless there was actually iso9660 filesystems on those devices. which would be odd.
also, please file a bug