Merge ~raharper/curtin:fix/lvm-over-raid into curtin:master

Proposed by Ryan Harper on 2018-07-26
Status: Merged
Approved by: Scott Moser on 2018-08-17
Approved revision: ed603a4e7eba552cb4345961c8dca5a71de8e343
Merge reported by: Server Team CI bot
Merged at revision: not available
Proposed branch: ~raharper/curtin:fix/lvm-over-raid
Merge into: curtin:master
Diff against target: 353 lines (+223/-24)
8 files modified
curtin/block/clear_holders.py (+3/-0)
curtin/block/lvm.py (+23/-5)
examples/tests/dirty_disks_config.yaml (+30/-3)
examples/tests/lvmoverraid.yaml (+98/-0)
tests/unittests/test_block_lvm.py (+13/-13)
tests/unittests/test_clear_holders.py (+5/-2)
tests/vmtests/test_lvm_raid.py (+50/-0)
tests/vmtests/test_lvm_root.py (+1/-1)
Reviewer Review Type Date Requested Status
Server Team CI bot continuous-integration Approve on 2018-08-17
Scott Moser 2018-07-26 Approve on 2018-08-17
Review via email: mp+351384@code.launchpad.net

Commit message

clear-holders: rescan for lvm devices after assembling raid arrays

Lvm devices to be found after assembling raid arrays. Add a call after
lvm_scan to activate any discovered vgs and lvs.

LP: #1783413

To post a comment you must log in.
Ryan Harper (raharper) wrote :

Fixing up trusty; it's lvm2 is fickle without lvmetad.

Scott Moser (smoser) :
Ryan Harper (raharper) wrote :

OK, I think I've got trusty sorted out.

https://jenkins.ubuntu.com/server/job/curtin-vmtest-devel-debug/80/console

It's now passing, so I think we can review this.

Scott Moser (smoser) wrote :

I had some comments inline in my review above.
Please address or explain.

review: Needs Information
Ryan Harper (raharper) wrote :

> is there a reason not to have lvm_scan do the udevadm_settle ?
> it seems like all callers are probably interested in having that happen.
>
> maybe we even have race conditions elsewhere as I don't see anywhere else
> that does use settle after calling this.

In blockmeta, creating a volgroup does not create any device nodes, but creating the
logical volumes will. We don't currently do a devsync() (which does a settle using
the --exit-if-exists=/path/to/device/expected to have a fast exit path if the
device node has been created) at the end of lvm logical volume creation, but
the subsequent call of the user of the lvm device uses get_path_to_storage() which
will do a devsync().

I don't think we've any race conditions due calls to scan in block-meta, the devsync
ensures we don't.

And generally adding the settle to the scan can't use the more specific --exist-if-exists
parameter since in clear-holders, we don't know what to expect.

So I generally think that leaving the need for settle up to the caller outside of the
scan itself.

> can we make this '--activate=ay' instead of 2 parms?
> Its much more clear that way. I thought you'd typod'
> your commit message with a rogue 'ay' in it.

Yes

> comments needed in dirty-disk yaml for shutting down lvm/raid

Yes, will add comments.

Scott Moser (smoser) wrote :

> > is there a reason not to have lvm_scan do the udevadm_settle ?
> > it seems like all callers are probably interested in having that happen.
> >
> > maybe we even have race conditions elsewhere as I don't see anywhere else
> > that does use settle after calling this.
>
> In blockmeta, creating a volgroup does not create any device nodes, but

But if that is the first time that 'lvm_scan' is run, I'm not sure
how that would not possibly create device nodes.

> creating the logical volumes will. We don't currently do a devsync()
> (which does a settle using the --exit-if-exists=/path/to/device/expected

I generally think that '--exit-if-exists=' is a bug.
consider this:
 a.) /dev/sda has 1 partition on it, so '/dev/sda1' exists (from previous
install).
 b.) we repartition /dev/sda, blockdev --rereadpt, and and then call
   'udevadm settle --exit-if-exists=/dev/sda1'
 c.) udevadm checks to see if /dev/sda1 exists. It does (it always did)
 d.) udev events continue, and re-read the partition table, delete all
     the old partitions and re-create them.
 e.) while 'd' is happening our code thinks it is fine to use /dev/sda1

I think we have this general fallacy in logic at places.

/dev/sda1 above could just as well be /dev/vg0 or any other device.

> to have a fast exit path if the device node has been created) at the end
> of lvm logical volume creation, but > the subsequent call of the user
> of the lvm device uses get_path_to_storage() which will do a devsync().

udevadm settle is *not* a slow operation if there are no events in the
queue. It is a very important one if there *is* events in the queue.

  $ time udevadm settle
  real 0m0.013s
  user 0m0.004s
  sys 0m0.001s

So... I'd much rather have any functions that may cause udevadm events
to call settle by default and be safe than try to shave off an extra
1/100th of a second in a OS install.

Ryan Harper (raharper) wrote :
Download full text (4.0 KiB)

On Wed, Aug 1, 2018 at 8:13 AM Scott Moser <email address hidden> wrote:
>
> > > is there a reason not to have lvm_scan do the udevadm_settle ?
> > > it seems like all callers are probably interested in having that happen.
> > >
> > > maybe we even have race conditions elsewhere as I don't see anywhere else
> > > that does use settle after calling this.
> >
> > In blockmeta, creating a volgroup does not create any device nodes, but
>
> But if that is the first time that 'lvm_scan' is run, I'm not sure
> how that would not possibly create device nodes.

A volume group is not directly accessible as a block device. You have
to create an LV on a VG to get an accessible block device.

>
> > creating the logical volumes will. We don't currently do a devsync()
> > (which does a settle using the --exit-if-exists=/path/to/device/expected
>
> I generally think that '--exit-if-exists=' is a bug.
> consider this:
> a.) /dev/sda has 1 partition on it, so '/dev/sda1' exists (from previous
> install).
> b.) we repartition /dev/sda, blockdev --rereadpt, and and then call
> 'udevadm settle --exit-if-exists=/dev/sda1'

We wipe the disk's partition table, and then force a reread of the
partition table
and ensure that the disk no longer has *any* partitions, whatever
their name was or the location.

The disk is clean before we create any new partitions.

> c.) udevadm checks to see if /dev/sda1 exists. It does (it always did)
> d.) udev events continue, and re-read the partition table, delete all
> the old partitions and re-create them.
> e.) while 'd' is happening our code thinks it is fine to use /dev/sda1
>
> I think we have this general fallacy in logic at places.

I disagree. The order of events that happen in curtin are as follows:

1) clear-holders calls wipe_superblock on /dev/sda (which has a /dev/sda1)
2) if the device being wiped has partitions (it does), then we wipe
the partition table
      and then reread the partition table, settle and query the disk
for any partitions, repeat
      until there are no partitions reported from the kernel, we fail
if we cannot wipe the partitions.
3) after all disks are clear (no partitions are left on any disk
specified which has wipe superblock
4) we create a partition table (empty) on /dev/sda
5) we create a first partition on /dev/sda, say /dev/sda1
6) we query the path to /dev/sda1 via the get_path_to_storage() which
calls devsync() which
    does a udevadm settle --exit-if-exists

There isn't a logic bug here because we know that *we* were the ones
to create the partition and we know what the expected /dev path is.

>
> /dev/sda1 above could just as well be /dev/vg0 or any other device.

It won't be a volume group, but yes, any other block device that we
create. The logic above holds for those as well. Where do you think
it fails after accounting for how we wipe devices and ensure they're
clear (including partitions)?

>
> > to have a fast exit path if the device node has been created) at the end
> > of lvm logical volume creation, but > the subsequent call of the user
> > of the lvm device uses get_path_to_storage() which will do a devsync().
>
> udevadm settle is *not* a slow op...

Read more...

Scott Moser (smoser) wrote :
Download full text (4.6 KiB)

> > > creating the logical volumes will. We don't currently do a devsync()
> > > (which does a settle using the --exit-if-exists=/path/to/device/expected
> >
> > I generally think that '--exit-if-exists=' is a bug.
> > consider this:
> > a.) /dev/sda has 1 partition on it, so '/dev/sda1' exists (from previous
> > install).
> > b.) we repartition /dev/sda, blockdev --rereadpt, and and then call
> > 'udevadm settle --exit-if-exists=/dev/sda1'
>
> We wipe the disk's partition table, and then force a reread of the
> partition table
> and ensure that the disk no longer has *any* partitions, whatever
> their name was or the location.
>
> The disk is clean before we create any new partitions.
>
> > c.) udevadm checks to see if /dev/sda1 exists. It does (it always did)
> > d.) udev events continue, and re-read the partition table, delete all
> > the old partitions and re-create them.
> > e.) while 'd' is happening our code thinks it is fine to use /dev/sda1
> >
> > I think we have this general fallacy in logic at places.
>
> I disagree. The order of events that happen in curtin are as follows:
>
> 1) clear-holders calls wipe_superblock on /dev/sda (which has a /dev/sda1)
> 2) if the device being wiped has partitions (it does), then we wipe
> the partition table
> and then reread the partition table, settle and query the disk
> for any partitions, repeat
> until there are no partitions reported from the kernel, we fail
> if we cannot wipe the partitions.
> 3) after all disks are clear (no partitions are left on any disk
> specified which has wipe superblock
> 4) we create a partition table (empty) on /dev/sda
> 5) we create a first partition on /dev/sda, say /dev/sda1
> 6) we query the path to /dev/sda1 via the get_path_to_storage() which
> calls devsync() which
> does a udevadm settle --exit-if-exists
>
> There isn't a logic bug here because we know that *we* were the ones
> to create the partition and we know what the expected /dev path is.
>
> >
> > /dev/sda1 above could just as well be /dev/vg0 or any other device.
>
> It won't be a volume group, but yes, any other block device that we
> create. The logic above holds for those as well. Where do you think
> it fails after accounting for how we wipe devices and ensure they're
> clear (including partitions)?
>
>
> >
> > > to have a fast exit path if the device node has been created) at the end
> > > of lvm logical volume creation, but > the subsequent call of the user
> > > of the lvm device uses get_path_to_storage() which will do a devsync().
> >
> > udevadm settle is *not* a slow operation if there are no events in the
> > queue. It is a very important one if there *is* events in the queue.
> >
> > $ time udevadm settle
> > real 0m0.013s
> > user 0m0.004s
> > sys 0m0.001s
>
> At best, at worst, the default timeout for udevadm settle is 120
> seconds; and without an --exit-if-exists, we wait on *all* events in
> the queue at the time of the settle call.

>
> >
> > So... I'd much rather have any functions that may cause udevadm events
> > to call settle by default and be safe than try to shave off an extra
> > 1/100th of a second in a OS...

Read more...

Ryan Harper (raharper) wrote :
Download full text (7.0 KiB)

On Wed, Aug 8, 2018 at 9:58 AM Scott Moser <email address hidden> wrote:
>
> > > > creating the logical volumes will. We don't currently do a devsync()
> > > > (which does a settle using the --exit-if-exists=/path/to/device/expected
> > >
> > > I generally think that '--exit-if-exists=' is a bug.
> > > consider this:
> > > a.) /dev/sda has 1 partition on it, so '/dev/sda1' exists (from previous
> > > install).
> > > b.) we repartition /dev/sda, blockdev --rereadpt, and and then call
> > > 'udevadm settle --exit-if-exists=/dev/sda1'
> >
> > We wipe the disk's partition table, and then force a reread of the
> > partition table
> > and ensure that the disk no longer has *any* partitions, whatever
> > their name was or the location.
> >
> > The disk is clean before we create any new partitions.
> >
> > > c.) udevadm checks to see if /dev/sda1 exists. It does (it always did)
> > > d.) udev events continue, and re-read the partition table, delete all
> > > the old partitions and re-create them.
> > > e.) while 'd' is happening our code thinks it is fine to use /dev/sda1
> > >
> > > I think we have this general fallacy in logic at places.
> >
> > I disagree. The order of events that happen in curtin are as follows:
> >
> > 1) clear-holders calls wipe_superblock on /dev/sda (which has a /dev/sda1)
> > 2) if the device being wiped has partitions (it does), then we wipe
> > the partition table
> > and then reread the partition table, settle and query the disk
> > for any partitions, repeat
> > until there are no partitions reported from the kernel, we fail
> > if we cannot wipe the partitions.
> > 3) after all disks are clear (no partitions are left on any disk
> > specified which has wipe superblock
> > 4) we create a partition table (empty) on /dev/sda
> > 5) we create a first partition on /dev/sda, say /dev/sda1
> > 6) we query the path to /dev/sda1 via the get_path_to_storage() which
> > calls devsync() which
> > does a udevadm settle --exit-if-exists
> >
> > There isn't a logic bug here because we know that *we* were the ones
> > to create the partition and we know what the expected /dev path is.
> >
> > >
> > > /dev/sda1 above could just as well be /dev/vg0 or any other device.
> >
> > It won't be a volume group, but yes, any other block device that we
> > create. The logic above holds for those as well. Where do you think
> > it fails after accounting for how we wipe devices and ensure they're
> > clear (including partitions)?
> >
> >
> > >
> > > > to have a fast exit path if the device node has been created) at the end
> > > > of lvm logical volume creation, but > the subsequent call of the user
> > > > of the lvm device uses get_path_to_storage() which will do a devsync().
> > >
> > > udevadm settle is *not* a slow operation if there are no events in the
> > > queue. It is a very important one if there *is* events in the queue.
> > >
> > > $ time udevadm settle
> > > real 0m0.013s
> > > user 0m0.004s
> > > sys 0m0.001s
> >
> > At best, at worst, the default timeout for udevadm settle is 120
> > seconds; and without an --exit-if-exists, we wait on *all* events in
> > the queue at the time ...

Read more...

Scott Moser (smoser) wrote :
Download full text (5.8 KiB)

> > non-sense.
> > have you ever seen 'udevadm settle' take 120 seconds? Can you give
> > an example of when it *could* do that? The only thing I can reasonably
>
> https://ask.fedoraproject.org/en/question/81006/slow-boot-time-udev-settle/
> https://askubuntu.com/questions/888010/about-systemd-udev-settle-service-and-
> slow-booting
> https://bugzilla.redhat.com/show_bug.cgi?id=735866
> https://bbs.archlinux.org/viewtopic.php?id=189369
> https://forums.opensuse.org/showthread.php/506828-Udev-3-minute-delay-on-boot
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=754987
> https://lists.debian.org/debian-user/2015/04/msg01592.html

Every single example link above is in regard to slow boot times.
Boot is a time when there is a flurry of udev events. curtin does
not run in that time frame which is a *huge* outlier in the general
life of a system.

In order to get actual numbers on how long we spend time waiting for
udev events that I think are important, I put up:
 https://code.launchpad.net/~smoser/curtin/+git/curtin/+merge/352791

A full vmtest run produced 13098 calls to udevadm settle over 257
installs. Of those calls, 1555 took over .1 seconds. The longest
coming in during TrustyHWEXTestRaid5Bcache at 1.809 seconds.
 http://paste.ubuntu.com/p/t2k6tDw6x7/

Here is a log of total time spent in settle along with total time
spent in 'curtin install'.
 http://paste.ubuntu.com/p/rNsqfpmxkr/
Admittedly the total time we spend "waiting" can be 10% of the
time of our install, but I'd still argue that those waits are
almost certainly valid.

> Settle can block up to 120 seconds any time one or more events
> that are in the queue when settle is called that do not get processed.
> The event in queue may be unrelated to storage; it can be any event
> in the system (net, block, usb); settle doesn't care.

And you think that at the time frame that an installer is running
there is likely a flurry of net or usb events coming in?
Go ahead, find a busy system that is not booting and run 'udevadm monitor'
See if you see a single event over the course of 10 minutes or so.

I gave that a try, I ran 'lxc launch' on a lxd with zfs. I was honestly
floored at how many events came through. (1163 UDEV , 1163 KERNEL).
I ran a loop of 'time udevadm settle' during that, and it never took more
than 1/100th of a second.

> > come up with are disk based operations on large md or something.
> > Those are exactly the reasons we want to settle. Anything else that
> > was seemingly unrelated but is taking significant time honestly
> > probably something important to an installer. I'd much rather
> > see a log entry that showed a timeout failed on a settle than
> > try to diagnose some related failure.
>
> Recall this discussion:
> https://bugs.launchpad.net/cloud-init/+bug/1766287
> https://code.launchpad.net/~raharper/cloud-init/+git/cloud-
> init/+merge/344198

Both of our responses on that bug is that we'd much rather
have *stable* behavior than fast behavior. That is what I'm arguing for here.
So I'm confused on your pointing to it.

> My argument is that we should only call settle when we *need* it.
> Extra calls to settle just because at best waste a...

Read more...

Ryan Harper (raharper) wrote :

jenkins vmtest run approves:

https://jenkins.ubuntu.com/server/job/curtin-vmtest-devel-debug/91/console

----------------------------------------------------------------------
Ran 3198 tests in 13805.907s

OK (SKIP=288)
Mon, 13 Aug 2018 18:33:50 +0000: vmtest end [0] in 13809s

Scott Moser (smoser) wrote :

bah. a comment i had made a week ago inline that didn't get saved.

this does look really good though at this point.

Ryan Harper (raharper) wrote :

On Fri, Aug 17, 2018 at 7:51 AM Scott Moser <email address hidden> wrote:
>
> bah. a comment i had made a week ago inline that didn't get saved.
>
> this does look really good though at this point.
>
>
> Diff comments:
>
> > diff --git a/examples/tests/lvmoverraid.yaml b/examples/tests/lvmoverraid.yaml
>
> By adding the lvm stop and mdadm stop you are pushing us through a very close path that we'd previously not been through. But aren't you also ensuring we do not hit the previously tested (and more common) code path?
> At least for lvm wouldn't the ephemeral environment normal have
> The pscan running?

Thanks for raising this.

In the previous "common" case, the LVMs were never removed (kernel
still knew of them, device nodes existed) since we've never performed
the stop; and they were never "rediscovered" by a scan since the lvm
tools examine /etc/lvm/* for config before looking at the disk. I
confirmed this when attempting to reproduce the failure only worked
once I added the above code to the dirty-disk mode.

I believe that what we are doing now matches more closely to what
curtin would see on first-boot of a system which has LVM on block
devices but the OS (our ephemeral image) will have no record of it and
it can *only* be found via our lvm_scan() + vg_activate().

>
> > new file mode 100644
> > index 0000000..a1d41e9
>
>
> --
> https://code.launchpad.net/~raharper/curtin/+git/curtin/+merge/351384
> You are the owner of ~raharper/curtin:fix/lvm-over-raid.

Scott Moser (smoser) wrote :

Sounds reasonable.

On Fri, Aug 17, 2018 at 10:24 AM Ryan Harper <email address hidden>
wrote:

> On Fri, Aug 17, 2018 at 7:51 AM Scott Moser <email address hidden>
> wrote:
> >
> > bah. a comment i had made a week ago inline that didn't get saved.
> >
> > this does look really good though at this point.
> >
> >
> > Diff comments:
> >
> > > diff --git a/examples/tests/lvmoverraid.yaml
> b/examples/tests/lvmoverraid.yaml
> >
> > By adding the lvm stop and mdadm stop you are pushing us through a very
> close path that we'd previously not been through. But aren't you also
> ensuring we do not hit the previously tested (and more common) code path?
> > At least for lvm wouldn't the ephemeral environment normal have
> > The pscan running?
>
> Thanks for raising this.
>
> In the previous "common" case, the LVMs were never removed (kernel
> still knew of them, device nodes existed) since we've never performed
> the stop; and they were never "rediscovered" by a scan since the lvm
> tools examine /etc/lvm/* for config before looking at the disk. I
> confirmed this when attempting to reproduce the failure only worked
> once I added the above code to the dirty-disk mode.
>
> I believe that what we are doing now matches more closely to what
> curtin would see on first-boot of a system which has LVM on block
> devices but the OS (our ephemeral image) will have no record of it and
> it can *only* be found via our lvm_scan() + vg_activate().
>
>
>
> >
> > > new file mode 100644
> > > index 0000000..a1d41e9
> >
> >
> > --
> > https://code.launchpad.net/~raharper/curtin/+git/curtin/+merge/351384
> > You are the owner of ~raharper/curtin:fix/lvm-over-raid.
>
> --
> https://code.launchpad.net/~raharper/curtin/+git/curtin/+merge/351384
> You are reviewing the proposed merge of ~raharper/curtin:fix/lvm-over-raid
> into curtin:master.
>

Scott Moser (smoser) wrote :

sounds good.

review: Approve

FAILED: Autolanding.
More details in the following jenkins job:
https://jenkins.ubuntu.com/server/job/curtin-autoland-test/15/
Executed test runs:
    None: https://jenkins.ubuntu.com/server/job/admin-lp-git-autoland/52/console

review: Needs Fixing (continuous-integration)
review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/curtin/block/clear_holders.py b/curtin/block/clear_holders.py
2index 9d73b28..a2042d5 100644
3--- a/curtin/block/clear_holders.py
4+++ b/curtin/block/clear_holders.py
5@@ -624,6 +624,9 @@ def start_clear_holders_deps():
6 # all disks and partitions should be sufficient to remove the mdadm
7 # metadata
8 mdadm.mdadm_assemble(scan=True, ignore_errors=True)
9+ # scan and activate for logical volumes
10+ lvm.lvm_scan()
11+ lvm.activate_volgroups()
12 # the bcache module needs to be present to properly detect bcache devs
13 # on some systems (precise without hwe kernel) it may not be possible to
14 # lad the bcache module bcause it is not present in the kernel. if this
15diff --git a/curtin/block/lvm.py b/curtin/block/lvm.py
16index 8643245..eca64f6 100644
17--- a/curtin/block/lvm.py
18+++ b/curtin/block/lvm.py
19@@ -57,14 +57,32 @@ def lvmetad_running():
20 '/run/lvmetad.pid'))
21
22
23-def lvm_scan():
24+def activate_volgroups():
25+ """
26+ Activate available volgroups and logical volumes within.
27+
28+ # found
29+ % vgchange -ay
30+ 1 logical volume(s) in volume group "vg1sdd" now active
31+
32+ # none found (no output)
33+ % vgchange -ay
34+ """
35+
36+ # vgchange handles syncing with udev by default
37+ # see man 8 vgchange and flag --noudevsync
38+ out, _ = util.subp(['vgchange', '--activate=y'], capture=True)
39+ if out:
40+ LOG.info(out)
41+
42+
43+def lvm_scan(activate=True):
44 """
45 run full scan for volgroups, logical volumes and physical volumes
46 """
47- # the lvm tools lvscan, vgscan and pvscan on ubuntu precise do not
48- # support the flag --cache. the flag is present for the tools in ubuntu
49- # trusty and later. since lvmetad is used in current releases of
50- # ubuntu, the --cache flag is needed to ensure that the data cached by
51+ # prior to xenial, lvmetad is not packaged, so even if a tool supports
52+ # flag --cache it has no effect. In Xenial and newer the --cache flag is
53+ # used (if lvmetad is running) to ensure that the data cached by
54 # lvmetad is updated.
55
56 # before appending the cache flag though, check if lvmetad is running. this
57diff --git a/examples/tests/dirty_disks_config.yaml b/examples/tests/dirty_disks_config.yaml
58index 75d44c3..fb9a0d6 100644
59--- a/examples/tests/dirty_disks_config.yaml
60+++ b/examples/tests/dirty_disks_config.yaml
61@@ -27,6 +27,31 @@ bucket:
62 # disable any rpools to trigger disks with zfs_member label but inactive
63 # pools
64 zpool export rpool ||:
65+ - &lvm_stop |
66+ #!/bin/sh
67+ # This function disables any existing lvm logical volumes that
68+ # have been created during the early storage config stage
69+ # and simulates the effect of booting into a system with existing
70+ # (but inactive) lvm configuration.
71+ for vg in `pvdisplay -C --separator = -o vg_name --noheadings`; do
72+ vgchange -an $vg ||:
73+ done
74+ # disable the automatic pvscan, we want to test that curtin
75+ # can find/enable logical volumes without this service
76+ command -v systemctl && systemctl mask lvm2-pvscan\@.service
77+ # remove any existing metadata written from early disk config
78+ rm -rf /etc/lvm/archive /etc/lvm/backup
79+ - &mdadm_stop |
80+ #!/bin/sh
81+ # This function disables any existing raid devices which may
82+ # have been created during the early storage config stage
83+ # and simulates the effect of booting into a system with existing
84+ # but inactive mdadm configuration.
85+ for md in /dev/md*; do
86+ mdadm --stop $md ||:
87+ done
88+ # remove any existing metadata written from early disk config
89+ rm -f /etc/mdadm/mdadm.conf
90
91 early_commands:
92 # running block-meta custom from the install environment
93@@ -34,9 +59,11 @@ early_commands:
94 # the disks exactly as in this config before the rest of the install
95 # will just blow it all away. We have clean out other environment
96 # that could unintentionally mess things up.
97- blockmeta: [env, -u, OUTPUT_FSTAB,
98+ 01-blockmeta: [env, -u, OUTPUT_FSTAB,
99 TARGET_MOUNT_POINT=/tmp/my.bdir/target,
100 WORKING_DIR=/tmp/my.bdir/work.d,
101 curtin, --showtrace, -v, block-meta, --umount, custom]
102- enable_swaps: [sh, -c, *swapon]
103- disable_rpool: [sh, -c, *zpool_export]
104+ 02-enable_swaps: [sh, -c, *swapon]
105+ 03-disable_rpool: [sh, -c, *zpool_export]
106+ 04-lvm_stop: [sh, -c, *lvm_stop]
107+ 05-mdadm_stop: [sh, -c, *mdadm_stop]
108diff --git a/examples/tests/lvmoverraid.yaml b/examples/tests/lvmoverraid.yaml
109new file mode 100644
110index 0000000..a1d41e9
111--- /dev/null
112+++ b/examples/tests/lvmoverraid.yaml
113@@ -0,0 +1,98 @@
114+storage:
115+ config:
116+ - grub_device: true
117+ id: disk-0
118+ model: QEMU_HARDDISK
119+ name: 'main_disk'
120+ serial: disk-a
121+ preserve: false
122+ ptable: gpt
123+ type: disk
124+ wipe: superblock
125+ - grub_device: false
126+ id: disk-2
127+ name: 'disk-2'
128+ serial: disk-b
129+ preserve: false
130+ type: disk
131+ wipe: superblock
132+ - grub_device: false
133+ id: disk-1
134+ name: 'disk-1'
135+ serial: disk-c
136+ preserve: false
137+ type: disk
138+ wipe: superblock
139+ - grub_device: false
140+ id: disk-3
141+ name: 'disk-3'
142+ serial: disk-d
143+ preserve: false
144+ type: disk
145+ wipe: superblock
146+ - grub_device: false
147+ id: disk-4
148+ name: 'disk-4'
149+ serial: disk-e
150+ preserve: false
151+ type: disk
152+ wipe: superblock
153+ - device: disk-0
154+ flag: bios_grub
155+ id: part-0
156+ preserve: false
157+ size: 1048576
158+ type: partition
159+ - device: disk-0
160+ flag: ''
161+ id: part-1
162+ preserve: false
163+ size: 4G
164+ type: partition
165+ - devices:
166+ - disk-2
167+ - disk-1
168+ id: raid-0
169+ name: md0
170+ raidlevel: 1
171+ spare_devices: []
172+ type: raid
173+ - devices:
174+ - disk-3
175+ - disk-4
176+ id: raid-1
177+ name: md1
178+ raidlevel: 1
179+ spare_devices: []
180+ type: raid
181+ - devices:
182+ - raid-0
183+ - raid-1
184+ id: vg-0
185+ name: vg0
186+ type: lvm_volgroup
187+ - id: lv-0
188+ name: lv-0
189+ size: 3G
190+ type: lvm_partition
191+ volgroup: vg-0
192+ - fstype: ext4
193+ id: fs-0
194+ preserve: false
195+ type: format
196+ volume: part-1
197+ - fstype: ext4
198+ id: fs-1
199+ preserve: false
200+ type: format
201+ volume: lv-0
202+ - device: fs-0
203+ id: mount-0
204+ path: /
205+ type: mount
206+ - device: fs-1
207+ id: mount-1
208+ path: /home
209+ type: mount
210+ version: 1
211+
212diff --git a/tests/unittests/test_block_lvm.py b/tests/unittests/test_block_lvm.py
213index 341f2fa..22fb064 100644
214--- a/tests/unittests/test_block_lvm.py
215+++ b/tests/unittests/test_block_lvm.py
216@@ -75,24 +75,24 @@ class TestBlockLvm(CiTestCase):
217 @mock.patch('curtin.block.lvm.util')
218 def test_lvm_scan(self, mock_util, mock_lvmetad):
219 """check that lvm_scan formats commands correctly for each release"""
220+ cmds = [['pvscan'], ['vgscan', '--mknodes']]
221 for (count, (codename, lvmetad_status, use_cache)) in enumerate(
222- [('precise', False, False), ('precise', True, False),
223- ('trusty', False, False), ('trusty', True, True),
224- ('vivid', False, False), ('vivid', True, True),
225- ('wily', False, False), ('wily', True, True),
226+ [('precise', False, False),
227+ ('trusty', False, False),
228 ('xenial', False, False), ('xenial', True, True),
229- ('yakkety', True, True), ('UNAVAILABLE', True, True),
230 (None, True, True), (None, False, False)]):
231 mock_util.lsb_release.return_value = {'codename': codename}
232 mock_lvmetad.return_value = lvmetad_status
233 lvm.lvm_scan()
234- self.assertEqual(
235- len(mock_util.subp.call_args_list), 2 * (count + 1))
236- for (expected, actual) in zip(
237- [['pvscan'], ['vgscan', '--mknodes']],
238- mock_util.subp.call_args_list[2 * count:2 * count + 2]):
239- if use_cache:
240- expected.append('--cache')
241- self.assertEqual(mock.call(expected, capture=True), actual)
242+ expected = [cmd for cmd in cmds]
243+ for cmd in expected:
244+ if lvmetad_status:
245+ cmd.append('--cache')
246+
247+ calls = [mock.call(cmd, capture=True) for cmd in expected]
248+ self.assertEqual(len(expected), len(mock_util.subp.call_args_list))
249+ mock_util.subp.has_calls(calls)
250+ mock_util.subp.reset_mock()
251+
252
253 # vi: ts=4 expandtab syntax=python
254diff --git a/tests/unittests/test_clear_holders.py b/tests/unittests/test_clear_holders.py
255index 6c29171..21f76be 100644
256--- a/tests/unittests/test_clear_holders.py
257+++ b/tests/unittests/test_clear_holders.py
258@@ -779,10 +779,12 @@ class TestClearHolders(CiTestCase):
259 mock_gen_holders_tree.return_value = self.example_holders_trees[1][1]
260 clear_holders.assert_clear(device)
261
262+ @mock.patch('curtin.block.clear_holders.lvm')
263 @mock.patch('curtin.block.clear_holders.zfs')
264 @mock.patch('curtin.block.clear_holders.mdadm')
265 @mock.patch('curtin.block.clear_holders.util')
266- def test_start_clear_holders_deps(self, mock_util, mock_mdadm, mock_zfs):
267+ def test_start_clear_holders_deps(self, mock_util, mock_mdadm, mock_zfs,
268+ mock_lvm):
269 mock_zfs.zfs_supported.return_value = True
270 clear_holders.start_clear_holders_deps()
271 mock_mdadm.mdadm_assemble.assert_called_with(
272@@ -790,11 +792,12 @@ class TestClearHolders(CiTestCase):
273 mock_util.load_kernel_module.assert_has_calls([
274 mock.call('bcache'), mock.call('zfs')])
275
276+ @mock.patch('curtin.block.clear_holders.lvm')
277 @mock.patch('curtin.block.clear_holders.zfs')
278 @mock.patch('curtin.block.clear_holders.mdadm')
279 @mock.patch('curtin.block.clear_holders.util')
280 def test_start_clear_holders_deps_nozfs(self, mock_util, mock_mdadm,
281- mock_zfs):
282+ mock_zfs, mock_lvm):
283 """test that we skip zfs modprobe on unsupported platforms"""
284 mock_zfs.zfs_supported.return_value = False
285 clear_holders.start_clear_holders_deps()
286diff --git a/tests/vmtests/test_lvm_raid.py b/tests/vmtests/test_lvm_raid.py
287new file mode 100644
288index 0000000..0c50941
289--- /dev/null
290+++ b/tests/vmtests/test_lvm_raid.py
291@@ -0,0 +1,50 @@
292+# This file is part of curtin. See LICENSE file for copyright and license info.
293+
294+from .releases import base_vm_classes as relbase
295+from .test_mdadm_bcache import TestMdadmAbs
296+from .test_lvm import TestLvmAbs
297+
298+import textwrap
299+
300+
301+class TestLvmOverRaidAbs(TestMdadmAbs, TestLvmAbs):
302+ conf_file = "examples/tests/lvmoverraid.yaml"
303+ active_mdadm = "2"
304+ nr_cpus = 2
305+ dirty_disks = True
306+ extra_disks = ['10G'] * 4
307+
308+ collect_scripts = TestLvmAbs.collect_scripts
309+ collect_scripts += TestMdadmAbs.collect_scripts + [textwrap.dedent("""
310+ cd OUTPUT_COLLECT_D
311+ ls -al /dev/md* > dev_md
312+ cp -a /etc/mdadm etc_mdadm
313+ cp -a /etc/lvm etc_lvm
314+ """)]
315+
316+ fstab_expected = {
317+ '/dev/vg1/lv1': '/srv/data',
318+ '/dev/vg1/lv2': '/srv/backup',
319+ }
320+ disk_to_check = [('main_disk', 1),
321+ ('md0', 0),
322+ ('md1', 0)]
323+
324+ def test_lvs(self):
325+ self.check_file_strippedline("lvs", "lv-0=vg0")
326+
327+ def test_pvs(self):
328+ self.check_file_strippedline("pvs", "vg0=/dev/md0")
329+ self.check_file_strippedline("pvs", "vg0=/dev/md1")
330+
331+
332+class CosmicTestLvmOverRaid(relbase.cosmic, TestLvmOverRaidAbs):
333+ __test__ = True
334+
335+
336+class BionicTestLvmOverRaid(relbase.bionic, TestLvmOverRaidAbs):
337+ __test__ = True
338+
339+
340+class XenialGATestLvmOverRaid(relbase.xenial_ga, TestLvmOverRaidAbs):
341+ __test__ = True
342diff --git a/tests/vmtests/test_lvm_root.py b/tests/vmtests/test_lvm_root.py
343index 8ca69d4..bc8b047 100644
344--- a/tests/vmtests/test_lvm_root.py
345+++ b/tests/vmtests/test_lvm_root.py
346@@ -113,7 +113,7 @@ class XenialTestUefiLvmRootXfs(relbase.xenial, TestUefiLvmRootAbs):
347 }
348
349
350-@VMBaseClass.skip_by_date("1652822", fixby="2019-06-01")
351+@VMBaseClass.skip_by_date("1652822", fixby="2019-06-01", install=False)
352 class XenialTestUefiLvmRootXfsBootXfs(relbase.xenial, TestUefiLvmRootAbs):
353 """This tests xfs root and xfs boot with uefi.
354

Subscribers

People subscribed via source and target branches