Merge ~raharper/curtin:mwhudson-vmtest-reuse-half-a-raid into curtin:master

Proposed by Ryan Harper
Status: Merged
Approved by: Ryan Harper
Approved revision: fbc9696d4b4b0ff0aa9846e8bff8061e879cac9d
Merge reported by: Server Team CI bot
Merged at revision: not available
Proposed branch: ~raharper/curtin:mwhudson-vmtest-reuse-half-a-raid
Merge into: curtin:master
Diff against target: 345 lines (+250/-19)
6 files modified
curtin/block/clear_holders.py (+20/-5)
curtin/commands/block_meta.py (+16/-12)
examples/tests/reuse-raid-member-partition.yaml (+73/-0)
examples/tests/reuse-raid-member-wipe.yaml (+71/-0)
tests/unittests/test_clear_holders.py (+8/-2)
tests/vmtests/test_reuse_raid_member.py (+62/-0)
Reviewer Review Type Date Requested Status
Server Team CI bot continuous-integration Approve
Dan Watkins (community) Approve
Review via email: mp+370843@code.launchpad.net

Commit message

Handle partial raid on partitions

Consider disks and partitions when checking for holders that
need removal before they can be used. Additionally when shutting
down a raid, if it is inactive or broken, writes to the /dev/mdX
device will fail which is expected, don't exit instead, log a
non-fatal error and continue with shutting down the raid.

With these fixes TestReuseRAIDMemberPartition vmtests now pass.

LP: #1835091

To post a comment you must log in.
Revision history for this message
Ryan Harper (raharper) wrote :

This needs a full vmtest run to ensure we're not regressing things with the changes.

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

LGTM assuming the vmtests are good.

20033d3... by Ryan Harper

Discover how to spell Discovering

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Needs Fixing (continuous-integration)
05c2aaf... by Ryan Harper

Refactor to handle not present devices

In some scenarios, the partitions in the config _may_ be present
but we need to handle if they are not. Additionally, if they were
present when we initially start clearing devices, they wont be present
after the clear so assert_clear needs to adapt to allow that.

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Needs Fixing (continuous-integration)
22d9742... by Ryan Harper

Don't forget to return the paths

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Dan Watkins (oddbloke) wrote :

This looks really good, thanks! I have one inline comment about exception handling that I'd like resolved, but after that I'll be +1.

review: Needs Information
Revision history for this message
Ryan Harper (raharper) :
fbc9696... by Ryan Harper

Update comment around exception handling for partial raid shutdown

Revision history for this message
Dan Watkins (oddbloke) wrote :

LGTM, thanks for the comment!

review: Approve
Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Ryan Harper (raharper) wrote :
Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Ryan Harper (raharper) wrote :

This needed a rebase after the change to btrfs-tools package names. After rebase, I'll resubmit vmtest run.

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/curtin/block/clear_holders.py b/curtin/block/clear_holders.py
2index 4a099cd..bcc16e6 100644
3--- a/curtin/block/clear_holders.py
4+++ b/curtin/block/clear_holders.py
5@@ -166,14 +166,27 @@ def shutdown_mdadm(device):
6
7 blockdev = block.sysfs_to_devpath(device)
8
9- LOG.info('Wiping superblock on raid device: %s', device)
10- _wipe_superblock(blockdev, exclusive=False)
11-
12+ LOG.info('Discovering raid devices and spares for %s', device)
13 md_devs = (
14 mdadm.md_get_devices_list(blockdev) +
15 mdadm.md_get_spares_list(blockdev))
16 mdadm.set_sync_action(blockdev, action="idle")
17 mdadm.set_sync_action(blockdev, action="frozen")
18+
19+ LOG.info('Wiping superblock on raid device: %s', device)
20+ try:
21+ _wipe_superblock(blockdev, exclusive=False)
22+ except ValueError as e:
23+ # if the array is not functional, writes to the device may fail
24+ # and _wipe_superblock will raise ValueError for short writes
25+ # which happens on inactive raid volumes. In that case we
26+ # shouldn't give up yet as we still want to disassemble
27+ # array and wipe members. Other errors such as IOError or OSError
28+ # are unwelcome and will stop deployment.
29+ LOG.debug('Non-fatal error writing to array device %s, '
30+ 'proceeding with shutdown: %s', blockdev, e)
31+
32+ LOG.info('Removing raid array members: %s', md_devs)
33 for mddev in md_devs:
34 try:
35 mdadm.fail_device(blockdev, mddev)
36@@ -534,8 +547,10 @@ def assert_clear(base_paths):
37 valid = ('disk', 'partition')
38 if not isinstance(base_paths, (list, tuple)):
39 base_paths = [base_paths]
40- base_paths = [block.sys_block_path(path) for path in base_paths]
41- for holders_tree in [gen_holders_tree(p) for p in base_paths]:
42+ base_paths = [block.sys_block_path(path, strict=False)
43+ for path in base_paths]
44+ for holders_tree in [gen_holders_tree(p)
45+ for p in base_paths if os.path.exists(p)]:
46 if any(holder_type not in valid and path not in base_paths
47 for (holder_type, path) in get_holder_types(holders_tree)):
48 raise OSError('Storage not clear, remaining:\n{}'
49diff --git a/curtin/commands/block_meta.py b/curtin/commands/block_meta.py
50index a9110a9..37b93dd 100644
51--- a/curtin/commands/block_meta.py
52+++ b/curtin/commands/block_meta.py
53@@ -67,7 +67,7 @@ def block_meta(args):
54 if devices is None:
55 devices = []
56 if 'storage' in cfg:
57- devices = get_disk_paths_from_storage_config(
58+ devices = get_device_paths_from_storage_config(
59 extract_storage_ordered_dict(cfg))
60 if len(devices) == 0:
61 devices = cfg.get('block-meta', {}).get('devices', [])
62@@ -1535,20 +1535,24 @@ def zfs_handler(info, storage_config):
63 util.write_file(state['fstab'], fstab_entry, omode='a')
64
65
66-def get_disk_paths_from_storage_config(storage_config):
67- """Returns a list of disk paths in a storage config filtering out
68- preserved or disks which do not have wipe configuration.
69+def get_device_paths_from_storage_config(storage_config):
70+ """Returns a list of device paths in a storage config filtering out
71+ preserved or devices which do not have wipe configuration.
72
73 :param: storage_config: Ordered dict of storage configation
74 """
75- def get_path_if_present(disk, config):
76- return get_path_to_storage_volume(disk, config)
77-
78- return [get_path_if_present(k, storage_config)
79- for (k, v) in storage_config.items()
80- if v.get('type') == 'disk' and
81- config.value_as_boolean(v.get('wipe')) and
82- not config.value_as_boolean(v.get('preserve'))]
83+ dpaths = []
84+ for (k, v) in storage_config.items():
85+ if v.get('type') in ['disk', 'partition']:
86+ if config.value_as_boolean(v.get('wipe')):
87+ if config.value_as_boolean(v.get('preserve')):
88+ continue
89+ try:
90+ dpaths.append(
91+ get_path_to_storage_volume(k, storage_config))
92+ except Exception:
93+ pass
94+ return dpaths
95
96
97 def zfsroot_update_storage_config(storage_config):
98diff --git a/examples/tests/reuse-raid-member-partition.yaml b/examples/tests/reuse-raid-member-partition.yaml
99new file mode 100644
100index 0000000..3fe2d83
101--- /dev/null
102+++ b/examples/tests/reuse-raid-member-partition.yaml
103@@ -0,0 +1,73 @@
104+showtrace: true
105+
106+# The point of this test is to test installing to a partition that used to
107+# be a RAID member where the other disks that used to be part of the
108+# RAID are not present (the scenario that the disk was just grabbed
109+# out of a pile of previously used disks and shoved into a server).
110+
111+# So what it does is to create a RAID0 out of two partition from two
112+# disks, stop the RAID, wipe one of the disks and then install to the
113+# other, reusing the partition that was part of the RAID.
114+
115+bucket:
116+ - &setup |
117+ parted /dev/disk/by-id/virtio-disk-a --script -- \
118+ mklabel gpt \
119+ mkpart primary 1GiB 2GiB \
120+ mkpart primary 2GiB 9GiB
121+ parted /dev/disk/by-id/virtio-disk-b --script -- \
122+ mklabel gpt \
123+ mkpart primary 2GiB 9GiB
124+ udevadm settle
125+ mdadm --create --metadata 1.2 --level 0 -n 2 /dev/md1 --assume-clean \
126+ /dev/disk/by-id/virtio-disk-a-part2 /dev/disk/by-id/virtio-disk-b-part1
127+ udevadm settle
128+ mdadm --stop /dev/md1
129+ udevadm settle
130+ mdadm --zero-superblock /dev/disk/by-id/virtio-disk-b-part1
131+ wipefs -a /dev/disk/by-id/virtio-disk-b
132+ udevadm settle
133+
134+early_commands:
135+ 00-setup-raid: [sh, -exuc, *setup]
136+
137+storage:
138+ config:
139+ - type: disk
140+ id: id_disk0
141+ serial: disk-a
142+ ptable: gpt
143+ preserve: true
144+ - type: disk
145+ id: id_disk1
146+ serial: disk-b
147+ - type: partition
148+ id: id_disk0_part1
149+ preserve: true
150+ device: id_disk0
151+ flag: boot
152+ number: 1
153+ size: 1G
154+ - type: partition
155+ id: id_disk0_part2
156+ preserve: true
157+ device: id_disk0
158+ number: 2
159+ size: 7G
160+ - type: format
161+ id: id_efi_format
162+ volume: id_disk0_part1
163+ fstype: fat32
164+ - type: format
165+ id: id_root_format
166+ volume: id_disk0_part2
167+ fstype: ext4
168+ - type: mount
169+ device: id_root_format
170+ id: id_root_mount
171+ path: /
172+ - type: mount
173+ id: id_efi_mount
174+ device: id_efi_format
175+ path: /boot/efi
176+ version: 1
177diff --git a/examples/tests/reuse-raid-member-wipe.yaml b/examples/tests/reuse-raid-member-wipe.yaml
178new file mode 100644
179index 0000000..84a2686
180--- /dev/null
181+++ b/examples/tests/reuse-raid-member-wipe.yaml
182@@ -0,0 +1,71 @@
183+showtrace: true
184+
185+# The point of this test is to test installing to a disk that contains
186+# a partition that used to be a RAID member where the other parts of
187+# the RAID are not present (the scenario is that the disk was just
188+# grabbed out of a pile of previously used disks and shoved into a
189+# server).
190+
191+# So what it does is to create a RAID0 out of partitions on two disks,
192+# stop the RAID, wipe the superblock on one of them and then install
193+# to the other using a standard partitioning scheme.
194+
195+bucket:
196+ - &setup |
197+ parted /dev/disk/by-id/virtio-disk-a --script -- \
198+ mklabel gpt \
199+ mkpart primary 1GiB 9GiB
200+ parted /dev/disk/by-id/virtio-disk-b --script -- \
201+ mklabel gpt \
202+ mkpart primary 1GiB 9GiB
203+ udevadm settle
204+ mdadm --create --metadata 1.2 --level 0 -n 2 /dev/md1 --assume-clean \
205+ /dev/disk/by-id/virtio-disk-a-part1 /dev/disk/by-id/virtio-disk-b-part1
206+ udevadm settle
207+ mdadm --stop /dev/md1
208+ udevadm settle
209+ mdadm --zero-superblock /dev/disk/by-id/virtio-disk-b-part1
210+ wipefs -a /dev/disk/by-id/virtio-disk-b
211+ udevadm settle
212+
213+early_commands:
214+ 00-setup-raid: [sh, -exuc, *setup]
215+
216+storage:
217+ config:
218+ - type: disk
219+ id: id_disk0
220+ serial: disk-a
221+ ptable: gpt
222+ wipe: superblock-recursive
223+ - type: disk
224+ id: id_disk1
225+ serial: disk-b
226+ - type: partition
227+ id: id_disk0_part1
228+ device: id_disk0
229+ flag: boot
230+ number: 1
231+ size: 512M
232+ - type: partition
233+ id: id_disk0_part2
234+ device: id_disk0
235+ number: 2
236+ size: 8G
237+ - type: format
238+ id: id_efi_format
239+ volume: id_disk0_part1
240+ fstype: fat32
241+ - type: format
242+ id: id_root_format
243+ volume: id_disk0_part2
244+ fstype: ext4
245+ - type: mount
246+ device: id_root_format
247+ id: id_root_mount
248+ path: /
249+ - type: mount
250+ id: id_efi_mount
251+ device: id_efi_format
252+ path: /boot/efi
253+ version: 1
254diff --git a/tests/unittests/test_clear_holders.py b/tests/unittests/test_clear_holders.py
255index e4d8f8d..906ae06 100644
256--- a/tests/unittests/test_clear_holders.py
257+++ b/tests/unittests/test_clear_holders.py
258@@ -566,11 +566,17 @@ class TestClearHolders(CiTestCase):
259 for tree, result in test_trees_and_results:
260 self.assertEqual(clear_holders.get_holder_types(tree), result)
261
262+ @mock.patch('curtin.block.clear_holders.os.path.exists')
263 @mock.patch('curtin.block.clear_holders.block.sys_block_path')
264 @mock.patch('curtin.block.clear_holders.gen_holders_tree')
265- def test_assert_clear(self, mock_gen_holders_tree, mock_syspath):
266+ def test_assert_clear(self, mock_gen_holders_tree, mock_syspath, m_ospe):
267+ def my_sysblock(p, strict=False):
268+ return '/sys/class/block/%s' % os.path.basename(p)
269+
270 mock_gen_holders_tree.return_value = self.example_holders_trees[0][0]
271- mock_syspath.side_effect = lambda x: x
272+ mock_syspath.side_effect = my_sysblock
273+ # os.path.exists set to True to include all devices in the call list
274+ m_ospe.return_value = True
275 device = self.test_blockdev
276 with self.assertRaises(OSError):
277 clear_holders.assert_clear(device)
278diff --git a/tests/vmtests/test_reuse_raid_member.py b/tests/vmtests/test_reuse_raid_member.py
279new file mode 100644
280index 0000000..3052c9c
281--- /dev/null
282+++ b/tests/vmtests/test_reuse_raid_member.py
283@@ -0,0 +1,62 @@
284+# This file is part of curtin. See LICENSE file for copyright and license info.
285+
286+from . import VMBaseClass
287+from .releases import base_vm_classes as relbase
288+
289+
290+class TestReuseRAIDMember(VMBaseClass):
291+ """ Curtin can install to a RAID member if other members are missing. """
292+ conf_file = "examples/tests/reuse-raid-member-wipe.yaml"
293+ extra_disks = ['10G', '10G']
294+ uefi = True
295+
296+ def test_simple(self):
297+ pass
298+
299+
300+class TestReuseRAIDMemberPartition(VMBaseClass):
301+ """ Curtin can install to a RAID member if other members are missing. """
302+ conf_file = "examples/tests/reuse-raid-member-wipe.yaml"
303+ extra_disks = ['10G', '10G']
304+ uefi = True
305+
306+ def test_simple(self):
307+ pass
308+
309+
310+class BionicTestReuseRAIDMember(relbase.bionic, TestReuseRAIDMember):
311+ __test__ = True
312+
313+
314+class CosmicTestReuseRAIDMember(relbase.cosmic, TestReuseRAIDMember):
315+ __test__ = True
316+
317+
318+class DiscoTestReuseRAIDMember(relbase.disco, TestReuseRAIDMember):
319+ __test__ = True
320+
321+
322+class EoanTestReuseRAIDMember(relbase.eoan, TestReuseRAIDMember):
323+ __test__ = True
324+
325+
326+class BionicTestReuseRAIDMemberPartition(
327+ relbase.bionic, TestReuseRAIDMemberPartition):
328+ __test__ = True
329+
330+
331+class CosmicTestReuseRAIDMemberPartition(
332+ relbase.cosmic, TestReuseRAIDMemberPartition):
333+ __test__ = True
334+
335+
336+class DiscoTestReuseRAIDMemberPartition(
337+ relbase.disco, TestReuseRAIDMemberPartition):
338+ __test__ = True
339+
340+
341+class EoanTestReuseRAIDMemberPartition(
342+ relbase.eoan, TestReuseRAIDMemberPartition):
343+ __test__ = True
344+
345+# vi: ts=4 expandtab syntax=python

Subscribers

People subscribed via source and target branches