Merge lp:~raharper/curtin/trunk.fix-bcache-sysfs-write-failures-on-remove into lp:~curtin-dev/curtin/trunk

Proposed by Ryan Harper
Status: Merged
Merged at revision: 535
Proposed branch: lp:~raharper/curtin/trunk.fix-bcache-sysfs-write-failures-on-remove
Merge into: lp:~curtin-dev/curtin/trunk
Diff against target: 150 lines (+100/-4)
2 files modified
curtin/block/clear_holders.py (+18/-4)
tests/unittests/test_clear_holders.py (+82/-0)
To merge this branch: bzr merge lp:~raharper/curtin/trunk.fix-bcache-sysfs-write-failures-on-remove
Reviewer Review Type Date Requested Status
Server Team CI bot continuous-integration Approve
Chad Smith Approve
Scott Moser (community) Needs Information
Review via email: mp+330758@code.launchpad.net

Description of the change

bcache: accept sysfs write failure in shutdown handler if path missing

Bcache device shutdown is asynchronus. The kernel may remove the
device after we've tested to see if the device needs to be stopped but
before we issue a 'stop' command to the bcache sysfs file.

When that happens, curtin throws an exception; this patch catches
those and tests if the target path is missing; and if so, ignores the
error as it's expected due to device removal.

All other write failures where the path is still present are still
valid and reported.

LP: #1700564

To post a comment you must log in.
Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Ryan Harper (raharper) wrote :

vmtest subset run:

+ ./tools/jenkins-runner -p 4 tests/vmtests/test_bcache_basic.py tests/vmtests/test_mdadm_bcache.py tests/vmtests/test_raid5_bcache.py
...
Ran 395 tests in 2160.668s

OK
Thu, 14 Sep 2017 16:42:19 +0000: vmtest end [0] in 2162s

Going to start up a full run; expect to see ArtfulNetwork failures, but otherwise should pass all other tests.

527. By Ryan Harper

merge from trunk

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Scott Moser (smoser) wrote :

My only comments are that I really think it doesn't hurt to see the value of the error at a debug level.

I know you've had dis-interest in doing that before. but I just think its useful information.

except (IOError, OSError) as e:
    LOG.debug('bcache stop file %s not present, device removed: %s', bcache_stop, e)

review: Approve
528. By Ryan Harper

merge from trunk

529. By Ryan Harper

Add exception for logging

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
530. By Ryan Harper

Use errno.ENOENT to check for missing bcache stop file

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Needs Fixing (continuous-integration)
531. By Ryan Harper

Update unittest, using errno means we skip the os.path.exists check (which can race)

532. By Ryan Harper

merge from trunk

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Scott Moser (smoser) :
review: Needs Information
Revision history for this message
Ryan Harper (raharper) wrote :

On Thu, Oct 12, 2017 at 2:56 PM, Scott Moser <email address hidden> wrote:

> Review: Needs Information
>
>
>
> Diff comments:
>
> > === modified file 'curtin/block/clear_holders.py'
> > --- curtin/block/clear_holders.py 2017-05-24 18:42:40 +0000
> > +++ curtin/block/clear_holders.py 2017-10-12 16:29:11 +0000
> > @@ -132,8 +132,13 @@
> > os.path.basename(bcache_cache_sysfs))
> > else:
> > LOG.info('stopping bcache cacheset at: %s', bcache_cache_sysfs)
> > - util.write_file(os.path.join(bcache_cache_sysfs, 'stop'),
> > - '1', mode=None)
> > + bcache_stop = os.path.join(bcache_cache_sysfs, 'stop')
> > + try:
> > + util.write_file(bcache_stop, '1', mode=None)
> > + except (IOError, OSError) as e:
> > + if e.errno == errno.ENOENT:
> > + LOG.debug('bcache stop file %s missing, device removed:
> %s',
> > + bcache_stop, e)
>
> dont you want:
> else:
> raise e
> ?
> or did you intend to be completely silent on failure except in the ENOENT
> case.
>

This is the only case where we want to eat the exception; the write to the
file
failed due to bcache kernel bits stopping. I'll add a comment to make it
more clear.

>
> > try:
> > util.wait_for_removal(bcache_cache_sysfs,
> retries=removal_retries)
> > except OSError:
>
>
> --
> https://code.launchpad.net/~raharper/curtin/trunk.fix-
> bcache-sysfs-write-failures-on-remove/+merge/330758
> You are the owner of lp:~raharper/curtin/trunk.fix-
> bcache-sysfs-write-failures-on-remove.
>

533. By Ryan Harper

Ensure we don't swallow all exceptions, only ENOENT

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Chad Smith (chad.smith) wrote :

I like the fact that you've added unit tests here, but the depth of mocking on this makes for unit tests that know too much about the internal implementation of the shutdown_bcache function. which means unittests will break more frequently if we change the operations performed by shutdown_bcache. Since the try/except handling is boilerplate for both cases, let's just pull it into a helper function which can be easily tested. Here's a patch for that and the simple unit tests addditions. I believe you can also drop your two unit tests as it feels like the mock depth will make it harder to maintain.

patch: http://paste.ubuntu.com/25733308/

review: Approve
534. By Ryan Harper

Use common bcache_maybe_remove handler for simpler unittests

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'curtin/block/clear_holders.py'
--- curtin/block/clear_holders.py 2017-05-24 18:42:40 +0000
+++ curtin/block/clear_holders.py 2017-10-13 17:48:44 +0000
@@ -93,6 +93,22 @@
93 return path93 return path
9494
9595
96def maybe_stop_bcache_device(device):
97 """Attempt to stop the provided device_path or raise unexpected errors."""
98 bcache_stop = os.path.join(device, 'stop')
99 try:
100 util.write_file(bcache_stop, '1', mode=None)
101 except (IOError, OSError) as e:
102 # Note: if we get a write failure and the exception is that the
103 # file is not found; we log this expected behavior; any other
104 # exception is raised
105 if e.errno == errno.ENOENT:
106 LOG.debug('bcache stop file %s missing, device removed: %s',
107 bcache_stop, e)
108 else:
109 raise e
110
111
96def shutdown_bcache(device):112def shutdown_bcache(device):
97 """113 """
98 Shut down bcache for specified bcache device114 Shut down bcache for specified bcache device
@@ -132,8 +148,7 @@
132 os.path.basename(bcache_cache_sysfs))148 os.path.basename(bcache_cache_sysfs))
133 else:149 else:
134 LOG.info('stopping bcache cacheset at: %s', bcache_cache_sysfs)150 LOG.info('stopping bcache cacheset at: %s', bcache_cache_sysfs)
135 util.write_file(os.path.join(bcache_cache_sysfs, 'stop'),151 maybe_stop_bcache_device(bcache_cache_sysfs)
136 '1', mode=None)
137 try:152 try:
138 util.wait_for_removal(bcache_cache_sysfs, retries=removal_retries)153 util.wait_for_removal(bcache_cache_sysfs, retries=removal_retries)
139 except OSError:154 except OSError:
@@ -162,8 +177,7 @@
162 return177 return
163 else:178 else:
164 LOG.info('stopping bcache backing device at: %s', bcache_block_sysfs)179 LOG.info('stopping bcache backing device at: %s', bcache_block_sysfs)
165 util.write_file(os.path.join(bcache_block_sysfs, 'stop'),180 maybe_stop_bcache_device(bcache_block_sysfs)
166 '1', mode=None)
167 try:181 try:
168 # wait for them all to go away182 # wait for them all to go away
169 for dev in [device, bcache_block_sysfs] + slave_paths:183 for dev in [device, bcache_block_sysfs] + slave_paths:
170184
=== modified file 'tests/unittests/test_clear_holders.py'
--- tests/unittests/test_clear_holders.py 2017-08-02 20:55:03 +0000
+++ tests/unittests/test_clear_holders.py 2017-10-13 17:48:44 +0000
@@ -1,3 +1,4 @@
1import errno
1import mock2import mock
2import os3import os
3import textwrap4import textwrap
@@ -329,6 +330,52 @@
329 mock.call(cset, retries=self.remove_retries)330 mock.call(cset, retries=self.remove_retries)
330 ])331 ])
331332
333 # test bcache shutdown with 'stop' sysfs write failure
334 @mock.patch('curtin.block.clear_holders.udev.udevadm_settle')
335 @mock.patch('curtin.block.clear_holders.get_bcache_sys_path')
336 @mock.patch('curtin.block.clear_holders.util')
337 @mock.patch('curtin.block.clear_holders.os')
338 @mock.patch('curtin.block.clear_holders.LOG')
339 @mock.patch('curtin.block.clear_holders.get_bcache_using_dev')
340 def test_shutdown_bcache_stop_sysfs_write_fails(self, mock_get_bcache,
341 mock_log, mock_os,
342 mock_util,
343 mock_get_bcache_block,
344 mock_udevadm_settle):
345 """Test writes sysfs write failures pass if file not present"""
346 device = "/sys/class/block/null"
347 mock_os.path.exists.side_effect = iter([
348 True, # backing device exists
349 True, # cset device not present (already removed)
350 False, # backing device is removed with cset
351 False, # bcache/stop sysfs is missing (already removed)
352 ])
353 cset = '/sys/fs/bcache/fake'
354 mock_get_bcache.return_value = cset
355 mock_get_bcache_block.return_value = device + '/bcache'
356 mock_os.path.join.side_effect = os.path.join
357
358 # make writes to sysfs fail
359 mock_util.write_file.side_effect = IOError(errno.ENOENT,
360 "File not found")
361
362 clear_holders.shutdown_bcache(device)
363
364 self.assertEqual(2, len(mock_log.info.call_args_list))
365 self.assertEqual(3, len(mock_os.path.exists.call_args_list))
366 self.assertEqual(1, len(mock_get_bcache.call_args_list))
367 self.assertEqual(1, len(mock_get_bcache_block.call_args_list))
368 self.assertEqual(1, len(mock_util.write_file.call_args_list))
369 self.assertEqual(1, len(mock_util.wait_for_removal.call_args_list))
370
371 mock_get_bcache.assert_called_with(device, strict=False)
372 mock_util.write_file.assert_has_calls([
373 mock.call(cset + '/stop', '1', mode=None),
374 ])
375 mock_util.wait_for_removal.assert_has_calls([
376 mock.call(cset, retries=self.remove_retries)
377 ])
378
332 @mock.patch('curtin.block.clear_holders.LOG')379 @mock.patch('curtin.block.clear_holders.LOG')
333 @mock.patch('curtin.block.clear_holders.block.sys_block_path')380 @mock.patch('curtin.block.clear_holders.block.sys_block_path')
334 @mock.patch('curtin.block.clear_holders.lvm')381 @mock.patch('curtin.block.clear_holders.lvm')
@@ -531,6 +578,41 @@
531 for tree, result in test_trees_and_results:578 for tree, result in test_trees_and_results:
532 self.assertEqual(clear_holders.format_holders_tree(tree), result)579 self.assertEqual(clear_holders.format_holders_tree(tree), result)
533580
581 @mock.patch('curtin.block.clear_holders.util.write_file')
582 def test_maybe_stop_bcache_device_raises_errors(self, m_write_file):
583 """Unexpected exceptions are raised by maybe_stop_bcache_device."""
584 m_write_file.side_effect = OSError(errno.EAGAIN, 'Unexpected errno')
585 with self.assertRaises(OSError) as cm:
586 clear_holders.maybe_stop_bcache_device('does/not/matter')
587 self.assertEqual('[Errno 11] Unexpected errno', str(cm.exception))
588 self.assertEqual(
589 mock.call('does/not/matter/stop', '1', mode=None),
590 m_write_file.call_args)
591
592 @mock.patch('curtin.block.clear_holders.LOG')
593 @mock.patch('curtin.block.clear_holders.util.write_file')
594 def test_maybe_stop_bcache_device_handles_oserror(self, m_write_file,
595 m_log):
596 """When OSError.NOENT is raised, log the condition and move on."""
597 m_write_file.side_effect = OSError(errno.ENOENT, 'Expected oserror')
598 clear_holders.maybe_stop_bcache_device('does/not/matter')
599 self.assertEqual(
600 'bcache stop file %s missing, device removed: %s',
601 m_log.debug.call_args[0][0])
602 self.assertEqual('does/not/matter/stop', m_log.debug.call_args[0][1])
603
604 @mock.patch('curtin.block.clear_holders.LOG')
605 @mock.patch('curtin.block.clear_holders.util.write_file')
606 def test_maybe_stop_bcache_device_handles_ioerror(self, m_write_file,
607 m_log):
608 """When IOError.NOENT is raised, log the condition and move on."""
609 m_write_file.side_effect = IOError(errno.ENOENT, 'Expected ioerror')
610 clear_holders.maybe_stop_bcache_device('does/not/matter')
611 self.assertEqual(
612 'bcache stop file %s missing, device removed: %s',
613 m_log.debug.call_args[0][0])
614 self.assertEqual('does/not/matter/stop', m_log.debug.call_args[0][1])
615
534 def test_get_holder_types(self):616 def test_get_holder_types(self):
535 """test clear_holders.get_holder_types"""617 """test clear_holders.get_holder_types"""
536 test_trees_and_results = [618 test_trees_and_results = [

Subscribers

People subscribed via source and target branches