Merge lp:~raharper/curtin/trunk.fix-bcache-sysfs-write-failures-on-remove into lp:~curtin-dev/curtin/trunk

Proposed by Ryan Harper
Status: Merged
Merged at revision: 535
Proposed branch: lp:~raharper/curtin/trunk.fix-bcache-sysfs-write-failures-on-remove
Merge into: lp:~curtin-dev/curtin/trunk
Diff against target: 150 lines (+100/-4)
2 files modified
curtin/block/clear_holders.py (+18/-4)
tests/unittests/test_clear_holders.py (+82/-0)
To merge this branch: bzr merge lp:~raharper/curtin/trunk.fix-bcache-sysfs-write-failures-on-remove
Reviewer Review Type Date Requested Status
Server Team CI bot continuous-integration Approve
Chad Smith Approve
Scott Moser (community) Needs Information
Review via email: mp+330758@code.launchpad.net

Description of the change

bcache: accept sysfs write failure in shutdown handler if path missing

Bcache device shutdown is asynchronus. The kernel may remove the
device after we've tested to see if the device needs to be stopped but
before we issue a 'stop' command to the bcache sysfs file.

When that happens, curtin throws an exception; this patch catches
those and tests if the target path is missing; and if so, ignores the
error as it's expected due to device removal.

All other write failures where the path is still present are still
valid and reported.

LP: #1700564

To post a comment you must log in.
Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Ryan Harper (raharper) wrote :

vmtest subset run:

+ ./tools/jenkins-runner -p 4 tests/vmtests/test_bcache_basic.py tests/vmtests/test_mdadm_bcache.py tests/vmtests/test_raid5_bcache.py
...
Ran 395 tests in 2160.668s

OK
Thu, 14 Sep 2017 16:42:19 +0000: vmtest end [0] in 2162s

Going to start up a full run; expect to see ArtfulNetwork failures, but otherwise should pass all other tests.

527. By Ryan Harper

merge from trunk

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Scott Moser (smoser) wrote :

My only comments are that I really think it doesn't hurt to see the value of the error at a debug level.

I know you've had dis-interest in doing that before. but I just think its useful information.

except (IOError, OSError) as e:
    LOG.debug('bcache stop file %s not present, device removed: %s', bcache_stop, e)

review: Approve
528. By Ryan Harper

merge from trunk

529. By Ryan Harper

Add exception for logging

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
530. By Ryan Harper

Use errno.ENOENT to check for missing bcache stop file

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Needs Fixing (continuous-integration)
531. By Ryan Harper

Update unittest, using errno means we skip the os.path.exists check (which can race)

532. By Ryan Harper

merge from trunk

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Scott Moser (smoser) :
review: Needs Information
Revision history for this message
Ryan Harper (raharper) wrote :

On Thu, Oct 12, 2017 at 2:56 PM, Scott Moser <email address hidden> wrote:

> Review: Needs Information
>
>
>
> Diff comments:
>
> > === modified file 'curtin/block/clear_holders.py'
> > --- curtin/block/clear_holders.py 2017-05-24 18:42:40 +0000
> > +++ curtin/block/clear_holders.py 2017-10-12 16:29:11 +0000
> > @@ -132,8 +132,13 @@
> > os.path.basename(bcache_cache_sysfs))
> > else:
> > LOG.info('stopping bcache cacheset at: %s', bcache_cache_sysfs)
> > - util.write_file(os.path.join(bcache_cache_sysfs, 'stop'),
> > - '1', mode=None)
> > + bcache_stop = os.path.join(bcache_cache_sysfs, 'stop')
> > + try:
> > + util.write_file(bcache_stop, '1', mode=None)
> > + except (IOError, OSError) as e:
> > + if e.errno == errno.ENOENT:
> > + LOG.debug('bcache stop file %s missing, device removed:
> %s',
> > + bcache_stop, e)
>
> dont you want:
> else:
> raise e
> ?
> or did you intend to be completely silent on failure except in the ENOENT
> case.
>

This is the only case where we want to eat the exception; the write to the
file
failed due to bcache kernel bits stopping. I'll add a comment to make it
more clear.

>
> > try:
> > util.wait_for_removal(bcache_cache_sysfs,
> retries=removal_retries)
> > except OSError:
>
>
> --
> https://code.launchpad.net/~raharper/curtin/trunk.fix-
> bcache-sysfs-write-failures-on-remove/+merge/330758
> You are the owner of lp:~raharper/curtin/trunk.fix-
> bcache-sysfs-write-failures-on-remove.
>

533. By Ryan Harper

Ensure we don't swallow all exceptions, only ENOENT

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Chad Smith (chad.smith) wrote :

I like the fact that you've added unit tests here, but the depth of mocking on this makes for unit tests that know too much about the internal implementation of the shutdown_bcache function. which means unittests will break more frequently if we change the operations performed by shutdown_bcache. Since the try/except handling is boilerplate for both cases, let's just pull it into a helper function which can be easily tested. Here's a patch for that and the simple unit tests addditions. I believe you can also drop your two unit tests as it feels like the mock depth will make it harder to maintain.

patch: http://paste.ubuntu.com/25733308/

review: Approve
534. By Ryan Harper

Use common bcache_maybe_remove handler for simpler unittests

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'curtin/block/clear_holders.py'
2--- curtin/block/clear_holders.py 2017-05-24 18:42:40 +0000
3+++ curtin/block/clear_holders.py 2017-10-13 17:48:44 +0000
4@@ -93,6 +93,22 @@
5 return path
6
7
8+def maybe_stop_bcache_device(device):
9+ """Attempt to stop the provided device_path or raise unexpected errors."""
10+ bcache_stop = os.path.join(device, 'stop')
11+ try:
12+ util.write_file(bcache_stop, '1', mode=None)
13+ except (IOError, OSError) as e:
14+ # Note: if we get a write failure and the exception is that the
15+ # file is not found; we log this expected behavior; any other
16+ # exception is raised
17+ if e.errno == errno.ENOENT:
18+ LOG.debug('bcache stop file %s missing, device removed: %s',
19+ bcache_stop, e)
20+ else:
21+ raise e
22+
23+
24 def shutdown_bcache(device):
25 """
26 Shut down bcache for specified bcache device
27@@ -132,8 +148,7 @@
28 os.path.basename(bcache_cache_sysfs))
29 else:
30 LOG.info('stopping bcache cacheset at: %s', bcache_cache_sysfs)
31- util.write_file(os.path.join(bcache_cache_sysfs, 'stop'),
32- '1', mode=None)
33+ maybe_stop_bcache_device(bcache_cache_sysfs)
34 try:
35 util.wait_for_removal(bcache_cache_sysfs, retries=removal_retries)
36 except OSError:
37@@ -162,8 +177,7 @@
38 return
39 else:
40 LOG.info('stopping bcache backing device at: %s', bcache_block_sysfs)
41- util.write_file(os.path.join(bcache_block_sysfs, 'stop'),
42- '1', mode=None)
43+ maybe_stop_bcache_device(bcache_block_sysfs)
44 try:
45 # wait for them all to go away
46 for dev in [device, bcache_block_sysfs] + slave_paths:
47
48=== modified file 'tests/unittests/test_clear_holders.py'
49--- tests/unittests/test_clear_holders.py 2017-08-02 20:55:03 +0000
50+++ tests/unittests/test_clear_holders.py 2017-10-13 17:48:44 +0000
51@@ -1,3 +1,4 @@
52+import errno
53 import mock
54 import os
55 import textwrap
56@@ -329,6 +330,52 @@
57 mock.call(cset, retries=self.remove_retries)
58 ])
59
60+ # test bcache shutdown with 'stop' sysfs write failure
61+ @mock.patch('curtin.block.clear_holders.udev.udevadm_settle')
62+ @mock.patch('curtin.block.clear_holders.get_bcache_sys_path')
63+ @mock.patch('curtin.block.clear_holders.util')
64+ @mock.patch('curtin.block.clear_holders.os')
65+ @mock.patch('curtin.block.clear_holders.LOG')
66+ @mock.patch('curtin.block.clear_holders.get_bcache_using_dev')
67+ def test_shutdown_bcache_stop_sysfs_write_fails(self, mock_get_bcache,
68+ mock_log, mock_os,
69+ mock_util,
70+ mock_get_bcache_block,
71+ mock_udevadm_settle):
72+ """Test writes sysfs write failures pass if file not present"""
73+ device = "/sys/class/block/null"
74+ mock_os.path.exists.side_effect = iter([
75+ True, # backing device exists
76+ True, # cset device not present (already removed)
77+ False, # backing device is removed with cset
78+ False, # bcache/stop sysfs is missing (already removed)
79+ ])
80+ cset = '/sys/fs/bcache/fake'
81+ mock_get_bcache.return_value = cset
82+ mock_get_bcache_block.return_value = device + '/bcache'
83+ mock_os.path.join.side_effect = os.path.join
84+
85+ # make writes to sysfs fail
86+ mock_util.write_file.side_effect = IOError(errno.ENOENT,
87+ "File not found")
88+
89+ clear_holders.shutdown_bcache(device)
90+
91+ self.assertEqual(2, len(mock_log.info.call_args_list))
92+ self.assertEqual(3, len(mock_os.path.exists.call_args_list))
93+ self.assertEqual(1, len(mock_get_bcache.call_args_list))
94+ self.assertEqual(1, len(mock_get_bcache_block.call_args_list))
95+ self.assertEqual(1, len(mock_util.write_file.call_args_list))
96+ self.assertEqual(1, len(mock_util.wait_for_removal.call_args_list))
97+
98+ mock_get_bcache.assert_called_with(device, strict=False)
99+ mock_util.write_file.assert_has_calls([
100+ mock.call(cset + '/stop', '1', mode=None),
101+ ])
102+ mock_util.wait_for_removal.assert_has_calls([
103+ mock.call(cset, retries=self.remove_retries)
104+ ])
105+
106 @mock.patch('curtin.block.clear_holders.LOG')
107 @mock.patch('curtin.block.clear_holders.block.sys_block_path')
108 @mock.patch('curtin.block.clear_holders.lvm')
109@@ -531,6 +578,41 @@
110 for tree, result in test_trees_and_results:
111 self.assertEqual(clear_holders.format_holders_tree(tree), result)
112
113+ @mock.patch('curtin.block.clear_holders.util.write_file')
114+ def test_maybe_stop_bcache_device_raises_errors(self, m_write_file):
115+ """Unexpected exceptions are raised by maybe_stop_bcache_device."""
116+ m_write_file.side_effect = OSError(errno.EAGAIN, 'Unexpected errno')
117+ with self.assertRaises(OSError) as cm:
118+ clear_holders.maybe_stop_bcache_device('does/not/matter')
119+ self.assertEqual('[Errno 11] Unexpected errno', str(cm.exception))
120+ self.assertEqual(
121+ mock.call('does/not/matter/stop', '1', mode=None),
122+ m_write_file.call_args)
123+
124+ @mock.patch('curtin.block.clear_holders.LOG')
125+ @mock.patch('curtin.block.clear_holders.util.write_file')
126+ def test_maybe_stop_bcache_device_handles_oserror(self, m_write_file,
127+ m_log):
128+ """When OSError.NOENT is raised, log the condition and move on."""
129+ m_write_file.side_effect = OSError(errno.ENOENT, 'Expected oserror')
130+ clear_holders.maybe_stop_bcache_device('does/not/matter')
131+ self.assertEqual(
132+ 'bcache stop file %s missing, device removed: %s',
133+ m_log.debug.call_args[0][0])
134+ self.assertEqual('does/not/matter/stop', m_log.debug.call_args[0][1])
135+
136+ @mock.patch('curtin.block.clear_holders.LOG')
137+ @mock.patch('curtin.block.clear_holders.util.write_file')
138+ def test_maybe_stop_bcache_device_handles_ioerror(self, m_write_file,
139+ m_log):
140+ """When IOError.NOENT is raised, log the condition and move on."""
141+ m_write_file.side_effect = IOError(errno.ENOENT, 'Expected ioerror')
142+ clear_holders.maybe_stop_bcache_device('does/not/matter')
143+ self.assertEqual(
144+ 'bcache stop file %s missing, device removed: %s',
145+ m_log.debug.call_args[0][0])
146+ self.assertEqual('does/not/matter/stop', m_log.debug.call_args[0][1])
147+
148 def test_get_holder_types(self):
149 """test clear_holders.get_holder_types"""
150 test_trees_and_results = [

Subscribers

People subscribed via source and target branches