Merge lp:~raharper/curtin/trunk.fix-bcache-sysfs-write-failures-on-remove into lp:~curtin-dev/curtin/trunk
- trunk.fix-bcache-sysfs-write-failures-on-remove
- Merge into trunk
Status: | Merged |
---|---|
Merged at revision: | 535 |
Proposed branch: | lp:~raharper/curtin/trunk.fix-bcache-sysfs-write-failures-on-remove |
Merge into: | lp:~curtin-dev/curtin/trunk |
Diff against target: |
150 lines (+100/-4) 2 files modified
curtin/block/clear_holders.py (+18/-4) tests/unittests/test_clear_holders.py (+82/-0) |
To merge this branch: | bzr merge lp:~raharper/curtin/trunk.fix-bcache-sysfs-write-failures-on-remove |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Server Team CI bot | continuous-integration | Approve | |
Chad Smith | Approve | ||
Scott Moser (community) | Needs Information | ||
Review via email: mp+330758@code.launchpad.net |
Commit message
Description of the change
bcache: accept sysfs write failure in shutdown handler if path missing
Bcache device shutdown is asynchronus. The kernel may remove the
device after we've tested to see if the device needs to be stopped but
before we issue a 'stop' command to the bcache sysfs file.
When that happens, curtin throws an exception; this patch catches
those and tests if the target path is missing; and if so, ignores the
error as it's expected due to device removal.
All other write failures where the path is still present are still
valid and reported.
LP: #1700564
Server Team CI bot (server-team-bot) wrote : | # |
Ryan Harper (raharper) wrote : | # |
vmtest subset run:
+ ./tools/
...
Ran 395 tests in 2160.668s
OK
Thu, 14 Sep 2017 16:42:19 +0000: vmtest end [0] in 2162s
Going to start up a full run; expect to see ArtfulNetwork failures, but otherwise should pass all other tests.
- 527. By Ryan Harper
-
merge from trunk
Server Team CI bot (server-team-bot) wrote : | # |
PASSED: Continuous integration, rev:527
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
Scott Moser (smoser) wrote : | # |
My only comments are that I really think it doesn't hurt to see the value of the error at a debug level.
I know you've had dis-interest in doing that before. but I just think its useful information.
except (IOError, OSError) as e:
LOG.
- 528. By Ryan Harper
-
merge from trunk
- 529. By Ryan Harper
-
Add exception for logging
Server Team CI bot (server-team-bot) wrote : | # |
PASSED: Continuous integration, rev:529
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
- 530. By Ryan Harper
-
Use errno.ENOENT to check for missing bcache stop file
Server Team CI bot (server-team-bot) wrote : | # |
FAILED: Continuous integration, rev:530
https:/
Executed test runs:
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
Click here to trigger a rebuild:
https:/
- 531. By Ryan Harper
-
Update unittest, using errno means we skip the os.path.exists check (which can race)
- 532. By Ryan Harper
-
merge from trunk
Server Team CI bot (server-team-bot) wrote : | # |
PASSED: Continuous integration, rev:532
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
Scott Moser (smoser) : | # |
Ryan Harper (raharper) wrote : | # |
On Thu, Oct 12, 2017 at 2:56 PM, Scott Moser <email address hidden> wrote:
> Review: Needs Information
>
>
>
> Diff comments:
>
> > === modified file 'curtin/
> > --- curtin/
> > +++ curtin/
> > @@ -132,8 +132,13 @@
> > os.path.
> > else:
> > LOG.info('stopping bcache cacheset at: %s', bcache_cache_sysfs)
> > - util.write_
> > - '1', mode=None)
> > + bcache_stop = os.path.
> > + try:
> > + util.write_
> > + except (IOError, OSError) as e:
> > + if e.errno == errno.ENOENT:
> > + LOG.debug('bcache stop file %s missing, device removed:
> %s',
> > + bcache_stop, e)
>
> dont you want:
> else:
> raise e
> ?
> or did you intend to be completely silent on failure except in the ENOENT
> case.
>
This is the only case where we want to eat the exception; the write to the
file
failed due to bcache kernel bits stopping. I'll add a comment to make it
more clear.
>
> > try:
> > util.wait_
> retries=
> > except OSError:
>
>
> --
> https:/
> bcache-
> You are the owner of lp:~raharper/curtin/trunk.fix-
> bcache-
>
- 533. By Ryan Harper
-
Ensure we don't swallow all exceptions, only ENOENT
Server Team CI bot (server-team-bot) wrote : | # |
PASSED: Continuous integration, rev:533
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
Chad Smith (chad.smith) wrote : | # |
I like the fact that you've added unit tests here, but the depth of mocking on this makes for unit tests that know too much about the internal implementation of the shutdown_bcache function. which means unittests will break more frequently if we change the operations performed by shutdown_bcache. Since the try/except handling is boilerplate for both cases, let's just pull it into a helper function which can be easily tested. Here's a patch for that and the simple unit tests addditions. I believe you can also drop your two unit tests as it feels like the mock depth will make it harder to maintain.
- 534. By Ryan Harper
-
Use common bcache_maybe_remove handler for simpler unittests
Server Team CI bot (server-team-bot) wrote : | # |
PASSED: Continuous integration, rev:534
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild:
https:/
Preview Diff
1 | === modified file 'curtin/block/clear_holders.py' |
2 | --- curtin/block/clear_holders.py 2017-05-24 18:42:40 +0000 |
3 | +++ curtin/block/clear_holders.py 2017-10-13 17:48:44 +0000 |
4 | @@ -93,6 +93,22 @@ |
5 | return path |
6 | |
7 | |
8 | +def maybe_stop_bcache_device(device): |
9 | + """Attempt to stop the provided device_path or raise unexpected errors.""" |
10 | + bcache_stop = os.path.join(device, 'stop') |
11 | + try: |
12 | + util.write_file(bcache_stop, '1', mode=None) |
13 | + except (IOError, OSError) as e: |
14 | + # Note: if we get a write failure and the exception is that the |
15 | + # file is not found; we log this expected behavior; any other |
16 | + # exception is raised |
17 | + if e.errno == errno.ENOENT: |
18 | + LOG.debug('bcache stop file %s missing, device removed: %s', |
19 | + bcache_stop, e) |
20 | + else: |
21 | + raise e |
22 | + |
23 | + |
24 | def shutdown_bcache(device): |
25 | """ |
26 | Shut down bcache for specified bcache device |
27 | @@ -132,8 +148,7 @@ |
28 | os.path.basename(bcache_cache_sysfs)) |
29 | else: |
30 | LOG.info('stopping bcache cacheset at: %s', bcache_cache_sysfs) |
31 | - util.write_file(os.path.join(bcache_cache_sysfs, 'stop'), |
32 | - '1', mode=None) |
33 | + maybe_stop_bcache_device(bcache_cache_sysfs) |
34 | try: |
35 | util.wait_for_removal(bcache_cache_sysfs, retries=removal_retries) |
36 | except OSError: |
37 | @@ -162,8 +177,7 @@ |
38 | return |
39 | else: |
40 | LOG.info('stopping bcache backing device at: %s', bcache_block_sysfs) |
41 | - util.write_file(os.path.join(bcache_block_sysfs, 'stop'), |
42 | - '1', mode=None) |
43 | + maybe_stop_bcache_device(bcache_block_sysfs) |
44 | try: |
45 | # wait for them all to go away |
46 | for dev in [device, bcache_block_sysfs] + slave_paths: |
47 | |
48 | === modified file 'tests/unittests/test_clear_holders.py' |
49 | --- tests/unittests/test_clear_holders.py 2017-08-02 20:55:03 +0000 |
50 | +++ tests/unittests/test_clear_holders.py 2017-10-13 17:48:44 +0000 |
51 | @@ -1,3 +1,4 @@ |
52 | +import errno |
53 | import mock |
54 | import os |
55 | import textwrap |
56 | @@ -329,6 +330,52 @@ |
57 | mock.call(cset, retries=self.remove_retries) |
58 | ]) |
59 | |
60 | + # test bcache shutdown with 'stop' sysfs write failure |
61 | + @mock.patch('curtin.block.clear_holders.udev.udevadm_settle') |
62 | + @mock.patch('curtin.block.clear_holders.get_bcache_sys_path') |
63 | + @mock.patch('curtin.block.clear_holders.util') |
64 | + @mock.patch('curtin.block.clear_holders.os') |
65 | + @mock.patch('curtin.block.clear_holders.LOG') |
66 | + @mock.patch('curtin.block.clear_holders.get_bcache_using_dev') |
67 | + def test_shutdown_bcache_stop_sysfs_write_fails(self, mock_get_bcache, |
68 | + mock_log, mock_os, |
69 | + mock_util, |
70 | + mock_get_bcache_block, |
71 | + mock_udevadm_settle): |
72 | + """Test writes sysfs write failures pass if file not present""" |
73 | + device = "/sys/class/block/null" |
74 | + mock_os.path.exists.side_effect = iter([ |
75 | + True, # backing device exists |
76 | + True, # cset device not present (already removed) |
77 | + False, # backing device is removed with cset |
78 | + False, # bcache/stop sysfs is missing (already removed) |
79 | + ]) |
80 | + cset = '/sys/fs/bcache/fake' |
81 | + mock_get_bcache.return_value = cset |
82 | + mock_get_bcache_block.return_value = device + '/bcache' |
83 | + mock_os.path.join.side_effect = os.path.join |
84 | + |
85 | + # make writes to sysfs fail |
86 | + mock_util.write_file.side_effect = IOError(errno.ENOENT, |
87 | + "File not found") |
88 | + |
89 | + clear_holders.shutdown_bcache(device) |
90 | + |
91 | + self.assertEqual(2, len(mock_log.info.call_args_list)) |
92 | + self.assertEqual(3, len(mock_os.path.exists.call_args_list)) |
93 | + self.assertEqual(1, len(mock_get_bcache.call_args_list)) |
94 | + self.assertEqual(1, len(mock_get_bcache_block.call_args_list)) |
95 | + self.assertEqual(1, len(mock_util.write_file.call_args_list)) |
96 | + self.assertEqual(1, len(mock_util.wait_for_removal.call_args_list)) |
97 | + |
98 | + mock_get_bcache.assert_called_with(device, strict=False) |
99 | + mock_util.write_file.assert_has_calls([ |
100 | + mock.call(cset + '/stop', '1', mode=None), |
101 | + ]) |
102 | + mock_util.wait_for_removal.assert_has_calls([ |
103 | + mock.call(cset, retries=self.remove_retries) |
104 | + ]) |
105 | + |
106 | @mock.patch('curtin.block.clear_holders.LOG') |
107 | @mock.patch('curtin.block.clear_holders.block.sys_block_path') |
108 | @mock.patch('curtin.block.clear_holders.lvm') |
109 | @@ -531,6 +578,41 @@ |
110 | for tree, result in test_trees_and_results: |
111 | self.assertEqual(clear_holders.format_holders_tree(tree), result) |
112 | |
113 | + @mock.patch('curtin.block.clear_holders.util.write_file') |
114 | + def test_maybe_stop_bcache_device_raises_errors(self, m_write_file): |
115 | + """Unexpected exceptions are raised by maybe_stop_bcache_device.""" |
116 | + m_write_file.side_effect = OSError(errno.EAGAIN, 'Unexpected errno') |
117 | + with self.assertRaises(OSError) as cm: |
118 | + clear_holders.maybe_stop_bcache_device('does/not/matter') |
119 | + self.assertEqual('[Errno 11] Unexpected errno', str(cm.exception)) |
120 | + self.assertEqual( |
121 | + mock.call('does/not/matter/stop', '1', mode=None), |
122 | + m_write_file.call_args) |
123 | + |
124 | + @mock.patch('curtin.block.clear_holders.LOG') |
125 | + @mock.patch('curtin.block.clear_holders.util.write_file') |
126 | + def test_maybe_stop_bcache_device_handles_oserror(self, m_write_file, |
127 | + m_log): |
128 | + """When OSError.NOENT is raised, log the condition and move on.""" |
129 | + m_write_file.side_effect = OSError(errno.ENOENT, 'Expected oserror') |
130 | + clear_holders.maybe_stop_bcache_device('does/not/matter') |
131 | + self.assertEqual( |
132 | + 'bcache stop file %s missing, device removed: %s', |
133 | + m_log.debug.call_args[0][0]) |
134 | + self.assertEqual('does/not/matter/stop', m_log.debug.call_args[0][1]) |
135 | + |
136 | + @mock.patch('curtin.block.clear_holders.LOG') |
137 | + @mock.patch('curtin.block.clear_holders.util.write_file') |
138 | + def test_maybe_stop_bcache_device_handles_ioerror(self, m_write_file, |
139 | + m_log): |
140 | + """When IOError.NOENT is raised, log the condition and move on.""" |
141 | + m_write_file.side_effect = IOError(errno.ENOENT, 'Expected ioerror') |
142 | + clear_holders.maybe_stop_bcache_device('does/not/matter') |
143 | + self.assertEqual( |
144 | + 'bcache stop file %s missing, device removed: %s', |
145 | + m_log.debug.call_args[0][0]) |
146 | + self.assertEqual('does/not/matter/stop', m_log.debug.call_args[0][1]) |
147 | + |
148 | def test_get_holder_types(self): |
149 | """test clear_holders.get_holder_types""" |
150 | test_trees_and_results = [ |
PASSED: Continuous integration, rev:526 /jenkins. ubuntu. com/server/ job/curtin- ci/622/ /jenkins. ubuntu. com/server/ job/curtin- ci/nodes= metal-amd64/ 622 /jenkins. ubuntu. com/server/ job/curtin- ci/nodes= metal-arm64/ 622 /jenkins. ubuntu. com/server/ job/curtin- ci/nodes= metal-ppc64el/ 622 /jenkins. ubuntu. com/server/ job/curtin- ci/nodes= metal-s390x/ 622
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
Click here to trigger a rebuild: /jenkins. ubuntu. com/server/ job/curtin- ci/622/ rebuild
https:/