Merge ~rodrigo-barbieri2010/ubuntu/+source/nova:bug/1821755_stein into ~ubuntu-openstack-dev/ubuntu/+source/nova:stable/stein
- Git
- lp:~rodrigo-barbieri2010/ubuntu/+source/nova
- bug/1821755_stein
- Merge into stable/stein
Proposed by
Rodrigo Barbieri
Status: | Merged |
---|---|
Merged at revision: | 3a93008b5ce6b1cb0e40f8ac9d32295596110f60 |
Proposed branch: | ~rodrigo-barbieri2010/ubuntu/+source/nova:bug/1821755_stein |
Merge into: | ~ubuntu-openstack-dev/ubuntu/+source/nova:stable/stein |
Diff against target: |
478 lines (+454/-1) 3 files modified
debian/changelog (+10/-1) debian/patches/lp1821755.patch (+443/-0) debian/patches/series (+1/-0) |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Chris MacNaughton | Pending | ||
Review via email: mp+409020@code.launchpad.net |
Commit message
Description of the change
To post a comment you must log in.
Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote : | # |
Revision history for this message
Rodrigo Barbieri (rodrigo-barbieri2010) wrote : | # |
Thanks Chris. I targeted the cloud-archive project and the stein series. I also linked the bug in this MP.
Preview Diff
[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1 | diff --git a/debian/changelog b/debian/changelog |
2 | index e72b114..9e15639 100644 |
3 | --- a/debian/changelog |
4 | +++ b/debian/changelog |
5 | @@ -1,5 +1,14 @@ |
6 | -nova (2:19.3.2-0ubuntu1~cloud1) UNRELEASED; urgency=medium |
7 | +nova (2:19.3.2-0ubuntu1~cloud2) UNRELEASED; urgency=medium |
8 | + |
9 | + [ Rodrigo Barbieri ] |
10 | + * d/p/lp1821755.patch: Error anti-affinity violation on |
11 | + migrations (LP #1821755). |
12 | + |
13 | + -- Rodrigo Barbieri <rodrigo.barbieri@canonical.com> Wed, 22 Sep 2021 21:14:10 +0000 |
14 | + |
15 | +nova (2:19.3.2-0ubuntu1~cloud1) bionic-stein; urgency=medium |
16 | |
17 | + [ Hemanth Nakkina ] |
18 | * d/p/0001-Update-pci-stat-pools-based-on-PCI-device-changes.patch: Update pci |
19 | stats pools based on PCI device changes (LP: #1892361). |
20 | |
21 | diff --git a/debian/patches/lp1821755.patch b/debian/patches/lp1821755.patch |
22 | new file mode 100644 |
23 | index 0000000..e0154e8 |
24 | --- /dev/null |
25 | +++ b/debian/patches/lp1821755.patch |
26 | @@ -0,0 +1,443 @@ |
27 | +From 5fa8718fe57e59b178d081e2068109151fdc3926 Mon Sep 17 00:00:00 2001 |
28 | +From: Rodrigo Barbieri <rodrigo.barbieri2010@gmail.com> |
29 | +Date: Wed, 31 Mar 2021 11:06:49 -0300 |
30 | +Subject: [PATCH] Error anti-affinity violation on migrations |
31 | + |
32 | +Error-out the migrations (cold and live) whenever the |
33 | +anti-affinity policy is violated. This addresses |
34 | +violations when multiple concurrent migrations are |
35 | +requested. |
36 | + |
37 | +Added detection on: |
38 | +- prep_resize |
39 | +- check_can_live_migration_destination |
40 | +- pre_live_migration |
41 | + |
42 | +The improved method of detection now locks based on group_id |
43 | +and considers other migrations in-progress as well. |
44 | + |
45 | +Conflicts: |
46 | + nova/tests/unit/compute/test_compute_mgr.py |
47 | + |
48 | +NOTE: Conflicts are because the following changes are not in Stein: |
49 | + |
50 | + * Ia00277ac8a68a635db85f9e0ce2c6d8df396e0d8 (Set migrate_data.vifs only when using multiple port bindings) |
51 | + * I3c917796cb30d11e7db1e235ac1625d2a743aaa2 (NUMA live migration support) |
52 | + * I2f3434f06489d8b6cb80933bcb1ea1e841049ba5 (Support migrating SRIOV port with bandwidth) |
53 | + * I292a0e2d840bbf657ba6d0932f9a3decbcb2778f ([FUP] Follow-up patch for SR-IOV live migration) |
54 | + * I734cc01dce13f9e75a16639faf890ddb1661b7eb (SR-IOV Live migration indirect port support) |
55 | + |
56 | +Summary of conflicts: |
57 | + |
58 | +Cherry-picked pulled many non-related newer unit tests. |
59 | +Cleaned those up and adjusted: |
60 | +- Removed 2 extra params of check_can_live_migrate_destination |
61 | + invocations. |
62 | +- Adjusted request_spec variable of |
63 | + unit test test_prep_resize_errors_migration. |
64 | +- Removed extra tab spacing on a unit test. |
65 | + |
66 | +Closes-bug: #1821755 |
67 | +Change-Id: I32e6214568bb57f7613ddeba2c2c46da0320fabc |
68 | +(cherry picked from commit 33c8af1f8c46c9c37fcc28fb3409fbd3a78ae39f) |
69 | +(cherry picked from commit 8b62a4ec9bf617dfb2da046c25a9f76b33516508) |
70 | +(cherry picked from commit 6ede6df7f41db809de19e124d3d4994180598f19) |
71 | +(cherry picked from commit bf90a1e06181f6b328b967124e538c6e2579b2e5) |
72 | +(cherry picked from commit a22d1b04de9e6ebc33b5ab9871b86f8e4022e7a9) |
73 | +--- |
74 | + nova/compute/manager.py | 110 ++++++++++++++--- |
75 | + nova/tests/unit/compute/test_compute_mgr.py | 112 ++++++++++++++++-- |
76 | + .../notes/bug-1821755-7bd03319e34b6b10.yaml | 11 ++ |
77 | + 3 files changed, 206 insertions(+), 27 deletions(-) |
78 | + create mode 100644 releasenotes/notes/bug-1821755-7bd03319e34b6b10.yaml |
79 | + |
80 | +diff --git a/nova/compute/manager.py b/nova/compute/manager.py |
81 | +index c7a0f37e92..1c29aa278f 100644 |
82 | +--- a/nova/compute/manager.py |
83 | ++++ b/nova/compute/manager.py |
84 | +@@ -1474,7 +1474,11 @@ class ComputeManager(manager.Manager): |
85 | + return [_decode(f) for f in injected_files] |
86 | + |
87 | + def _validate_instance_group_policy(self, context, instance, |
88 | +- scheduler_hints): |
89 | ++ scheduler_hints=None): |
90 | ++ |
91 | ++ if CONF.workarounds.disable_group_policy_check_upcall: |
92 | ++ return |
93 | ++ |
94 | + # NOTE(russellb) Instance group policy is enforced by the scheduler. |
95 | + # However, there is a race condition with the enforcement of |
96 | + # the policy. Since more than one instance may be scheduled at the |
97 | +@@ -1483,29 +1487,63 @@ class ComputeManager(manager.Manager): |
98 | + # multiple instances with an affinity policy could end up on different |
99 | + # hosts. This is a validation step to make sure that starting the |
100 | + # instance here doesn't violate the policy. |
101 | +- group_hint = scheduler_hints.get('group') |
102 | +- if not group_hint: |
103 | +- return |
104 | +- |
105 | +- # The RequestSpec stores scheduler_hints as key=list pairs so we need |
106 | +- # to check the type on the value and pull the single entry out. The |
107 | +- # API request schema validates that the 'group' hint is a single value. |
108 | +- if isinstance(group_hint, list): |
109 | +- group_hint = group_hint[0] |
110 | ++ if scheduler_hints is not None: |
111 | ++ # only go through here if scheduler_hints is provided, even if it |
112 | ++ # is empty. |
113 | ++ group_hint = scheduler_hints.get('group') |
114 | ++ if not group_hint: |
115 | ++ return |
116 | ++ else: |
117 | ++ # The RequestSpec stores scheduler_hints as key=list pairs so |
118 | ++ # we need to check the type on the value and pull the single |
119 | ++ # entry out. The API request schema validates that |
120 | ++ # the 'group' hint is a single value. |
121 | ++ if isinstance(group_hint, list): |
122 | ++ group_hint = group_hint[0] |
123 | ++ |
124 | ++ group = objects.InstanceGroup.get_by_hint(context, group_hint) |
125 | ++ else: |
126 | ++ # TODO(ganso): a call to DB can be saved by adding request_spec |
127 | ++ # to rpcapi payload of live_migration, pre_live_migration and |
128 | ++ # check_can_live_migrate_destination |
129 | ++ try: |
130 | ++ group = objects.InstanceGroup.get_by_instance_uuid( |
131 | ++ context, instance.uuid) |
132 | ++ except exception.InstanceGroupNotFound: |
133 | ++ return |
134 | + |
135 | +- @utils.synchronized(group_hint) |
136 | +- def _do_validation(context, instance, group_hint): |
137 | +- group = objects.InstanceGroup.get_by_hint(context, group_hint) |
138 | ++ @utils.synchronized(group['uuid']) |
139 | ++ def _do_validation(context, instance, group): |
140 | + if group.policy and 'anti-affinity' == group.policy: |
141 | ++ |
142 | ++ # instances on host |
143 | + instances_uuids = objects.InstanceList.get_uuids_by_host( |
144 | + context, self.host) |
145 | + ins_on_host = set(instances_uuids) |
146 | ++ |
147 | ++ # instance param is just for logging, the nodename obtained is |
148 | ++ # not actually related to the instance at all |
149 | ++ nodename = self._get_nodename(instance) |
150 | ++ |
151 | ++ # instances being migrated to host |
152 | ++ migrations = ( |
153 | ++ objects.MigrationList.get_in_progress_by_host_and_node( |
154 | ++ context, self.host, nodename)) |
155 | ++ migration_vm_uuids = set([mig['instance_uuid'] |
156 | ++ for mig in migrations]) |
157 | ++ |
158 | ++ total_instances = migration_vm_uuids | ins_on_host |
159 | ++ |
160 | ++ # refresh group to get updated members within locked block |
161 | ++ group = objects.InstanceGroup.get_by_uuid(context, |
162 | ++ group['uuid']) |
163 | + members = set(group.members) |
164 | + # Determine the set of instance group members on this host |
165 | + # which are not the instance in question. This is used to |
166 | + # determine how many other members from the same anti-affinity |
167 | + # group can be on this host. |
168 | +- members_on_host = ins_on_host & members - set([instance.uuid]) |
169 | ++ members_on_host = (total_instances & members - |
170 | ++ set([instance.uuid])) |
171 | + rules = group.rules |
172 | + if rules and 'max_server_per_host' in rules: |
173 | + max_server = rules['max_server_per_host'] |
174 | +@@ -1517,6 +1555,12 @@ class ComputeManager(manager.Manager): |
175 | + raise exception.RescheduledException( |
176 | + instance_uuid=instance.uuid, |
177 | + reason=msg) |
178 | ++ |
179 | ++ # NOTE(ganso): The check for affinity below does not work and it |
180 | ++ # can easily be violated because the lock happens in different |
181 | ++ # compute hosts. |
182 | ++ # The only fix seems to be a DB lock to perform the check whenever |
183 | ++ # setting the host field to an instance. |
184 | + elif group.policy and 'affinity' == group.policy: |
185 | + group_hosts = group.get_hosts(exclude=[instance.uuid]) |
186 | + if group_hosts and self.host not in group_hosts: |
187 | +@@ -1525,8 +1569,7 @@ class ComputeManager(manager.Manager): |
188 | + instance_uuid=instance.uuid, |
189 | + reason=msg) |
190 | + |
191 | +- if not CONF.workarounds.disable_group_policy_check_upcall: |
192 | +- _do_validation(context, instance, group_hint) |
193 | ++ _do_validation(context, instance, group) |
194 | + |
195 | + def _log_original_error(self, exc_info, instance_uuid): |
196 | + LOG.error('Error: %s', exc_info[1], instance_uuid=instance_uuid, |
197 | +@@ -4646,10 +4689,24 @@ class ComputeManager(manager.Manager): |
198 | + with self._error_out_instance_on_exception( |
199 | + context, instance, instance_state=instance_state),\ |
200 | + errors_out_migration_ctxt(migration): |
201 | ++ |
202 | + self._send_prep_resize_notifications( |
203 | + context, instance, fields.NotificationPhase.START, |
204 | + instance_type) |
205 | + try: |
206 | ++ scheduler_hints = self._get_scheduler_hints(filter_properties, |
207 | ++ request_spec) |
208 | ++ # Error out if this host cannot accept the new instance due |
209 | ++ # to anti-affinity. At this point the migration is already |
210 | ++ # in-progress, so this is the definitive moment to abort due to |
211 | ++ # the policy violation. Also, exploding here is covered by the |
212 | ++ # cleanup methods in except block. |
213 | ++ try: |
214 | ++ self._validate_instance_group_policy(context, instance, |
215 | ++ scheduler_hints) |
216 | ++ except exception.RescheduledException as e: |
217 | ++ raise exception.InstanceFaultRollback(inner_exception=e) |
218 | ++ |
219 | + self._prep_resize(context, image, instance, |
220 | + instance_type, filter_properties, |
221 | + node, migration, clean_shutdown) |
222 | +@@ -6505,6 +6562,20 @@ class ComputeManager(manager.Manager): |
223 | + if None, ignore disk usage checking |
224 | + :returns: a dict containing migration info |
225 | + """ |
226 | ++ |
227 | ++ # Error out if this host cannot accept the new instance due |
228 | ++ # to anti-affinity. This check at this moment is not very accurate, as |
229 | ++ # multiple requests may be happening concurrently and miss the lock, |
230 | ++ # but when it works it provides a better user experience by failing |
231 | ++ # earlier. Also, it should be safe to explode here, error becomes |
232 | ++ # NoValidHost and instance status remains ACTIVE. |
233 | ++ try: |
234 | ++ self._validate_instance_group_policy(ctxt, instance) |
235 | ++ except exception.RescheduledException as e: |
236 | ++ msg = ("Failed to validate instance group policy " |
237 | ++ "due to: {}".format(e)) |
238 | ++ raise exception.MigrationPreCheckError(reason=msg) |
239 | ++ |
240 | + src_compute_info = obj_base.obj_to_primitive( |
241 | + self._get_compute_info(ctxt, instance.host)) |
242 | + dst_compute_info = obj_base.obj_to_primitive( |
243 | +@@ -6567,6 +6638,13 @@ class ComputeManager(manager.Manager): |
244 | + """ |
245 | + LOG.debug('pre_live_migration data is %s', migrate_data) |
246 | + |
247 | ++ # Error out if this host cannot accept the new instance due |
248 | ++ # to anti-affinity. At this point the migration is already in-progress, |
249 | ++ # so this is the definitive moment to abort due to the policy |
250 | ++ # violation. Also, it should be safe to explode here. The instance |
251 | ++ # status remains ACTIVE, migration status failed. |
252 | ++ self._validate_instance_group_policy(context, instance) |
253 | ++ |
254 | + migrate_data.old_vol_attachment_ids = {} |
255 | + bdms = objects.BlockDeviceMappingList.get_by_instance_uuid( |
256 | + context, instance.uuid) |
257 | +diff --git a/nova/tests/unit/compute/test_compute_mgr.py b/nova/tests/unit/compute/test_compute_mgr.py |
258 | +index 1c7b5de37a..ef048e1d68 100644 |
259 | +--- a/nova/tests/unit/compute/test_compute_mgr.py |
260 | ++++ b/nova/tests/unit/compute/test_compute_mgr.py |
261 | +@@ -2915,14 +2915,36 @@ class ComputeManagerUnitTestCase(test.NoDBTestCase, |
262 | + self.context, 'compute_check_can_live_migrate_destination', |
263 | + CONF.host, instance.uuid) |
264 | + |
265 | ++ @mock.patch('nova.objects.InstanceGroup.get_by_instance_uuid', mock.Mock( |
266 | ++ side_effect=exception.InstanceGroupNotFound(group_uuid=''))) |
267 | + def test_check_can_live_migrate_destination_success(self): |
268 | + self._test_check_can_live_migrate_destination() |
269 | + |
270 | ++ @mock.patch('nova.objects.InstanceGroup.get_by_instance_uuid', mock.Mock( |
271 | ++ side_effect=exception.InstanceGroupNotFound(group_uuid=''))) |
272 | + def test_check_can_live_migrate_destination_fail(self): |
273 | + self.assertRaises( |
274 | +- test.TestingException, |
275 | +- self._test_check_can_live_migrate_destination, |
276 | +- do_raise=True) |
277 | ++ test.TestingException, |
278 | ++ self._test_check_can_live_migrate_destination, |
279 | ++ do_raise=True) |
280 | ++ |
281 | ++ @mock.patch.object(compute_utils, 'add_instance_fault_from_exc') |
282 | ++ def test_check_can_live_migrate_destination_fail_group_policy( |
283 | ++ self, mock_fail_db): |
284 | ++ |
285 | ++ instance = fake_instance.fake_instance_obj( |
286 | ++ self.context, host=self.compute.host, vm_state=vm_states.ACTIVE, |
287 | ++ node='fake-node') |
288 | ++ |
289 | ++ ex = exception.RescheduledException( |
290 | ++ instance_uuid=instance.uuid, reason="policy violated") |
291 | ++ |
292 | ++ with mock.patch.object(self.compute, '_validate_instance_group_policy', |
293 | ++ side_effect=ex): |
294 | ++ self.assertRaises( |
295 | ++ exception.MigrationPreCheckError, |
296 | ++ self.compute.check_can_live_migrate_destination, |
297 | ++ self.context, instance, None, None) |
298 | + |
299 | + @mock.patch('nova.compute.manager.InstanceEvents._lock_name') |
300 | + def test_prepare_for_instance_event(self, lock_name_mock): |
301 | +@@ -6447,7 +6469,8 @@ class ComputeManagerBuildInstanceTestCase(test.NoDBTestCase): |
302 | + def test_validate_policy_honors_workaround_disabled(self, mock_get): |
303 | + instance = objects.Instance(uuid=uuids.instance) |
304 | + hints = {'group': 'foo'} |
305 | +- mock_get.return_value = objects.InstanceGroup(policy=None) |
306 | ++ mock_get.return_value = objects.InstanceGroup(policy=None, |
307 | ++ uuid=uuids.group) |
308 | + self.compute._validate_instance_group_policy(self.context, |
309 | + instance, hints) |
310 | + mock_get.assert_called_once_with(self.context, 'foo') |
311 | +@@ -6473,10 +6496,14 @@ class ComputeManagerBuildInstanceTestCase(test.NoDBTestCase): |
312 | + instance, hints) |
313 | + mock_get.assert_called_once_with(self.context, uuids.group_hint) |
314 | + |
315 | ++ @mock.patch('nova.objects.InstanceGroup.get_by_uuid') |
316 | + @mock.patch('nova.objects.InstanceList.get_uuids_by_host') |
317 | + @mock.patch('nova.objects.InstanceGroup.get_by_hint') |
318 | +- def test_validate_instance_group_policy_with_rules(self, mock_get_by_hint, |
319 | +- mock_get_by_host): |
320 | ++ @mock.patch.object(fake_driver.FakeDriver, 'get_available_nodes') |
321 | ++ @mock.patch('nova.objects.MigrationList.get_in_progress_by_host_and_node') |
322 | ++ def test_validate_instance_group_policy_with_rules( |
323 | ++ self, migration_list, nodes, mock_get_by_hint, mock_get_by_host, |
324 | ++ mock_get_by_uuid): |
325 | + # Create 2 instance in same host, inst2 created before inst1 |
326 | + instance = objects.Instance(uuid=uuids.inst1) |
327 | + hints = {'group': [uuids.group_hint]} |
328 | +@@ -6485,17 +6512,26 @@ class ComputeManagerBuildInstanceTestCase(test.NoDBTestCase): |
329 | + mock_get_by_host.return_value = existing_insts |
330 | + |
331 | + # if group policy rules limit to 1, raise RescheduledException |
332 | +- mock_get_by_hint.return_value = objects.InstanceGroup( |
333 | ++ group = objects.InstanceGroup( |
334 | + policy='anti-affinity', rules={'max_server_per_host': '1'}, |
335 | +- hosts=['host1'], members=members_uuids) |
336 | ++ hosts=['host1'], members=members_uuids, |
337 | ++ uuid=uuids.group) |
338 | ++ mock_get_by_hint.return_value = group |
339 | ++ mock_get_by_uuid.return_value = group |
340 | ++ nodes.return_value = ['nodename'] |
341 | ++ migration_list.return_value = [objects.Migration( |
342 | ++ uuid=uuids.migration, instance_uuid=uuids.instance)] |
343 | + self.assertRaises(exception.RescheduledException, |
344 | + self.compute._validate_instance_group_policy, |
345 | + self.context, instance, hints) |
346 | + |
347 | + # if group policy rules limit change to 2, validate OK |
348 | +- mock_get_by_hint.return_value = objects.InstanceGroup( |
349 | ++ group2 = objects.InstanceGroup( |
350 | + policy='anti-affinity', rules={'max_server_per_host': 2}, |
351 | +- hosts=['host1'], members=members_uuids) |
352 | ++ hosts=['host1'], members=members_uuids, |
353 | ++ uuid=uuids.group) |
354 | ++ mock_get_by_hint.return_value = group2 |
355 | ++ mock_get_by_uuid.return_value = group2 |
356 | + self.compute._validate_instance_group_policy(self.context, |
357 | + instance, hints) |
358 | + |
359 | +@@ -7966,6 +8002,8 @@ class ComputeManagerMigrationTestCase(test.NoDBTestCase, |
360 | + else: |
361 | + mock_log.warning.assert_not_called() |
362 | + |
363 | ++ @mock.patch('nova.objects.InstanceGroup.get_by_instance_uuid', mock.Mock( |
364 | ++ side_effect=exception.InstanceGroupNotFound(group_uuid=''))) |
365 | + def test_pre_live_migration_cinder_v3_api(self): |
366 | + # This tests that pre_live_migration with a bdm with an |
367 | + # attachment_id, will create a new attachment and update |
368 | +@@ -8043,6 +8081,8 @@ class ComputeManagerMigrationTestCase(test.NoDBTestCase, |
369 | + |
370 | + _test() |
371 | + |
372 | ++ @mock.patch('nova.objects.InstanceGroup.get_by_instance_uuid', mock.Mock( |
373 | ++ side_effect=exception.InstanceGroupNotFound(group_uuid=''))) |
374 | + def test_pre_live_migration_exception_cinder_v3_api(self): |
375 | + # The instance in this test has 2 attachments. The second attach_create |
376 | + # will throw an exception. This will test that the first attachment |
377 | +@@ -8112,6 +8152,8 @@ class ComputeManagerMigrationTestCase(test.NoDBTestCase, |
378 | + self.assertGreater(len(m.mock_calls), 0) |
379 | + _test() |
380 | + |
381 | ++ @mock.patch('nova.objects.InstanceGroup.get_by_instance_uuid', mock.Mock( |
382 | ++ side_effect=exception.InstanceGroupNotFound(group_uuid=''))) |
383 | + def test_pre_live_migration_exceptions_delete_attachments(self): |
384 | + # The instance in this test has 2 attachments. The call to |
385 | + # driver.pre_live_migration will raise an exception. This will test |
386 | +@@ -9297,7 +9339,7 @@ class ComputeManagerMigrationTestCase(test.NoDBTestCase, |
387 | + self.compute.prep_resize, |
388 | + self.context, mock.sentinel.image, |
389 | + instance, flavor, |
390 | +- mock.sentinel.request_spec, |
391 | ++ objects.RequestSpec(), |
392 | + {}, 'node', False, |
393 | + migration, []) |
394 | + |
395 | +@@ -9375,6 +9417,54 @@ class ComputeManagerMigrationTestCase(test.NoDBTestCase, |
396 | + # (_error_out_instance_on_exception will set to ACTIVE by default). |
397 | + self.assertEqual(vm_states.STOPPED, instance.vm_state) |
398 | + |
399 | ++ @mock.patch('nova.compute.utils.notify_usage_exists') |
400 | ++ @mock.patch('nova.compute.manager.ComputeManager.' |
401 | ++ '_notify_about_instance_usage') |
402 | ++ @mock.patch('nova.compute.utils.notify_about_resize_prep_instance') |
403 | ++ @mock.patch('nova.objects.Instance.save') |
404 | ++ @mock.patch('nova.compute.manager.ComputeManager._revert_allocation') |
405 | ++ @mock.patch('nova.compute.manager.ComputeManager.' |
406 | ++ '_reschedule_resize_or_reraise') |
407 | ++ @mock.patch('nova.compute.utils.add_instance_fault_from_exc') |
408 | ++ # this is almost copy-paste from test_prep_resize_fails_rollback |
409 | ++ def test_prep_resize_fails_group_validation( |
410 | ++ self, add_instance_fault_from_exc, _reschedule_resize_or_reraise, |
411 | ++ _revert_allocation, mock_instance_save, |
412 | ++ notify_about_resize_prep_instance, _notify_about_instance_usage, |
413 | ++ notify_usage_exists): |
414 | ++ """Tests that if _validate_instance_group_policy raises |
415 | ++ InstanceFaultRollback, the instance.vm_state is reset properly in |
416 | ++ _error_out_instance_on_exception |
417 | ++ """ |
418 | ++ instance = fake_instance.fake_instance_obj( |
419 | ++ self.context, host=self.compute.host, vm_state=vm_states.STOPPED, |
420 | ++ node='fake-node', expected_attrs=['system_metadata', 'flavor']) |
421 | ++ migration = mock.MagicMock(spec='nova.objects.Migration') |
422 | ++ request_spec = mock.MagicMock(spec='nova.objects.RequestSpec') |
423 | ++ ex = exception.RescheduledException( |
424 | ++ instance_uuid=instance.uuid, reason="policy violated") |
425 | ++ ex2 = exception.InstanceFaultRollback( |
426 | ++ inner_exception=ex) |
427 | ++ |
428 | ++ def fake_reschedule_resize_or_reraise(*args, **kwargs): |
429 | ++ raise ex2 |
430 | ++ |
431 | ++ _reschedule_resize_or_reraise.side_effect = ( |
432 | ++ fake_reschedule_resize_or_reraise) |
433 | ++ |
434 | ++ with mock.patch.object(self.compute, '_validate_instance_group_policy', |
435 | ++ side_effect=ex): |
436 | ++ self.assertRaises( |
437 | ++ # _error_out_instance_on_exception should reraise the |
438 | ++ # RescheduledException inside InstanceFaultRollback. |
439 | ++ exception.RescheduledException, self.compute.prep_resize, |
440 | ++ self.context, instance.image_meta, instance, instance.flavor, |
441 | ++ request_spec, filter_properties={}, node=instance.node, |
442 | ++ clean_shutdown=True, migration=migration, host_list=[]) |
443 | ++ # The instance.vm_state should remain unchanged |
444 | ++ # (_error_out_instance_on_exception will set to ACTIVE by default). |
445 | ++ self.assertEqual(vm_states.STOPPED, instance.vm_state) |
446 | ++ |
447 | + def test_get_updated_nw_info_with_pci_mapping(self): |
448 | + old_dev = objects.PciDevice(address='0000:04:00.2') |
449 | + new_dev = objects.PciDevice(address='0000:05:00.3') |
450 | +diff --git a/releasenotes/notes/bug-1821755-7bd03319e34b6b10.yaml b/releasenotes/notes/bug-1821755-7bd03319e34b6b10.yaml |
451 | +new file mode 100644 |
452 | +index 0000000000..4c6135311b |
453 | +--- /dev/null |
454 | ++++ b/releasenotes/notes/bug-1821755-7bd03319e34b6b10.yaml |
455 | +@@ -0,0 +1,11 @@ |
456 | ++--- |
457 | ++fixes: |
458 | ++ - | |
459 | ++ Improved detection of anti-affinity policy violation when performing live |
460 | ++ and cold migrations. Most of the violations caused by race conditions due |
461 | ++ to performing concurrent live or cold migrations should now be addressed |
462 | ++ by extra checks in the compute service. Upon detection, cold migration |
463 | ++ operations are automatically rescheduled, while live migrations have two |
464 | ++ checks and will be rescheduled if detected by the first one, otherwise the |
465 | ++ live migration will fail cleanly and revert the instance state back to its |
466 | ++ previous value. |
467 | +-- |
468 | +2.17.1 |
469 | + |
470 | diff --git a/debian/patches/series b/debian/patches/series |
471 | index d47ac8b..dac98fd 100644 |
472 | --- a/debian/patches/series |
473 | +++ b/debian/patches/series |
474 | @@ -4,3 +4,4 @@ skip-ssl-tests.patch |
475 | arm-console-patch.patch |
476 | revert-generalize-db-conf-group-copying.patch |
477 | 0001-Update-pci-stat-pools-based-on-PCI-device-changes.patch |
478 | +lp1821755.patch |
The change looks OK but the related bug has no Stein targets, it would be good to get bug tasks raised for the cloud-archive (and nova in Ubuntu as well) to track getting this patch into distro