Merge lp:~salvatore-orlando/nova/bug723301 into lp:~hudson-openstack/nova/trunk
- bug723301
- Merge into trunk
Status: | Merged | ||||
---|---|---|---|---|---|
Approved by: | Brian Waldon | ||||
Approved revision: | 798 | ||||
Merged at revision: | 1250 | ||||
Proposed branch: | lp:~salvatore-orlando/nova/bug723301 | ||||
Merge into: | lp:~hudson-openstack/nova/trunk | ||||
Diff against target: |
621 lines (+320/-95) 5 files modified
nova/tests/test_xenapi.py (+42/-0) nova/tests/xenapi/stubs.py (+39/-0) nova/virt/xenapi/vm_utils.py (+116/-49) nova/virt/xenapi/vmops.py (+121/-44) plugins/xenserver/xenapi/etc/xapi.d/plugins/glance (+2/-2) |
||||
To merge this branch: | bzr merge lp:~salvatore-orlando/nova/bug723301 | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Brian Waldon (community) | Approve | ||
Soren Hansen (community) | Abstain | ||
Brian Lamar (community) | Approve | ||
Sandy Walsh (community) | Approve | ||
termie (community) | Needs Fixing | ||
Matt Dietz (community) | Approve | ||
Rick Harris (community) | Approve | ||
Review via email: mp+53482@code.launchpad.net |
Commit message
implemented clean-up logic when VM fails to spawn for xenapi back-end
Description of the change
The spawn method in VMOps has been modified in order to handle exceptions occuring before a VM record is created; the exception handling procedure will destroy any VDI created so far, as well as remove kernel/ramdisk files from /boot/guest in dom0 file system.
Exceptions generated by methods dealing with resources on the hypervisor called by spawn are caught and then raised again adding the identifier of any already created VDI.
Salvatore Orlando (salvatore-orlando) wrote : | # |
Hi Brian,
thanks for your prompt review.
Please see my comments inline.
> I'm not familiar with a lot of the terminology here but hopefully I can be of
> some assistance.
>
> 33: self.assertRais
> 34: + self._test_spawn, 1, 2, 3)
>
> Can this look for a particular exception instead of `Exception`?
The reason for which I'm looking for a generic exception is that the spawn process might fail in several ways. However, since I'm forcing to fail in a particular way by stubbing out a method in the test case, I guess I can easily use a specific type of exception in assertRaises.
>
> 139 + task = session.
> 140 + session.
>
> Do both line 139 and 140 raise the cls.XenAPI.Failure exception? If not, only
> the calls that can raise the exception in question should be encapsulated in
> the try/except.
Yes, they can both raise a XenAPI Failure. Line 139 might fail, for instance, if we lose connection to the server, whereas line 140 raises a Failure if the async operation started at line 139 fails for some reason.
>
> I have the same question for lines 181 through 209. Which exact calls are you
> looking to catch an exception from? I don't like "except BaseException" which
> will catch all sort of things (SystemExit, etc.).
>
I used BaseException in order to be able to handle both system and user exception. However, as concerns line 181 to 209, I'm not interested in anything else than XenAPI.Failure. I will change the code to catch only this kind of failure
> Same for line 216, is it possible to be more explicit than "except:"?
>
Definitely.
> Same for 262 through 320. Error handling is best handed at a more granular
> level.
In this code fragment we can have several kinds of failures (Glance failure, OS failures during partition preparation, XenAPI failures). I used a single try/except block with BaseException for having a single point for handling all those exception. I understand however this code will not deal properly with unexpected exceptions that I'm not prepared to handle in the catch block.
I can either replace BaseException with the exact types of exception I'm ready to handle or have smaller try/except blocks only for bits that I'm interesting in.
Which approach would you suggest to be better (or should I probably have smaller try/except blocks catching only specific expections) ?
> Same for 407 through 452. That is a lot of logic to handle for the error case.
>
The except block in this code fragment deals with exception raised in the spawn method and exception throw by other methods. I agree that is quite a lot code. I'll try and find a way to reduce the error handling logic.
>
> Overall I'd like to see more concise error handling using more specific errors
> if/where possible.
Agreed.
Brian Lamar (blamar) wrote : | # |
> I can either replace BaseException with the exact types of exception I'm ready to handle or
> have smaller try/except blocks only for bits that I'm interesting in.
> Which approach would you suggest to be better (or should I probably have smaller try/except
> blocks catching only specific expections) ?
I'd love to see several smaller try/except blocks encapsulating the relevant calls. If/when this isn't possible you can catch multiple types of exceptions. If code can be shared between the except blocks then break those out into functions etc.
Thanks for the code!
Salvatore Orlando (salvatore-orlando) wrote : | # |
Some comments to help reviewers' task:
During the spawn process, we create several resources before the VM record is created (VDIs and files in dom0's filesystem). The aim of this bug fix is to make sure all these resources are removed if the spawn process fails *before* the VM record is created; if the spawn process fails after that, all the resources will be removed upon instance termination, as the VM record has a reference to all of them.
Resources to be removed might include:
- disk VDI
- kernel/ramdisk files on dom0's filesystem (in /boot/guest)
- temporary VDIs for kernel and ramdisk
The resources to be removed will be stored in a dict. Keys of the dict are Image Types (KERNEL, RAMDISK, VHD, DISK), whereas values are a 2-element tuple in which the 1st element is the VDI uuid and the 2nd element the filename on dom0.
We look for the following exceptions:
- XenAPI.Failure which can occurs in several places
- OSError, which can occur when operations like parted or pygrub fail
- IOError, which can occur when _stream_disk fails (e.g.: backend failures, glance failures)
When an exception occurs, exception handling blocks in interested function add information about resources created so far to the dict, and then raise again the error. Resources are then finally removed in _handle_
As an example:
{
ImageType.KERNEL: (None, /boot/guest/
ImageType.
ImageType.DISK: (disk-vdi-uuid,)
}
Represents the dict that would have been built if an exception occurred while copying the temporary ramdisk VDI to dom0. The 'main' exception handling function, VMOps._
Matt Dietz (cerberus) wrote : | # |
Thanks for this Salvatore! We've need this for a while now.
Could only find a few things to pick on ;-)
PEP8 / Style nit-picks:
96 import os
97 +import sys
98 import pickle
pep8: sys should come after re in this list
351 import subprocess
352 import tempfile
353 import uuid
354 +import sys
Same thing
Sandy Walsh (sandy-walsh) wrote : | # |
Thanks for the clarification of what's going on here. That makes my life much easier. Perhaps consider putting that in a comment somewhere? :)
Great branch by the way! Much needed.
Minor style point: I think there're supposed to be a space after the hash in:
# Comment ...
Easy changes:
433 + if len(vm_
434 + last_arg = vm_create_
if vm_create_
last_arg = vm_create_
same thing in 455-456, 475, 479
465 + if vdi_to_remove != None:
if vdi_to_remove:
Set WIP so I can approve afterwards ...
Sandy Walsh (sandy-walsh) wrote : | # |
easy to tell I'm fried today ... s/there're/there is/
(there're ... wtf?)
Salvatore Orlando (salvatore-orlando) wrote : | # |
Style comments have been addressed...
all but one:
if len(items_
476 + # there is also a file to remove
477 + files_to_
as we are actually checking that there are at least two elements in the list.
Thanks for spending time reviewing this branch, I hope you can find some time to give another round of review!
termie (termie) wrote : | # |
- line 251: please put one arg per line, or all on the same line on the next line, double indented
- please format all docstrings according to guide in HACKING
- line 258: perhaps assign the lambda on the line before and pass it in to this function instead of defining it inline here as it gets a bit messy
- line 526: dictionary contents on the same line as the opening parens, you can remove the comment and you doing foo[:255] is the normal way of saying foo[0:255]
- line 551: you actually need that extra line
Salvatore Orlando (salvatore-orlando) wrote : | # |
Guys, I have addressed the last review's comments, ball in your court now :)
Cheers!
Brian Lamar (blamar) : | # |
termie (termie) wrote : | # |
almost there, just looks like the couple docstrings in nova/tests/
Soren Hansen (soren) wrote : | # |
> === modified file 'nova/tests/
> --- nova/tests/
> +++ nova/tests/
> @@ -327,6 +327,19 @@
> self.assertEqua
> self.assertEqua
>
> + def _check_
> + url = FLAGS.xenapi_
> + username = FLAGS.xenapi_
> + password = FLAGS.xenapi_
> + session = xenapi_
> + vdi_refs = session.
> + for vdi_ref in vdi_refs:
> + vdi_rec = session.
> + if 'VBDs' in vdi_rec:
> + self.assertEqua
> + else:
> + self.fail('Found unexpected unbound VDI:%s' % vdi_rec['uuid'])
I'm having trouble followin the data model.
If vdi_rec has a 'VBDs' key, it has to be empty. If there's no 'VBDs' key, that
means unexpected unbound VDI's are found? Is that accurate?
Salvatore Orlando (salvatore-orlando) wrote : | # |
@termie:
docstrings are now formatted according to PEP-257
@Soren:
I agree the code is at least cryptic. Basically the test fails if a VDI without a VBD was found (hence 'unbound'). The VBD is the 'Virtual Block Device' that binds a VDI to a VM.
The other check:
if 'VBDs' in vdi_rec:
was in place because we are dealing with the fake xenapi driver, but was not strictly required.
However, I have improved the unit tests trying to make them more clear: we now extract the whole list of VDIs before and after the spawn process, and verify that no new VDI has been added.
Rick Harris (rconradharris) wrote : | # |
Really nice work here, Salvatore. Marking as approved with a few optional cleanup suggestions:
> 401 + if not image_type in (ImageType.KERNEL, ImageType.RAMDISK):
`X not in Y` seems a bit more readable:
if image_type not in (ImageType.KERNEL, ImageType.RAMDISK):
> 205 + def destroy_vdi(cls, session, vdi_ref):
Now that we have a destroy_vdi helper, can we modify vmops:_destroy_vdis and
vmops:_
May have to add a `raise_on_error` kwarg since _destroy_
reason) just `continue`s.
> + except (self.XenAPI.
For me, the code which propogates the 'rollback-context` to the handler is a
little hard to follow. I think it may be due to trying to re-use the original
exception and attaching the rollback-context as a positional-arg.
As a point of discussion, another route might be to create a wrapper exception called
`ExceptionWithR
original exception is triggered is stored within a `rollback_context` dict.
Then, after handling the `ExceptionWithR
re-raise the inner exception, something like:
raise rollback_
The advantage of this approach, is that it clearly delinates the original
exception from its rollback-context, we don't have to fool around with last
position-arg code, and we can potentially re-use this concept/pattern in other
portions of Xenapi virt layer.
Matt Dietz (cerberus) wrote : | # |
lgtm. We'll see what Termie has to say
termie (termie) wrote : | # |
These two docstrings should have a one line summary, followed by a newline and then the additional information.
37 + """Simulates an error while downloading an image.
38 + Verifies that VDIs created are properly cleaned up.
39 +
40 + """
For example, would become:
"""Simulates an error while downloading an image.
Verifies that VDIs created are properly cleaned up.
"""
50 + def test_spawn_
51 + """Simulates an error while creating VM record. It
52 + verifies that VDIs created are properly cleaned up.
53 +
54 + """
Dan Prince (dan-prince) wrote : | # |
Hi Salvatore,
I get conflicts when trying to merge your latest code w/ lp:nova. Can you have a look?
Dan
Salvatore Orlando (salvatore-orlando) wrote : | # |
Sorry for the delay in fixing this code, I was on vacation.
Salvatore Orlando (salvatore-orlando) wrote : | # |
Hi,
I am reproposing this branch for merging.
Code has been adapted to latest changes in routines for creating VDIs and fetching images from glance and objectstore.
However, I think the code can be improved as we are now using two distinct sets of values for describing image types, the ImageType enumeration, and the vdi_type key returned by methods which fetch images.
I was thinking of using just one enumeration, thus simplifying future code maintenance.
I submitted a question on this topic: https:/
Thanks,
Salvatore
Sandy Walsh (sandy-walsh) wrote : | # |
+252/259 commented out?
+295/303 ... remove?
+436 necessary?
Needs a trunk merge ... conflicts.
Nit picky: Comments should be proper sentences. Capitalized and punctuated. Space after #
Salvatore Orlando (salvatore-orlando) wrote : | # |
Hi Sandy,
I've addressed your comments and resolved the conflicts.
Thanks,
Salvatore
Sandy Walsh (sandy-walsh) wrote : | # |
Thanks ... works for me!
Vish Ishaya (vishvananda) wrote : | # |
looks good to go as soon as termie's docstring concerns are addressed.
Brian Waldon (bcwaldon) wrote : | # |
Setting to WIP until test_xenapi.py docstrings are formatted correctly.
Salvatore Orlando (salvatore-orlando) wrote : | # |
docstrings fixed
Salvatore Orlando (salvatore-orlando) wrote : | # |
No setting back to needs reviews though, as I get the following exception while running tests:
=======
ERROR: <nose.suite.
-------
Traceback (most recent call last):
File "/usr/lib/
self.setUp()
File "/usr/lib/
self.
File "/usr/lib/
try_
File "/usr/lib/
return func()
File "/home/
cleandb = os.path.
File "/home/
val = gflags.
File "/usr/lib/
raise AttributeError(
AttributeError: sqlite_clean_db
It is the first time I get it, I don't think it depends on my code, but I prefer to fix it before proposing again for merge.
Salvatore Orlando (salvatore-orlando) wrote : | # |
Previous issue with tests has now been solved.
Setting back to needs review.
Brian Lamar (blamar) : | # |
Brian Waldon (bcwaldon) wrote : | # |
I think termie's concerns have been addressed. Waiting to hear back from Soren
Soren Hansen (soren) wrote : | # |
My question has been answered, and this MP has more than enough +1's. Just ignore me :)
Brian Waldon (bcwaldon) wrote : | # |
Looks good to me
Preview Diff
1 | === modified file 'nova/tests/test_xenapi.py' |
2 | --- nova/tests/test_xenapi.py 2011-06-27 21:48:03 +0000 |
3 | +++ nova/tests/test_xenapi.py 2011-07-04 16:24:47 +0000 |
4 | @@ -381,6 +381,18 @@ |
5 | self.assertEquals(self.vm['HVM_boot_params'], {}) |
6 | self.assertEquals(self.vm['HVM_boot_policy'], '') |
7 | |
8 | + def _list_vdis(self): |
9 | + url = FLAGS.xenapi_connection_url |
10 | + username = FLAGS.xenapi_connection_username |
11 | + password = FLAGS.xenapi_connection_password |
12 | + session = xenapi_conn.XenAPISession(url, username, password) |
13 | + return session.call_xenapi('VDI.get_all') |
14 | + |
15 | + def _check_vdis(self, start_list, end_list): |
16 | + for vdi_ref in end_list: |
17 | + if not vdi_ref in start_list: |
18 | + self.fail('Found unexpected VDI:%s' % vdi_ref) |
19 | + |
20 | def _test_spawn(self, image_ref, kernel_id, ramdisk_id, |
21 | instance_type_id="3", os_type="linux", |
22 | architecture="x86-64", instance_id=1, |
23 | @@ -422,6 +434,36 @@ |
24 | self._test_spawn, |
25 | 1, 2, 3, "4") # m1.xlarge |
26 | |
27 | + def test_spawn_fail_cleanup_1(self): |
28 | + """Simulates an error while downloading an image. |
29 | + |
30 | + Verifies that VDIs created are properly cleaned up. |
31 | + |
32 | + """ |
33 | + vdi_recs_start = self._list_vdis() |
34 | + FLAGS.xenapi_image_service = 'glance' |
35 | + stubs.stubout_fetch_image_glance_disk(self.stubs) |
36 | + self.assertRaises(xenapi_fake.Failure, |
37 | + self._test_spawn, 1, 2, 3) |
38 | + # No additional VDI should be found. |
39 | + vdi_recs_end = self._list_vdis() |
40 | + self._check_vdis(vdi_recs_start, vdi_recs_end) |
41 | + |
42 | + def test_spawn_fail_cleanup_2(self): |
43 | + """Simulates an error while creating VM record. |
44 | + |
45 | + It verifies that VDIs created are properly cleaned up. |
46 | + |
47 | + """ |
48 | + vdi_recs_start = self._list_vdis() |
49 | + FLAGS.xenapi_image_service = 'glance' |
50 | + stubs.stubout_create_vm(self.stubs) |
51 | + self.assertRaises(xenapi_fake.Failure, |
52 | + self._test_spawn, 1, 2, 3) |
53 | + # No additional VDI should be found. |
54 | + vdi_recs_end = self._list_vdis() |
55 | + self._check_vdis(vdi_recs_start, vdi_recs_end) |
56 | + |
57 | def test_spawn_raw_objectstore(self): |
58 | FLAGS.xenapi_image_service = 'objectstore' |
59 | self._test_spawn(1, None, None) |
60 | |
61 | === modified file 'nova/tests/xenapi/stubs.py' |
62 | --- nova/tests/xenapi/stubs.py 2011-06-06 15:00:51 +0000 |
63 | +++ nova/tests/xenapi/stubs.py 2011-07-04 16:24:47 +0000 |
64 | @@ -98,6 +98,42 @@ |
65 | stubs.Set(vm_utils, '_is_vdi_pv', f) |
66 | |
67 | |
68 | +def stubout_determine_is_pv_objectstore(stubs): |
69 | + """Assumes VMs never have PV kernels""" |
70 | + |
71 | + @classmethod |
72 | + def f(cls, *args): |
73 | + return False |
74 | + stubs.Set(vm_utils.VMHelper, '_determine_is_pv_objectstore', f) |
75 | + |
76 | + |
77 | +def stubout_lookup_image(stubs): |
78 | + """Simulates a failure in lookup image.""" |
79 | + def f(_1, _2, _3, _4): |
80 | + raise Exception("Test Exception raised by fake lookup_image") |
81 | + stubs.Set(vm_utils, 'lookup_image', f) |
82 | + |
83 | + |
84 | +def stubout_fetch_image_glance_disk(stubs): |
85 | + """Simulates a failure in fetch image_glance_disk.""" |
86 | + |
87 | + @classmethod |
88 | + def f(cls, *args): |
89 | + raise fake.Failure("Test Exception raised by " + |
90 | + "fake fetch_image_glance_disk") |
91 | + stubs.Set(vm_utils.VMHelper, '_fetch_image_glance_disk', f) |
92 | + |
93 | + |
94 | +def stubout_create_vm(stubs): |
95 | + """Simulates a failure in create_vm.""" |
96 | + |
97 | + @classmethod |
98 | + def f(cls, *args): |
99 | + raise fake.Failure("Test Exception raised by " + |
100 | + "fake create_vm") |
101 | + stubs.Set(vm_utils.VMHelper, 'create_vm', f) |
102 | + |
103 | + |
104 | def stubout_loopingcall_start(stubs): |
105 | def fake_start(self, interval, now=True): |
106 | self.f(*self.args, **self.kw) |
107 | @@ -120,6 +156,9 @@ |
108 | super(FakeSessionForVMTests, self).__init__(uri) |
109 | |
110 | def host_call_plugin(self, _1, _2, plugin, method, _5): |
111 | + # If the call is for 'copy_kernel_vdi' return None. |
112 | + if method == 'copy_kernel_vdi': |
113 | + return |
114 | sr_ref = fake.get_all('SR')[0] |
115 | vdi_ref = fake.create_vdi('', False, sr_ref, False) |
116 | vdi_rec = fake.get_record('VDI', vdi_ref) |
117 | |
118 | === modified file 'nova/virt/xenapi/vm_utils.py' |
119 | --- nova/virt/xenapi/vm_utils.py 2011-06-24 12:01:51 +0000 |
120 | +++ nova/virt/xenapi/vm_utils.py 2011-07-04 16:24:47 +0000 |
121 | @@ -23,6 +23,7 @@ |
122 | import os |
123 | import pickle |
124 | import re |
125 | +import sys |
126 | import tempfile |
127 | import time |
128 | import urllib |
129 | @@ -71,17 +72,51 @@ |
130 | class ImageType: |
131 | """ |
132 | Enumeration class for distinguishing different image types |
133 | - 0 - kernel/ramdisk image (goes on dom0's filesystem) |
134 | - 1 - disk image (local SR, partitioned by objectstore plugin) |
135 | - 2 - raw disk image (local SR, NOT partitioned by plugin) |
136 | - 3 - vhd disk image (local SR, NOT inspected by XS, PV assumed for |
137 | + 0 - kernel image (goes on dom0's filesystem) |
138 | + 1 - ramdisk image (goes on dom0's filesystem) |
139 | + 2 - disk image (local SR, partitioned by objectstore plugin) |
140 | + 3 - raw disk image (local SR, NOT partitioned by plugin) |
141 | + 4 - vhd disk image (local SR, NOT inspected by XS, PV assumed for |
142 | linux, HVM assumed for Windows) |
143 | """ |
144 | |
145 | - KERNEL_RAMDISK = 0 |
146 | - DISK = 1 |
147 | - DISK_RAW = 2 |
148 | - DISK_VHD = 3 |
149 | + KERNEL = 0 |
150 | + RAMDISK = 1 |
151 | + DISK = 2 |
152 | + DISK_RAW = 3 |
153 | + DISK_VHD = 4 |
154 | + |
155 | + KERNEL_STR = "kernel" |
156 | + RAMDISK_STR = "ramdisk" |
157 | + DISK_STR = "os" |
158 | + DISK_RAW_STR = "os_raw" |
159 | + DISK_VHD_STR = "vhd" |
160 | + |
161 | + @classmethod |
162 | + def to_string(cls, image_type): |
163 | + if image_type == ImageType.KERNEL: |
164 | + return ImageType.KERNEL_STR |
165 | + elif image_type == ImageType.RAMDISK: |
166 | + return ImageType.RAMDISK_STR |
167 | + elif image_type == ImageType.DISK: |
168 | + return ImageType.DISK_STR |
169 | + elif image_type == ImageType.DISK_RAW: |
170 | + return ImageType.DISK_RAW_STR |
171 | + elif image_type == ImageType.DISK_VHD: |
172 | + return ImageType.VHD_STR |
173 | + |
174 | + @classmethod |
175 | + def from_string(cls, image_type_str): |
176 | + if image_type_str == ImageType.KERNEL_STR: |
177 | + return ImageType.KERNEL |
178 | + elif image_type == ImageType.RAMDISK_STR: |
179 | + return ImageType.RAMDISK |
180 | + elif image_type == ImageType.DISK_STR: |
181 | + return ImageType.DISK |
182 | + elif image_type == ImageType.DISK_RAW_STR: |
183 | + return ImageType.DISK_RAW |
184 | + elif image_type == ImageType.DISK_VHD_STR: |
185 | + return ImageType.VHD |
186 | |
187 | |
188 | class VMHelper(HelperBase): |
189 | @@ -145,7 +180,6 @@ |
190 | 'VCPUs_max': vcpus, |
191 | 'VCPUs_params': {}, |
192 | 'xenstore_data': {}} |
193 | - |
194 | # Complete VM configuration record according to the image type |
195 | # non-raw/raw with PV kernel/raw in HVM mode |
196 | if use_pv_kernel: |
197 | @@ -240,6 +274,15 @@ |
198 | raise StorageError(_('Unable to destroy VBD %s') % vbd_ref) |
199 | |
200 | @classmethod |
201 | + def destroy_vdi(cls, session, vdi_ref): |
202 | + try: |
203 | + task = session.call_xenapi('Async.VDI.destroy', vdi_ref) |
204 | + session.wait_for_task(task) |
205 | + except cls.XenAPI.Failure, exc: |
206 | + LOG.exception(exc) |
207 | + raise StorageError(_('Unable to destroy VDI %s') % vdi_ref) |
208 | + |
209 | + @classmethod |
210 | def create_vif(cls, session, vm_ref, network_ref, mac_address, |
211 | dev, rxtx_cap=0): |
212 | """Create a VIF record. Returns a Deferred that gives the new |
213 | @@ -394,12 +437,12 @@ |
214 | """ |
215 | LOG.debug(_("Asking xapi to fetch vhd image %(image)s") |
216 | % locals()) |
217 | - |
218 | sr_ref = safe_find_sr(session) |
219 | |
220 | - # NOTE(sirp): The Glance plugin runs under Python 2.4 which does not |
221 | - # have the `uuid` module. To work around this, we generate the uuids |
222 | - # here (under Python 2.6+) and pass them as arguments |
223 | + # NOTE(sirp): The Glance plugin runs under Python 2.4 |
224 | + # which does not have the `uuid` module. To work around this, |
225 | + # we generate the uuids here (under Python 2.6+) and |
226 | + # pass them as arguments |
227 | uuid_stack = [str(uuid.uuid4()) for i in xrange(2)] |
228 | |
229 | glance_host, glance_port = \ |
230 | @@ -449,18 +492,20 @@ |
231 | # FIXME(sirp): Since the Glance plugin seems to be required for the |
232 | # VHD disk, it may be worth using the plugin for both VHD and RAW and |
233 | # DISK restores |
234 | + LOG.debug(_("Fetching image %(image)s") % locals()) |
235 | + LOG.debug(_("Image Type: %s"), ImageType.to_string(image_type)) |
236 | sr_ref = safe_find_sr(session) |
237 | |
238 | glance_client, image_id = nova.image.get_glance_client(image) |
239 | meta, image_file = glance_client.get_image(image_id) |
240 | virtual_size = int(meta['size']) |
241 | vdi_size = virtual_size |
242 | - LOG.debug(_("Size for image %(image)s:%(virtual_size)d") % locals()) |
243 | - |
244 | + LOG.debug(_("Size for image %(image)s:" + |
245 | + "%(virtual_size)d") % locals()) |
246 | if image_type == ImageType.DISK: |
247 | # Make room for MBR. |
248 | vdi_size += MBR_SIZE_BYTES |
249 | - elif image_type == ImageType.KERNEL_RAMDISK and \ |
250 | + elif image_type in (ImageType.KERNEL, ImageType.RAMDISK) and \ |
251 | vdi_size > FLAGS.max_kernel_ramdisk_size: |
252 | max_size = FLAGS.max_kernel_ramdisk_size |
253 | raise exception.Error( |
254 | @@ -469,29 +514,45 @@ |
255 | |
256 | name_label = get_name_label_for_image(image) |
257 | vdi_ref = cls.create_vdi(session, sr_ref, name_label, vdi_size, False) |
258 | - |
259 | - with_vdi_attached_here(session, vdi_ref, False, |
260 | - lambda dev: |
261 | - _stream_disk(dev, image_type, |
262 | - virtual_size, image_file)) |
263 | - if image_type == ImageType.KERNEL_RAMDISK: |
264 | - #we need to invoke a plugin for copying VDI's |
265 | - #content into proper path |
266 | - LOG.debug(_("Copying VDI %s to /boot/guest on dom0"), vdi_ref) |
267 | - fn = "copy_kernel_vdi" |
268 | - args = {} |
269 | - args['vdi-ref'] = vdi_ref |
270 | - #let the plugin copy the correct number of bytes |
271 | - args['image-size'] = str(vdi_size) |
272 | - task = session.async_call_plugin('glance', fn, args) |
273 | - filename = session.wait_for_task(task, instance_id) |
274 | - #remove the VDI as it is not needed anymore |
275 | - session.get_xenapi().VDI.destroy(vdi_ref) |
276 | - LOG.debug(_("Kernel/Ramdisk VDI %s destroyed"), vdi_ref) |
277 | - return filename |
278 | - else: |
279 | + # From this point we have a VDI on Xen host; |
280 | + # If anything goes wrong, we need to remember its uuid. |
281 | + try: |
282 | + filename = None |
283 | vdi_uuid = session.get_xenapi().VDI.get_uuid(vdi_ref) |
284 | - return [dict(vdi_type='os', vdi_uuid=vdi_uuid)] |
285 | + with_vdi_attached_here(session, vdi_ref, False, |
286 | + lambda dev: |
287 | + _stream_disk(dev, image_type, |
288 | + virtual_size, image_file)) |
289 | + if image_type in (ImageType.KERNEL, ImageType.RAMDISK): |
290 | + # We need to invoke a plugin for copying the |
291 | + # content of the VDI into the proper path. |
292 | + LOG.debug(_("Copying VDI %s to /boot/guest on dom0"), vdi_ref) |
293 | + fn = "copy_kernel_vdi" |
294 | + args = {} |
295 | + args['vdi-ref'] = vdi_ref |
296 | + # Let the plugin copy the correct number of bytes. |
297 | + args['image-size'] = str(vdi_size) |
298 | + task = session.async_call_plugin('glance', fn, args) |
299 | + filename = session.wait_for_task(task, instance_id) |
300 | + # Remove the VDI as it is not needed anymore. |
301 | + session.get_xenapi().VDI.destroy(vdi_ref) |
302 | + LOG.debug(_("Kernel/Ramdisk VDI %s destroyed"), vdi_ref) |
303 | + return [dict(vdi_type=ImageType.to_string(image_type), |
304 | + vdi_uuid=None, |
305 | + file=filename)] |
306 | + else: |
307 | + return [dict(vdi_type=ImageType.to_string(image_type), |
308 | + vdi_uuid=vdi_uuid, |
309 | + file=None)] |
310 | + except (cls.XenAPI.Failure, IOError, OSError) as e: |
311 | + # We look for XenAPI and OS failures. |
312 | + LOG.exception(_("instance %s: Failed to fetch glance image"), |
313 | + instance_id, exc_info=sys.exc_info()) |
314 | + e.args = e.args + ([dict(vdi_type=ImageType. |
315 | + to_string(image_type), |
316 | + vdi_uuid=vdi_uuid, |
317 | + file=filename)],) |
318 | + raise e |
319 | |
320 | @classmethod |
321 | def determine_disk_image_type(cls, instance): |
322 | @@ -506,7 +567,8 @@ |
323 | whether a kernel_id is specified. |
324 | """ |
325 | def log_disk_format(image_type): |
326 | - pretty_format = {ImageType.KERNEL_RAMDISK: 'KERNEL_RAMDISK', |
327 | + pretty_format = {ImageType.KERNEL: 'KERNEL', |
328 | + ImageType.RAMDISK: 'RAMDISK', |
329 | ImageType.DISK: 'DISK', |
330 | ImageType.DISK_RAW: 'DISK_RAW', |
331 | ImageType.DISK_VHD: 'DISK_VHD'} |
332 | @@ -519,8 +581,8 @@ |
333 | def determine_from_glance(): |
334 | glance_disk_format2nova_type = { |
335 | 'ami': ImageType.DISK, |
336 | - 'aki': ImageType.KERNEL_RAMDISK, |
337 | - 'ari': ImageType.KERNEL_RAMDISK, |
338 | + 'aki': ImageType.KERNEL, |
339 | + 'ari': ImageType.RAMDISK, |
340 | 'raw': ImageType.DISK_RAW, |
341 | 'vhd': ImageType.DISK_VHD} |
342 | image_ref = instance.image_ref |
343 | @@ -553,7 +615,7 @@ |
344 | image_type): |
345 | """Fetch image from glance based on image type. |
346 | |
347 | - Returns: A single filename if image_type is KERNEL_RAMDISK |
348 | + Returns: A single filename if image_type is KERNEL or RAMDISK |
349 | A list of dictionaries that describe VDIs, otherwise |
350 | """ |
351 | if image_type == ImageType.DISK_VHD: |
352 | @@ -568,13 +630,13 @@ |
353 | secret, image_type): |
354 | """Fetch an image from objectstore. |
355 | |
356 | - Returns: A single filename if image_type is KERNEL_RAMDISK |
357 | + Returns: A single filename if image_type is KERNEL or RAMDISK |
358 | A list of dictionaries that describe VDIs, otherwise |
359 | """ |
360 | url = "http://%s:%s/_images/%s/image" % (FLAGS.s3_host, FLAGS.s3_port, |
361 | image) |
362 | LOG.debug(_("Asking xapi to fetch %(url)s as %(access)s") % locals()) |
363 | - if image_type == ImageType.KERNEL_RAMDISK: |
364 | + if image_type in (ImageType.KERNEL, ImageType.RAMDISK): |
365 | fn = 'get_kernel' |
366 | else: |
367 | fn = 'get_vdi' |
368 | @@ -584,15 +646,20 @@ |
369 | args['password'] = secret |
370 | args['add_partition'] = 'false' |
371 | args['raw'] = 'false' |
372 | - if image_type != ImageType.KERNEL_RAMDISK: |
373 | + if not image_type in (ImageType.KERNEL, ImageType.RAMDISK): |
374 | args['add_partition'] = 'true' |
375 | if image_type == ImageType.DISK_RAW: |
376 | args['raw'] = 'true' |
377 | task = session.async_call_plugin('objectstore', fn, args) |
378 | - uuid_or_fn = session.wait_for_task(task, instance_id) |
379 | - if image_type != ImageType.KERNEL_RAMDISK: |
380 | - return [dict(vdi_type='os', vdi_uuid=uuid_or_fn)] |
381 | - return uuid_or_fn |
382 | + vdi_uuid = None |
383 | + filename = None |
384 | + if image_type in (ImageType.KERNEL, ImageType.RAMDISK): |
385 | + filename = session.wait_for_task(task, instance_id) |
386 | + else: |
387 | + vdi_uuid = session.wait_for_task(task, instance_id) |
388 | + return [dict(vdi_type=ImageType.to_string(image_type), |
389 | + vdi_uuid=vdi_uuid, |
390 | + file=filename)] |
391 | |
392 | @classmethod |
393 | def determine_is_pv(cls, session, instance_id, vdi_ref, disk_image_type, |
394 | |
395 | === modified file 'nova/virt/xenapi/vmops.py' |
396 | --- nova/virt/xenapi/vmops.py 2011-06-30 19:20:59 +0000 |
397 | +++ nova/virt/xenapi/vmops.py 2011-07-04 16:24:47 +0000 |
398 | @@ -25,6 +25,7 @@ |
399 | import os |
400 | import pickle |
401 | import subprocess |
402 | +import sys |
403 | import time |
404 | import uuid |
405 | |
406 | @@ -137,9 +138,18 @@ |
407 | return vdis |
408 | |
409 | def spawn(self, instance, network_info): |
410 | - vdis = self._create_disks(instance) |
411 | - vm_ref = self._create_vm(instance, vdis, network_info) |
412 | - self._spawn(instance, vm_ref) |
413 | + vdis = None |
414 | + try: |
415 | + vdis = self._create_disks(instance) |
416 | + vm_ref = self._create_vm(instance, vdis, network_info) |
417 | + self._spawn(instance, vm_ref) |
418 | + except (self.XenAPI.Failure, OSError, IOError) as spawn_error: |
419 | + LOG.exception(_("instance %s: Failed to spawn"), |
420 | + instance.id, exc_info=sys.exc_info()) |
421 | + LOG.debug(_('Instance %s failed to spawn - performing clean-up'), |
422 | + instance.id) |
423 | + self._handle_spawn_error(vdis, spawn_error) |
424 | + raise spawn_error |
425 | |
426 | def spawn_rescue(self, instance): |
427 | """Spawn a rescue instance.""" |
428 | @@ -165,42 +175,64 @@ |
429 | project = AuthManager().get_project(instance.project_id) |
430 | |
431 | disk_image_type = VMHelper.determine_disk_image_type(instance) |
432 | - |
433 | kernel = None |
434 | - if instance.kernel_id: |
435 | - kernel = VMHelper.fetch_image(self._session, instance.id, |
436 | - instance.kernel_id, user, project, |
437 | - ImageType.KERNEL_RAMDISK) |
438 | - |
439 | ramdisk = None |
440 | - if instance.ramdisk_id: |
441 | - ramdisk = VMHelper.fetch_image(self._session, instance.id, |
442 | - instance.ramdisk_id, user, project, |
443 | - ImageType.KERNEL_RAMDISK) |
444 | - |
445 | - # Create the VM ref and attach the first disk |
446 | - first_vdi_ref = self._session.call_xenapi('VDI.get_by_uuid', |
447 | - vdis[0]['vdi_uuid']) |
448 | - |
449 | - vm_mode = instance.vm_mode and instance.vm_mode.lower() |
450 | - if vm_mode == 'pv': |
451 | - use_pv_kernel = True |
452 | - elif vm_mode in ('hv', 'hvm'): |
453 | - use_pv_kernel = False |
454 | - vm_mode = 'hvm' # Normalize |
455 | - else: |
456 | - use_pv_kernel = VMHelper.determine_is_pv(self._session, |
457 | - instance.id, first_vdi_ref, disk_image_type, |
458 | - instance.os_type) |
459 | - vm_mode = use_pv_kernel and 'pv' or 'hvm' |
460 | - |
461 | - if instance.vm_mode != vm_mode: |
462 | - # Update database with normalized (or determined) value |
463 | - db.instance_update(context.get_admin_context(), |
464 | - instance['id'], {'vm_mode': vm_mode}) |
465 | - |
466 | - vm_ref = VMHelper.create_vm(self._session, instance, |
467 | - kernel, ramdisk, use_pv_kernel) |
468 | + try: |
469 | + if instance.kernel_id: |
470 | + kernel = VMHelper.fetch_image(self._session, instance.id, |
471 | + instance.kernel_id, user, project, |
472 | + ImageType.KERNEL)[0] |
473 | + if instance.ramdisk_id: |
474 | + ramdisk = VMHelper.fetch_image(self._session, instance.id, |
475 | + instance.ramdisk_id, user, project, |
476 | + ImageType.RAMDISK)[0] |
477 | + # Create the VM ref and attach the first disk |
478 | + first_vdi_ref = self._session.call_xenapi('VDI.get_by_uuid', |
479 | + vdis[0]['vdi_uuid']) |
480 | + |
481 | + vm_mode = instance.vm_mode and instance.vm_mode.lower() |
482 | + if vm_mode == 'pv': |
483 | + use_pv_kernel = True |
484 | + elif vm_mode in ('hv', 'hvm'): |
485 | + use_pv_kernel = False |
486 | + vm_mode = 'hvm' # Normalize |
487 | + else: |
488 | + use_pv_kernel = VMHelper.determine_is_pv(self._session, |
489 | + instance.id, first_vdi_ref, disk_image_type, |
490 | + instance.os_type) |
491 | + vm_mode = use_pv_kernel and 'pv' or 'hvm' |
492 | + |
493 | + if instance.vm_mode != vm_mode: |
494 | + # Update database with normalized (or determined) value |
495 | + db.instance_update(context.get_admin_context(), |
496 | + instance['id'], {'vm_mode': vm_mode}) |
497 | + vm_ref = VMHelper.create_vm(self._session, instance, |
498 | + kernel and kernel.get('file', None) or None, |
499 | + ramdisk and ramdisk.get('file', None) or None, |
500 | + use_pv_kernel) |
501 | + except (self.XenAPI.Failure, OSError, IOError) as vm_create_error: |
502 | + # Collect VDI/file resources to clean up; |
503 | + # These resources will be removed by _handle_spawn_error. |
504 | + LOG.exception(_("instance %s: Failed to spawn - " + |
505 | + "Unable to create VM"), |
506 | + instance.id, exc_info=sys.exc_info()) |
507 | + last_arg = None |
508 | + resources = [] |
509 | + |
510 | + if vm_create_error.args: |
511 | + last_arg = vm_create_error.args[-1] |
512 | + if isinstance(last_arg, list): |
513 | + resources = last_arg |
514 | + else: |
515 | + vm_create_error.args = vm_create_error.args + (resources,) |
516 | + |
517 | + if kernel: |
518 | + resources.append(kernel) |
519 | + if ramdisk: |
520 | + resources.append(ramdisk) |
521 | + |
522 | + raise vm_create_error |
523 | + |
524 | VMHelper.create_vbd(session=self._session, vm_ref=vm_ref, |
525 | vdi_ref=first_vdi_ref, userdevice=0, bootable=True) |
526 | |
527 | @@ -321,6 +353,47 @@ |
528 | |
529 | return timer.start(interval=0.5, now=True) |
530 | |
531 | + def _handle_spawn_error(self, vdis, spawn_error): |
532 | + # Extract resource list from spawn_error. |
533 | + resources = [] |
534 | + if spawn_error.args: |
535 | + last_arg = spawn_error.args[-1] |
536 | + resources = last_arg |
537 | + if vdis: |
538 | + for vdi in vdis: |
539 | + resources.append(dict(vdi_type=vdi['vdi_type'], |
540 | + vdi_uuid=vdi['vdi_uuid'], |
541 | + file=None)) |
542 | + |
543 | + LOG.debug(_("Resources to remove:%s"), resources) |
544 | + kernel_file = None |
545 | + ramdisk_file = None |
546 | + |
547 | + for item in resources: |
548 | + vdi_type = item['vdi_type'] |
549 | + vdi_to_remove = item['vdi_uuid'] |
550 | + if vdi_to_remove: |
551 | + try: |
552 | + vdi_ref = self._session.call_xenapi('VDI.get_by_uuid', |
553 | + vdi_to_remove) |
554 | + LOG.debug(_('Removing VDI %(vdi_ref)s' + |
555 | + '(uuid:%(vdi_to_remove)s)'), locals()) |
556 | + VMHelper.destroy_vdi(self._session, vdi_ref) |
557 | + except self.XenAPI.Failure: |
558 | + # Vdi has already been deleted |
559 | + LOG.debug(_("Skipping VDI destroy for %s"), vdi_to_remove) |
560 | + if item['file']: |
561 | + # There is also a file to remove. |
562 | + if vdi_type == ImageType.KERNEL_STR: |
563 | + kernel_file = item['file'] |
564 | + elif vdi_type == ImageType.RAMDISK_STR: |
565 | + ramdisk_file = item['file'] |
566 | + |
567 | + if kernel_file or ramdisk_file: |
568 | + LOG.debug(_("Removing kernel/ramdisk files from dom0")) |
569 | + self._destroy_kernel_ramdisk_plugin_call(kernel_file, |
570 | + ramdisk_file) |
571 | + |
572 | def _get_vm_opaque_ref(self, instance_or_vm): |
573 | """ |
574 | Refactored out the common code of many methods that receive either |
575 | @@ -698,6 +771,16 @@ |
576 | VMHelper.unplug_vbd(self._session, vbd_ref) |
577 | VMHelper.destroy_vbd(self._session, vbd_ref) |
578 | |
579 | + def _destroy_kernel_ramdisk_plugin_call(self, kernel, ramdisk): |
580 | + args = {} |
581 | + if kernel: |
582 | + args['kernel-file'] = kernel |
583 | + if ramdisk: |
584 | + args['ramdisk-file'] = ramdisk |
585 | + task = self._session.async_call_plugin( |
586 | + 'glance', 'remove_kernel_ramdisk', args) |
587 | + self._session.wait_for_task(task) |
588 | + |
589 | def _destroy_kernel_ramdisk(self, instance, vm_ref): |
590 | """Three situations can occur: |
591 | |
592 | @@ -727,13 +810,7 @@ |
593 | (kernel, ramdisk) = VMHelper.lookup_kernel_ramdisk(self._session, |
594 | vm_ref) |
595 | |
596 | - LOG.debug(_("Removing kernel/ramdisk files")) |
597 | - |
598 | - args = {'kernel-file': kernel, 'ramdisk-file': ramdisk} |
599 | - task = self._session.async_call_plugin( |
600 | - 'glance', 'remove_kernel_ramdisk', args) |
601 | - self._session.wait_for_task(task, instance.id) |
602 | - |
603 | + self._destroy_kernel_ramdisk_plugin_call(kernel, ramdisk) |
604 | LOG.debug(_("kernel/ramdisk files removed")) |
605 | |
606 | def _destroy_vm(self, instance, vm_ref): |
607 | |
608 | === modified file 'plugins/xenserver/xenapi/etc/xapi.d/plugins/glance' |
609 | --- plugins/xenserver/xenapi/etc/xapi.d/plugins/glance 2011-06-03 18:14:28 +0000 |
610 | +++ plugins/xenserver/xenapi/etc/xapi.d/plugins/glance 2011-07-04 16:24:47 +0000 |
611 | @@ -412,8 +412,8 @@ |
612 | |
613 | def remove_kernel_ramdisk(session, args): |
614 | """Removes kernel and/or ramdisk from dom0's file system""" |
615 | - kernel_file = exists(args, 'kernel-file') |
616 | - ramdisk_file = exists(args, 'ramdisk-file') |
617 | + kernel_file = optional(args, 'kernel-file') |
618 | + ramdisk_file = optional(args, 'ramdisk-file') |
619 | if kernel_file: |
620 | os.remove(kernel_file) |
621 | if ramdisk_file: |
I'm not familiar with a lot of the terminology here but hopefully I can be of some assistance.
33: self.assertRais es(Exception,
34: + self._test_spawn, 1, 2, 3)
Can this look for a particular exception instead of `Exception`?
139 + task = session. call_xenapi( 'Async. VDI.destroy' , vdi_ref) wait_for_ task(task)
140 + session.
Do both line 139 and 140 raise the cls.XenAPI.Failure exception? If not, only the calls that can raise the exception in question should be encapsulated in the try/except.
I have the same question for lines 181 through 209. Which exact calls are you looking to catch an exception from? I don't like "except BaseException" which will catch all sort of things (SystemExit, etc.).
Same for line 216, is it possible to be more explicit than "except:"?
Same for 262 through 320. Error handling is best handed at a more granular level.
Same for 407 through 452. That is a lot of logic to handle for the error case.
Overall I'd like to see more concise error handling using more specific errors if/where possible.