Merge lp:~tr3buchet/nova/lock into lp:~hudson-openstack/nova/trunk

Proposed by Trey Morris
Status: Merged
Approved by: Eric Day
Approved revision: 529
Merged at revision: 524
Proposed branch: lp:~tr3buchet/nova/lock
Merge into: lp:~hudson-openstack/nova/trunk
Diff against target: 330 lines (+184/-2)
5 files modified
nova/api/openstack/servers.py (+44/-0)
nova/compute/api.py (+33/-0)
nova/compute/manager.py (+86/-2)
nova/db/sqlalchemy/models.py (+2/-0)
nova/tests/test_compute.py (+19/-0)
To merge this branch: bzr merge lp:~tr3buchet/nova/lock
Reviewer Review Type Date Requested Status
Eric Day (community) Approve
Devin Carlen (community) Approve
Matt Dietz (community) Needs Fixing
Review via email: mp+44874@code.launchpad.net

Description of the change

This branch implements lock functionality. The lock is stored in the compute worker database. Decorators have been added to the openstack API actions which alter instances in any way.

To post a comment you must log in.
Revision history for this message
Trey Morris (tr3buchet) wrote :

also, not shown is me modifying the users and instances tables in another window to set the state of is_admin and locked attributes which are commented about above.

lp:~tr3buchet/nova/lock updated
507. By Trey Morris

removed some code i didn't end up using

Revision history for this message
Trey Morris (tr3buchet) wrote :

also, I apologize for launchpad mangling my commit message

Revision history for this message
Rick Harris (rconradharris) wrote :

Overall lgtm. A few small comments:

> + if(admin or not locked):

No parens needed :-)

> + @checks_lock

I'm wondering if we should give this particular lock a memorable/unambiguous name-- there will be various kinds of locks, mutexs, etc throughout the system, so down the road, check_lock might not be as clear as it is today.

Maybe, @check_instance_lock, or @check_state_lock. Just a thought.

Revision history for this message
Trey Morris (tr3buchet) wrote :

i like that
i'll fix up both

lp:~tr3buchet/nova/lock updated
508. By Trey Morris

removed () from if (can't believe i did that) and renamed checks_lock decorator

Revision history for this message
Trey Morris (tr3buchet) wrote :
Download full text (6.0 KiB)

fixed.

also moved my testing notes from above to this comment so it isn't in the commit message

Notes: unittests passed locally. pep8 returned not errors locally. My integration test with output is as follows (remember that the openstack API maps state "pause" to "error"):

(trey|3nova)~/openstack/n3vascript> n3va.sh teardown
(trey|3nova)~/openstack/n3vascript> n3va.sh run

(screen window)
(trey|3nova)~/openstack/n3vascript> . /home/trey/openstack/n3vascript/novarc
(trey|3nova)~/openstack/n3vascript> euca-add-keypair test > test.pem
(trey|3nova)~/openstack/n3vascript> export CLOUD_SERVERS_USERNAME=admin
(trey|3nova)~/openstack/n3vascript> export CLOUD_SERVERS_API_KEY=admin
(trey|3nova)~/openstack/n3vascript> export CLOUD_SERVERS_URL=http://localhost:8774/v1.0/
(trey|3nova)~/openstack/n3vascript> euca-run-instances -k test -t m1.tiny ami-tiny
RESERVATION r-27hxqjra admin
INSTANCE i-qssosb ami-tiny scheduling test (admin, None) 0 m1.tiny 2010-12-29 20:54:38.391163
(trey|3nova)~/openstack/n3vascript>
(trey|3nova)~/openstack/n3vascript> # unlocked + admin
(trey|3nova)~/openstack/n3vascript> cloudservers list
+------------+-------------------+--------+-----------+------------+
| ID | Name | Status | Public IP | Private IP |
+------------+-------------------+--------+-----------+------------+
| 1620488315 | Server 1620488315 | active | | |
+------------+-------------------+--------+-----------+------------+
(trey|3nova)~/openstack/n3vascript> cloudservers pause 1620488315
(trey|3nova)~/openstack/n3vascript> cloudservers list
+------------+-------------------+--------+-----------+------------+
| ID | Name | Status | Public IP | Private IP |
+------------+-------------------+--------+-----------+------------+
| 1620488315 | Server 1620488315 | error | | |
+------------+-------------------+--------+-----------+------------+
(trey|3nova)~/openstack/n3vascript> cloudservers unpause 1620488315
(trey|3nova)~/openstack/n3vascript> cloudservers list
+------------+-------------------+--------+-----------+------------+
| ID | Name | Status | Public IP | Private IP |
+------------+-------------------+--------+-----------+------------+
| 1620488315 | Server 1620488315 | active | | |
+------------+-------------------+--------+-----------+------------+
(trey|3nova)~/openstack/n3vascript>
(trey|3nova)~/openstack/n3vascript> # locked + admin
(trey|3nova)~/openstack/n3vascript> cloudservers list
+------------+-------------------+--------+-----------+------------+
| ID | Name | Status | Public IP | Private IP |
+------------+-------------------+--------+-----------+------------+
| 1620488315 | Server 1620488315 | active | | |
+------------+-------------------+--------+-----------+------------+
(trey|3nova)~/openstack/n3vascript> cloudservers pause 1620488315
(trey|3nova)~/openstack/n3vascript> cloudservers list
+------------+-------------------+--------+-----------+------------+
| ID | Name | Status | Public ...

Read more...

Revision history for this message
Trey Morris (tr3buchet) wrote :

oooh less mangling this way too :D

Revision history for this message
Eric Day (eday) wrote :

I think the lock check should be happening on the compute worker node, not within the API server (and if it were happening in the API server, it should be in compute.api, not in the openstack API specific code). We need to start planning for race conditions in a distributed system, and this means handling operations at the canonical source.

review: Needs Fixing
Revision history for this message
Matt Dietz (cerberus) wrote :

I'm not sure I agree with the idea of a locked instance returning not found. I think it delivers an idea inconsistent with the intent of the locking mechanism. I would say HTTP 405 (MethodNotAllowed) or something like that instead, as a 404 wouldn't let the end user know that it needs to be unlocked. Also, I don't think we want to lock the show action, as the lock is supposed to represent immutability, not visibility.

review: Needs Fixing
lp:~tr3buchet/nova/lock updated
509. By Trey Morris

removed lock check from show and changed returning 404 to 405

Revision history for this message
Trey Morris (tr3buchet) wrote :

re matthew: fixed.
re eric: yikes, good point.

Revision history for this message
Trey Morris (tr3buchet) wrote :

also re eric: you devil...

lp:~tr3buchet/nova/lock updated
510. By Trey Morris

moved check lock decorator to compute api level. altered openstack.test_servers according and wrote test for lock in tests.test_compute

511. By Trey Morris

fixd variables being out of scope in lock decorator

512. By Trey Morris

altered error exception/logging

513. By Trey Morris

altered error exception/logging

514. By Trey Morris

typo, trying to hurry.. look where that got me

515. By Trey Morris

added some logging

516. By Trey Morris

removed db.set_lock, using update_instance instead

517. By Trey Morris

moved check lock decorator from the compute api to the come manager... when it rains it pours

518. By Trey Morris

syntax error

519. By Trey Morris

fixed up the compute lock test, was failing because the context was always admin

520. By Trey Morris

removed tests.api.openstack.test_servers test_lock, to hell with it. i'm not even sure if testing lock needs to be at this level

Revision history for this message
Trey Morris (tr3buchet) wrote :

re eric:
ok moved the lock check business to the compute api. Tests spent forever fixing the unittests, gave up. Moved the lock check business to the compute manager. Tests are passing. Testing shows it works. Attempting to perform a lock checking action on a locked instance when not an admin will throw an exception which is currently uncaught. Also I removed the test for lock in the openstack api tests because I'm not sure a lock test should go there.

lp:~tr3buchet/nova/lock updated
521. By Trey Morris

altered the compute lock test

522. By Trey Morris

fixed the compute lock test

Revision history for this message
Trey Morris (tr3buchet) wrote :

alright here we go. Now it works, passes unittests and when you attempt to do something to a locked instance and are not an admin, nothing happens and this shows up in the logs (for example):

ERROR:root:Instance |1| is locked, cannot execute |<function pause_instance at 0x2d910c8>|

Revision history for this message
Devin Carlen (devcamcar) wrote :

good work, lock name is clear and code is in compute api now.

review: Approve
Revision history for this message
Eric Day (eday) wrote :

140: I would simplify this and assume (self, context, instance_id) for args[0/1/2] and not looks in kwargs. In fact, on 129, I would instead do:

+ def decorated_function(self, context, instance_id, *args, **kwargs):

and remove the try block altogether.

lp:~tr3buchet/nova/lock updated
523. By Trey Morris

altered argument handling

524. By Trey Morris

pep8

525. By Trey Morris

merged trunk

526. By Trey Morris

accidentally left unlocked in there, it should have been locked

527. By Trey Morris

passing the correct parameters to decorated function

528. By Trey Morris

refers to instance_id instead of instance_ref[instance_id]

529. By Trey Morris

typo

Revision history for this message
Trey Morris (tr3buchet) wrote :

alright eric day, i've got your changes implemented. locally unittests pass, pep8 returns nothing and the lock works.

Revision history for this message
Eric Day (eday) wrote :

lgtm

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'nova/api/openstack/servers.py'
--- nova/api/openstack/servers.py 2011-01-05 17:50:19 +0000
+++ nova/api/openstack/servers.py 2011-01-07 00:00:12 +0000
@@ -170,6 +170,50 @@
170 return faults.Fault(exc.HTTPUnprocessableEntity())170 return faults.Fault(exc.HTTPUnprocessableEntity())
171 return exc.HTTPAccepted()171 return exc.HTTPAccepted()
172172
173 def lock(self, req, id):
174 """
175 lock the instance with id
176 admin only operation
177
178 """
179 context = req.environ['nova.context']
180 try:
181 self.compute_api.lock(context, id)
182 except:
183 readable = traceback.format_exc()
184 logging.error(_("Compute.api::lock %s"), readable)
185 return faults.Fault(exc.HTTPUnprocessableEntity())
186 return exc.HTTPAccepted()
187
188 def unlock(self, req, id):
189 """
190 unlock the instance with id
191 admin only operation
192
193 """
194 context = req.environ['nova.context']
195 try:
196 self.compute_api.unlock(context, id)
197 except:
198 readable = traceback.format_exc()
199 logging.error(_("Compute.api::unlock %s"), readable)
200 return faults.Fault(exc.HTTPUnprocessableEntity())
201 return exc.HTTPAccepted()
202
203 def get_lock(self, req, id):
204 """
205 return the boolean state of (instance with id)'s lock
206
207 """
208 context = req.environ['nova.context']
209 try:
210 self.compute_api.get_lock(context, id)
211 except:
212 readable = traceback.format_exc()
213 logging.error(_("Compute.api::get_lock %s"), readable)
214 return faults.Fault(exc.HTTPUnprocessableEntity())
215 return exc.HTTPAccepted()
216
173 def pause(self, req, id):217 def pause(self, req, id):
174 """ Permit Admins to Pause the server. """218 """ Permit Admins to Pause the server. """
175 ctxt = req.environ['nova.context']219 ctxt = req.environ['nova.context']
176220
=== modified file 'nova/compute/api.py'
--- nova/compute/api.py 2011-01-05 17:50:19 +0000
+++ nova/compute/api.py 2011-01-07 00:00:12 +0000
@@ -147,6 +147,7 @@
147 'user_data': user_data or '',147 'user_data': user_data or '',
148 'key_name': key_name,148 'key_name': key_name,
149 'key_data': key_data,149 'key_data': key_data,
150 'locked': False,
150 'availability_zone': availability_zone}151 'availability_zone': availability_zone}
151152
152 elevated = context.elevated()153 elevated = context.elevated()
@@ -354,6 +355,38 @@
354 {"method": "unrescue_instance",355 {"method": "unrescue_instance",
355 "args": {"instance_id": instance_id}})356 "args": {"instance_id": instance_id}})
356357
358 def lock(self, context, instance_id):
359 """
360 lock the instance with instance_id
361
362 """
363 instance = self.get_instance(context, instance_id)
364 host = instance['host']
365 rpc.cast(context,
366 self.db.queue_get_for(context, FLAGS.compute_topic, host),
367 {"method": "lock_instance",
368 "args": {"instance_id": instance['id']}})
369
370 def unlock(self, context, instance_id):
371 """
372 unlock the instance with instance_id
373
374 """
375 instance = self.get_instance(context, instance_id)
376 host = instance['host']
377 rpc.cast(context,
378 self.db.queue_get_for(context, FLAGS.compute_topic, host),
379 {"method": "unlock_instance",
380 "args": {"instance_id": instance['id']}})
381
382 def get_lock(self, context, instance_id):
383 """
384 return the boolean state of (instance with instance_id)'s lock
385
386 """
387 instance = self.get_instance(context, instance_id)
388 return instance['locked']
389
357 def attach_volume(self, context, instance_id, volume_id, device):390 def attach_volume(self, context, instance_id, volume_id, device):
358 if not re.match("^/dev/[a-z]d[a-z]+$", device):391 if not re.match("^/dev/[a-z]d[a-z]+$", device):
359 raise exception.ApiError(_("Invalid device specified: %s. "392 raise exception.ApiError(_("Invalid device specified: %s. "
360393
=== modified file 'nova/compute/manager.py'
--- nova/compute/manager.py 2011-01-06 13:12:57 +0000
+++ nova/compute/manager.py 2011-01-07 00:00:12 +0000
@@ -36,6 +36,7 @@
3636
37import datetime37import datetime
38import logging38import logging
39import functools
3940
40from nova import exception41from nova import exception
41from nova import flags42from nova import flags
@@ -53,6 +54,38 @@
53 'Stub network related code')54 'Stub network related code')
5455
5556
57def checks_instance_lock(function):
58 """
59 decorator used for preventing action against locked instances
60 unless, of course, you happen to be admin
61
62 """
63
64 @functools.wraps(function)
65 def decorated_function(self, context, instance_id, *args, **kwargs):
66
67 logging.info(_("check_instance_lock: decorating: |%s|"), function)
68 logging.info(_("check_instance_lock: arguments: |%s| |%s| |%s|"),
69 self,
70 context,
71 instance_id)
72 locked = self.get_lock(context, instance_id)
73 admin = context.is_admin
74 logging.info(_("check_instance_lock: locked: |%s|"), locked)
75 logging.info(_("check_instance_lock: admin: |%s|"), admin)
76
77 # if admin or unlocked call function otherwise log error
78 if admin or not locked:
79 logging.info(_("check_instance_lock: executing: |%s|"), function)
80 function(self, context, instance_id, *args, **kwargs)
81 else:
82 logging.error(_("check_instance_lock: not executing |%s|"),
83 function)
84 return False
85
86 return decorated_function
87
88
56class ComputeManager(manager.Manager):89class ComputeManager(manager.Manager):
5790
58 """Manages the running instances from creation to destruction."""91 """Manages the running instances from creation to destruction."""
@@ -158,6 +191,7 @@
158 self._update_state(context, instance_id)191 self._update_state(context, instance_id)
159192
160 @exception.wrap_exception193 @exception.wrap_exception
194 @checks_instance_lock
161 def terminate_instance(self, context, instance_id):195 def terminate_instance(self, context, instance_id):
162 """Terminate an instance on this machine."""196 """Terminate an instance on this machine."""
163 context = context.elevated()197 context = context.elevated()
@@ -202,6 +236,7 @@
202 self.db.instance_destroy(context, instance_id)236 self.db.instance_destroy(context, instance_id)
203237
204 @exception.wrap_exception238 @exception.wrap_exception
239 @checks_instance_lock
205 def reboot_instance(self, context, instance_id):240 def reboot_instance(self, context, instance_id):
206 """Reboot an instance on this server."""241 """Reboot an instance on this server."""
207 context = context.elevated()242 context = context.elevated()
@@ -246,6 +281,7 @@
246 self.driver.snapshot(instance_ref, name)281 self.driver.snapshot(instance_ref, name)
247282
248 @exception.wrap_exception283 @exception.wrap_exception
284 @checks_instance_lock
249 def rescue_instance(self, context, instance_id):285 def rescue_instance(self, context, instance_id):
250 """Rescue an instance on this server."""286 """Rescue an instance on this server."""
251 context = context.elevated()287 context = context.elevated()
@@ -261,6 +297,7 @@
261 self._update_state(context, instance_id)297 self._update_state(context, instance_id)
262298
263 @exception.wrap_exception299 @exception.wrap_exception
300 @checks_instance_lock
264 def unrescue_instance(self, context, instance_id):301 def unrescue_instance(self, context, instance_id):
265 """Rescue an instance on this server."""302 """Rescue an instance on this server."""
266 context = context.elevated()303 context = context.elevated()
@@ -280,6 +317,7 @@
280 self._update_state(context, instance_id)317 self._update_state(context, instance_id)
281318
282 @exception.wrap_exception319 @exception.wrap_exception
320 @checks_instance_lock
283 def pause_instance(self, context, instance_id):321 def pause_instance(self, context, instance_id):
284 """Pause an instance on this server."""322 """Pause an instance on this server."""
285 context = context.elevated()323 context = context.elevated()
@@ -297,6 +335,7 @@
297 result))335 result))
298336
299 @exception.wrap_exception337 @exception.wrap_exception
338 @checks_instance_lock
300 def unpause_instance(self, context, instance_id):339 def unpause_instance(self, context, instance_id):
301 """Unpause a paused instance on this server."""340 """Unpause a paused instance on this server."""
302 context = context.elevated()341 context = context.elevated()
@@ -324,8 +363,12 @@
324 return self.driver.get_diagnostics(instance_ref)363 return self.driver.get_diagnostics(instance_ref)
325364
326 @exception.wrap_exception365 @exception.wrap_exception
366 @checks_instance_lock
327 def suspend_instance(self, context, instance_id):367 def suspend_instance(self, context, instance_id):
328 """suspend the instance with instance_id"""368 """
369 suspend the instance with instance_id
370
371 """
329 context = context.elevated()372 context = context.elevated()
330 instance_ref = self.db.instance_get(context, instance_id)373 instance_ref = self.db.instance_get(context, instance_id)
331374
@@ -340,8 +383,12 @@
340 result))383 result))
341384
342 @exception.wrap_exception385 @exception.wrap_exception
386 @checks_instance_lock
343 def resume_instance(self, context, instance_id):387 def resume_instance(self, context, instance_id):
344 """resume the suspended instance with instance_id"""388 """
389 resume the suspended instance with instance_id
390
391 """
345 context = context.elevated()392 context = context.elevated()
346 instance_ref = self.db.instance_get(context, instance_id)393 instance_ref = self.db.instance_get(context, instance_id)
347394
@@ -356,6 +403,41 @@
356 result))403 result))
357404
358 @exception.wrap_exception405 @exception.wrap_exception
406 def lock_instance(self, context, instance_id):
407 """
408 lock the instance with instance_id
409
410 """
411 context = context.elevated()
412 instance_ref = self.db.instance_get(context, instance_id)
413
414 logging.debug(_('instance %s: locking'), instance_id)
415 self.db.instance_update(context, instance_id, {'locked': True})
416
417 @exception.wrap_exception
418 def unlock_instance(self, context, instance_id):
419 """
420 unlock the instance with instance_id
421
422 """
423 context = context.elevated()
424 instance_ref = self.db.instance_get(context, instance_id)
425
426 logging.debug(_('instance %s: unlocking'), instance_id)
427 self.db.instance_update(context, instance_id, {'locked': False})
428
429 @exception.wrap_exception
430 def get_lock(self, context, instance_id):
431 """
432 return the boolean state of (instance with instance_id)'s lock
433
434 """
435 context = context.elevated()
436 logging.debug(_('instance %s: getting locked state'), instance_id)
437 instance_ref = self.db.instance_get(context, instance_id)
438 return instance_ref['locked']
439
440 @exception.wrap_exception
359 def get_console_output(self, context, instance_id):441 def get_console_output(self, context, instance_id):
360 """Send the console output for an instance."""442 """Send the console output for an instance."""
361 context = context.elevated()443 context = context.elevated()
@@ -365,6 +447,7 @@
365 return self.driver.get_console_output(instance_ref)447 return self.driver.get_console_output(instance_ref)
366448
367 @exception.wrap_exception449 @exception.wrap_exception
450 @checks_instance_lock
368 def attach_volume(self, context, instance_id, volume_id, mountpoint):451 def attach_volume(self, context, instance_id, volume_id, mountpoint):
369 """Attach a volume to an instance."""452 """Attach a volume to an instance."""
370 context = context.elevated()453 context = context.elevated()
@@ -394,6 +477,7 @@
394 return True477 return True
395478
396 @exception.wrap_exception479 @exception.wrap_exception
480 @checks_instance_lock
397 def detach_volume(self, context, instance_id, volume_id):481 def detach_volume(self, context, instance_id, volume_id):
398 """Detach a volume from an instance."""482 """Detach a volume from an instance."""
399 context = context.elevated()483 context = context.elevated()
400484
=== modified file 'nova/db/sqlalchemy/models.py'
--- nova/db/sqlalchemy/models.py 2011-01-05 22:59:41 +0000
+++ nova/db/sqlalchemy/models.py 2011-01-07 00:00:12 +0000
@@ -224,6 +224,8 @@
224 display_name = Column(String(255))224 display_name = Column(String(255))
225 display_description = Column(String(255))225 display_description = Column(String(255))
226226
227 locked = Column(Boolean)
228
227 # TODO(vish): see Ewan's email about state improvements, probably229 # TODO(vish): see Ewan's email about state improvements, probably
228 # should be in a driver base class or some such230 # should be in a driver base class or some such
229 # vmstate_state = running, halted, suspended, paused231 # vmstate_state = running, halted, suspended, paused
230232
=== modified file 'nova/tests/test_compute.py'
--- nova/tests/test_compute.py 2010-12-31 03:55:00 +0000
+++ nova/tests/test_compute.py 2011-01-07 00:00:12 +0000
@@ -178,3 +178,22 @@
178 self.context,178 self.context,
179 instance_id)179 instance_id)
180 self.compute.terminate_instance(self.context, instance_id)180 self.compute.terminate_instance(self.context, instance_id)
181
182 def test_lock(self):
183 """ensure locked instance cannot be changed"""
184 instance_id = self._create_instance()
185 self.compute.run_instance(self.context, instance_id)
186
187 non_admin_context = context.RequestContext(None, None, False, False)
188
189 # decorator should return False (fail) with locked nonadmin context
190 self.compute.lock_instance(self.context, instance_id)
191 ret_val = self.compute.reboot_instance(non_admin_context, instance_id)
192 self.assertEqual(ret_val, False)
193
194 # decorator should return None (success) with unlocked nonadmin context
195 self.compute.unlock_instance(self.context, instance_id)
196 ret_val = self.compute.reboot_instance(non_admin_context, instance_id)
197 self.assertEqual(ret_val, None)
198
199 self.compute.terminate_instance(self.context, instance_id)