Merge lp:~jk0/nova/diagnostics-per-instance into lp:~hudson-openstack/nova/trunk
- diagnostics-per-instance
- Merge into trunk
Status: | Merged |
---|---|
Approved by: | Eric Day |
Approved revision: | 503 |
Merged at revision: | 507 |
Proposed branch: | lp:~jk0/nova/diagnostics-per-instance |
Merge into: | lp:~hudson-openstack/nova/trunk |
Diff against target: |
304 lines (+92/-24) 14 files modified
nova/api/openstack/__init__.py (+2/-0) nova/api/openstack/servers.py (+10/-0) nova/compute/api.py (+14/-0) nova/compute/manager.py (+9/-0) nova/db/api.py (+5/-0) nova/db/sqlalchemy/api.py (+12/-0) nova/db/sqlalchemy/models.py (+2/-17) nova/tests/api/openstack/test_servers.py (+16/-0) nova/tests/test_xenapi.py (+5/-1) nova/virt/fake.py (+3/-0) nova/virt/libvirt_conn.py (+3/-0) nova/virt/xenapi/vm_utils.py (+4/-0) nova/virt/xenapi/vmops.py (+4/-3) nova/virt/xenapi_conn.py (+3/-3) |
To merge this branch: | bzr merge lp:~jk0/nova/diagnostics-per-instance |
Related bugs: | |
Related blueprints: |
Run Diagnostics on Instance
(Medium)
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Eric Day (community) | Approve | ||
Matt Dietz (community) | Approve | ||
Review via email: mp+44872@code.launchpad.net |
Commit message
Make InstanceActions and live diagnostics available through the Admin API.
Description of the change
The InstanceDiagnostics table was removed since storing diagnostics is beyond the scope of this blueprint and is a big enough project to have its own blueprint.
Matt Dietz (cerberus) wrote : | # |
Josh corrected me offline. I'm fine with the rest
Eric Day (eday) wrote : | # |
70, 81: Does this need to be elevated?
For get_actions, why are you diving all the way down into the virt worker when you could just read the table from compute.api? I know we discussed storing this data elsewhere eventually since it's not critical for operation, but for now we are still central (and this call will probably happen in compuite.api. I imagine when libvirt support is added it will use the same reporting mechanism (which is currently the actions table), so I wouldn't worry about different implementations yet either.
Cory Wright (corywright) wrote : | # |
I don't understand the changes to test_run_terminate. Why change
self.
to
self.
? The former is more specific and is a tighter test.
Also, the assertion should probably compare against a constant value that you are expecting, 0 or 1 in this case, rather than len(instances) + 1.
You may have a good reason for these changes and I'm just not aware of it. If so, let me know. :)
Josh Kearney (jk0) wrote : | # |
> 70, 81: Does this need to be elevated?
I don't believe so -- good catch. I'll remove that and test again without it.
> For get_actions, why are you diving all the way down into the virt worker when
> you could just read the table from compute.api? I know we discussed storing
> this data elsewhere eventually since it's not critical for operation, but for
> now we are still central (and this call will probably happen in compuite.api.
> I imagine when libvirt support is added it will use the same reporting
> mechanism (which is currently the actions table), so I wouldn't worry about
> different implementations yet either.
Sure thing. I'll clean that up.
Josh Kearney (jk0) wrote : | # |
> I don't understand the changes to test_run_terminate. Why change
>
> self.assertEqua
>
> to
>
> self.assertNotE
>
> ? The former is more specific and is a tighter test.
I had to change that because the test was failing when more than one instance was returned. By doing it this way, as long as the instance count is >0, we know there is one or more running. The restriction comes from Python 2.6's unittest. There just isn't much to work with - it's either Equal or Not Equal.
> Also, the assertion should probably compare against a constant value that you
> are expecting, 0 or 1 in this case, rather than len(instances) + 1.
>
> You may have a good reason for these changes and I'm just not aware of it. If
> so, let me know. :)
The thing is we don't know that it will *always* be 1 or 0. We could start off with 15 running instances. Terminate one. Assert that the original instance_count is equal to the current instance count + 1. If we were running Python 2.7's unittest, there is assertGreater, assertGreaterEqual, etc, but unfortunately those aren't available in 2.6.
Eric Day (eday) wrote : | # |
But what actually changed in this patch that would have broken the tests? Is this unrelated to the changes you propose? I've not seen any issues with this test case failing as it was.
Josh Kearney (jk0) wrote : | # |
> But what actually changed in this patch that would have broken the tests? Is
> this unrelated to the changes you propose? I've not seen any issues with this
> test case failing as it was.
Nothing changed in this branch that would break the tests -- I just merged trunk yesterday and ran the tests and had that problem.
Eric Day (eday) wrote : | # |
get_actions is still going over an RPC call to the compute manager, where it could be answered directly in compute.api. Push it one more layer up :)
Cory Wright (corywright) wrote : | # |
> > Also, the assertion should probably compare against a constant value that you
> > are expecting, 0 or 1 in this case, rather than len(instances) + 1.
> >
> The thing is we don't know that it will *always* be 1 or 0. We could start off
> with 15 running instances. Terminate one. Assert that the original
> instance_count is equal to the current instance count + 1.
Shouldn't those instances be torn down between each test case?
> If we were running
> Python 2.7's unittest, there is assertGreater, assertGreaterEqual, etc, but
> unfortunately those aren't available in 2.6.
Don't forget that you can always just use Python's built-in assert:
assert len(instances) > 0
Eric Day (eday) wrote : | # |
Sorry to be a bother, but compute.api methods should include a verb, such as get_actions, not just 'actions'. In the future I could see having other operations on actions. :)
OpenStack Infra (hudson-openstack) wrote : | # |
The attempt to merge lp:~jk0/nova/diagnostics-per-instance into lp:nova failed. Below is the output from the failed tests.
TrialTestCase
runTest ok
AdminAPITest
test_
test_
APITest
test_
Test
test_
test_
test_bad_token ok
test_bad_user ok
test_no_user ok
test_
TestLimiter
test_
TestFaults
test_
test_raise ok
test_
FlavorsTest
test_
test_
GlanceImageServ
test_create ok
test_
test_delete ok
test_update ok
ImageController
test_
test_
LocalImageServi
test_create ok
test_
test_delete ok
test_update ok
LimiterTest
test_minute ok
test_
test_second ok
test_
test_
WSGIAppProxyTest
test_200 ok
test_403 ok
test_failure ok
WSGIAppTest
test_escaping ok
test_good_urls ok
test_
test_
test_
ServersTest
test_
test_
test_
test_
OpenStack Infra (hudson-openstack) wrote : | # |
The attempt to merge lp:~jk0/nova/diagnostics-per-instance into lp:nova failed. Below is the output from the failed tests.
TrialTestCase
runTest ok
AdminAPITest
test_
test_
APITest
test_
Test
test_
test_
test_bad_token ok
test_bad_user ok
test_no_user ok
test_
TestLimiter
test_
TestFaults
test_
test_raise ok
test_
FlavorsTest
test_
test_
GlanceImageServ
test_create ok
test_
test_delete ok
test_update ok
ImageController
test_
test_
LocalImageServi
test_create ok
test_
test_delete ok
test_update ok
LimiterTest
test_minute ok
test_
test_second ok
test_
test_
WSGIAppProxyTest
test_200 ok
test_403 ok
test_failure ok
WSGIAppTest
test_escaping ok
test_good_urls ok
test_
test_
test_
ServersTest
test_
test_
test_
test_
OpenStack Infra (hudson-openstack) wrote : | # |
The attempt to merge lp:~jk0/nova/diagnostics-per-instance into lp:nova failed. Below is the output from the failed tests.
TrialTestCase
runTest ok
AdminAPITest
test_
test_
APITest
test_
Test
test_
test_
test_bad_token ok
test_bad_user ok
test_no_user ok
test_
TestLimiter
test_
TestFaults
test_
test_raise ok
test_
FlavorsTest
test_
test_
GlanceImageServ
test_create ok
test_
test_delete ok
test_update ok
ImageController
test_
test_
LocalImageServi
test_create ok
test_
test_delete ok
test_update ok
LimiterTest
test_minute ok
test_
test_second ok
test_
test_
WSGIAppProxyTest
test_200 ok
test_403 ok
test_failure ok
WSGIAppTest
test_escaping ok
test_good_urls ok
test_
test_
test_
ServersTest
test_
test_
test_
test_
- 501. By Josh Kearney
-
Removed problematic test
- 502. By Josh Kearney
-
Improved test
- 503. By Josh Kearney
-
Cleanup
Preview Diff
1 | === modified file 'nova/api/openstack/__init__.py' |
2 | --- nova/api/openstack/__init__.py 2010-12-30 17:26:21 +0000 |
3 | +++ nova/api/openstack/__init__.py 2010-12-31 01:09:13 +0000 |
4 | @@ -93,6 +93,8 @@ |
5 | logging.debug("Including admin operations in API.") |
6 | server_members['pause'] = 'POST' |
7 | server_members['unpause'] = 'POST' |
8 | + server_members["diagnostics"] = "GET" |
9 | + server_members["actions"] = "GET" |
10 | server_members['suspend'] = 'POST' |
11 | server_members['resume'] = 'POST' |
12 | |
13 | |
14 | === modified file 'nova/api/openstack/servers.py' |
15 | --- nova/api/openstack/servers.py 2010-12-28 23:34:51 +0000 |
16 | +++ nova/api/openstack/servers.py 2010-12-31 01:09:13 +0000 |
17 | @@ -218,3 +218,13 @@ |
18 | logging.error(_("compute.api::resume %s"), readable) |
19 | return faults.Fault(exc.HTTPUnprocessableEntity()) |
20 | return exc.HTTPAccepted() |
21 | + |
22 | + def diagnostics(self, req, id): |
23 | + """Permit Admins to retrieve server diagnostics.""" |
24 | + ctxt = req.environ["nova.context"] |
25 | + return self.compute_api.get_diagnostics(ctxt, id) |
26 | + |
27 | + def actions(self, req, id): |
28 | + """Permit Admins to retrieve server actions.""" |
29 | + ctxt = req.environ["nova.context"] |
30 | + return self.compute_api.get_actions(ctxt, id) |
31 | |
32 | === modified file 'nova/compute/api.py' |
33 | --- nova/compute/api.py 2010-12-28 19:01:54 +0000 |
34 | +++ nova/compute/api.py 2010-12-31 01:09:13 +0000 |
35 | @@ -293,6 +293,20 @@ |
36 | {"method": "unpause_instance", |
37 | "args": {"instance_id": instance['id']}}) |
38 | |
39 | + def get_diagnostics(self, context, instance_id): |
40 | + """Retrieve diagnostics for the given instance.""" |
41 | + instance = self.db.instance_get_by_internal_id(context, instance_id) |
42 | + host = instance["host"] |
43 | + return rpc.call(context, |
44 | + self.db.queue_get_for(context, FLAGS.compute_topic, host), |
45 | + {"method": "get_diagnostics", |
46 | + "args": {"instance_id": instance["id"]}}) |
47 | + |
48 | + def get_actions(self, context, instance_id): |
49 | + """Retrieve actions for the given instance.""" |
50 | + instance = self.db.instance_get_by_internal_id(context, instance_id) |
51 | + return self.db.instance_get_actions(context, instance["id"]) |
52 | + |
53 | def suspend(self, context, instance_id): |
54 | """suspend the instance with instance_id""" |
55 | instance = self.db.instance_get_by_internal_id(context, instance_id) |
56 | |
57 | === modified file 'nova/compute/manager.py' |
58 | --- nova/compute/manager.py 2010-12-28 19:01:54 +0000 |
59 | +++ nova/compute/manager.py 2010-12-31 01:09:13 +0000 |
60 | @@ -318,6 +318,15 @@ |
61 | result)) |
62 | |
63 | @exception.wrap_exception |
64 | + def get_diagnostics(self, context, instance_id): |
65 | + """Retrieve diagnostics for an instance on this server.""" |
66 | + instance_ref = self.db.instance_get(context, instance_id) |
67 | + |
68 | + if instance_ref["state"] == power_state.RUNNING: |
69 | + logging.debug(_("instance %s: retrieving diagnostics"), |
70 | + instance_ref["internal_id"]) |
71 | + return self.driver.get_diagnostics(instance_ref) |
72 | + |
73 | def suspend_instance(self, context, instance_id): |
74 | """suspend the instance with instance_id""" |
75 | context = context.elevated() |
76 | |
77 | === modified file 'nova/db/api.py' |
78 | --- nova/db/api.py 2010-12-22 21:17:30 +0000 |
79 | +++ nova/db/api.py 2010-12-31 01:09:13 +0000 |
80 | @@ -383,6 +383,11 @@ |
81 | return IMPL.instance_action_create(context, values) |
82 | |
83 | |
84 | +def instance_get_actions(context, instance_id): |
85 | + """Get instance actions by instance id.""" |
86 | + return IMPL.instance_get_actions(context, instance_id) |
87 | + |
88 | + |
89 | ################### |
90 | |
91 | |
92 | |
93 | === modified file 'nova/db/sqlalchemy/api.py' |
94 | --- nova/db/sqlalchemy/api.py 2010-12-23 21:41:54 +0000 |
95 | +++ nova/db/sqlalchemy/api.py 2010-12-31 01:09:13 +0000 |
96 | @@ -856,6 +856,18 @@ |
97 | return action_ref |
98 | |
99 | |
100 | +@require_admin_context |
101 | +def instance_get_actions(context, instance_id): |
102 | + """Return the actions associated to the given instance id""" |
103 | + session = get_session() |
104 | + actions = {} |
105 | + for action in session.query(models.InstanceActions).\ |
106 | + filter_by(instance_id=instance_id).\ |
107 | + all(): |
108 | + actions[action.action] = action.error |
109 | + return actions |
110 | + |
111 | + |
112 | ################### |
113 | |
114 | |
115 | |
116 | === modified file 'nova/db/sqlalchemy/models.py' |
117 | --- nova/db/sqlalchemy/models.py 2010-12-22 21:17:30 +0000 |
118 | +++ nova/db/sqlalchemy/models.py 2010-12-31 01:09:13 +0000 |
119 | @@ -22,7 +22,7 @@ |
120 | import datetime |
121 | |
122 | from sqlalchemy.orm import relationship, backref, object_mapper |
123 | -from sqlalchemy import Column, Integer, Float, String, schema |
124 | +from sqlalchemy import Column, Integer, String, schema |
125 | from sqlalchemy import ForeignKey, DateTime, Boolean, Text |
126 | from sqlalchemy.exc import IntegrityError |
127 | from sqlalchemy.ext.declarative import declarative_base |
128 | @@ -236,21 +236,6 @@ |
129 | # 'shutdown', 'shutoff', 'crashed']) |
130 | |
131 | |
132 | -class InstanceDiagnostics(BASE, NovaBase): |
133 | - """Represents a guest VM's diagnostics""" |
134 | - __tablename__ = "instance_diagnostics" |
135 | - id = Column(Integer, primary_key=True) |
136 | - instance_id = Column(Integer, ForeignKey('instances.id')) |
137 | - |
138 | - memory_available = Column(Float) |
139 | - memory_free = Column(Float) |
140 | - cpu_load = Column(Float) |
141 | - disk_read = Column(Float) |
142 | - disk_write = Column(Float) |
143 | - net_tx = Column(Float) |
144 | - net_rx = Column(Float) |
145 | - |
146 | - |
147 | class InstanceActions(BASE, NovaBase): |
148 | """Represents a guest VM's actions and results""" |
149 | __tablename__ = "instance_actions" |
150 | @@ -561,7 +546,7 @@ |
151 | it will never need to be called explicitly elsewhere. |
152 | """ |
153 | from sqlalchemy import create_engine |
154 | - models = (Service, Instance, InstanceDiagnostics, InstanceActions, |
155 | + models = (Service, Instance, InstanceActions, |
156 | Volume, ExportDevice, IscsiTarget, FixedIp, FloatingIp, |
157 | Network, SecurityGroup, SecurityGroupIngressRule, |
158 | SecurityGroupInstanceAssociation, AuthToken, User, |
159 | |
160 | === modified file 'nova/tests/api/openstack/test_servers.py' |
161 | --- nova/tests/api/openstack/test_servers.py 2010-12-23 21:33:16 +0000 |
162 | +++ nova/tests/api/openstack/test_servers.py 2010-12-31 01:09:13 +0000 |
163 | @@ -95,6 +95,10 @@ |
164 | fake_compute_api) |
165 | self.stubs.Set(nova.compute.api.ComputeAPI, 'resume', |
166 | fake_compute_api) |
167 | + self.stubs.Set(nova.compute.api.ComputeAPI, "get_diagnostics", |
168 | + fake_compute_api) |
169 | + self.stubs.Set(nova.compute.api.ComputeAPI, "get_actions", |
170 | + fake_compute_api) |
171 | self.allow_admin = FLAGS.allow_admin_api |
172 | |
173 | def tearDown(self): |
174 | @@ -274,6 +278,18 @@ |
175 | res = req.get_response(nova.api.API('os')) |
176 | self.assertEqual(res.status_int, 202) |
177 | |
178 | + def test_server_diagnostics(self): |
179 | + req = webob.Request.blank("/v1.0/servers/1/diagnostics") |
180 | + req.method = "GET" |
181 | + res = req.get_response(nova.api.API("os")) |
182 | + self.assertEqual(res.status_int, 404) |
183 | + |
184 | + def test_server_actions(self): |
185 | + req = webob.Request.blank("/v1.0/servers/1/actions") |
186 | + req.method = "GET" |
187 | + res = req.get_response(nova.api.API("os")) |
188 | + self.assertEqual(res.status_int, 404) |
189 | + |
190 | def test_server_reboot(self): |
191 | body = dict(server=dict( |
192 | name='server_test', imageId=2, flavorId=2, metadata={}, |
193 | |
194 | === modified file 'nova/tests/test_xenapi.py' |
195 | --- nova/tests/test_xenapi.py 2010-12-28 18:53:32 +0000 |
196 | +++ nova/tests/test_xenapi.py 2010-12-31 01:09:13 +0000 |
197 | @@ -166,6 +166,10 @@ |
198 | instances = self.conn.list_instances() |
199 | self.assertEquals(instances, []) |
200 | |
201 | + def test_get_diagnostics(self): |
202 | + instance = self._create_instance() |
203 | + self.conn.get_diagnostics(instance) |
204 | + |
205 | def test_instance_snapshot(self): |
206 | stubs.stubout_instance_snapshot(self.stubs) |
207 | instance = self._create_instance() |
208 | @@ -253,7 +257,7 @@ |
209 | 'kernel_id': 2, |
210 | 'ramdisk_id': 3, |
211 | 'instance_type': 'm1.large', |
212 | - 'mac_address': 'aa:bb:cc:dd:ee:ff', |
213 | + 'mac_address': 'aa:bb:cc:dd:ee:ff' |
214 | } |
215 | instance = db.instance_create(values) |
216 | self.conn.spawn(instance) |
217 | |
218 | === modified file 'nova/virt/fake.py' |
219 | --- nova/virt/fake.py 2010-12-28 19:01:54 +0000 |
220 | +++ nova/virt/fake.py 2010-12-31 01:09:13 +0000 |
221 | @@ -216,6 +216,9 @@ |
222 | 'num_cpu': 2, |
223 | 'cpu_time': 0} |
224 | |
225 | + def get_diagnostics(self, instance_name): |
226 | + pass |
227 | + |
228 | def list_disks(self, instance_name): |
229 | """ |
230 | Return the IDs of all the virtual disks attached to the specified |
231 | |
232 | === modified file 'nova/virt/libvirt_conn.py' |
233 | --- nova/virt/libvirt_conn.py 2010-12-30 17:26:21 +0000 |
234 | +++ nova/virt/libvirt_conn.py 2010-12-31 01:09:13 +0000 |
235 | @@ -587,6 +587,9 @@ |
236 | 'num_cpu': num_cpu, |
237 | 'cpu_time': cpu_time} |
238 | |
239 | + def get_diagnostics(self, instance_name): |
240 | + raise exception.APIError("diagnostics are not supported for libvirt") |
241 | + |
242 | def get_disks(self, instance_name): |
243 | """ |
244 | Note that this function takes an instance name, not an Instance, so |
245 | |
246 | === modified file 'nova/virt/xenapi/vm_utils.py' |
247 | --- nova/virt/xenapi/vm_utils.py 2010-12-28 19:01:54 +0000 |
248 | +++ nova/virt/xenapi/vm_utils.py 2010-12-31 01:09:13 +0000 |
249 | @@ -347,6 +347,10 @@ |
250 | try: |
251 | host = session.get_xenapi_host() |
252 | host_ip = session.get_xenapi().host.get_record(host)["address"] |
253 | + except (cls.XenAPI.Failure, KeyError) as e: |
254 | + return {"Unable to retrieve diagnostics": e} |
255 | + |
256 | + try: |
257 | diags = {} |
258 | xml = get_rrd(host_ip, record["uuid"]) |
259 | if xml: |
260 | |
261 | === modified file 'nova/virt/xenapi/vmops.py' |
262 | --- nova/virt/xenapi/vmops.py 2010-12-28 19:01:54 +0000 |
263 | +++ nova/virt/xenapi/vmops.py 2010-12-31 01:09:13 +0000 |
264 | @@ -268,11 +268,12 @@ |
265 | rec = self._session.get_xenapi().VM.get_record(vm) |
266 | return VMHelper.compile_info(rec) |
267 | |
268 | - def get_diagnostics(self, instance_id): |
269 | + def get_diagnostics(self, instance): |
270 | """Return data about VM diagnostics""" |
271 | - vm = VMHelper.lookup(self._session, instance_id) |
272 | + vm = VMHelper.lookup(self._session, instance.name) |
273 | if vm is None: |
274 | - raise exception.NotFound(_("Instance not found %s") % instance_id) |
275 | + raise exception.NotFound(_("Instance not found %s") % |
276 | + instance.name) |
277 | rec = self._session.get_xenapi().VM.get_record(vm) |
278 | return VMHelper.compile_diagnostics(self._session, rec) |
279 | |
280 | |
281 | === modified file 'nova/virt/xenapi_conn.py' |
282 | --- nova/virt/xenapi_conn.py 2010-12-28 19:01:54 +0000 |
283 | +++ nova/virt/xenapi_conn.py 2010-12-31 01:09:13 +0000 |
284 | @@ -167,9 +167,9 @@ |
285 | """Return data about VM instance""" |
286 | return self._vmops.get_info(instance_id) |
287 | |
288 | - def get_diagnostics(self, instance_id): |
289 | + def get_diagnostics(self, instance): |
290 | """Return data about VM diagnostics""" |
291 | - return self._vmops.get_diagnostics(instance_id) |
292 | + return self._vmops.get_diagnostics(instance) |
293 | |
294 | def get_console_output(self, instance): |
295 | """Return snapshot of console""" |
296 | @@ -242,7 +242,7 @@ |
297 | status = self._session.xenapi.task.get_status(task) |
298 | action = dict( |
299 | instance_id=int(id), |
300 | - action=name, |
301 | + action=name[0:255], # Ensure action is never > 255 |
302 | error=None) |
303 | if status == "pending": |
304 | return |
I'm not going to make you do this, but we need to more explicitly establish what "admin" API means, because right now it's simply whether or not a whole slew of extra routes are available. No vetting of any kind is being done. I know that the DB layer will ultimately reject the request, but I think that's a silly waste of time if the API 1) already knows what you are and 2) can do something about it
Also, all controller verbs returning data are expected to format it into a dictionary for proper formatting into the Accept'd type. It doesn't look like this would return anything we would want to the client. We should be consistent