Merge ~fwieffering/maas:tgt-redfish-bugfix into maas:master

Proposed by Freddy
Status: Merged
Approved by: Blake Rouse
Approved revision: 1c395302c2e11011fd5f66702a7d122a3e2a6e5c
Merge reported by: MAAS Lander
Merged at revision: not available
Proposed branch: ~fwieffering/maas:tgt-redfish-bugfix
Merge into: maas:master
Diff against target: 252 lines (+114/-15)
4 files modified
src/maasserver/api/nodes.py (+1/-1)
src/maasserver/clusterrpc/power.py (+1/-1)
src/provisioningserver/drivers/power/redfish.py (+34/-12)
src/provisioningserver/drivers/power/tests/test_redfish.py (+78/-1)
Reviewer Review Type Date Requested Status
Blake Rouse (community) Approve
MAAS Lander Approve
Review via email: mp+373131@code.launchpad.net

Commit message

Fixes LP: #1845032 - Fixes issues that causes MAAS Redfish power driver to not work on some BMC's.

* removed trailing slashes from redfish power driver. This caused errors for some manufacturers implementations that we use
* catch json parsing errors and return a sensible error message if it fails to parse
* increase timeout on power query calls
* check state in redfish power driver power_off prior to submitting power_off request. Some redfish implementations we encountered errors if the state was already off and a power off request was submitted

Description of the change

We have switched to the redfish power driver for MaaS and encountered a few bugs that we needed to fix.

- removed trailing slashes from redfish power driver. This caused errors for some manufacturers implementations that we use
- catch json parsing errors and return a sensible error message if it fails to parse
- increase timeout on power query calls
- check state in redfish power driver power_off prior to submitting power_off request. Some redfish implementations we encountered errors if the state was already off and a power off request was submitted

Will comment with branch once bug is created

To post a comment you must log in.
Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b tgt-redfish-bugfix lp:~fwieffering/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: FAILED
LOG: http://maas-ci-jenkins.internal:8080/job/maas/job/branch-tester/6476/console
COMMIT: 18bdccb5235e69f27e6a387aa010f489c469dac0

review: Needs Fixing
Revision history for this message
Blake Rouse (blake-rouse) wrote :

I like the fixes, overall looks acceptable. But tests need to be updated and more need to be added to cover the changes.

review: Needs Fixing
Revision history for this message
Freddy (fwieffering) wrote :

I will take care of that.

Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b tgt-redfish-bugfix lp:~fwieffering/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: FAILED
LOG: http://maas-ci-jenkins.internal:8080/job/maas/job/branch-tester/6478/console
COMMIT: ecdda8fc9aa154861e2b056c683170d4e566efba

review: Needs Fixing
Revision history for this message
Freddy (fwieffering) wrote :

The tests should be all set. I don't know if this jenkins job is going to run again or not, but I have them passing.

Revision history for this message
Newell Jensen (newell-jensen) wrote :

Some manufacturers require the trailing slash. We should probably tailor the driver to attempt both cases.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

Code change looks good and the tests look good. Only issue I see is that you have not signed the Canonical contributors agreement. Already having a launchpad account makes that really easy.

https://ubuntu.com/legal/contributors/agreement

Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b tgt-redfish-bugfix lp:~fwieffering/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: FAILED
LOG: http://maas-ci-jenkins.internal:8080/job/maas/job/branch-tester/6485/console
COMMIT: d0470f99ac1de4f6cac3fcc814b9751982eabc88

review: Needs Fixing
Revision history for this message
Freddy (fwieffering) wrote :

I have signed the canonical contributors agreement

Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b tgt-redfish-bugfix lp:~fwieffering/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: SUCCESS
COMMIT: ae549d3b087f66e0ccfff7ae2b72334cfe99f517

review: Approve
Revision history for this message
Blake Rouse (blake-rouse) wrote :

Seems that @newell-jensen, our resident power driver expert and original author of the Redfish driver says that some BMC's require the '/' at the end. We need to confirm this is the case before landing this.

review: Needs Information
Revision history for this message
Freddy (fwieffering) wrote :

I can add a retry to add a trailing slash if the power driver call fails for any reason. Would you like that approach or is this something that can be investigated?

Revision history for this message
Blake Rouse (blake-rouse) wrote :

I would hope in the case that a ending slash is required the BMC would return a 301 redirect to the URL with a slash append, we should ensure that we handle a 301 or 302 redirect.

If it returns a 404 we could retry with a pending slash, just to be sure. Seems like its better to be defensive, sense this is not documented in the specification on the proper way.

Revision history for this message
Freddy (fwieffering) wrote :

@blake-rouse @newell-jensen I've added a retry with a trailing slash if the request fails with a 404 and the uri does not include a trailing slash

Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b tgt-redfish-bugfix lp:~fwieffering/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: SUCCESS
COMMIT: 1c395302c2e11011fd5f66702a7d122a3e2a6e5c

review: Approve
Revision history for this message
Freddy (fwieffering) wrote :

Just bumping this in case it fell off the radar

Revision history for this message
Blake Rouse (blake-rouse) wrote :

Looks good, thanks for the fixes and the tests!

review: Approve
Revision history for this message
MAAS Lander (maas-lander) wrote :
Revision history for this message
Freddy (fwieffering) wrote :

Is there anything more I need to do here?

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
diff --git a/src/maasserver/api/nodes.py b/src/maasserver/api/nodes.py
index 878c418..f228e1a 100644
--- a/src/maasserver/api/nodes.py
+++ b/src/maasserver/api/nodes.py
@@ -930,7 +930,7 @@ class PowerMixin:
930 system_id=system_id, user=request.user,930 system_id=system_id, user=request.user,
931 perm=NodePermission.view)931 perm=NodePermission.view)
932 return {932 return {
933 "state": node.power_query().wait(45),933 "state": node.power_query().wait(60),
934 }934 }
935935
936 @operation(idempotent=False)936 @operation(idempotent=False)
diff --git a/src/maasserver/clusterrpc/power.py b/src/maasserver/clusterrpc/power.py
index c117386..7eafab6 100644
--- a/src/maasserver/clusterrpc/power.py
+++ b/src/maasserver/clusterrpc/power.py
@@ -215,7 +215,7 @@ def pick_best_power_state(power_states):
215215
216216
217@asynchronous(timeout=FOREVER)217@asynchronous(timeout=FOREVER)
218def power_query_all(system_id, hostname, power_info, timeout=30):218def power_query_all(system_id, hostname, power_info, timeout=60):
219 """Query every connected rack controller and get the power status from all219 """Query every connected rack controller and get the power status from all
220 rack controllers.220 rack controllers.
221221
diff --git a/src/provisioningserver/drivers/power/redfish.py b/src/provisioningserver/drivers/power/redfish.py
index ef37367..1bd31b2 100644
--- a/src/provisioningserver/drivers/power/redfish.py
+++ b/src/provisioningserver/drivers/power/redfish.py
@@ -34,14 +34,15 @@ from twisted.web.client import (
34 FileBodyProducer,34 FileBodyProducer,
35 PartialDownloadError,35 PartialDownloadError,
36 readBody,36 readBody,
37 RedirectAgent,
37)38)
38from twisted.web.http_headers import Headers39from twisted.web.http_headers import Headers
3940
4041# no trailing slashes
41REDFISH_POWER_CONTROL_ENDPOINT = (42REDFISH_POWER_CONTROL_ENDPOINT = (
42 b"redfish/v1/Systems/%s/Actions/ComputerSystem.Reset/")43 b"redfish/v1/Systems/%s/Actions/ComputerSystem.Reset")
4344
44REDFISH_SYSTEMS_ENDPOINT = b"redfish/v1/Systems/"45REDFISH_SYSTEMS_ENDPOINT = b"redfish/v1/Systems"
4546
4647
47class RedfishPowerDriverBase(PowerDriver):48class RedfishPowerDriverBase(PowerDriver):
@@ -61,6 +62,7 @@ class RedfishPowerDriverBase(PowerDriver):
61 return Headers(62 return Headers(
62 {63 {
63 b"User-Agent": [b"MAAS"],64 b"User-Agent": [b"MAAS"],
65 b"Accept": [b"application/json"],
64 b"Authorization": [b"Basic " + authorization],66 b"Authorization": [b"Basic " + authorization],
65 b"Content-Type": [b"application/json; charset=utf-8"],67 b"Content-Type": [b"application/json; charset=utf-8"],
66 }68 }
@@ -69,7 +71,8 @@ class RedfishPowerDriverBase(PowerDriver):
69 @asynchronous71 @asynchronous
70 def redfish_request(self, method, uri, headers=None, bodyProducer=None):72 def redfish_request(self, method, uri, headers=None, bodyProducer=None):
71 """Send the redfish request and return the response."""73 """Send the redfish request and return the response."""
72 agent = Agent(reactor, contextFactory=WebClientContextFactory())74 agent = RedirectAgent(
75 Agent(reactor, contextFactory=WebClientContextFactory()))
73 d = agent.request(76 d = agent.request(
74 method, uri, headers=headers, bodyProducer=bodyProducer)77 method, uri, headers=headers, bodyProducer=bodyProducer)
7578
@@ -90,16 +93,33 @@ class RedfishPowerDriverBase(PowerDriver):
90 data = data.decode('utf-8')93 data = data.decode('utf-8')
91 # Only decode non-empty response bodies.94 # Only decode non-empty response bodies.
92 if data:95 if data:
93 return json.loads(data)96 # occasionally invalid json is returned. provide a clear
97 # error in that case
98 try:
99 return json.loads(data)
100 except ValueError as error:
101 raise PowerActionError(
102 "Redfish request failed from a JSON parse error:"
103 " %s." % error)
94104
95 def cb_attach_headers(data, headers):105 def cb_attach_headers(data, headers):
96 return data, headers106 return data, headers
97107
98 # Error out if the response has a status code of 400 or above.108 # Error out if the response has a status code of 400 or above.
99 if response.code >= int(HTTPStatus.BAD_REQUEST):109 if response.code >= int(HTTPStatus.BAD_REQUEST):
100 raise PowerActionError(110 # if there was no trailing slash, retry with a trailing slash
101 "Redfish request failed with response status code:"111 # because of varying requirements of BMC manufacturers
102 " %s." % response.code)112 if (response.code == HTTPStatus.NOT_FOUND and
113 uri.decode('utf-8')[-1] != "/"):
114 d = agent.request(
115 method,
116 uri + "/".encode('utf-8'),
117 headers=headers,
118 bodyProducer=bodyProducer)
119 else:
120 raise PowerActionError(
121 "Redfish request failed with response status code:"
122 " %s." % response.code)
103123
104 d = readBody(response)124 d = readBody(response)
105 d.addErrback(eb_catch_partial)125 d.addErrback(eb_catch_partial)
@@ -165,7 +185,7 @@ class RedfishPowerDriver(RedfishPowerDriverBase):
165 @inlineCallbacks185 @inlineCallbacks
166 def set_pxe_boot(self, url, node_id, headers):186 def set_pxe_boot(self, url, node_id, headers):
167 """Set the machine with node_id to PXE boot."""187 """Set the machine with node_id to PXE boot."""
168 endpoint = REDFISH_SYSTEMS_ENDPOINT + b'%s/' % node_id188 endpoint = join(REDFISH_SYSTEMS_ENDPOINT, b'%s' % node_id)
169 payload = FileBodyProducer(189 payload = FileBodyProducer(
170 BytesIO(190 BytesIO(
171 json.dumps(191 json.dumps(
@@ -213,14 +233,16 @@ class RedfishPowerDriver(RedfishPowerDriverBase):
213 url, node_id, headers = yield self.process_redfish_context(context)233 url, node_id, headers = yield self.process_redfish_context(context)
214 # Set to PXE boot.234 # Set to PXE boot.
215 yield self.set_pxe_boot(url, node_id, headers)235 yield self.set_pxe_boot(url, node_id, headers)
216 # Power off the machine.236 # Power off the machine if it is not already off
217 yield self.power("ForceOff", url, node_id, headers)237 power_state = yield self.power_query(node_id, context)
238 if power_state != 'off':
239 yield self.power("ForceOff", url, node_id, headers)
218240
219 @asynchronous241 @asynchronous
220 @inlineCallbacks242 @inlineCallbacks
221 def power_query(self, node_id, context):243 def power_query(self, node_id, context):
222 """Power query machine."""244 """Power query machine."""
223 url, node_id, headers = yield self.process_redfish_context(context)245 url, node_id, headers = yield self.process_redfish_context(context)
224 uri = join(url, REDFISH_SYSTEMS_ENDPOINT + b'%s/' % node_id)246 uri = join(url, REDFISH_SYSTEMS_ENDPOINT, b'%s' % node_id)
225 node_data, _ = yield self.redfish_request(b"GET", uri, headers)247 node_data, _ = yield self.redfish_request(b"GET", uri, headers)
226 return node_data.get('PowerState').lower()248 return node_data.get('PowerState').lower()
diff --git a/src/provisioningserver/drivers/power/tests/test_redfish.py b/src/provisioningserver/drivers/power/tests/test_redfish.py
index ca2a771..a937190 100644
--- a/src/provisioningserver/drivers/power/tests/test_redfish.py
+++ b/src/provisioningserver/drivers/power/tests/test_redfish.py
@@ -224,6 +224,7 @@ class TestRedfishPowerDriver(MAASTestCase):
224 authorization = b64encode(creds.encode('utf-8'))224 authorization = b64encode(creds.encode('utf-8'))
225 attributes = {225 attributes = {
226 b"User-Agent": [b"MAAS"],226 b"User-Agent": [b"MAAS"],
227 b"Accept": [b"application/json"],
227 b"Authorization": [b"Basic " + authorization],228 b"Authorization": [b"Basic " + authorization],
228 b"Content-Type": [b"application/json; charset=utf-8"],229 b"Content-Type": [b"application/json; charset=utf-8"],
229 }230 }
@@ -255,6 +256,60 @@ class TestRedfishPowerDriver(MAASTestCase):
255 self.assertEquals(expected_headers.headers, headers)256 self.assertEquals(expected_headers.headers, headers)
256257
257 @inlineCallbacks258 @inlineCallbacks
259 def test_wrap_redfish_request_retries_404s_trailing_slash(self):
260 driver = RedfishPowerDriver()
261 context = make_context()
262 url = driver.get_url(context)
263 uri = join(url, b"redfish/v1/Systems")
264 headers = driver.make_auth_headers(**context)
265 mock_agent = self.patch(redfish_module, 'Agent')
266 mock_agent.return_value.request = Mock()
267 expected_headers = Mock()
268 expected_headers.code = HTTPStatus.NOT_FOUND
269 expected_headers.headers = "Testing Headers"
270 happy_headers = Mock()
271 happy_headers.code = HTTPStatus.OK
272 happy_headers.headers = "Testing Headers"
273 mock_agent.return_value.request.side_effect = [
274 succeed(expected_headers),
275 succeed(happy_headers)
276 ]
277 mock_readBody = self.patch(redfish_module, 'readBody')
278 mock_readBody.return_value = succeed(
279 json.dumps(SAMPLE_JSON_SYSTEMS).encode('utf-8'))
280 expected_response = SAMPLE_JSON_SYSTEMS
281
282 response, return_headers = yield driver.redfish_request(
283 b"GET", uri, headers)
284 self.assertThat(
285 mock_agent.return_value.request,
286 MockCallsMatch(
287 call(b"GET", uri, headers, None),
288 call(b"GET", uri + "/".encode('utf-8'), headers, None)))
289 self.assertEquals(expected_response, response)
290 self.assertEquals(expected_headers.headers, return_headers)
291
292 @inlineCallbacks
293 def test_redfish_request_raises_invalid_json_error(self):
294 driver = RedfishPowerDriver()
295 context = make_context()
296 url = driver.get_url(context)
297 uri = join(url, b"redfish/v1/Systems")
298 headers = driver.make_auth_headers(**context)
299 mock_agent = self.patch(redfish_module, 'Agent')
300 mock_agent.return_value.request = Mock()
301 expected_headers = Mock()
302 expected_headers.code = HTTPStatus.OK
303 expected_headers.headers = "Testing Headers"
304 mock_agent.return_value.request.return_value = succeed(
305 expected_headers)
306 mock_readBody = self.patch(redfish_module, 'readBody')
307 mock_readBody.return_value = succeed(
308 '{"invalid": "json"'.encode('utf-8'))
309 with ExpectedException(PowerActionError):
310 yield driver.redfish_request(b"GET", uri, headers)
311
312 @inlineCallbacks
258 def test_redfish_request_continues_partial_download_error(self):313 def test_redfish_request_continues_partial_download_error(self):
259 driver = RedfishPowerDriver()314 driver = RedfishPowerDriver()
260 context = make_context()315 context = make_context()
@@ -371,7 +426,7 @@ class TestRedfishPowerDriver(MAASTestCase):
371426
372 yield driver.set_pxe_boot(url, node_id, headers)427 yield driver.set_pxe_boot(url, node_id, headers)
373 self.assertThat(mock_redfish_request, MockCalledOnceWith(428 self.assertThat(mock_redfish_request, MockCalledOnceWith(
374 b"PATCH", join(url, b"redfish/v1/Systems/%s/" % node_id),429 b"PATCH", join(url, b"redfish/v1/Systems/%s" % node_id),
375 headers, payload))430 headers, payload))
376431
377 @inlineCallbacks432 @inlineCallbacks
@@ -409,6 +464,8 @@ class TestRedfishPowerDriver(MAASTestCase):
409 mock_redfish_request.return_value = (464 mock_redfish_request.return_value = (
410 SAMPLE_JSON_SYSTEMS, None)465 SAMPLE_JSON_SYSTEMS, None)
411 mock_set_pxe_boot = self.patch(driver, 'set_pxe_boot')466 mock_set_pxe_boot = self.patch(driver, 'set_pxe_boot')
467 mock_power_query = self.patch(driver, 'power_query')
468 mock_power_query.return_value = "on"
412 mock_power = self.patch(driver, 'power')469 mock_power = self.patch(driver, 'power')
413470
414 yield driver.power_off(node_id, context)471 yield driver.power_off(node_id, context)
@@ -418,6 +475,26 @@ class TestRedfishPowerDriver(MAASTestCase):
418 "ForceOff", url, node_id, headers))475 "ForceOff", url, node_id, headers))
419476
420 @inlineCallbacks477 @inlineCallbacks
478 def test__power_off_already_off(self):
479 driver = RedfishPowerDriver()
480 context = make_context()
481 url = driver.get_url(context)
482 headers = driver.make_auth_headers(**context)
483 node_id = b'1'
484 mock_redfish_request = self.patch(driver, 'redfish_request')
485 mock_redfish_request.return_value = (
486 SAMPLE_JSON_SYSTEMS, None)
487 mock_set_pxe_boot = self.patch(driver, 'set_pxe_boot')
488 mock_power_query = self.patch(driver, 'power_query')
489 mock_power_query.return_value = "off"
490 mock_power = self.patch(driver, 'power')
491
492 yield driver.power_off(node_id, context)
493 self.assertThat(mock_set_pxe_boot, MockCalledOnceWith(
494 url, node_id, headers))
495 self.assertThat(mock_power, MockNotCalled())
496
497 @inlineCallbacks
421 def test_power_query_queries_on(self):498 def test_power_query_queries_on(self):
422 driver = RedfishPowerDriver()499 driver = RedfishPowerDriver()
423 power_change = "On"500 power_change = "On"

Subscribers

People subscribed via source and target branches