Merge ~fwieffering/maas:tgt-redfish-bugfix into maas:master

Proposed by Freddy
Status: Merged
Approved by: Blake Rouse
Approved revision: 1c395302c2e11011fd5f66702a7d122a3e2a6e5c
Merge reported by: MAAS Lander
Merged at revision: not available
Proposed branch: ~fwieffering/maas:tgt-redfish-bugfix
Merge into: maas:master
Diff against target: 252 lines (+114/-15)
4 files modified
src/maasserver/api/nodes.py (+1/-1)
src/maasserver/clusterrpc/power.py (+1/-1)
src/provisioningserver/drivers/power/redfish.py (+34/-12)
src/provisioningserver/drivers/power/tests/test_redfish.py (+78/-1)
Reviewer Review Type Date Requested Status
Blake Rouse (community) Approve
MAAS Lander Approve
Review via email: mp+373131@code.launchpad.net

Commit message

Fixes LP: #1845032 - Fixes issues that causes MAAS Redfish power driver to not work on some BMC's.

* removed trailing slashes from redfish power driver. This caused errors for some manufacturers implementations that we use
* catch json parsing errors and return a sensible error message if it fails to parse
* increase timeout on power query calls
* check state in redfish power driver power_off prior to submitting power_off request. Some redfish implementations we encountered errors if the state was already off and a power off request was submitted

Description of the change

We have switched to the redfish power driver for MaaS and encountered a few bugs that we needed to fix.

- removed trailing slashes from redfish power driver. This caused errors for some manufacturers implementations that we use
- catch json parsing errors and return a sensible error message if it fails to parse
- increase timeout on power query calls
- check state in redfish power driver power_off prior to submitting power_off request. Some redfish implementations we encountered errors if the state was already off and a power off request was submitted

Will comment with branch once bug is created

To post a comment you must log in.
Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b tgt-redfish-bugfix lp:~fwieffering/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: FAILED
LOG: http://maas-ci-jenkins.internal:8080/job/maas/job/branch-tester/6476/console
COMMIT: 18bdccb5235e69f27e6a387aa010f489c469dac0

review: Needs Fixing
Revision history for this message
Blake Rouse (blake-rouse) wrote :

I like the fixes, overall looks acceptable. But tests need to be updated and more need to be added to cover the changes.

review: Needs Fixing
Revision history for this message
Freddy (fwieffering) wrote :

I will take care of that.

Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b tgt-redfish-bugfix lp:~fwieffering/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: FAILED
LOG: http://maas-ci-jenkins.internal:8080/job/maas/job/branch-tester/6478/console
COMMIT: ecdda8fc9aa154861e2b056c683170d4e566efba

review: Needs Fixing
Revision history for this message
Freddy (fwieffering) wrote :

The tests should be all set. I don't know if this jenkins job is going to run again or not, but I have them passing.

Revision history for this message
Newell Jensen (newell-jensen) wrote :

Some manufacturers require the trailing slash. We should probably tailor the driver to attempt both cases.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

Code change looks good and the tests look good. Only issue I see is that you have not signed the Canonical contributors agreement. Already having a launchpad account makes that really easy.

https://ubuntu.com/legal/contributors/agreement

Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b tgt-redfish-bugfix lp:~fwieffering/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: FAILED
LOG: http://maas-ci-jenkins.internal:8080/job/maas/job/branch-tester/6485/console
COMMIT: d0470f99ac1de4f6cac3fcc814b9751982eabc88

review: Needs Fixing
Revision history for this message
Freddy (fwieffering) wrote :

I have signed the canonical contributors agreement

Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b tgt-redfish-bugfix lp:~fwieffering/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: SUCCESS
COMMIT: ae549d3b087f66e0ccfff7ae2b72334cfe99f517

review: Approve
Revision history for this message
Blake Rouse (blake-rouse) wrote :

Seems that @newell-jensen, our resident power driver expert and original author of the Redfish driver says that some BMC's require the '/' at the end. We need to confirm this is the case before landing this.

review: Needs Information
Revision history for this message
Freddy (fwieffering) wrote :

I can add a retry to add a trailing slash if the power driver call fails for any reason. Would you like that approach or is this something that can be investigated?

Revision history for this message
Blake Rouse (blake-rouse) wrote :

I would hope in the case that a ending slash is required the BMC would return a 301 redirect to the URL with a slash append, we should ensure that we handle a 301 or 302 redirect.

If it returns a 404 we could retry with a pending slash, just to be sure. Seems like its better to be defensive, sense this is not documented in the specification on the proper way.

Revision history for this message
Freddy (fwieffering) wrote :

@blake-rouse @newell-jensen I've added a retry with a trailing slash if the request fails with a 404 and the uri does not include a trailing slash

Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b tgt-redfish-bugfix lp:~fwieffering/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: SUCCESS
COMMIT: 1c395302c2e11011fd5f66702a7d122a3e2a6e5c

review: Approve
Revision history for this message
Freddy (fwieffering) wrote :

Just bumping this in case it fell off the radar

Revision history for this message
Blake Rouse (blake-rouse) wrote :

Looks good, thanks for the fixes and the tests!

review: Approve
Revision history for this message
MAAS Lander (maas-lander) wrote :
Revision history for this message
Freddy (fwieffering) wrote :

Is there anything more I need to do here?

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/src/maasserver/api/nodes.py b/src/maasserver/api/nodes.py
2index 878c418..f228e1a 100644
3--- a/src/maasserver/api/nodes.py
4+++ b/src/maasserver/api/nodes.py
5@@ -930,7 +930,7 @@ class PowerMixin:
6 system_id=system_id, user=request.user,
7 perm=NodePermission.view)
8 return {
9- "state": node.power_query().wait(45),
10+ "state": node.power_query().wait(60),
11 }
12
13 @operation(idempotent=False)
14diff --git a/src/maasserver/clusterrpc/power.py b/src/maasserver/clusterrpc/power.py
15index c117386..7eafab6 100644
16--- a/src/maasserver/clusterrpc/power.py
17+++ b/src/maasserver/clusterrpc/power.py
18@@ -215,7 +215,7 @@ def pick_best_power_state(power_states):
19
20
21 @asynchronous(timeout=FOREVER)
22-def power_query_all(system_id, hostname, power_info, timeout=30):
23+def power_query_all(system_id, hostname, power_info, timeout=60):
24 """Query every connected rack controller and get the power status from all
25 rack controllers.
26
27diff --git a/src/provisioningserver/drivers/power/redfish.py b/src/provisioningserver/drivers/power/redfish.py
28index ef37367..1bd31b2 100644
29--- a/src/provisioningserver/drivers/power/redfish.py
30+++ b/src/provisioningserver/drivers/power/redfish.py
31@@ -34,14 +34,15 @@ from twisted.web.client import (
32 FileBodyProducer,
33 PartialDownloadError,
34 readBody,
35+ RedirectAgent,
36 )
37 from twisted.web.http_headers import Headers
38
39-
40+# no trailing slashes
41 REDFISH_POWER_CONTROL_ENDPOINT = (
42- b"redfish/v1/Systems/%s/Actions/ComputerSystem.Reset/")
43+ b"redfish/v1/Systems/%s/Actions/ComputerSystem.Reset")
44
45-REDFISH_SYSTEMS_ENDPOINT = b"redfish/v1/Systems/"
46+REDFISH_SYSTEMS_ENDPOINT = b"redfish/v1/Systems"
47
48
49 class RedfishPowerDriverBase(PowerDriver):
50@@ -61,6 +62,7 @@ class RedfishPowerDriverBase(PowerDriver):
51 return Headers(
52 {
53 b"User-Agent": [b"MAAS"],
54+ b"Accept": [b"application/json"],
55 b"Authorization": [b"Basic " + authorization],
56 b"Content-Type": [b"application/json; charset=utf-8"],
57 }
58@@ -69,7 +71,8 @@ class RedfishPowerDriverBase(PowerDriver):
59 @asynchronous
60 def redfish_request(self, method, uri, headers=None, bodyProducer=None):
61 """Send the redfish request and return the response."""
62- agent = Agent(reactor, contextFactory=WebClientContextFactory())
63+ agent = RedirectAgent(
64+ Agent(reactor, contextFactory=WebClientContextFactory()))
65 d = agent.request(
66 method, uri, headers=headers, bodyProducer=bodyProducer)
67
68@@ -90,16 +93,33 @@ class RedfishPowerDriverBase(PowerDriver):
69 data = data.decode('utf-8')
70 # Only decode non-empty response bodies.
71 if data:
72- return json.loads(data)
73+ # occasionally invalid json is returned. provide a clear
74+ # error in that case
75+ try:
76+ return json.loads(data)
77+ except ValueError as error:
78+ raise PowerActionError(
79+ "Redfish request failed from a JSON parse error:"
80+ " %s." % error)
81
82 def cb_attach_headers(data, headers):
83 return data, headers
84
85 # Error out if the response has a status code of 400 or above.
86 if response.code >= int(HTTPStatus.BAD_REQUEST):
87- raise PowerActionError(
88- "Redfish request failed with response status code:"
89- " %s." % response.code)
90+ # if there was no trailing slash, retry with a trailing slash
91+ # because of varying requirements of BMC manufacturers
92+ if (response.code == HTTPStatus.NOT_FOUND and
93+ uri.decode('utf-8')[-1] != "/"):
94+ d = agent.request(
95+ method,
96+ uri + "/".encode('utf-8'),
97+ headers=headers,
98+ bodyProducer=bodyProducer)
99+ else:
100+ raise PowerActionError(
101+ "Redfish request failed with response status code:"
102+ " %s." % response.code)
103
104 d = readBody(response)
105 d.addErrback(eb_catch_partial)
106@@ -165,7 +185,7 @@ class RedfishPowerDriver(RedfishPowerDriverBase):
107 @inlineCallbacks
108 def set_pxe_boot(self, url, node_id, headers):
109 """Set the machine with node_id to PXE boot."""
110- endpoint = REDFISH_SYSTEMS_ENDPOINT + b'%s/' % node_id
111+ endpoint = join(REDFISH_SYSTEMS_ENDPOINT, b'%s' % node_id)
112 payload = FileBodyProducer(
113 BytesIO(
114 json.dumps(
115@@ -213,14 +233,16 @@ class RedfishPowerDriver(RedfishPowerDriverBase):
116 url, node_id, headers = yield self.process_redfish_context(context)
117 # Set to PXE boot.
118 yield self.set_pxe_boot(url, node_id, headers)
119- # Power off the machine.
120- yield self.power("ForceOff", url, node_id, headers)
121+ # Power off the machine if it is not already off
122+ power_state = yield self.power_query(node_id, context)
123+ if power_state != 'off':
124+ yield self.power("ForceOff", url, node_id, headers)
125
126 @asynchronous
127 @inlineCallbacks
128 def power_query(self, node_id, context):
129 """Power query machine."""
130 url, node_id, headers = yield self.process_redfish_context(context)
131- uri = join(url, REDFISH_SYSTEMS_ENDPOINT + b'%s/' % node_id)
132+ uri = join(url, REDFISH_SYSTEMS_ENDPOINT, b'%s' % node_id)
133 node_data, _ = yield self.redfish_request(b"GET", uri, headers)
134 return node_data.get('PowerState').lower()
135diff --git a/src/provisioningserver/drivers/power/tests/test_redfish.py b/src/provisioningserver/drivers/power/tests/test_redfish.py
136index ca2a771..a937190 100644
137--- a/src/provisioningserver/drivers/power/tests/test_redfish.py
138+++ b/src/provisioningserver/drivers/power/tests/test_redfish.py
139@@ -224,6 +224,7 @@ class TestRedfishPowerDriver(MAASTestCase):
140 authorization = b64encode(creds.encode('utf-8'))
141 attributes = {
142 b"User-Agent": [b"MAAS"],
143+ b"Accept": [b"application/json"],
144 b"Authorization": [b"Basic " + authorization],
145 b"Content-Type": [b"application/json; charset=utf-8"],
146 }
147@@ -255,6 +256,60 @@ class TestRedfishPowerDriver(MAASTestCase):
148 self.assertEquals(expected_headers.headers, headers)
149
150 @inlineCallbacks
151+ def test_wrap_redfish_request_retries_404s_trailing_slash(self):
152+ driver = RedfishPowerDriver()
153+ context = make_context()
154+ url = driver.get_url(context)
155+ uri = join(url, b"redfish/v1/Systems")
156+ headers = driver.make_auth_headers(**context)
157+ mock_agent = self.patch(redfish_module, 'Agent')
158+ mock_agent.return_value.request = Mock()
159+ expected_headers = Mock()
160+ expected_headers.code = HTTPStatus.NOT_FOUND
161+ expected_headers.headers = "Testing Headers"
162+ happy_headers = Mock()
163+ happy_headers.code = HTTPStatus.OK
164+ happy_headers.headers = "Testing Headers"
165+ mock_agent.return_value.request.side_effect = [
166+ succeed(expected_headers),
167+ succeed(happy_headers)
168+ ]
169+ mock_readBody = self.patch(redfish_module, 'readBody')
170+ mock_readBody.return_value = succeed(
171+ json.dumps(SAMPLE_JSON_SYSTEMS).encode('utf-8'))
172+ expected_response = SAMPLE_JSON_SYSTEMS
173+
174+ response, return_headers = yield driver.redfish_request(
175+ b"GET", uri, headers)
176+ self.assertThat(
177+ mock_agent.return_value.request,
178+ MockCallsMatch(
179+ call(b"GET", uri, headers, None),
180+ call(b"GET", uri + "/".encode('utf-8'), headers, None)))
181+ self.assertEquals(expected_response, response)
182+ self.assertEquals(expected_headers.headers, return_headers)
183+
184+ @inlineCallbacks
185+ def test_redfish_request_raises_invalid_json_error(self):
186+ driver = RedfishPowerDriver()
187+ context = make_context()
188+ url = driver.get_url(context)
189+ uri = join(url, b"redfish/v1/Systems")
190+ headers = driver.make_auth_headers(**context)
191+ mock_agent = self.patch(redfish_module, 'Agent')
192+ mock_agent.return_value.request = Mock()
193+ expected_headers = Mock()
194+ expected_headers.code = HTTPStatus.OK
195+ expected_headers.headers = "Testing Headers"
196+ mock_agent.return_value.request.return_value = succeed(
197+ expected_headers)
198+ mock_readBody = self.patch(redfish_module, 'readBody')
199+ mock_readBody.return_value = succeed(
200+ '{"invalid": "json"'.encode('utf-8'))
201+ with ExpectedException(PowerActionError):
202+ yield driver.redfish_request(b"GET", uri, headers)
203+
204+ @inlineCallbacks
205 def test_redfish_request_continues_partial_download_error(self):
206 driver = RedfishPowerDriver()
207 context = make_context()
208@@ -371,7 +426,7 @@ class TestRedfishPowerDriver(MAASTestCase):
209
210 yield driver.set_pxe_boot(url, node_id, headers)
211 self.assertThat(mock_redfish_request, MockCalledOnceWith(
212- b"PATCH", join(url, b"redfish/v1/Systems/%s/" % node_id),
213+ b"PATCH", join(url, b"redfish/v1/Systems/%s" % node_id),
214 headers, payload))
215
216 @inlineCallbacks
217@@ -409,6 +464,8 @@ class TestRedfishPowerDriver(MAASTestCase):
218 mock_redfish_request.return_value = (
219 SAMPLE_JSON_SYSTEMS, None)
220 mock_set_pxe_boot = self.patch(driver, 'set_pxe_boot')
221+ mock_power_query = self.patch(driver, 'power_query')
222+ mock_power_query.return_value = "on"
223 mock_power = self.patch(driver, 'power')
224
225 yield driver.power_off(node_id, context)
226@@ -418,6 +475,26 @@ class TestRedfishPowerDriver(MAASTestCase):
227 "ForceOff", url, node_id, headers))
228
229 @inlineCallbacks
230+ def test__power_off_already_off(self):
231+ driver = RedfishPowerDriver()
232+ context = make_context()
233+ url = driver.get_url(context)
234+ headers = driver.make_auth_headers(**context)
235+ node_id = b'1'
236+ mock_redfish_request = self.patch(driver, 'redfish_request')
237+ mock_redfish_request.return_value = (
238+ SAMPLE_JSON_SYSTEMS, None)
239+ mock_set_pxe_boot = self.patch(driver, 'set_pxe_boot')
240+ mock_power_query = self.patch(driver, 'power_query')
241+ mock_power_query.return_value = "off"
242+ mock_power = self.patch(driver, 'power')
243+
244+ yield driver.power_off(node_id, context)
245+ self.assertThat(mock_set_pxe_boot, MockCalledOnceWith(
246+ url, node_id, headers))
247+ self.assertThat(mock_power, MockNotCalled())
248+
249+ @inlineCallbacks
250 def test_power_query_queries_on(self):
251 driver = RedfishPowerDriver()
252 power_change = "On"

Subscribers

People subscribed via source and target branches