Merge lp:~newell-jensen/maas/amt_wsman into lp:~maas-committers/maas/trunk
- amt_wsman
- Merge into trunk
Status: | Merged | ||||
---|---|---|---|---|---|
Approved by: | Blake Rouse | ||||
Approved revision: | no longer in the source branch. | ||||
Merged at revision: | 3567 | ||||
Proposed branch: | lp:~newell-jensen/maas/amt_wsman | ||||
Merge into: | lp:~maas-committers/maas/trunk | ||||
Diff against target: |
278 lines (+180/-19) 1 file modified
etc/maas/templates/power/amt.template (+180/-19) |
||||
To merge this branch: | bzr merge lp:~newell-jensen/maas/amt_wsman | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Blake Rouse (community) | Approve | ||
Andres Rodriguez (community) | Abstain | ||
Jeff Lane (community) | Needs Resubmitting | ||
Christian Reis (community) | Needs Fixing | ||
Review via email: mp+248174@code.launchpad.net |
Commit message
AMT power template changes to use `wsman` in place of `amttool` for any AMT versions > 8.
Description of the change
Christian Reis (kiko) wrote : | # |
Christian Reis (kiko) : | # |
Christian Reis (kiko) wrote : | # |
See http://
Christian Reis (kiko) wrote : | # |
Do we need to provide special schemas for the different machine types we are to control?
Christian Reis (kiko) wrote : | # |
Do we need a packaging update for wsmancli?
Christian Reis (kiko) wrote : | # |
Here's the output for the identify on AMT 9 systems: http://
Christian Reis (kiko) : | # |
Jeff Lane (bladernr) wrote : | # |
> Comments below. Note that there is one thing I'm not sure of -- is the AMT
> port always 16992, even on these new AMT 9 machines?
For wsman connections to AMT, 16992 is the HTTP port. 16993 is available for HTTPS connections but requires valid certificates before you can make it work, AIUI.
Here's a listing of the available ports and their usage for AMT:
Newell Jensen (newell-jensen) wrote : | # |
This is ready to be reviewed.
Christian Reis (kiko) wrote : | # |
No, you haven't addressed any of my comments as far as I can see.
Christian Reis (kiko) wrote : | # |
And I have more comments.
Newell Jensen (newell-jensen) wrote : | # |
> Comments below. Note that there is one thing I'm not sure of -- is the AMT
> port always 16992, even on these new AMT 9 machines?
See Jeff's comment below:
https:/
Newell Jensen (newell-jensen) wrote : | # |
> Do we need a packaging update for wsmancli?
Since it worked and there isn't an update candidate for the package I dont' believe so:
$ apt-cache policy wsmancli
wsmancli:
Installed: 2.3.0-0ubuntu4
Candidate: 2.3.0-0ubuntu4
Version table:
*** 2.3.0-0ubuntu4 0
500 http://
100 /var/lib/
Christian Reis (kiko) wrote : | # |
On Wed, Feb 11, 2015 at 05:55:04PM -0000, Newell Jensen wrote:
> > Do we need a packaging update for wsmancli?
>
> Since it worked and there isn't an update candidate for the package I dont' believe so:
I meant do we need to add it to the recommends line?
Jeff Lane (bladernr) wrote : | # |
I've made several comments to the diff comments... some of which Newell may have also addressed in the time I've been working on cleaning the script up.
Jeff Lane (bladernr) wrote : | # |
> On Wed, Feb 11, 2015 at 05:55:04PM -0000, Newell Jensen wrote:
> > > Do we need a packaging update for wsmancli?
> >
> > Since it worked and there isn't an update candidate for the package I dont'
> believe so:
>
> I meant do we need to add it to the recommends line?
Good point, I say so since it's a universe package. I'll make that change to the tree as well.
Jeff Lane (bladernr) wrote : | # |
> Good point, I say so since it's a universe package. I'll make that change to
> the tree as well.
I take that back, I can't because the control file isn't included in this tree... so Yes, I would suggest that we include wsmancli as a recommends.
Jeff Lane (bladernr) wrote : | # |
> Do we need to provide special schemas for the different machine types we are
> to control?
AFAIK, we do not. I've used the same schemas/xml on Lenovo hardware that is used on the NUCs in the Orange Boxes and Newell has also tested them on a Dell. So three different makes using the same schemas, we should be safe.
Jeff Lane (bladernr) wrote : | # |
More replies, I think I've replied to everything now. Next step is I will upload my changes to lp and do a merge request from my branch into Newell's... then Newell can merge that into his, and update this Merge Proposal.
Christian Reis (kiko) : | # |
Christian Reis (kiko) : | # |
Christian Reis (kiko) wrote : | # |
I've consolidated my comments now.
Christian Reis (kiko) : | # |
Newell Jensen (newell-jensen) wrote : | # |
Latest changes from bladernr have been merged in.
Jeff Lane (bladernr) wrote : | # |
See inline... I KNOW now why the formatting looks wonky in that one place... here's a tab in there instead of all spaces. doesn't matter to bash, python would have told me immediately on running.
Newell Jensen (newell-jensen) wrote : | # |
Jeff,
Glad you found out the real reason for the comment I had. Since this hasn't caused in issue in executing the code and it is only showing up in the LP diff, I don't see the harm in it.
Jeff Lane (bladernr) wrote : | # |
Kiko, any chance you can take another look at the latest changes and comment replies?
Andres Rodriguez (andreserl) wrote : | # |
I just have a few comments:
1. I still believe that in the WebUI we should display the AMT Version and allow the user to chose what AMT version to use. This is actually a mechanism to make aware the user that there's 2 different AMT versions being supported.
2. See inline.
3. Consider making this a python power driver instead of continuing using shell.
Jeff Lane (bladernr) wrote : | # |
Andres,
First, I do agree but the I really need the template changes in maas packages as soon as possible, as this is blocking certification currently, and will definitely be blocking even more systems in March.
In any case, I DO agree with you, and I would like to propose this:
0: We KNOW this current implementation works on old and new AMT versions, I have tested it repeatedly with AMT9 and Mark Wenning has tested it at Dell with AMT4, 7, and 8 at the least. Because of that I would like to get this merged as soon as we can, and get the fix SRU'd into MAAS 1.7.1 so I can continue on with certification of AMT systems which currently are blocked by MAAS.
1: Regarding Comment #1 above, the nice thing about what it does now is that the user doesn't NEED to know or care what AMT version the system is using. That said, which versions should we care about? Realistically, it's a matter of "8 or older" vs "9 or newer". I have asked Mark to check the wsman commands manually on his array of older AMT hardware. If wsman-cli works on all AMT versions, I'd be happy with removing the amttool parts completely and just settling on wsman for all AMT systems. That much I think, assuming mark can test them quickly, I can get into this merge. It should also address your Commend #2 and negate the need for a UI change and making the user care about what version AMT they're running.
2: I agree with your inline comment (#2) about checking each time. See 1 Above which should address this one.
3: I talked with Andrew and converting all the power templates to Python (Comment #3) that need to be is something we'd like to see done, and I think the Enablement Team can likely help convert those.
SO I'll see if I can help Mark out with wsman testing on the older AMT versions and if that works, I'll make another update to this MP. And then we can consider the Python rewrites later.
Christian Reis (kiko) wrote : | # |
I don't think the UI needs to indicate what version of AMT for the moment; this template change should be transparent to the end user -- if they have wsman installed, and it's AMT9, it works. I had intended to keep amttool support in to minimize change, so I'm not sure why this being being rediscussed.
I don't think we should convert this template to Python in the same step as making this change.
Christian Reis (kiko) wrote : | # |
Read Andres' inline comment -- I see what he is saying. But is running `amttool info` that expensive, enough we should worry about it?
Jeff Lane (bladernr) wrote : | # |
> I don't think the UI needs to indicate what version of AMT for the moment;
> this template change should be transparent to the end user -- if they have
> wsman installed, and it's AMT9, it works. I had intended to keep amttool
> support in to minimize change, so I'm not sure why this being being
> rediscussed.
I agree, UI changes should come at a later date if necessary. As for amttool support, I understand what you're saying, my only point was that if wsman works for both older and new/current AMT versions, is there a benefit to keeping two tools around that do essentially the same job if one tool works for all?
In any case, I also think that is something to revisit later, not as part of this proposal.
> I don't think we should convert this template to Python in the same step as
> making this change.
Also agreed, I meant to imply that too would be revisited at a later date, that's up to You and Andres and Andy to mull over as we now have two teams responsible for MAAS upkeep, the MAAS team and Server Enablement.
Jeff Lane (bladernr) wrote : | # |
> Read Andres' inline comment -- I see what he is saying. But is running
> `amttool info` that expensive, enough we should worry about it?
I wonder about that too... it's not THAT expensive but I think there is the possibility that it could cause problems at scale, if we ever run into a scenario where someone actually outfitted a large DC with AMT based machines... I honestly don't expect we'll see many 10000 unit AMT datacenters, BUT stranger things have happened.
Having to check the version before choosing which tool does mean that every action requires two queries and two responses, so you're effectively doubling the traffic for each function (Power up, off, recycle and status)
So I asked Mark to check out wsman on older AMT versions for due diligence. I am certainly willing to address that later on as a different project as well, that would make perfect sense.
Or I guess maybe I'm throwing Newell under the bus there and implying HE will address it ;-)
Blake Rouse (blake-rouse) wrote : | # |
Overall I am happy with the code, I only have a few questions. If those can be answered then I will approve this branch.
Jeff Lane (bladernr) wrote : | # |
I've made a couple changes per Blake's comments. I'll upload them and get Newell to merge them here for a final review.
Blake Rouse (blake-rouse) wrote : | # |
Cool thanks for the information. Looks good to me.
MAAS Lander (maas-lander) wrote : | # |
The attempt to merge lp:~newell-jensen/maas/amt_wsman into lp:maas failed. Below is the output from the failed tests.
Ign http://
Get:1 http://
Get:2 http://
Ign http://
Ign http://
Hit http://
Get:3 http://
Get:4 http://
Hit http://
Get:5 http://
Get:6 http://
Hit http://
Hit http://
Hit http://
Hit http://
Hit http://
Hit http://
Get:7 http://
Get:8 http://
Get:9 http://
Hit http://
Hit http://
Get:10 http://
Get:11 http://
Get:12 http://
Hit http://
Hit http://
Ign http://
Ign http://
Fetched 1,509 kB in 2s (504 kB/s)
Reading package lists...
sudo DEBIAN_
--
Preview Diff
1 | === modified file 'etc/maas/templates/power/amt.template' |
2 | --- etc/maas/templates/power/amt.template 2015-01-29 16:03:31 +0000 |
3 | +++ etc/maas/templates/power/amt.template 2015-02-26 18:05:27 +0000 |
4 | @@ -38,17 +38,27 @@ |
5 | exit $1 |
6 | } |
7 | |
8 | +# XXX: bladernr 2015-02-04 bug=1331214: AMT ver > 8 needs wsman not amttool |
9 | +# Check the version of AMT on node. |
10 | +check_amt_version() { |
11 | + amt_version=$(issue_amt_command info 2>&1 | grep version | awk '{print $NF}') |
12 | + major_version=$(echo $amt_version | awk -F'.' '{print $1}') |
13 | + return $major_version |
14 | +} |
15 | + |
16 | |
17 | # Perform a command using amttool. |
18 | issue_amt_command() { |
19 | LC_ALL='C' AMT_PASSWORD="$power_pass" amttool "$ip_address" "$@" |
20 | } |
21 | |
22 | + |
23 | # Check that amttool is present. |
24 | check_amt_command() { |
25 | command -v amttool >/dev/null 2>&1 || { fail 2 "Missing amttool (amtterm package)"; } |
26 | } |
27 | |
28 | + |
29 | # Check that a node host has been passed. |
30 | check_bmc_host() { |
31 | if [ -z "$ip_address" ]; then |
32 | @@ -59,11 +69,11 @@ |
33 | fi |
34 | } |
35 | |
36 | -# Ask for node's power state: 'on' or 'off'. |
37 | -query_state() { |
38 | + |
39 | +# Ask for node's power state: 'on' or 'off', via amttool. |
40 | +amt_query_state() { |
41 | # Retry the state if it fails because it often fails the first time. |
42 | local state= |
43 | - local count= |
44 | for attempts in $(seq 0 10) |
45 | do |
46 | state=$(issue_amt_command info | grep '^Powerstate:' | awk '{print $2}') |
47 | @@ -91,21 +101,20 @@ |
48 | } |
49 | |
50 | |
51 | -# Power-cycle the machine, and boot it using the requested boot mode. |
52 | -restart() { |
53 | +# Power-cycle the machine via amttool, and boot it using the requested boot mode. |
54 | +amt_restart() { |
55 | yes | issue_amt_command powercycle "$amt_boot_mode" |
56 | } |
57 | |
58 | |
59 | -# Power the machine on, and boot it using PXE. |
60 | -power_on() { |
61 | - local attempts= |
62 | +# Power the machine on via amttool, and boot it using PXE. |
63 | +amt_power_on() { |
64 | # Try several times. Power commands often fail the first time. |
65 | for attempts in $(seq 0 10) |
66 | do |
67 | # Issue the AMT command; amttool will prompt for confirmation. |
68 | yes | issue_amt_command powerup "$amt_boot_mode" |
69 | - if [ "$(query_state)" = 'on' ] |
70 | + if [ "$(amt_query_state)" = 'on' ] |
71 | then |
72 | # Success. Machine is on. |
73 | return 0 |
74 | @@ -116,13 +125,12 @@ |
75 | } |
76 | |
77 | |
78 | -# Power the machine off. |
79 | -power_off() { |
80 | - local attempts= |
81 | +# Power the machine off amttool. |
82 | +amt_power_off() { |
83 | # Try several times. Power commands often fail the first time. |
84 | for attempts in $(seq 0 10) |
85 | do |
86 | - if [ "$(query_state)" = 'off' ] |
87 | + if [ "$(amt_query_state)" = 'off' ] |
88 | then |
89 | # Success. Machine is off. |
90 | return 0 |
91 | @@ -135,28 +143,181 @@ |
92 | } |
93 | |
94 | |
95 | +# Check that wsman is present. |
96 | +check_wsman_command() { |
97 | + command -v wsman >/dev/null 2>&1 || { fail 2 "Missing wsman (wsmancli package)"; } |
98 | +} |
99 | + |
100 | +# Issue the wsman command. |
101 | +issue_wsman_command() { |
102 | + POWER_TEMPLATE_HEAD=' |
103 | +<p:RequestPowerStateChange_INPUT xmlns:p="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_PowerManagementService"> |
104 | + <p:PowerState>' |
105 | + POWER_TEMPLATE_TAIL='</p:PowerState> |
106 | + <p:ManagedElement xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/addressing" |
107 | + xmlns:wsman="http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd"> |
108 | + <wsa:Address>http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</wsa:Address> |
109 | + <wsa:ReferenceParameters> |
110 | + <wsman:ResourceURI>http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ComputerSystem</wsman:ResourceURI> |
111 | + <wsman:SelectorSet> |
112 | + <wsman:Selector Name="CreationClassName">CIM_ComputerSystem</wsman:Selector> |
113 | + <wsman:Selector Name="Name">ManagedSystem</wsman:Selector> |
114 | + </wsman:SelectorSet> |
115 | + </wsa:ReferenceParameters> |
116 | + </p:ManagedElement> |
117 | +</p:RequestPowerStateChange_INPUT> |
118 | +' |
119 | + # Usage: issue_wsman_command on|off|restart|query |
120 | + ACTION="invoke -a RequestPowerStateChange" |
121 | + SCHEMA_URI='http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_PowerManagementService?SystemCreationClassName="CIM_ComputerSystem",SystemName="Intel(r) AMT",CreationClassName="CIM_PowerManagementService",Name="Intel(r) AMT Power Management Service"' |
122 | + OPTS="--port 16992 --hostname $ip_address --username admin -p $power_pass -V -v" |
123 | + case $1 in |
124 | + on) |
125 | + template=${POWER_TEMPLATE_HEAD}2${POWER_TEMPLATE_TAIL} |
126 | + echo ${template} | wsman ${ACTION} "${SCHEMA_URI}" ${OPTS} -J - |
127 | + ;; |
128 | + off) |
129 | + template=${POWER_TEMPLATE_HEAD}8${POWER_TEMPLATE_TAIL} |
130 | + echo ${template} | wsman ${ACTION} "${SCHEMA_URI}" ${OPTS} -J - |
131 | + ;; |
132 | + restart) |
133 | + template=${POWER_TEMPLATE_HEAD}10${POWER_TEMPLATE_TAIL} |
134 | + echo ${template} | wsman ${ACTION} "${SCHEMA_URI}" ${OPTS} -J - |
135 | + ;; |
136 | + query) |
137 | + ACTION="enumerate" |
138 | + SCHEMA_URI='http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_AssociatedPowerManagementService' |
139 | + OPTS="${OPTS} -o -j utf-8" |
140 | + wsman ${ACTION} "${SCHEMA_URI}" ${OPTS} |
141 | + ;; |
142 | + *) |
143 | + fail 2 "Unknown command issued to wsman" |
144 | + ;; |
145 | + esac |
146 | +} |
147 | + |
148 | + |
149 | +# Ask for node's power state: 'on' or 'off', via wsman. |
150 | +wsman_query_state() { |
151 | + # Retry the state if it fails because it often fails the first time. |
152 | + local state= |
153 | + for attempts in $(seq 0 10) |
154 | + do |
155 | + state=$(issue_wsman_command query | grep "h:PowerState" | awk -F [\<\>] '{print $3}') |
156 | + if [ -n "$state" ] |
157 | + then |
158 | + break |
159 | + fi |
160 | + # Wait 1 second between retries. AMT controllers are generally very |
161 | + # light and may not be comfortable with more frequent queries. |
162 | + sleep 1 |
163 | + done |
164 | + case "$state" in |
165 | + # There are a LOT of possible power states |
166 | + # 1: Other 9: Power Cycle (Off - Hard) |
167 | + # 2: On 10: Master Bus Reset |
168 | + # 3: Sleep - Light 11: Diagnostic Interrupt (NMI) |
169 | + # 4: Sleep - Deep 12: Off - Soft Graceful |
170 | + # 5: Power Cycle (Off - Soft) 13: Off - Hard Graceful |
171 | + # 6: Off - Hard 14: Master Bus Reset Graceful |
172 | + # 7: Hibernate (Off - Soft) 15: Power Cycle (Off - Soft Graceful) |
173 | + # 8: Off - Soft 16: Power Cycle (Off - Hard Graceful |
174 | + # 17: Diagnostic Interrupt (INIT) |
175 | + |
176 | + # These are all power states that indicate that the system is either ON or |
177 | + # will resume function in an ON or Powered Up state (e.g. being power |
178 | + # cycled currently) |
179 | + [2,3,4,5,7,9,10,14,15,16]) |
180 | + echo 'on' |
181 | + ;; |
182 | + # These 4 states should leave system completely powered down |
183 | + [6,8,12,13]) |
184 | + echo 'off' |
185 | + ;; |
186 | + *) |
187 | + fail 1 "Got unknown power state from node: '$state'" |
188 | + ;; |
189 | + esac |
190 | +} |
191 | + |
192 | + |
193 | +# Power-cycle the machine via wsman, and boot it using the requested boot mode. |
194 | +wsman_restart() { |
195 | + issue_wsman_command restart |
196 | +} |
197 | + |
198 | + |
199 | +# Power the machine on via wsman, and boot it using PXE. |
200 | +wsman_power_on() { |
201 | + # Issue the wsman command to change power state |
202 | + issue_wsman_command on |
203 | + # Check power state several times. It usually takes a second or two to get |
204 | + # the correct state. |
205 | + for attempts in $(seq 0 10) |
206 | + do |
207 | + if [ "$(wsman_query_state)" = 'on' ] |
208 | + then |
209 | + # Success. Machine is on. |
210 | + return 0 |
211 | + fi |
212 | + sleep 1 |
213 | + done |
214 | + fail 1 "Machine is not powering on. Giving up." |
215 | +} |
216 | + |
217 | +# Power the machine off via wsman. |
218 | +# Currently commented out because we need a way to power query the machines |
219 | +wsman_power_off() { |
220 | + # Issue the wsman command to change power state |
221 | + issue_wsman_command off |
222 | + # Check power state several times. It usually takes a second or two to get |
223 | + # the correct state |
224 | + for attempts in $(seq 0 10) |
225 | + do |
226 | + if [ "$(wsman_query_state)" = 'off' ] |
227 | + then |
228 | + # Success. Machine is off. |
229 | + return 0 |
230 | + fi |
231 | + sleep 1 |
232 | + done |
233 | + fail 1 "Machine is not powering off. Giving up." |
234 | +} |
235 | + |
236 | main() { |
237 | check_bmc_host |
238 | check_amt_command |
239 | + # Check AMT version |
240 | + check_amt_version |
241 | + amt_version=$? |
242 | + if [ ${amt_version} -gt 8 ]; then |
243 | + # version > 8, use wsman |
244 | + check_wsman_command |
245 | + tool="wsman" |
246 | + else |
247 | + # Use amttool |
248 | + tool="amt" |
249 | + fi |
250 | + |
251 | case "${power_change}" in |
252 | 'on') |
253 | - if [ "$(query_state)" = 'on' ] |
254 | + if [ "${tool}_query_state" = 'on' ] |
255 | then |
256 | # The machine is already up, so powering it up will do nothing. |
257 | # Cycle the power instead. |
258 | - restart |
259 | + ${tool}_restart |
260 | else |
261 | - power_on |
262 | + ${tool}_power_on |
263 | fi |
264 | ;; |
265 | 'off') |
266 | - if [ "$(query_state)" != 'off' ] |
267 | + if [ "${tool}_query_state" != 'off' ] |
268 | then |
269 | - power_off |
270 | + ${tool}_power_off |
271 | fi |
272 | ;; |
273 | 'query') |
274 | - query_state |
275 | + ${tool}_query_state |
276 | ;; |
277 | *) |
278 | fail 1 "Unknown power command: '$1'" |
Comments below. Note that there is one thing I'm not sure of -- is the AMT port always 16992, even on these new AMT 9 machines?