Merge lp:~nelson-chu/opencompute/add-ocp-cpu-memory-job into lp:opencompute/checkbox

Proposed by Nelson Chu
Status: Superseded
Proposed branch: lp:~nelson-chu/opencompute/add-ocp-cpu-memory-job
Merge into: lp:opencompute/checkbox
Diff against target: 500 lines (+361/-43)
7 files modified
data/whitelists/opencompute-certify-local.whitelist (+0/-43)
debian/changelog (+19/-0)
jobs/TC-001-0001-CPU_Memory.txt.in (+31/-0)
jobs/local.txt.in (+7/-0)
scripts/cpu_info (+97/-0)
scripts/memory_info (+77/-0)
scripts/processor_topology (+130/-0)
To merge this branch: bzr merge lp:~nelson-chu/opencompute/add-ocp-cpu-memory-job
Reviewer Review Type Date Requested Status
Jeff Lane  Needs Fixing
Nelson Chu Pending
Review via email: mp+205884@code.launchpad.net

This proposal supersedes a proposal from 2013-11-26.

This proposal has been superseded by a proposal from 2014-02-13.

Description of the change

Scripts and annotations have been modified.

To post a comment you must log in.
Revision history for this message
Jeff Lane  (bladernr) wrote : Posted in a previous version of this proposal
Download full text (4.6 KiB)

Hi Nelson,

Thank you for breaking this into smaller pieces... it makes review a LOT easier. Now, there are some things that need fixing.

I'll take them one file at a time:
data/whitelists/opencompute-certify-local.whitelist looks fine.

jobs/TC-001-0001-CPU_Memory.txt.in:
1: any job that calls a script that needs root permissions to run must include the 'user: root' definition. See the file 'jobs/cpu.txt.in' for some examples where this is necessary. When I run the script cpu_info without root, the cache data is not returned and the test will fail. So this job needs 'user:root'
2: TC-001-0001-003-Memory_Information also needs 'user: root' added to properly run.

jobs/local.txt.in
1: The word "Verified" should be "Verify" in the description field.

scripts/cpu_info:
1: When I manually run the cpu_info script for the test TC-001-0001-001-CPU_information with and without root permissions, I get the following different outputs:
bladernr@klaatu:~/development/ocp-nelson-memory-cpu-test$ ./scripts/cpu_infoIntel(R) Core(TM) i7 CPU Q 720 @ 1.60GHz
Can not found any CPU cache.
bladernr@klaatu:~/development/ocp-nelson-memory-cpu-test$ echo $?
30
bladernr@klaatu:~/development/ocp-nelson-memory-cpu-test$ sudo ./scripts/cpu_info
Intel(R) Core(TM) i7 CPU Q 720 @ 1.60GHz
L1 Cache 32 KB
L2 Cache 256 KB
L3 Cache 6 MB
L3 cache size less than 20MB.
bladernr@klaatu:~/development/ocp-nelson-memory-cpu-test$ echo $?
50

I don't have a system that meets the test criteria, so it will always fail for me. First, the explanation for failure should be more explicit. Example: 'FAIL: Can not find any CPU cache.' for the first example. This is more important in the second example where 'L3 cache size less than 20MB' looks like part of the lshw output. It would be better more explicitly stated as 'FAIL: L3 cache size less than 20MB.'

2: You don't need all those explicit error codes. Checkbox only knows and stores 0 and Not 0 exit codes. A 0 exit code indicates a test passed. A non-zero exit code indicates a failure. Checkbox does not store the actual exit codes. This is not necessarily something you need to change, but you DO need to be aware of the behaviour in case a script behaves differently than expected when you run it via checkbox.

3: The description and spec for this test says: "CPU model should belong to Intel Xeon processor E5-2600 family..." but your test does not fail on non-Xeon processors. For example, when I comment out the error code return for my cache limit on my laptop, I get this output:
bladernr@klaatu:~/development/ocp-nelson-memory-cpu-test$ sudo ./scripts/cpu_info; echo $?
Intel(R) Core(TM) i7 CPU Q 720 @ 1.60GHz
L1 Cache 32 KB
L2 Cache 256 KB
L3 Cache 6 MB
L3 cache size less than 20MB.
0

but my laptop should clearly fail the test case since it's not up to OCP spec.

4: The output needs to be cleaned up a bit. Sorry, I know English is a second language for you, so I'll try to help as much as I can.
    "Can not parser" should be "Can not parse"
    "Can not found" should be "Can not find"
    "# Parser lshw XML for gather" should be "# Parse lshw XML for gathering"

scripts/memory_info:
1: Needs to be run as ...

Read more...

review: Needs Fixing
Revision history for this message
Nelson Chu (nelson-chu) wrote : Posted in a previous version of this proposal

OK, Thank you for your suggestion.
I will revise it accordingly.

Revision history for this message
Nelson Chu (nelson-chu) wrote : Posted in a previous version of this proposal

Hi Jeff,

I have modified scripts. Please help me review them.
Any suggestion will be appreciated.

Thanks,
Nelson

review: Needs Resubmitting
Revision history for this message
Jeff Lane  (bladernr) wrote :

Before I can go any further, you have a conflict in data/whitelists/opencompute-certify-local.whitelist

To see this, you should do the following:

bzr branch lp:opencompute/checkbox ocp-checkbox
cd ocp-checkbox
bzr merge lp:~nelson-chu/opencompute/add-ocp-cpu-memory-job

That MAY actually just clean it up... but there's a file conflict in there. so please resolve that and I'll review the rest at that time.

review: Needs Fixing
2170. By Nelson Chu

Merge add-ocp-cpu-memory-job to lp:opencompute/checkbox and fixed some conflicts

2171. By Nelson Chu

Modify cpu_info and memory_info scripts. Revise debian/changelog file.

2172. By Nelson Chu

debian/changelog file and memory_info script

Unmerged revisions

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added file 'data/whitelists/opencompute-certify-local.whitelist'
2--- data/whitelists/opencompute-certify-local.whitelist 1970-01-01 00:00:00 +0000
3+++ data/whitelists/opencompute-certify-local.whitelist 2014-02-13 07:27:19 +0000
4@@ -0,0 +1,47 @@
5+# Resource Jobs
6+block_device
7+cdimage
8+cpuinfo
9+device
10+dmi
11+dpkg
12+efi
13+environment
14+gconf
15+lsb
16+meminfo
17+module
18+optical_drive
19+package
20+sleep
21+uname
22+#Info attachment jobs
23+__info__
24+cpuinfo_attachment
25+dmesg_attachment
26+dmi_attachment
27+dmidecode_attachment
28+efi_attachment
29+lspci_attachment
30+lshw_attachment
31+mcelog_attachment
32+meminfo_attachment
33+modprobe_attachment
34+modules_attachment
35+sysctl_attachment
36+sysfs_attachment
37+udev_attachment
38+lsmod_attachment
39+acpi_sleep_attachment
40+info/hdparm
41+info/hdparm_.*.txt
42+installer_debug.gz
43+info/disk_partitions
44+# Actual test cases
45+__TC-001-0001-CPU_Memory__
46+TC-001-0001-001-CPU_Information
47+TC-001-0001-002-Processor_Topology
48+TC-001-0001-003-Memory_Information
49+__TC-001-0002-Platform_Controller_Hub__
50+TC-001-0002-001-SATA_port
51+TC-001-0002-002-USB_2.0
52
53=== removed file 'data/whitelists/opencompute-certify-local.whitelist'
54--- data/whitelists/opencompute-certify-local.whitelist 2014-02-11 12:16:41 +0000
55+++ data/whitelists/opencompute-certify-local.whitelist 1970-01-01 00:00:00 +0000
56@@ -1,43 +0,0 @@
57-# Resource Jobs
58-block_device
59-cdimage
60-cpuinfo
61-device
62-dmi
63-dpkg
64-efi
65-environment
66-gconf
67-lsb
68-meminfo
69-module
70-optical_drive
71-package
72-sleep
73-uname
74-#Info attachment jobs
75-__info__
76-cpuinfo_attachment
77-dmesg_attachment
78-dmi_attachment
79-dmidecode_attachment
80-efi_attachment
81-lspci_attachment
82-lshw_attachment
83-mcelog_attachment
84-meminfo_attachment
85-modprobe_attachment
86-modules_attachment
87-sysctl_attachment
88-sysfs_attachment
89-udev_attachment
90-lsmod_attachment
91-acpi_sleep_attachment
92-info/hdparm
93-info/hdparm_.*.txt
94-installer_debug.gz
95-info/disk_partitions
96-# Actual test cases
97-__TC-001-0002-Platform_Controller_Hub__
98-TC-001-0002-001-SATA_port
99-TC-001-0002-002-USB_2.0
100
101=== modified file 'debian/changelog'
102--- debian/changelog 2013-11-05 20:04:15 +0000
103+++ debian/changelog 2014-02-13 07:27:19 +0000
104@@ -1,3 +1,22 @@
105+ [ Nelson Chu ]
106+ * data/whitelists/opencompute-certify-local.whitelist - Added new jobs to
107+ certification whitelist
108+ * jobs/TC-001-0002-Platform_Controller_Hub.txt - Added new jobs for PCH
109+ test cases
110+ * jobs/local.txt.in - Added job to parse new PCH job file
111+ * scripts/check_sata_port - new script to verify SATA port speed
112+ * scripts/check_usb_port - new script to verify USB version
113+
114+ [ Nelson Chu ]
115+ * data/whitelists/opencompute-certify-local.whitelist - Added new jobs to
116+ certification whitelist
117+ * jobs/TC-001-0001-CPU_Memory.txt.in - Added new jobs for CPU and Memory
118+ test cases
119+ * jobs/local.txt.in - Added job to parse new CPU and Memory job file
120+ * scripts/cpu_info - new script to gather CPU information
121+ * scripts/memory_info - new script to gather memory information
122+ * scripts/processor_topology - Revised script to match certification criteria
123+
124 checkbox (1.16.13~OCP) UNRELEASED; urgency=low
125
126 [ Jeff Marcom ]
127
128=== added file 'jobs/TC-001-0001-CPU_Memory.txt.in'
129--- jobs/TC-001-0001-CPU_Memory.txt.in 1970-01-01 00:00:00 +0000
130+++ jobs/TC-001-0001-CPU_Memory.txt.in 2014-02-13 07:27:19 +0000
131@@ -0,0 +1,31 @@
132+plugin: shell
133+name: TC-001-0001-001-CPU_Information
134+requires: package.name == 'lshw'
135+user: root
136+command: cpu_info -p Xeon -f E5
137+description:
138+ 1. Use lshw command to gather CPU information.
139+ 2. The program will output CPU model and L1, L2, L3 cache size.
140+ 3. Criteria: CPU model must be Intel Xeon processor E5-2600 product family and L3 cache size must be up to 20MB.
141+
142+plugin: shell
143+name: TC-001-0001-002-Processor_Topology
144+command: processor_topology
145+description:
146+ 1. This test checks CPU topology for accuracy.
147+ 2. Use lscpu command to gather CPU information.
148+ 3. The program will output the total number of CPUs, the number of threads per core, the number of cores per socket, and the number of sockets.
149+ 4. Criteria: It should be 8-12 cores per CPU and 2 threads per core.
150+
151+plugin: shell
152+name: TC-001-0001-003-Memory_Information
153+requires: package.name == 'lshw'
154+user: root
155+command: memory_info
156+description:
157+ 1. Use lshw command to gather memory information.
158+ 2. Testing prerequisites:
159+ 4 channels DDR3 registered memory interface on each processor 0 and processor 1.
160+ 2 DDR3 slots per channel per processor. (total of 16 DIMMs on the motherboard)
161+ 3. The program will output memory module, vendor, size and slot.
162+ 4. Criteria: Total of 16 DIMMs on the motherboard.
163
164=== renamed file 'jobs/TC-001-0002-Platform_Controller_Hub.txt' => 'jobs/TC-001-0002-Platform_Controller_Hub.txt.in'
165=== modified file 'jobs/local.txt.in'
166--- jobs/local.txt.in 2014-02-11 12:16:41 +0000
167+++ jobs/local.txt.in 2014-02-13 07:27:19 +0000
168@@ -110,6 +110,13 @@
169 shopt -s extglob
170 cat $CHECKBOX_SHARE/jobs/sniff.txt?(.in)
171
172+name: __TC-001-0001-CPU_Memory__
173+plugin: local
174+_description: Verify CPU and memory
175+command:
176+ shopt -s extglob
177+ cat $CHECKBOX_SHARE/jobs/TC-001-0001-CPU_Memory.txt?(.in)
178+
179 name: __TC-001-0002-Platform_Controller_Hub__
180 plugin: local
181 _description: Verify platform controller hub functionality
182
183=== added file 'scripts/cpu_info'
184--- scripts/cpu_info 1970-01-01 00:00:00 +0000
185+++ scripts/cpu_info 2014-02-13 07:27:19 +0000
186@@ -0,0 +1,97 @@
187+#!/usr/bin/env python3
188+"""
189+Copyright (C) 2010-2013 by Cloud Computing Center for Mobile Applications
190+Industrial Technology Research Institute
191+
192+cpu_info
193+ Use lshw command to gather CPU information.
194+ The program will output CPU model and L1, L2, L3 cache size.
195+ Criteria: CPU model and product family must match user's input
196+ and L3 cache should be larger than 20MB.
197+
198+Authors
199+ Nelson Chu <Nelson.Chu@itri.org.tw>
200+
201+This program is free software: you can redistribute it and/or modify
202+it under the terms of the GNU General Public License version 3,
203+as published by the Free Software Foundation.
204+
205+This program is distributed in the hope that it will be useful,
206+but WITHOUT ANY WARRANTY; without even the implied warranty of
207+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
208+GNU General Public License for more details.
209+
210+You should have received a copy of the GNU General Public License
211+along with this program. If not, see <http://www.gnu.org/licenses/>.
212+
213+"""
214+
215+import os
216+import re
217+import sys
218+import xml.etree.ElementTree as ET
219+from subprocess import check_output
220+from argparse import ArgumentParser, RawTextHelpFormatter
221+
222+def run(product, family):
223+ command = "lshw -xml"
224+ with open(os.devnull, 'w') as NULL:
225+ hwinfo_xml = check_output(command, stderr=NULL, shell=True)
226+ root = ET.fromstring(hwinfo_xml)
227+
228+ # Parse lshw XML for gathering processor information.
229+ processor = root.findall(".//product/..[@class='processor']")
230+
231+ if not processor:
232+ print("Fail: Cannot parse any processor information.")
233+ return 10
234+
235+ for cpu in processor:
236+ if cpu.find('product') is None:
237+ print("Fail: Cannot find processor product.")
238+ return 20
239+ print(cpu.find('product').text)
240+ match = re.search(product + '.*' + family, cpu.find('product').text)
241+ if not match:
242+ print("Fail: Cannot match CPU %s %s family." %(product, family))
243+ return 25
244+
245+ cache_list = cpu.findall(".//size/..[@class='memory']")
246+ if not cache_list:
247+ print("Fail: Cannot find any CPU cache.")
248+ return 30
249+
250+ for cache in cache_list:
251+ if cache.find('size') is None or cache.find('slot') is None:
252+ print("Fail: Cannot access Last Level Cache (LLC).")
253+ return 40
254+
255+ cache_size = int(cache.find('size').text) / 1024
256+ if cache_size > 1024:
257+ cache_size = cache_size / 1024
258+ print(('%s %d MB') %(cache.find('slot').text, cache_size))
259+ if re.search('L3', cache.find('slot').text):
260+ if cache_size < 20:
261+ print('Fail: L3 cache size less than 20MB.')
262+ return 50
263+ else:
264+ print(('%s %d KB') %(cache.find('slot').text, cache_size))
265+
266+ return 0
267+
268+def main():
269+ parser = ArgumentParser(formatter_class=RawTextHelpFormatter)
270+
271+ parser.add_argument('-p', '--product', type=str, required=True,
272+ help=("The CPU product name. [Example: Xeon]"))
273+ parser.add_argument('-f', '--family', type=str, required=True,
274+ help=("Processor family. [Example: E5]"))
275+
276+ args = parser.parse_args()
277+
278+ product = args.product.title()
279+ family = args.family.title()
280+ return run(product, family)
281+
282+if __name__ == '__main__':
283+ sys.exit(main())
284
285=== added file 'scripts/memory_info'
286--- scripts/memory_info 1970-01-01 00:00:00 +0000
287+++ scripts/memory_info 2014-02-13 07:27:19 +0000
288@@ -0,0 +1,77 @@
289+#!/usr/bin/env python3
290+"""
291+Copyright (C) 2010-2013 by Cloud Computing Center for Mobile Applications
292+Industrial Technology Research Institute
293+
294+memory_info
295+ 1. Use lshw command to gather memory information.
296+ 2. Testing prerequisites:
297+ 4 channels DDR3 registered memory interface on each processor 0
298+ and processor 1.
299+ 2 DDR3 slots per channel per processor. (total of 16 DIMMs
300+ on the motherboard)
301+ 3. The program will output memory module, vendor, size and slot.
302+ 4. Criteria: Total of 16 DIMMs on the motherboard.
303+
304+Authors
305+ Nelson Chu <Nelson.Chu@itri.org.tw>
306+
307+This program is free software: you can redistribute it and/or modify
308+it under the terms of the GNU General Public License version 3,
309+as published by the Free Software Foundation.
310+
311+This program is distributed in the hope that it will be useful,
312+but WITHOUT ANY WARRANTY; without even the implied warranty of
313+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
314+GNU General Public License for more details.
315+
316+You should have received a copy of the GNU General Public License
317+along with this program. If not, see <http://www.gnu.org/licenses/>.
318+
319+"""
320+
321+import sys
322+import xml.etree.ElementTree as ET
323+from subprocess import Popen, PIPE
324+
325+def main():
326+ attribute = ['description', 'vendor', 'slot', 'size']
327+ command = 'lshw -xml'
328+ hwinfo_xml = Popen(command, stdout=PIPE, stderr=PIPE,
329+ shell=True).communicate()[0]
330+ root = ET.fromstring(hwinfo_xml)
331+
332+ # Parse lshw XML for gathering memory information.
333+ memory_list = root.findall(".//clock/..[@class='memory']")
334+
335+ if not memory_list:
336+ print("Fail: Cannot parse any memory information.")
337+ return 10
338+
339+ count = 0
340+ for dimm in memory_list:
341+ count = count +1
342+ for attr in attribute:
343+ if dimm.find(attr) is None:
344+ print(("Fail: Cannot find memory %s") %attr)
345+ return 20
346+
347+ memory_size = int(dimm.find('size').text) / (1024**3)
348+ for attr in attribute:
349+ if attr == 'description':
350+ print('%s' %(dimm.find(attr).text), end=" ")
351+ continue
352+ elif attr == 'size':
353+ print('%s=%dGB.' %(attr, memory_size))
354+ continue
355+ print('%s=%s' %(attr, dimm.find(attr).text), end=" ")
356+
357+ print("Total number of DIMMs is %s." %(count))
358+ if count != 16:
359+ print("Fail: Memory DIMM number is not meet the requirement.")
360+ return 30
361+
362+ return 0
363+
364+if __name__ == '__main__':
365+ sys.exit(main())
366
367=== added file 'scripts/processor_topology'
368--- scripts/processor_topology 1970-01-01 00:00:00 +0000
369+++ scripts/processor_topology 2014-02-13 07:27:19 +0000
370@@ -0,0 +1,130 @@
371+#!/usr/bin/env python3
372+'''
373+cpu_topology
374+Written by Jeffrey Lane <jeffrey.lane@canonical.com>
375+'''
376+import sys
377+import os
378+from subprocess import check_output
379+
380+class proc_cpuinfo():
381+ '''
382+ Class to get and handle information from /proc/cpuinfo
383+ Creates a dictionary of data gleaned from that file.
384+ '''
385+ def __init__(self):
386+ self.cpuinfo = {}
387+ cpu_fh = open('/proc/cpuinfo', 'r')
388+ try:
389+ temp = cpu_fh.readlines()
390+ finally:
391+ cpu_fh.close()
392+
393+ for i in temp:
394+ if i.startswith('processor'):
395+ key = 'cpu' + (i.split(':')[1].strip())
396+ self.cpuinfo[key] = {'core_id':'', 'physical_package_id':''}
397+ elif i.startswith('core id'):
398+ self.cpuinfo[key].update({'core_id': i.split(':')[1].strip()})
399+ elif i.startswith('physical id'):
400+ self.cpuinfo[key].update({'physical_package_id':
401+ i.split(':')[1].strip()})
402+ else:
403+ continue
404+
405+
406+class sysfs_cpu():
407+ '''
408+ Class to get and handle information from sysfs as relates to CPU topology
409+ Creates an informational class to present information on various CPUs
410+ '''
411+
412+ def __init__(self, proc):
413+ self.syscpu = {}
414+ self.path = '/sys/devices/system/cpu/' + proc + '/topology'
415+ items = ['core_id', 'physical_package_id']
416+ for i in items:
417+ syscpu_fh = open(os.path.join(self.path, i), 'r')
418+ try:
419+ self.syscpu[i] = syscpu_fh.readline().strip()
420+ finally:
421+ syscpu_fh.close()
422+
423+
424+def compare(proc_cpu, sys_cpu):
425+ cpu_map = {}
426+ '''
427+ If there is only 1 CPU the test don't look for core_id
428+ and physical_package_id because those information are absent in
429+ /proc/cpuinfo on singlecore system
430+ '''
431+ for key in proc_cpu.keys():
432+ if 'cpu1' not in proc_cpu:
433+ cpu_map[key] = True
434+ else:
435+ for subkey in proc_cpu[key].keys():
436+ if proc_cpu[key][subkey] == sys_cpu[key][subkey]:
437+ cpu_map[key] = True
438+ else:
439+ cpu_map[key] = False
440+ return cpu_map
441+
442+
443+def main():
444+ cpuinfo = proc_cpuinfo()
445+ sys_cpu = {}
446+ keys = cpuinfo.cpuinfo.keys()
447+ for k in keys:
448+ sys_cpu[k] = sysfs_cpu(k).syscpu
449+ cpu_map = compare(cpuinfo.cpuinfo, sys_cpu)
450+ if False in cpu_map.values() or len(cpu_map) < 1:
451+ print("FAIL: CPU Topology is incorrect", file=sys.stderr)
452+ print("-" * 52, file=sys.stderr)
453+ print("{0}{1}".format("/proc/cpuinfo".center(30), "sysfs".center(25)),
454+ file=sys.stderr)
455+ print("{0}{1}{2}{3}{1}{2}".format(
456+ "CPU".center(6),
457+ "Physical ID".center(13),
458+ "Core ID".center(9),
459+ "|".center(3)), file=sys.stderr)
460+ for key in sorted(sys_cpu.keys()):
461+ print("{0}{1}{2}{3}{4}{5}".format(
462+ key.center(6),
463+ cpuinfo.cpuinfo[key]['physical_package_id'].center(13),
464+ cpuinfo.cpuinfo[key]['core_id'].center(9),
465+ "|".center(3),
466+ sys_cpu[key]['physical_package_id'].center(13),
467+ sys_cpu[key]['core_id'].center(9)), file=sys.stderr)
468+ return 1
469+ else:
470+ # Use lscpu command to gather CPU information.
471+ # Output the total number of CPUs, the number of threads per core,
472+ # the number of cores per socket, and the number of sockets.
473+ # Criteria: It must be 8-12 cores per CPU and 2 threads per core.
474+ # Revised by Nelson Chu <nelson.chu@itri.org.tw>
475+ command = 'lscpu'
476+ return_code = 0
477+
478+ with open(os.devnull, "w") as NULL:
479+ cpu_info = check_output(command, stderr=NULL, shell=True)
480+
481+ cpu_info = cpu_info.decode('utf-8')
482+
483+ for cpu in cpu_info.split('\n'):
484+ if cpu.startswith("CPU(s)"):
485+ print(cpu)
486+ if cpu.startswith("Thread(s) per core"):
487+ print(cpu)
488+ if cpu.startswith("Core(s) per socket"):
489+ cores = int(cpu.split(":")[1])
490+ print(cpu)
491+ # It should be 8-12 cores per CPU.
492+ if cores < 8 or cores > 12:
493+ return_code = 1
494+ if cpu.startswith("Socket(s)"):
495+ print(cpu)
496+
497+ return return_code
498+
499+if __name__ == '__main__':
500+ sys.exit(main())

Subscribers

People subscribed via source and target branches