Merge lp:~ltrager/maas/parse_proc into lp:~maas-committers/maas/trunk

Proposed by Lee Trager
Status: Merged
Approved by: Lee Trager
Approved revision: no longer in the source branch.
Merged at revision: 5016
Proposed branch: lp:~ltrager/maas/parse_proc
Merge into: lp:~maas-committers/maas/trunk
Diff against target: 120 lines (+51/-2)
3 files modified
src/metadataserver/models/commissioningscript.py (+23/-2)
src/metadataserver/models/tests/test_commissioningscript.py (+18/-0)
src/provisioningserver/refresh/node_info_scripts.py (+10/-0)
To merge this branch: bzr merge lp:~ltrager/maas/parse_proc
Reviewer Review Type Date Requested Status
Blake Rouse (community) Approve
Andres Rodriguez (community) Needs Fixing
Review via email: mp+293844@code.launchpad.net

Commit message

Get cpu_count from /proc/cpuinfo and memory from dmesg

Description of the change

lshw isn't giving us correct data. With this MP MAAS no longer processes the data from lshw, however it will still be captured.

With some CPU's, such as the ones in the NEC lab, lshw is only displaying the number of physical CPU's and not mentioning anything about how many cores they have. As /proc/cpuinfo shows all the cores the kernel has access to upload and process it to get the correct number of CPU cores.

lshw CPU is no longer reporting the correct amount of RAM. This isn't entirely lshw's fault as newer kernels are only reporting the amount of available RAM. RAM can be reserved by the BIOS(e.g for video cards) or by the kernel itself. I searched around a bit to try to find the closest number I could get. On boot the kernel itself prints the number closest to the amount of actual physical RAM.

As per our discussion this morning I modified Node.display_memory() to show the amount of RAM in GiB with one decimal place. Because Python automatically rounds, and the memory value is so close to the actual value, the UI now displays the correct amount of RAM in GiB.

To post a comment you must log in.
Revision history for this message
Andres Rodriguez (andreserl) wrote :

Can you please be more specific when you say that lshw is not providing the correct information? The big problem here is that MAAS tagging based on definition feature depends on LSHW output. I'd think that instead of doing a work around because LSHW is broken, we should actually fix LSHW.

Can you please provide examples of LSHW being broken?

Also, for example, you can create a tag with a definition that has X amount of cores. This definition will scrape lshw output on all machines to tag all machines with that definition. As such, we can't just not rely on lshw without completely affecting that feature. So we really need to fix lshw.

review: Needs Information
Revision history for this message
Andres Rodriguez (andreserl) wrote :

actually! Gonna reject this branch because this would completely break definition based tagging.

review: Disapprove
Revision history for this message
Lee Trager (ltrager) wrote :

The reason MAAS isn't showing the correct amount of CPU cores is because lshw isn't detecting all of them. http://pastebin.com/1Fwd8U8U is the output of lshw -xml from antileague-adolfo in the NEC lab. As you can see its only showing 1 CPU and lists nothing about the amount of cores. http://paste.ubuntu.com/16232408/ is /proc/cpuinfo which lists all usable CPU cores.

The amount of memory lshw is the same as what the MemTotal field from /proc/meminfo is showing. I found that the kernel lists a bit more in dmesg on boot. I assume the amount listed is dmesg is the total amount of memory available to the kernel while /proc/meminfo shows the total amount of memory available to the user. I'm trying to capture the most correct answer without guessing.

lshw is still captured and stored, it is NOT removed. What this MP changes is where commissioning pulls the cpu_count and memory from. Tagging looks at the stored lshw data and is unaffected by this change.

Revision history for this message
Blake Rouse (blake-rouse) :
review: Needs Information
Revision history for this message
Blake Rouse (blake-rouse) wrote :

Oh and as Lee says it does not affect tagging or anything that depends on LSHW, as that information is still returned. We just use proc to get the other information.

Revision history for this message
Lee Trager (ltrager) wrote :

lshw-B.02.18 fixes both showing the correct amount of cores and physical RAM. As per the discussion I had with Andres I will be seeing if we can RSU lshw-B.02.18 into Xenial or backport the patch which fixes both bugs. Until then we will parse lshw as well as /proc/cpuinfo and choose the higher number for the amount of CPU cores.

Revision history for this message
Andres Rodriguez (andreserl) :
review: Needs Fixing
Revision history for this message
Lee Trager (ltrager) wrote :

I've reordered the commissioning scripts so both lshw and /proc/cpuinfo are both 01. After uploading this I noticed trunk skipped 05 while this branch doesn't. Let me know if you want me to restore the original order or if closing the gap is alright.

Revision history for this message
Andres Rodriguez (andreserl) :
review: Needs Fixing
Revision history for this message
Blake Rouse (blake-rouse) wrote :

Looks good.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/metadataserver/models/commissioningscript.py'
2--- src/metadataserver/models/commissioningscript.py 2016-05-10 16:58:17 +0000
3+++ src/metadataserver/models/commissioningscript.py 2016-05-11 21:32:07 +0000
4@@ -19,6 +19,7 @@
5 import logging
6 import math
7 import os.path
8+import re
9 import tarfile
10 from time import time as now
11
12@@ -175,15 +176,34 @@
13 else:
14 # Same document, many queries: use XPathEvaluator.
15 evaluator = etree.XPathEvaluator(doc)
16- cpu_count = evaluator(_xpath_processor_count)
17+ cpu_count = evaluator(_xpath_processor_count) or 0
18 memory = evaluator(_xpath_memory_bytes)
19 if not memory or math.isnan(memory):
20 memory = 0
21- node.cpu_count = cpu_count or 0
22+ # XXX ltrager 2016-05-09 - Work around for LP:1579996. On some
23+ # CPU's lshw doesn't detect all CPU cores. MAAS captures and
24+ # processes /proc/cpuinfo so MAAS chooses the highest number.
25+ if node.cpu_count is None or cpu_count > node.cpu_count:
26+ node.cpu_count = cpu_count
27 node.memory = memory
28 node.save()
29
30
31+def parse_cpuinfo(node, output, exit_status):
32+ """Parse the output of /proc/cpuinfo."""
33+ assert isinstance(output, bytes)
34+ if exit_status != 0:
35+ return
36+ decoded_output = output.decode('ascii')
37+ cpu_count = len([
38+ m.start()
39+ for m in re.finditer('processor\t:', decoded_output)
40+ ])
41+ if node.cpu_count is None or cpu_count > node.cpu_count:
42+ node.cpu_count = cpu_count
43+ node.save()
44+
45+
46 def set_virtual_tag(node, output, exit_status):
47 """Process the results of `VIRTUALITY_SCRIPT`.
48
49@@ -354,6 +374,7 @@
50
51 # Register the post processing hooks.
52 NODE_INFO_SCRIPTS[LSHW_OUTPUT_NAME]['hook'] = update_hardware_details
53+NODE_INFO_SCRIPTS['00-maas-01-cpuinfo.out']['hook'] = parse_cpuinfo
54 NODE_INFO_SCRIPTS['00-maas-02-virtuality.out']['hook'] = set_virtual_tag
55 NODE_INFO_SCRIPTS['00-maas-07-block-devices.out']['hook'] = (
56 update_node_physical_block_devices)
57
58=== modified file 'src/metadataserver/models/tests/test_commissioningscript.py'
59--- src/metadataserver/models/tests/test_commissioningscript.py 2016-05-10 20:10:34 +0000
60+++ src/metadataserver/models/tests/test_commissioningscript.py 2016-05-11 21:32:07 +0000
61@@ -52,6 +52,7 @@
62 inject_lldp_result,
63 inject_lshw_result,
64 inject_result,
65+ parse_cpuinfo,
66 set_virtual_tag,
67 update_hardware_details,
68 update_node_network_information,
69@@ -414,6 +415,23 @@
70 self.assertEqual("", logger.output)
71
72
73+class TestParseCPUInfo(MAASServerTestCase):
74+
75+ doctest_flags = doctest.ELLIPSIS | doctest.NORMALIZE_WHITESPACE
76+
77+ def test_parse_cpuinfo(self):
78+ node = factory.make_Node()
79+ cpuinfo = dedent("""\
80+ processor\t: 0
81+ vendor_id\t: GenuineIntel
82+
83+ processor\t: 1
84+ vendor_id\t: GenuineIntel
85+ """).encode('utf-8')
86+ parse_cpuinfo(node, cpuinfo, 0)
87+ self.assertEqual(2, reload_object(node).cpu_count)
88+
89+
90 class TestUpdateNodePhysicalBlockDevices(MAASServerTestCase):
91
92 def make_block_device(
93
94=== modified file 'src/provisioningserver/refresh/node_info_scripts.py'
95--- src/provisioningserver/refresh/node_info_scripts.py 2016-03-28 13:54:47 +0000
96+++ src/provisioningserver/refresh/node_info_scripts.py 2016-05-11 21:32:07 +0000
97@@ -83,6 +83,11 @@
98 fi
99 """)
100
101+CPUINFO_SCRIPT = dedent("""\
102+ #!/bin/sh
103+ cat /proc/cpuinfo
104+ """)
105+
106
107 # Run `dhclient` on all the unconfigured interfaces.
108 # This is done to create records in the leases file for the
109@@ -336,6 +341,11 @@
110 'hook': null_hook,
111 'run_on_controller': True,
112 }),
113+ ('00-maas-01-cpuinfo.out', {
114+ 'content': CPUINFO_SCRIPT.encode('ascii'),
115+ 'hook': null_hook,
116+ 'run_on_controller': True,
117+ }),
118 ('00-maas-02-virtuality.out', {
119 'content': VIRTUALITY_SCRIPT.encode('ascii'),
120 'hook': null_hook,