Merge ~mpontillo/maas:ppc64-disable-smt-for-pod-deploy--bug-1797484 into maas:master

Proposed by Mike Pontillo on 2018-10-12
Status: Merged
Approved by: Mike Pontillo on 2018-10-16
Approved revision: 2be7e2f97facf6c257ae8f35b45904e9cac0f681
Merge reported by: MAAS Lander
Merged at revision: not available
Proposed branch: ~mpontillo/maas:ppc64-disable-smt-for-pod-deploy--bug-1797484
Merge into: maas:master
Diff against target: 123 lines (+50/-10)
4 files modified
src/maasserver/node_action.py (+3/-1)
src/maasserver/tests/test_node_action.py (+6/-2)
src/metadataserver/tests/test_vendor_data.py (+18/-0)
src/metadataserver/vendor_data.py (+23/-7)
Reviewer Review Type Date Requested Status
Andres Rodriguez (community) Approve on 2018-10-16
MAAS Lander Approve on 2018-10-16
Lee Trager 2018-10-12 Needs Information on 2018-10-15
Review via email: mp+356667@code.launchpad.net

Commit message

LP: #1797484 - Support for PPC64 pod deployment

 * Add rc.local to disable SMT for ppc64el KVM pod deployments.
 * Fix UI pod deployment path to bridge all interfaces.

Description of the change

It's not really clear if this is the most correct change; I wonder about race conditions. We should investigate to see if adding a tag with `nosmt=force` is a better way to do this.

It's also not really clear if we need to restart libvirt in the rc file; we might want to drop that, too. Thoughts?

To post a comment you must log in.
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b ppc64-disable-smt-for-pod-deploy--bug-1797484 lp:~mpontillo/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: SUCCESS
COMMIT: 8965d5b463c71522dbc8e954726e3cc228b6a251

review: Approve
Lee Trager (ltrager) wrote :

Do you know why SMT needs to be disabled? I found this blog post from IBM saying it should be supported. [1]

Since this is for systemd based systems I think we should create a systemd unit in /etc/systemd/user. Something like this should work and will ensure smt is turned off before libvirtd starts so no restart is needed.

[Unit]
Description=Ensures SMT is disabled for KVM

[Service]
ExecStart=ppc64_cpu --smt=off
Before=libvirtd

[Install]
WantedBy=multi-user.target

[1] https://www.ibm.com/developerworks/community/blogs/fe313521-2e95-46f2-817d-44a4f27eba32/entry/enabling_smt_on_powerkvm_guests?lang=en_us

review: Needs Information
Mike Pontillo (mpontillo) wrote :

The article you linked to states that SMT can be enabled on /guests/, if I understand it correctly. This is for deploying the hypervisor. That said, I don't know specifically why SMT needs to be disabled on the hypervisor. I was asking about this on IRC last week, and heard from @rharper: "on certain levels of the power hardware, the kvm hypervisor cannot run the threads one the host which may be given to guests; (rather it ends up running poorly, so disabling the threads is needed)".

I like your systemd unit idea; thanks. I feel like the kernel parameter still might be better, but I don't like the idea of the manual tag management that would come with implementing it that way.

Mike Pontillo (mpontillo) wrote :

In other words, if I parsed what @rharper said correctly, by disabling SMT on the host, you ensure it's fully available on the guests?

ac39c3e... by Mike Pontillo on 2018-10-15

Fix end quote in cloud-config for PPC64.

MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b ppc64-disable-smt-for-pod-deploy--bug-1797484 lp:~mpontillo/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: FAILED
LOG: http://maas-ci-jenkins.internal:8080/job/maas/job/branch-tester/4199/console
COMMIT: 176915594810294556db88020ac1f83aa870962c

review: Needs Fixing
72c4d09... by Mike Pontillo on 2018-10-16

Fix test.

MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b ppc64-disable-smt-for-pod-deploy--bug-1797484 lp:~mpontillo/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: SUCCESS
COMMIT: e90dab3294e16885424b94bfe9d1c3f540caa949

review: Approve
Andres Rodriguez (andreserl) wrote :

lgtm!

review: Approve
2be7e2f... by Mike Pontillo on 2018-10-16

Use bridge_all when deploying from the UI for better UX.

Andres Rodriguez (andreserl) wrote :

lgtm!

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/src/maasserver/node_action.py b/src/maasserver/node_action.py
2index f299700..edfdfe7 100644
3--- a/src/maasserver/node_action.py
4+++ b/src/maasserver/node_action.py
5@@ -468,7 +468,9 @@ class Deploy(NodeAction):
6 if self.node.owner is None:
7 with locks.node_acquire:
8 try:
9- self.node.acquire(self.user, token=None)
10+ bridge_all = True if install_kvm else False
11+ self.node.acquire(
12+ self.user, token=None, bridge_all=bridge_all)
13 except ValidationError as e:
14 raise NodeActionError(e)
15 if install_kvm:
16diff --git a/src/maasserver/tests/test_node_action.py b/src/maasserver/tests/test_node_action.py
17index 4cf029e..ca37761 100644
18--- a/src/maasserver/tests/test_node_action.py
19+++ b/src/maasserver/tests/test_node_action.py
20@@ -699,11 +699,12 @@ class TestDeployAction(MAASServerTestCase):
21 request = factory.make_fake_request('/')
22 request.user = user
23 node = factory.make_Node(
24- interface=True, status=NODE_STATUS.ALLOCATED,
25- power_type='manual', owner=user)
26+ interface=True, status=NODE_STATUS.READY,
27+ power_type='manual', owner=None)
28 mock_get_curtin_config = self.patch(
29 node_action_module, 'get_curtin_config')
30 mock_node_start = self.patch(node, 'start')
31+ mock_node_acquire = self.patch(node, 'acquire')
32 mock_validate_os = self.patch(
33 node_action_module, 'validate_osystem_and_distro_series')
34 mock_validate_os.side_effect = lambda os, release: (os, release)
35@@ -715,6 +716,9 @@ class TestDeployAction(MAASServerTestCase):
36 }
37 Deploy(node, user, request).execute(**extra)
38 self.expectThat(mock_get_curtin_config, MockCalledOnceWith(ANY, node))
39+ # Make sure we set bridge_all when we acquire the Node.
40+ self.expectThat(mock_node_acquire, MockCalledOnceWith(
41+ user, token=None, bridge_all=True))
42 self.expectThat(mock_node_start, MockCalledOnceWith(user))
43 # Make sure ubuntu/bionic is selected if KVM pod deployment has been
44 # selected.
45diff --git a/src/metadataserver/tests/test_vendor_data.py b/src/metadataserver/tests/test_vendor_data.py
46index 341717a..6f85f86 100644
47--- a/src/metadataserver/tests/test_vendor_data.py
48+++ b/src/metadataserver/tests/test_vendor_data.py
49@@ -250,3 +250,21 @@ class TestGenerateRackControllerConfiguration(MAASServerTestCase):
50 configuration = get_vendor_data(node)
51 config = dict(configuration)
52 self.assertThat(config['packages'], Contains("qemu-efi"))
53+
54+ def test_includes_smt_off_for_install_kvm_on_ppc64(self):
55+ node = factory.make_Node(
56+ osystem='ubuntu', netboot=False, architecture='ppc64el/generic')
57+ node.install_kvm = True
58+ configuration = get_vendor_data(node)
59+ config = dict(configuration)
60+ self.assertThat(
61+ config['runcmd'], Contains([
62+ 'sh', '-c', 'printf "'
63+ '#!/bin/sh\\n'
64+ 'ppc64_cpu --smt=off\\n'
65+ 'exit 0\\n'
66+ '" >> /etc/rc.local'
67+ ]))
68+ self.assertThat(
69+ config['runcmd'], Contains(['chmod', '+x', '/etc/rc.local']))
70+ self.assertThat(config['runcmd'], Contains(['/etc/rc.local']))
71diff --git a/src/metadataserver/vendor_data.py b/src/metadataserver/vendor_data.py
72index a7d942e..26810d3 100644
73--- a/src/metadataserver/vendor_data.py
74+++ b/src/metadataserver/vendor_data.py
75@@ -103,7 +103,12 @@ def generate_rack_controller_configuration(node):
76 def generate_kvm_pod_configuration(node):
77 """Generate cloud-init configuration to install the node as a KVM pod."""
78 if node.netboot is False and node.install_kvm is True:
79- yield "runcmd", [
80+ architecture = None
81+ if node.architecture is not None:
82+ architecture = node.architecture
83+ if '/' in architecture:
84+ architecture = architecture.split('/')[0]
85+ runcmd = [
86 # Restrict the $PATH so that rbash can be used to limit what the
87 # virsh user can do if they manage to get a shell.
88 ['mkdir', '-p', '/home/virsh/bin'],
89@@ -127,6 +132,21 @@ def generate_kvm_pod_configuration(node):
90 # Ensure services are ready before cloud-init finishes.
91 ['/bin/sleep', '10'],
92 ]
93+ if architecture == 'ppc64el':
94+ # XXX mpontillo 2018-10-12 - we should investigate if it might be
95+ # better to add a tag to the node that includes a kernel parameter
96+ # such as nosmt=force. (The only problem being that we should
97+ # probably also remove it after the machine is released.)
98+ runcmd.append([
99+ 'sh', '-c', 'printf "'
100+ '#!/bin/sh\\n'
101+ 'ppc64_cpu --smt=off\\n'
102+ 'exit 0\\n'
103+ '" >> /etc/rc.local'
104+ ])
105+ runcmd.append(['chmod', '+x', '/etc/rc.local'])
106+ runcmd.append(['/etc/rc.local'])
107+ yield "runcmd", runcmd
108 # Generate a 32-character password by encoding 24 bytes as base64.
109 virsh_password = b64encode(
110 urandom(24), altchars=b'.!').decode('ascii')
111@@ -150,10 +170,6 @@ def generate_kvm_pod_configuration(node):
112 }
113 ]
114 packages = ["qemu-kvm", "libvirt-bin"]
115- if node.architecture is not None:
116- architecture = node.architecture
117- if '/' in architecture:
118- architecture = architecture.split('/')[0]
119- if architecture == 'amd64':
120- packages.append('qemu-efi')
121+ if architecture == 'amd64':
122+ packages.append('qemu-efi')
123 yield "packages", packages

Subscribers

People subscribed via source and target branches

to all changes: