Merge ~mpontillo/maas:ppc64-disable-smt-for-pod-deploy--bug-1797484 into maas:master

Proposed by Mike Pontillo
Status: Merged
Approved by: Mike Pontillo
Approved revision: 2be7e2f97facf6c257ae8f35b45904e9cac0f681
Merge reported by: MAAS Lander
Merged at revision: not available
Proposed branch: ~mpontillo/maas:ppc64-disable-smt-for-pod-deploy--bug-1797484
Merge into: maas:master
Diff against target: 123 lines (+50/-10)
4 files modified
src/maasserver/node_action.py (+3/-1)
src/maasserver/tests/test_node_action.py (+6/-2)
src/metadataserver/tests/test_vendor_data.py (+18/-0)
src/metadataserver/vendor_data.py (+23/-7)
Reviewer Review Type Date Requested Status
Andres Rodriguez (community) Approve
MAAS Lander Approve
Lee Trager (community) Needs Information
Review via email: mp+356667@code.launchpad.net

Commit message

LP: #1797484 - Support for PPC64 pod deployment

 * Add rc.local to disable SMT for ppc64el KVM pod deployments.
 * Fix UI pod deployment path to bridge all interfaces.

Description of the change

It's not really clear if this is the most correct change; I wonder about race conditions. We should investigate to see if adding a tag with `nosmt=force` is a better way to do this.

It's also not really clear if we need to restart libvirt in the rc file; we might want to drop that, too. Thoughts?

To post a comment you must log in.
Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b ppc64-disable-smt-for-pod-deploy--bug-1797484 lp:~mpontillo/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: SUCCESS
COMMIT: 8965d5b463c71522dbc8e954726e3cc228b6a251

review: Approve
Revision history for this message
Lee Trager (ltrager) wrote :

Do you know why SMT needs to be disabled? I found this blog post from IBM saying it should be supported. [1]

Since this is for systemd based systems I think we should create a systemd unit in /etc/systemd/user. Something like this should work and will ensure smt is turned off before libvirtd starts so no restart is needed.

[Unit]
Description=Ensures SMT is disabled for KVM

[Service]
ExecStart=ppc64_cpu --smt=off
Before=libvirtd

[Install]
WantedBy=multi-user.target

[1] https://www.ibm.com/developerworks/community/blogs/fe313521-2e95-46f2-817d-44a4f27eba32/entry/enabling_smt_on_powerkvm_guests?lang=en_us

review: Needs Information
Revision history for this message
Mike Pontillo (mpontillo) wrote :

The article you linked to states that SMT can be enabled on /guests/, if I understand it correctly. This is for deploying the hypervisor. That said, I don't know specifically why SMT needs to be disabled on the hypervisor. I was asking about this on IRC last week, and heard from @rharper: "on certain levels of the power hardware, the kvm hypervisor cannot run the threads one the host which may be given to guests; (rather it ends up running poorly, so disabling the threads is needed)".

I like your systemd unit idea; thanks. I feel like the kernel parameter still might be better, but I don't like the idea of the manual tag management that would come with implementing it that way.

Revision history for this message
Mike Pontillo (mpontillo) wrote :

In other words, if I parsed what @rharper said correctly, by disabling SMT on the host, you ensure it's fully available on the guests?

ac39c3e... by Mike Pontillo

Fix end quote in cloud-config for PPC64.

Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b ppc64-disable-smt-for-pod-deploy--bug-1797484 lp:~mpontillo/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: FAILED
LOG: http://maas-ci-jenkins.internal:8080/job/maas/job/branch-tester/4199/console
COMMIT: 176915594810294556db88020ac1f83aa870962c

review: Needs Fixing
72c4d09... by Mike Pontillo

Fix test.

Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b ppc64-disable-smt-for-pod-deploy--bug-1797484 lp:~mpontillo/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: SUCCESS
COMMIT: e90dab3294e16885424b94bfe9d1c3f540caa949

review: Approve
Revision history for this message
Andres Rodriguez (andreserl) wrote :

lgtm!

review: Approve
2be7e2f... by Mike Pontillo

Use bridge_all when deploying from the UI for better UX.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

lgtm!

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
diff --git a/src/maasserver/node_action.py b/src/maasserver/node_action.py
index f299700..edfdfe7 100644
--- a/src/maasserver/node_action.py
+++ b/src/maasserver/node_action.py
@@ -468,7 +468,9 @@ class Deploy(NodeAction):
468 if self.node.owner is None:468 if self.node.owner is None:
469 with locks.node_acquire:469 with locks.node_acquire:
470 try:470 try:
471 self.node.acquire(self.user, token=None)471 bridge_all = True if install_kvm else False
472 self.node.acquire(
473 self.user, token=None, bridge_all=bridge_all)
472 except ValidationError as e:474 except ValidationError as e:
473 raise NodeActionError(e)475 raise NodeActionError(e)
474 if install_kvm:476 if install_kvm:
diff --git a/src/maasserver/tests/test_node_action.py b/src/maasserver/tests/test_node_action.py
index 4cf029e..ca37761 100644
--- a/src/maasserver/tests/test_node_action.py
+++ b/src/maasserver/tests/test_node_action.py
@@ -699,11 +699,12 @@ class TestDeployAction(MAASServerTestCase):
699 request = factory.make_fake_request('/')699 request = factory.make_fake_request('/')
700 request.user = user700 request.user = user
701 node = factory.make_Node(701 node = factory.make_Node(
702 interface=True, status=NODE_STATUS.ALLOCATED,702 interface=True, status=NODE_STATUS.READY,
703 power_type='manual', owner=user)703 power_type='manual', owner=None)
704 mock_get_curtin_config = self.patch(704 mock_get_curtin_config = self.patch(
705 node_action_module, 'get_curtin_config')705 node_action_module, 'get_curtin_config')
706 mock_node_start = self.patch(node, 'start')706 mock_node_start = self.patch(node, 'start')
707 mock_node_acquire = self.patch(node, 'acquire')
707 mock_validate_os = self.patch(708 mock_validate_os = self.patch(
708 node_action_module, 'validate_osystem_and_distro_series')709 node_action_module, 'validate_osystem_and_distro_series')
709 mock_validate_os.side_effect = lambda os, release: (os, release)710 mock_validate_os.side_effect = lambda os, release: (os, release)
@@ -715,6 +716,9 @@ class TestDeployAction(MAASServerTestCase):
715 }716 }
716 Deploy(node, user, request).execute(**extra)717 Deploy(node, user, request).execute(**extra)
717 self.expectThat(mock_get_curtin_config, MockCalledOnceWith(ANY, node))718 self.expectThat(mock_get_curtin_config, MockCalledOnceWith(ANY, node))
719 # Make sure we set bridge_all when we acquire the Node.
720 self.expectThat(mock_node_acquire, MockCalledOnceWith(
721 user, token=None, bridge_all=True))
718 self.expectThat(mock_node_start, MockCalledOnceWith(user))722 self.expectThat(mock_node_start, MockCalledOnceWith(user))
719 # Make sure ubuntu/bionic is selected if KVM pod deployment has been723 # Make sure ubuntu/bionic is selected if KVM pod deployment has been
720 # selected.724 # selected.
diff --git a/src/metadataserver/tests/test_vendor_data.py b/src/metadataserver/tests/test_vendor_data.py
index 341717a..6f85f86 100644
--- a/src/metadataserver/tests/test_vendor_data.py
+++ b/src/metadataserver/tests/test_vendor_data.py
@@ -250,3 +250,21 @@ class TestGenerateRackControllerConfiguration(MAASServerTestCase):
250 configuration = get_vendor_data(node)250 configuration = get_vendor_data(node)
251 config = dict(configuration)251 config = dict(configuration)
252 self.assertThat(config['packages'], Contains("qemu-efi"))252 self.assertThat(config['packages'], Contains("qemu-efi"))
253
254 def test_includes_smt_off_for_install_kvm_on_ppc64(self):
255 node = factory.make_Node(
256 osystem='ubuntu', netboot=False, architecture='ppc64el/generic')
257 node.install_kvm = True
258 configuration = get_vendor_data(node)
259 config = dict(configuration)
260 self.assertThat(
261 config['runcmd'], Contains([
262 'sh', '-c', 'printf "'
263 '#!/bin/sh\\n'
264 'ppc64_cpu --smt=off\\n'
265 'exit 0\\n'
266 '" >> /etc/rc.local'
267 ]))
268 self.assertThat(
269 config['runcmd'], Contains(['chmod', '+x', '/etc/rc.local']))
270 self.assertThat(config['runcmd'], Contains(['/etc/rc.local']))
diff --git a/src/metadataserver/vendor_data.py b/src/metadataserver/vendor_data.py
index a7d942e..26810d3 100644
--- a/src/metadataserver/vendor_data.py
+++ b/src/metadataserver/vendor_data.py
@@ -103,7 +103,12 @@ def generate_rack_controller_configuration(node):
103def generate_kvm_pod_configuration(node):103def generate_kvm_pod_configuration(node):
104 """Generate cloud-init configuration to install the node as a KVM pod."""104 """Generate cloud-init configuration to install the node as a KVM pod."""
105 if node.netboot is False and node.install_kvm is True:105 if node.netboot is False and node.install_kvm is True:
106 yield "runcmd", [106 architecture = None
107 if node.architecture is not None:
108 architecture = node.architecture
109 if '/' in architecture:
110 architecture = architecture.split('/')[0]
111 runcmd = [
107 # Restrict the $PATH so that rbash can be used to limit what the112 # Restrict the $PATH so that rbash can be used to limit what the
108 # virsh user can do if they manage to get a shell.113 # virsh user can do if they manage to get a shell.
109 ['mkdir', '-p', '/home/virsh/bin'],114 ['mkdir', '-p', '/home/virsh/bin'],
@@ -127,6 +132,21 @@ def generate_kvm_pod_configuration(node):
127 # Ensure services are ready before cloud-init finishes.132 # Ensure services are ready before cloud-init finishes.
128 ['/bin/sleep', '10'],133 ['/bin/sleep', '10'],
129 ]134 ]
135 if architecture == 'ppc64el':
136 # XXX mpontillo 2018-10-12 - we should investigate if it might be
137 # better to add a tag to the node that includes a kernel parameter
138 # such as nosmt=force. (The only problem being that we should
139 # probably also remove it after the machine is released.)
140 runcmd.append([
141 'sh', '-c', 'printf "'
142 '#!/bin/sh\\n'
143 'ppc64_cpu --smt=off\\n'
144 'exit 0\\n'
145 '" >> /etc/rc.local'
146 ])
147 runcmd.append(['chmod', '+x', '/etc/rc.local'])
148 runcmd.append(['/etc/rc.local'])
149 yield "runcmd", runcmd
130 # Generate a 32-character password by encoding 24 bytes as base64.150 # Generate a 32-character password by encoding 24 bytes as base64.
131 virsh_password = b64encode(151 virsh_password = b64encode(
132 urandom(24), altchars=b'.!').decode('ascii')152 urandom(24), altchars=b'.!').decode('ascii')
@@ -150,10 +170,6 @@ def generate_kvm_pod_configuration(node):
150 }170 }
151 ]171 ]
152 packages = ["qemu-kvm", "libvirt-bin"]172 packages = ["qemu-kvm", "libvirt-bin"]
153 if node.architecture is not None:173 if architecture == 'amd64':
154 architecture = node.architecture174 packages.append('qemu-efi')
155 if '/' in architecture:
156 architecture = architecture.split('/')[0]
157 if architecture == 'amd64':
158 packages.append('qemu-efi')
159 yield "packages", packages175 yield "packages", packages

Subscribers

People subscribed via source and target branches