Merge ~davecore/charm-sysconfig:master-charm-sysconfig-mellanox-ringsize into charm-sysconfig:master

Proposed by David Coronel
Status: Rejected
Rejected by: Haw Loeung
Proposed branch: ~davecore/charm-sysconfig:master-charm-sysconfig-mellanox-ringsize
Merge into: charm-sysconfig:master
Diff against target: 138 lines (+66/-0)
4 files modified
src/config.yaml (+6/-0)
src/lib/lib_sysconfig.py (+46/-0)
src/reactive/sysconfig.py (+6/-0)
src/templates/set-ringsize.sh.j2 (+8/-0)
Reviewer Review Type Date Requested Status
Alvaro Uria (community) Needs Fixing
BootStack Reviewers mr tracking; do not claim Pending
BootStack Reviewers Pending
Canonical IS Reviewers Pending
Review via email: mp+378638@code.launchpad.net

Commit message

Added Mellanox ConnectX ringsize support

To post a comment you must log in.
Revision history for this message
🤖 Canonical IS Merge Bot (canonical-is-mergebot) wrote :

This merge proposal is being monitored by mergebot. Change the status to Approved to merge.

Revision history for this message
Alvaro Uria (aluria) wrote :

Hi David. I've added several comments inline. Could you please also add unit tests for the changes applied as well as file a bug this MP can be linked to?

Thank you.

review: Needs Fixing
Revision history for this message
David Coronel (davecore) wrote :

Hi Alvaro. Sorry for taking so long to get back to this. I added comments inline as well.

I filed this LP bug https://bugs.launchpad.net/charm-sysconfig/+bug/1869233

Revision history for this message
Alvaro Uria (aluria) wrote :

Hi David. I have added some replies inline. Thank you.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

I'm really not sold on determining the nics in the charm. If this is part of networkd-dispatcher service, shouldn't the startup script probe current connectx nics at boot time and configure them all based on their current names. what happens if you add a new nic, or reseat a nic in the wrong slot.

I'd assume the charm wouldn't update the startup script until well after startup, which both triggers issues of timing of network card changes to when there are applications running, and potentially not having the changes take effect until a second reboot of the host taking place.

I might suggest moving all the syshelper hook logic into the script and only templatize the size to set, or even just drop the size as another file on disk rather than having a python/jinja generated shell script.

Also, +1 Alvaro's suggestion to have an action for "tune-connectx-rings" as an operator timed action to happen after the startup script and ring size configuration have landed.

Unmerged commits

a8e89de... by David Coronel

added removal of mellanox changes and missing install step

496af1c... by David Coronel

added exit 0 to script in case ringsize is already same size

7fbf73f... by David Coronel

Added Mellanox ConnectX ringsize support

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/src/config.yaml b/src/config.yaml
2index bd60238..d30aee2 100644
3--- a/src/config.yaml
4+++ b/src/config.yaml
5@@ -122,3 +122,9 @@ options:
6 Recommended option when Bios control power is set to the OS.
7 - 'performance'
8 - 'powersave'
9+ mellanox-connectx-ringsize:
10+ type: string
11+ default: ""
12+ description: |
13+ Configure the value of RX/TX ring size for Mellanox Connect-X cards
14+ to this value.
15diff --git a/src/lib/lib_sysconfig.py b/src/lib/lib_sysconfig.py
16index 51bf049..a58cfa7 100644
17--- a/src/lib/lib_sysconfig.py
18+++ b/src/lib/lib_sysconfig.py
19@@ -16,11 +16,13 @@ GRUB_DEFAULT = 'Advanced options for Ubuntu>Ubuntu, with Linux {}'
20 CPUFREQUTILS_TMPL = 'cpufrequtils.j2'
21 GRUB_CONF_TMPL = 'grub.j2'
22 SYSTEMD_SYSTEM_TMPL = 'etc.systemd.system.conf.j2'
23+SET_RINGSIZE_SCRIPT_TMPL = 'set-ringsize.sh.j2'
24
25 CPUFREQUTILS = '/etc/default/cpufrequtils'
26 GRUB_CONF = '/etc/default/grub.d/90-sysconfig.cfg'
27 SYSTEMD_SYSTEM = '/etc/systemd/system.conf'
28 KERNEL = 'kernel'
29+SET_RINGSIZE_SCRIPT = '/usr/lib/networkd-dispatcher/routable.d/set-ringsize.sh'
30
31
32 def parse_config_flags(config_flags):
33@@ -185,6 +187,27 @@ class SysConfigHelper:
34 flags = config_flags_parser(self.charm_config['config-flags'])
35 return flags
36
37+ @property
38+ def mellanox_connectx_ringsize(self):
39+ """Return mellanox-connectx-ringsize config option."""
40+ return self.charm_config['mellanox-connectx-ringsize']
41+
42+ @property
43+ def mellanox_connectx_nics(self):
44+ """Return Mellanoc ConnectX NICs."""
45+ if not self.charm_config.get('mellanox-connectx-ringsize'):
46+ return []
47+ mellanoxdevices = []
48+ for filename in os.listdir('/sys/class/net/'):
49+ fullpath = '/sys/class/net/' + filename + '/device/uevent'
50+ if os.path.isfile(fullpath):
51+ with open(fullpath) as f:
52+ for line in f:
53+ if 'DRIVER=mlx5_core' in line:
54+ mellanoxdevices.append(filename)
55+ break
56+ return mellanoxdevices
57+
58 def _render_boot_resource(self, source, target, context):
59 """Render the template and set the resource as changed."""
60 render(source=source, templates_dir='templates', target=target, context=context)
61@@ -313,6 +336,21 @@ class SysConfigHelper:
62
63 host.service_restart('cpufrequtils')
64
65+ def configure_mellanox_connectx_ringsize(self):
66+ """Configure Mellanox ringsize as given by the mellanox-connectx-ringsize option.
67+
68+ Will generate networkd-dispatcher script and run it as well
69+ """
70+ context = {}
71+ if self.mellanox_connectx_nics:
72+ context['mellanox_connectx_nics'] = self.mellanox_connectx_nics
73+ if self.mellanox_connectx_ringsize:
74+ context['mellanox_connectx_ringsize'] = self.mellanox_connectx_ringsize
75+ self._render_boot_resource(SET_RINGSIZE_SCRIPT_TMPL, SET_RINGSIZE_SCRIPT, context)
76+ hookenv.log('Mellanox ConnectX ringsize script updated via networkd-dispatcher')
77+ subprocess.check_call(['/bin/chmod', '755', SET_RINGSIZE_SCRIPT])
78+ subprocess.check_call([SET_RINGSIZE_SCRIPT])
79+
80 def remove_grub_configuration(self):
81 """Remove /etc/default/grub.d/90-sysconfig.cfg if exists.
82
83@@ -361,3 +399,11 @@ class SysConfigHelper:
84 hookenv.DEBUG
85 )
86 host.service_restart('cpufrequtils')
87+
88+ def remove_mellanox_connectx_ringsize(self):
89+ """Remove Mellanox ringsize configuration.
90+
91+ Will remove networkd-dispatcher script.
92+ """
93+ os.remove(SET_RINGSIZE_SCRIPT)
94+ hookenv.log('Mellanox ConnectX ringsize script removed')
95diff --git a/src/reactive/sysconfig.py b/src/reactive/sysconfig.py
96index eba0916..501ff0f 100644
97--- a/src/reactive/sysconfig.py
98+++ b/src/reactive/sysconfig.py
99@@ -54,6 +54,7 @@ def install_sysconfig():
100 syshelper.update_cpufreq()
101 syshelper.update_grub_file()
102 syshelper.update_systemd_system_file()
103+ syshelper.configure_mellanox_connectx_ringsize()
104 set_flag('sysconfig.installed')
105 update_status()
106
107@@ -103,6 +104,10 @@ def config_changed():
108 )) or helpers.any_file_changed([SYSTEMD_SYSTEM]):
109 syshelper.update_systemd_system_file()
110
111+ # Mellanox ConnectX ringsize
112+ if syshelper.charm_config.changed('mellanox-connectx-ringsize'):
113+ syshelper.configure_mellanox_connectx_ringsize()
114+
115 update_status()
116
117
118@@ -142,5 +147,6 @@ def remove_configuration():
119 syshelper.remove_cpufreq_configuration()
120 syshelper.remove_grub_configuration()
121 syshelper.remove_systemd_configuration()
122+ syshelper.remove_mellanox_connectx_ringsize()
123 clear_flag('sysconfig.installed')
124 clear_flag('sysconfig.unsupported')
125diff --git a/src/templates/set-ringsize.sh.j2 b/src/templates/set-ringsize.sh.j2
126new file mode 100644
127index 0000000..024dfda
128--- /dev/null
129+++ b/src/templates/set-ringsize.sh.j2
130@@ -0,0 +1,8 @@
131+#!/bin/bash
132+# Set rx-ringsize and tx-ringsize to card max
133+{% if mellanox_connectx_nics is defined and mellanox_connectx_nics -%}
134+{% for nic in mellanox_connectx_nics -%}
135+/sbin/ethtool -G {{ nic }} rx {{ mellanox_connectx_ringsize }} tx {{ mellanox_connectx_ringsize }}
136+{% endfor -%}
137+{% endif -%}
138+exit 0

Subscribers

People subscribed via source and target branches

to all changes: