Merge ~jfguedez/charm-telegraf:feature/intel-cmt-cat into charm-telegraf:master
- Git
- lp:~jfguedez/charm-telegraf
- feature/intel-cmt-cat
- Merge into master
Status: | Merged |
---|---|
Approved by: | Xav Paice |
Approved revision: | ed52df526595684011a7df9946549e34519ce560 |
Merged at revision: | fb9446ae1a98b8b25478dfb0048a07ed0ec59bda |
Proposed branch: | ~jfguedez/charm-telegraf:feature/intel-cmt-cat |
Merge into: | charm-telegraf:master |
Diff against target: |
1061 lines (+854/-28) 6 files modified
src/config.yaml (+18/-1) src/reactive/telegraf.py (+199/-23) src/templates/base_inputs.conf (+7/-0) src/templates/dashboards/grafana/IntelRDT.json.j2 (+508/-0) src/templates/sudoers/telegraf_intel_rdt.tmpl (+4/-0) src/tests/unit/test_telegraf.py (+118/-4) |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
🤖 prod-jenkaas-bootstack | continuous-integration | Approve | |
Celia Wang | Approve | ||
Joe Guo (community) | Needs Fixing | ||
Junien F | Approve | ||
Edin S (community) | Approve | ||
Canonical IS Reviewers | Pending | ||
Review via email: mp+405064@code.launchpad.net |
Commit message
Add support for Memory Bandwidth Monitoring (Intel RDT)
Description of the change
🤖 Canonical IS Merge Bot (canonical-is-mergebot) wrote : | # |
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
A CI job is currently in progress. A follow up comment will be added when it completes.
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
FAILED: Continuous integration, rev:e9451b441b2
https:/
Executed test runs:
FAILURE: https:/
None: https:/
Click here to trigger a rebuild:
https:/
Jose Guedez (jfguedez) wrote (last edit ): | # |
It seems that the CI might have some issues. I don't think it has had a successful build yet. The failing tests are unrelated to the changes afaict.
When I run the unit tests locally there are no failures before/after the change fwiw - https:/
James Troup (elmo) wrote : | # |
LGTM, one minor comment inline.
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
A CI job is currently in progress. A follow up comment will be added when it completes.
Jose Guedez (jfguedez) wrote : | # |
@James Troup. Thanks, I replied inline and will be adding the comment back
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
FAILED: Continuous integration, rev:913ac81e423
https:/
Executed test runs:
FAILURE: https:/
None: https:/
Click here to trigger a rebuild:
https:/
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
A CI job is currently in progress. A follow up comment will be added when it completes.
Jose Guedez (jfguedez) wrote : | # |
All unit tests pass - https:/
There's an issue with the CI that is being addressed in https:/
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
FAILED: Continuous integration, rev:f549208353e
https:/
Executed test runs:
FAILURE: https:/
None: https:/
Click here to trigger a rebuild:
https:/
Junien F (axino) wrote : | # |
See below about the sudoers file - thanks !
Joe Guo (guoqiao) wrote : | # |
for function `check_
1) to report issues, it used both exception and string message, maybe just use one way. I will prefer exception.
2) it returns empty str (false) as ok, which maybe misleading or be misused.
3) for kernel version compare, I noticed[0] there is version like `5.13`?
Can we also use the `fetch.
[0]: https:/
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
A CI job is currently in progress. A follow up comment will be added when it completes.
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
FAILED: Continuous integration, rev:c3b6d8aaf37
https:/
Executed test runs:
FAILURE: https:/
None: https:/
Click here to trigger a rebuild:
https:/
Jose Guedez (jfguedez) wrote : | # |
@axino:
Thanks, in this case the plugin executes the sudo command only once when the telegraf service starts. However, I did add the extra commands to the sudoers file to avoid logging the command. Please take a look again.
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
A CI job is currently in progress. A follow up comment will be added when it completes.
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
FAILED: Continuous integration, rev:6b146fc34fa
https:/
Executed test runs:
FAILURE: https:/
None: https:/
Click here to trigger a rebuild:
https:/
Jose Guedez (jfguedez) wrote (last edit ): | # |
@ guoqiao
Thanks, please see comments inline. Addressed in the latest push.
> for function `check_
>
> 1) to report issues, it used both exception and string message, maybe just use
> one way. I will prefer exception.
>
> 2) it returns empty str (false) as ok, which maybe misleading or be misused.
>
I had originally wanted to use the exception as a separate mechanism, but I can see how it would be confusing. Definitely agree with the empty string, so I switched it to use exceptions.
> 3) for kernel version compare, I noticed[0] there is version like `5.13`?
>
> Can we also use the `fetch.
>
> [0]: https:/
According to [0], there is always a 3rd number. However, it seems that at least in Ubuntu it's always zero so is has no meaning, as it doesn't match the third digit from upstream. You can see the full table of ubuntu/upstream here [1], they all seem to have the 3 number. The first two numbers (major, minor) always match the kernel version so I changed the validation to use only those (e.g. 5.4), which should be enough for our purposes here.
As to using the apt_pkg version for this, you could have multiple kernels installed, some good, some bad (for example in bionic you need the HWE kernel) so it is more reliable to use the version of the running kernel.
I believe the comments/changes address the issues you brought up. Please take a look again, thanks.
[0] https:/
[1] https:/
Junien F (axino) wrote : | # |
Thanks for the sudoers change !
Joe Guo (guoqiao) wrote : | # |
Hi Jose,
Thanks for the quick change, another small question:
In doc[0], it mentioned the required minimal pqos version is `4.0.0`.
But here in code we are using `RDT_MINIMUM_
I understand we have to use ppa to backport for boinic, but as my understanding, is will more generic and reliable to use `4.0.0` here ?
[0]: https:/
Joe Guo (guoqiao) wrote : | # |
Jose has explained the version issue in chat. +1.
Joe Guo (guoqiao) wrote : | # |
+1, but worth noting:
according to doc[0], so far telegraf can not stop the rdt plugin with sudo=true.
2 potential solutions are suggested there.
Before the final solution on telegraf side is released, we may need to provide workaround for the charm to work.
[0]: https:/
Joe Guo (guoqiao) wrote : | # |
Hi Jose:
I am doing some testing with this patch in a lxd container (for conditions unmet case), I noticed the `kernel.modprobe` will raise exception: https:/
Instead of creating a new patch, I am wondering could you apply following change to your code and re-push, so we can keep the review history here, please ?
diff --git a/src/reactive/
index 0b1e71d..8bfd801 100644
--- a/src/reactive/
+++ b/src/reactive/
@@ -807,7 +807,14 @@ def configure_
if config[
# load and persist the required module
- kernel.
+ try:
+ kernel.
+ except subprocess.
+ error_msg = "modprobe {} failed"
+ hookenv.
+ hookenv.
+ return
+
try:
except InvalidIntelRDT
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
A CI job is currently in progress. A follow up comment will be added when it completes.
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
FAILED: Continuous integration, rev:38b6c858a18
https:/
Executed test runs:
FAILURE: https:/
None: https:/
Click here to trigger a rebuild:
https:/
Joe Guo (guoqiao) wrote : | # |
@jfguedez CI failed and there is unresolved merge conflict in code.
Joe Guo (guoqiao) wrote : | # |
Re: the `modprobe msr` failure in lxc/lxd, I am able to reproduce it with:
lxc launch ubuntu:20.04 ubuntu
lxc exec ubuntu -- bash
root@ubuntu:~# modprobe msr
modprobe: FATAL: Module msr not found in directory /lib/modules/
Celia Wang (ziyiwang) : | # |
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
A CI job is currently in progress. A follow up comment will be added when it completes.
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
FAILED: Continuous integration, rev:66114d61675
https:/
Executed test runs:
FAILURE: https:/
None: https:/
Click here to trigger a rebuild:
https:/
Joe Guo (guoqiao) wrote : | # |
new changes pushed:
1) rebased against mater to resolve conflicts.
2) block charm if collect_
3) rename rdt option `sudo` to `use_sudo`
for 3), the purpose is to be consistent with the existing plugins.
upstream patch: https:/
new ppa built with above patch:
ppa:guoqiao/
to use ppa:
juju config telegraf install_
New review appreciated !
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
A CI job is currently in progress. A follow up comment will be added when it completes.
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
FAILED: Continuous integration, rev:0e452420071
https:/
Executed test runs:
FAILURE: https:/
None: https:/
Click here to trigger a rebuild:
https:/
Joe Guo (guoqiao) wrote : | # |
Unit tests works on local machine but failed on CI.
I have triggered another CI job on master to see how it works:
https:/
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
A CI job is currently in progress. A follow up comment will be added when it completes.
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
FAILED: Continuous integration, rev:1f283785475
https:/
Executed test runs:
FAILURE: https:/
None: https:/
Click here to trigger a rebuild:
https:/
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
A CI job is currently in progress. A follow up comment will be added when it completes.
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
FAILED: Continuous integration, rev:1f283785475
https:/
Executed test runs:
FAILURE: https:/
None: https:/
Click here to trigger a rebuild:
https:/
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
A CI job is currently in progress. A follow up comment will be added when it completes.
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
FAILED: Continuous integration, rev:ed52df52659
https:/
Executed test runs:
FAILURE: https:/
None: https:/
Click here to trigger a rebuild:
https:/
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote : | # |
PASSED: Continuous integration, rev:ed52df52659
https:/
Executed test runs:
SUCCESS: https:/
None: https:/
Click here to trigger a rebuild:
https:/
🤖 Canonical IS Merge Bot (canonical-is-mergebot) wrote : | # |
Change successfully merged at revision fb9446ae1a98b8b
Preview Diff
1 | diff --git a/src/config.yaml b/src/config.yaml |
2 | index e4f6f04..eee3fbe 100644 |
3 | --- a/src/config.yaml |
4 | +++ b/src/config.yaml |
5 | @@ -233,4 +233,21 @@ options: |
6 | description: > |
7 | Enable the collection of IPMI sensor metrics, using the ipmi sensor telegraf |
8 | input plugin. Collecting these metrics requires sudo access - enabling |
9 | - this option will install an appropriate, locked-down sudoers file. |
10 | \ No newline at end of file |
11 | + this option will install an appropriate, locked-down sudoers file. |
12 | + collect_intel_rdt_metrics: |
13 | + default: false |
14 | + type: boolean |
15 | + description: > |
16 | + Enable the collection of Intel memory bandwidth metrics, using the |
17 | + telegraf intel_rdt input plugin. Collecting these metrics requires sudo |
18 | + access - enabling this option will install an appropriate, locked-down |
19 | + sudoers file. |
20 | + . |
21 | + There are certain requisites to run this plugin, including having a |
22 | + kernel >= v5.4, Intel RDT tools >= v4.1 available in a repository, and |
23 | + a supported CPU (as reported by the Intel utility `pqos`) |
24 | + . |
25 | + Currently the charm will configure monitoring of all detected cores. |
26 | + . |
27 | + See https://github.com/influxdata/telegraf/blob/master/plugins/inputs/intel_rdt/README.md |
28 | + for info on the telegraf intel_rdt plugin. |
29 | diff --git a/src/reactive/telegraf.py b/src/reactive/telegraf.py |
30 | index ef89a47..7a19d16 100644 |
31 | --- a/src/reactive/telegraf.py |
32 | +++ b/src/reactive/telegraf.py |
33 | @@ -22,6 +22,7 @@ import io |
34 | import ipaddress |
35 | import json |
36 | import os |
37 | +import platform |
38 | import re |
39 | import socket |
40 | import subprocess |
41 | @@ -29,9 +30,9 @@ import sys |
42 | import time |
43 | from distutils.version import LooseVersion |
44 | |
45 | -from charmhelpers import context |
46 | +from charmhelpers import context, fetch |
47 | from charmhelpers.contrib.charmsupport import nrpe |
48 | -from charmhelpers.core import hookenv, host, unitdata |
49 | +from charmhelpers.core import hookenv, host, kernel, unitdata |
50 | from charmhelpers.core.host import is_container |
51 | from charmhelpers.core.templating import render |
52 | |
53 | @@ -67,9 +68,18 @@ CONFIG_FILE = "telegraf.conf" |
54 | |
55 | CONFIG_DIR = "telegraf.d" |
56 | |
57 | -GRAFANA_DASHBOARD_TELEGRAF_FILE_NAME = "Telegraf.json.j2" |
58 | - |
59 | -GRAFANA_DASHBOARD_NAME = "telegraf" |
60 | +GRAFANA_DASHBOARD_CONFIG = { |
61 | + "telegraf": { |
62 | + "template_file": "Telegraf.json.j2", |
63 | + "context_vars": { |
64 | + # TODO: Figure out if metrics exist and then set bools accordingly. |
65 | + # For now, setting bools to true. |
66 | + "bonds_enabled": True, |
67 | + "bcache_enabled": True, |
68 | + "conntrack_enabled": True, |
69 | + }, |
70 | + }, |
71 | +} |
72 | |
73 | SNAP_SERVICE = "snap.telegraf.telegraf" |
74 | DEB_SERVICE = "telegraf" |
75 | @@ -83,6 +93,11 @@ DEB_USER = "telegraf" |
76 | |
77 | # Utilities # |
78 | |
79 | +# constants related to RDT metrics support |
80 | +RDT_MINIMUM_KERNEL_VERSION = (5, 4) |
81 | +RDT_MINIMUM_PKG_VERSION = "4.1-1ppa3" |
82 | +RDT_KERNEL_MODULE_NAME = "msr" |
83 | + |
84 | |
85 | class InvalidInstallMethodError(Exception): |
86 | pass |
87 | @@ -92,6 +107,10 @@ class InvalidPrometheusIPRangeError(Exception): |
88 | pass |
89 | |
90 | |
91 | +class InvalidIntelRDTConfigurationError(Exception): |
92 | + pass |
93 | + |
94 | + |
95 | def write_telegraf_file(path, content): |
96 | return host.write_file( |
97 | path, |
98 | @@ -283,6 +302,96 @@ def get_remote_unit_name(): |
99 | return rel["__unit__"] |
100 | |
101 | |
102 | +def check_valid_intel_rdt_configuration(): |
103 | + """ |
104 | + Check that the requirements for RDT are met. |
105 | + |
106 | + Will raise a InvalidIntelRDTConfigurationError exception when a validation |
107 | + issue is encountered, otherwise None |
108 | + """ |
109 | + # check that we meet the minimum kernel version |
110 | + linux_release = platform.release() # format is like '5.4.0-73-generic' |
111 | + re_kernel_version = r"^(\d+)\.(\d+)" |
112 | + match = re.match(re_kernel_version, linux_release) |
113 | + |
114 | + if match: |
115 | + current_kernel_version = tuple(int(d) for d in match.groups()) |
116 | + if current_kernel_version < RDT_MINIMUM_KERNEL_VERSION: |
117 | + raise InvalidIntelRDTConfigurationError( |
118 | + "unsupported kernel version: {}, need version higher than {}".format( |
119 | + current_kernel_version, RDT_MINIMUM_KERNEL_VERSION |
120 | + ) |
121 | + ) |
122 | + else: |
123 | + raise InvalidIntelRDTConfigurationError( |
124 | + "Incompatible platform.release output: {}".format(linux_release) |
125 | + ) |
126 | + |
127 | + # check that package `intel-cmt-cat` is installed |
128 | + current_pkg_version = fetch.get_installed_version("intel-cmt-cat") |
129 | + if not current_pkg_version: |
130 | + raise InvalidIntelRDTConfigurationError( |
131 | + "package 'intel-cmt-cat' is not installed yet" |
132 | + ) |
133 | + |
134 | + current_pkg_version_str = current_pkg_version["ver_str"] |
135 | + |
136 | + # check that package `intel-cmt-cat` is recent enough |
137 | + if ( |
138 | + fetch.apt_pkg.version_compare(current_pkg_version_str, RDT_MINIMUM_PKG_VERSION) |
139 | + < 0 # noqa: W503 |
140 | + ): |
141 | + base_error_msg = "package 'intel-cmt-cat' is older than required" |
142 | + raise InvalidIntelRDTConfigurationError( |
143 | + "{}: '{}' (installed '{}')".format( |
144 | + base_error_msg, RDT_MINIMUM_PKG_VERSION, current_pkg_version_str |
145 | + ) |
146 | + ) |
147 | + |
148 | + # check that the required module is loaded |
149 | + if not kernel.is_module_loaded(RDT_KERNEL_MODULE_NAME): |
150 | + raise InvalidIntelRDTConfigurationError( |
151 | + "required module '{}' is not loaded".format(RDT_KERNEL_MODULE_NAME) |
152 | + ) |
153 | + |
154 | + # check that the `pqos` utility reports no issues |
155 | + # this performs a sanity check on the RDT utility configuration |
156 | + command = ["sudo", "pqos", "-d"] |
157 | + try: |
158 | + subprocess.check_call(command) |
159 | + # this performs a sanity check on the RDT utility configuration |
160 | + except subprocess.CalledProcessError as error: |
161 | + hookenv.log( |
162 | + "pqos -d call failed:\n{}".format(error.output.decode("utf8")), |
163 | + level=hookenv.ERROR, |
164 | + ) |
165 | + raise InvalidIntelRDTConfigurationError("pqos -d failed, see logs for details") |
166 | + |
167 | + return None |
168 | + |
169 | + |
170 | +def get_cpu_cores(): |
171 | + """Get the list of available cores for the cpu(s).""" |
172 | + # should return something like ["0-23"] |
173 | + command = ["lscpu", "--json"] |
174 | + try: |
175 | + lscpu_output = subprocess.check_output(command).decode("utf8") |
176 | + except subprocess.CalledProcessError as error: |
177 | + hookenv.log( |
178 | + "lscpu call failed:\n{}".format(error.output.decode("utf8")), |
179 | + level=hookenv.ERROR, |
180 | + ) |
181 | + raise error |
182 | + |
183 | + lscpu_json = json.loads(lscpu_output) |
184 | + |
185 | + for data_pair in lscpu_json["lscpu"]: |
186 | + if data_pair["field"] == "On-line CPU(s) list:": |
187 | + return '["{}"]'.format(data_pair["data"]) |
188 | + |
189 | + raise Exception("Incompatible lscpu output: {}".format(lscpu_output)) |
190 | + |
191 | + |
192 | def get_disabled_plugins(): |
193 | """Return consolidated list of all plugins to be disabled.""" |
194 | config = hookenv.config() |
195 | @@ -324,6 +433,13 @@ def get_base_inputs(): |
196 | ipmi_sensor = config["collect_ipmi_sensor_metrics"] |
197 | disabled_plugins = get_disabled_plugins() |
198 | |
199 | + # handle the Intel RDT collection parameters |
200 | + intel_rdt = config["collect_intel_rdt_metrics"] |
201 | + if intel_rdt: |
202 | + intel_rdt_cores = get_cpu_cores() |
203 | + else: |
204 | + intel_rdt_cores = None |
205 | + |
206 | return { |
207 | "extra_options": extra_options["inputs"], |
208 | "bcache": is_bcache(), |
209 | @@ -334,6 +450,8 @@ def get_base_inputs(): |
210 | "iptables": iptables, |
211 | "smart": smart, |
212 | "ipmi_sensor": ipmi_sensor, |
213 | + "intel_rdt": intel_rdt, |
214 | + "intel_rdt_cores": intel_rdt_cores, |
215 | } |
216 | |
217 | |
218 | @@ -694,6 +812,38 @@ def configure_telegraf(): # noqa: C901 |
219 | else: |
220 | remove_sudoers_file(sudoers_filename) |
221 | |
222 | + # handle the configuration of intel_rdt |
223 | + sudoers_filename = "telegraf_intel_rdt" |
224 | + if config["collect_intel_rdt_metrics"]: |
225 | + hookenv.log("Intel RDT enabled, enabling module and running checks") |
226 | + |
227 | + if is_container(): |
228 | + error_msg = "Intel RDT can not be enabled in container" |
229 | + hookenv.log(error_msg, level=hookenv.WARNING) |
230 | + hookenv.status_set("blocked", error_msg) |
231 | + return |
232 | + |
233 | + # load and persist the required module |
234 | + try: |
235 | + kernel.modprobe(RDT_KERNEL_MODULE_NAME, persist=True) |
236 | + except subprocess.CalledProcessError: |
237 | + error_msg = "modprobe {} failed".format(RDT_KERNEL_MODULE_NAME) |
238 | + hookenv.log(error_msg, level=hookenv.ERROR) |
239 | + hookenv.status_set("blocked", error_msg) |
240 | + return |
241 | + |
242 | + try: |
243 | + check_valid_intel_rdt_configuration() |
244 | + except InvalidIntelRDTConfigurationError as e: |
245 | + # on error we abort configuration and block the charm |
246 | + error_msg = "Cannot configure Intel RDT: {}".format(e) |
247 | + hookenv.log(error_msg, level=hookenv.ERROR) |
248 | + hookenv.status_set("blocked", error_msg) |
249 | + return |
250 | + render_sudoers_file(sudoers_filename) |
251 | + else: |
252 | + remove_sudoers_file(sudoers_filename) |
253 | + |
254 | telegraf_exec_metrics = os.path.join(get_files_dir(), "telegraf_exec_metrics.py") |
255 | cmd = [ |
256 | telegraf_exec_metrics, |
257 | @@ -720,7 +870,12 @@ def configure_telegraf(): # noqa: C901 |
258 | for service in [DEB_SERVICE, SNAP_SERVICE]: |
259 | if service == get_service(): |
260 | host.service_resume(service) |
261 | - host.service_reload(service) |
262 | + # skip reload when Intel RDT is enabled, as it stops the plugin from |
263 | + # publishing data. The service will be restarted via the flag |
264 | + # "telegraf.needs_reload" on changes later |
265 | + if not config["collect_intel_rdt_metrics"]: |
266 | + hookenv.log("reloading service: {}".format(service), level="DEBUG") |
267 | + host.service_reload(service) |
268 | else: |
269 | try: |
270 | host.service_pause(service) |
271 | @@ -846,6 +1001,13 @@ def handle_config_changes(): |
272 | ): |
273 | clear_flag("plugins.prometheus-client.configured") |
274 | clear_flag("prometheus-client.relation.configured") |
275 | + |
276 | + # handle the Intel RDT/MBM metrics collection |
277 | + if config.get("collect_intel_rdt_metrics"): |
278 | + set_flag("telegraf.intel_rdt.enabled") |
279 | + else: |
280 | + clear_flag("telegraf.intel_rdt.enabled") |
281 | + |
282 | clear_flag("telegraf.configured") |
283 | clear_flag("telegraf.apt.configured") |
284 | clear_flag("telegraf.snap.configured") |
285 | @@ -1556,32 +1718,40 @@ def prometheus_client_departed(): |
286 | ) |
287 | @when_not("grafana.configured") |
288 | def register_grafana_dashboard(): |
289 | + config = hookenv.config() |
290 | grafana = endpoint_from_flag("endpoint.dashboards.joined") |
291 | - hookenv.log("Loading grafana dashboard", level=hookenv.DEBUG) |
292 | - dashboard = _load_grafana_dashboard() |
293 | - digest = hashlib.md5(dashboard.encode("utf8")).hexdigest() |
294 | - dashboard_dict = json.loads(dashboard) |
295 | - dashboard_dict["digest"] = digest |
296 | - hookenv.log( |
297 | - "Rendered dashboard dict:\n{}".format(dashboard_dict), level=hookenv.DEBUG |
298 | - ) |
299 | - grafana.register_dashboard(name=GRAFANA_DASHBOARD_NAME, dashboard=dashboard_dict) |
300 | - hookenv.log('Grafana dashboard "{}" registered.'.format(GRAFANA_DASHBOARD_NAME)) |
301 | + grafana_dashboard_config = GRAFANA_DASHBOARD_CONFIG.copy() |
302 | + |
303 | + # if RDT is enabled inject the relevant dashboard config |
304 | + if config["collect_intel_rdt_metrics"]: |
305 | + grafana_dashboard_config["Intel RDT"] = {"template_file": "IntelRDT.json.j2"} |
306 | + |
307 | + # process all the configured dashboards |
308 | + for dashboard_name, dashboard_data in grafana_dashboard_config.items(): |
309 | + hookenv.log( |
310 | + "Loading grafana dashboard: {}".format(dashboard_name), level=hookenv.DEBUG |
311 | + ) |
312 | + dashboard = _load_grafana_dashboard(dashboard_data) |
313 | + digest = hashlib.md5(dashboard.encode("utf8")).hexdigest() |
314 | + dashboard_dict = json.loads(dashboard) |
315 | + dashboard_dict["digest"] = digest |
316 | + hookenv.log( |
317 | + "Rendered dashboard dict:\n{}".format(dashboard_dict), level=hookenv.DEBUG |
318 | + ) |
319 | + grafana.register_dashboard(name=dashboard_name, dashboard=dashboard_dict) |
320 | + hookenv.log('Grafana dashboard "{}" registered.'.format(dashboard_name)) |
321 | + |
322 | set_flag("grafana.configured") |
323 | |
324 | |
325 | -def _load_grafana_dashboard(): |
326 | +def _load_grafana_dashboard(dashboard_data): |
327 | prometheus_datasource = "{} - Juju generated source".format( |
328 | hookenv.config().get("prometheus_datasource", "prometheus") |
329 | ) |
330 | dashboard_context = dict(datasource=prometheus_datasource) |
331 | - # TODO: Figure out if metrics exist and then set bools accordingly. |
332 | - # For now, setting bools to true. |
333 | - dashboard_context["bonds_enabled"] = True |
334 | - dashboard_context["bcache_enabled"] = True |
335 | - dashboard_context["conntrack_enabled"] = True |
336 | + dashboard_context.update(dashboard_data.get("context_vars", {})) |
337 | return render_custom( |
338 | - source=GRAFANA_DASHBOARD_TELEGRAF_FILE_NAME, |
339 | + source=dashboard_data["template_file"], |
340 | render_context=dashboard_context, |
341 | variable_start_string="<<", |
342 | variable_end_string=">>", |
343 | @@ -1731,3 +1901,9 @@ def configure_nagios(nagios): |
344 | @when_not("apt.nvme-cli.installed") |
345 | def install_smart_metrics_packages(): |
346 | apt.queue_install(["smartmontools", "nvme-cli"]) |
347 | + |
348 | + |
349 | +@when("telegraf.intel_rdt.enabled") |
350 | +@when_not("apt.installed.intel-cmt-cat") |
351 | +def install_intel_rdt_packages(): |
352 | + apt.queue_install(["intel-cmt-cat"]) |
353 | diff --git a/src/templates/base_inputs.conf b/src/templates/base_inputs.conf |
354 | index 7f42543..8feea45 100644 |
355 | --- a/src/templates/base_inputs.conf |
356 | +++ b/src/templates/base_inputs.conf |
357 | @@ -170,6 +170,13 @@ use_sudo = true |
358 | {%- endif %} |
359 | {% endif %} |
360 | |
361 | +{% if "intel_rdt" not in disabled_plugins %} |
362 | +{% if intel_rdt -%} |
363 | +[[inputs.intel_rdt]] |
364 | +cores = {{ intel_rdt_cores }} |
365 | +use_sudo = true |
366 | +{%- endif %} |
367 | +{% endif %} |
368 | |
369 | [[inputs.exec]] |
370 | commands = [ |
371 | diff --git a/src/templates/dashboards/grafana/IntelRDT.json.j2 b/src/templates/dashboards/grafana/IntelRDT.json.j2 |
372 | new file mode 100644 |
373 | index 0000000..5a915c4 |
374 | --- /dev/null |
375 | +++ b/src/templates/dashboards/grafana/IntelRDT.json.j2 |
376 | @@ -0,0 +1,508 @@ |
377 | +{ |
378 | + "annotations": { |
379 | + "list": [ |
380 | + { |
381 | + "builtIn": 1, |
382 | + "datasource": "-- Grafana --", |
383 | + "enable": true, |
384 | + "hide": true, |
385 | + "iconColor": "rgba(0, 211, 255, 1)", |
386 | + "name": "Annotations & Alerts", |
387 | + "type": "dashboard" |
388 | + } |
389 | + ] |
390 | + }, |
391 | + "editable": true, |
392 | + "gnetId": null, |
393 | + "graphTooltip": 0, |
394 | + "id": null, |
395 | + "iteration": 1625126969668, |
396 | + "links": [], |
397 | + "panels": [ |
398 | + { |
399 | + "collapsed": false, |
400 | + "datasource": null, |
401 | + "gridPos": { |
402 | + "h": 1, |
403 | + "w": 24, |
404 | + "x": 0, |
405 | + "y": 0 |
406 | + }, |
407 | + "id": 8, |
408 | + "panels": [], |
409 | + "title": "Memory Bandwidth", |
410 | + "type": "row" |
411 | + }, |
412 | + { |
413 | + "datasource": "<< datasource >>", |
414 | + "fieldConfig": { |
415 | + "defaults": { |
416 | + "color": { |
417 | + "mode": "palette-classic" |
418 | + }, |
419 | + "custom": { |
420 | + "axisLabel": "MB/s", |
421 | + "axisPlacement": "auto", |
422 | + "barAlignment": 0, |
423 | + "drawStyle": "line", |
424 | + "fillOpacity": 0, |
425 | + "gradientMode": "none", |
426 | + "hideFrom": { |
427 | + "legend": false, |
428 | + "tooltip": false, |
429 | + "viz": false |
430 | + }, |
431 | + "lineInterpolation": "linear", |
432 | + "lineWidth": 1, |
433 | + "pointSize": 5, |
434 | + "scaleDistribution": { |
435 | + "type": "linear" |
436 | + }, |
437 | + "showPoints": "auto", |
438 | + "spanNulls": false, |
439 | + "stacking": { |
440 | + "group": "A", |
441 | + "mode": "none" |
442 | + }, |
443 | + "thresholdsStyle": { |
444 | + "mode": "off" |
445 | + } |
446 | + }, |
447 | + "mappings": [], |
448 | + "thresholds": { |
449 | + "mode": "absolute", |
450 | + "steps": [ |
451 | + { |
452 | + "color": "green", |
453 | + "value": null |
454 | + }, |
455 | + { |
456 | + "color": "red", |
457 | + "value": 80 |
458 | + } |
459 | + ] |
460 | + } |
461 | + }, |
462 | + "overrides": [] |
463 | + }, |
464 | + "gridPos": { |
465 | + "h": 11, |
466 | + "w": 24, |
467 | + "x": 0, |
468 | + "y": 1 |
469 | + }, |
470 | + "id": 2, |
471 | + "options": { |
472 | + "legend": { |
473 | + "calcs": [], |
474 | + "displayMode": "list", |
475 | + "placement": "bottom" |
476 | + }, |
477 | + "tooltip": { |
478 | + "mode": "single" |
479 | + } |
480 | + }, |
481 | + "targets": [ |
482 | + { |
483 | + "exemplar": true, |
484 | + "expr": "{name=\"MBL\"}", |
485 | + "interval": "", |
486 | + "legendFormat": "{{ host }}", |
487 | + "queryType": "randomWalk", |
488 | + "refId": "Memory Bandwidth" |
489 | + } |
490 | + ], |
491 | + "title": "MBL", |
492 | + "type": "timeseries" |
493 | + }, |
494 | + { |
495 | + "datasource": "<< datasource >>", |
496 | + "fieldConfig": { |
497 | + "defaults": { |
498 | + "color": { |
499 | + "mode": "palette-classic" |
500 | + }, |
501 | + "custom": { |
502 | + "axisLabel": "MB/s", |
503 | + "axisPlacement": "auto", |
504 | + "barAlignment": 0, |
505 | + "drawStyle": "line", |
506 | + "fillOpacity": 0, |
507 | + "gradientMode": "none", |
508 | + "hideFrom": { |
509 | + "legend": false, |
510 | + "tooltip": false, |
511 | + "viz": false |
512 | + }, |
513 | + "lineInterpolation": "linear", |
514 | + "lineWidth": 1, |
515 | + "pointSize": 5, |
516 | + "scaleDistribution": { |
517 | + "type": "linear" |
518 | + }, |
519 | + "showPoints": "auto", |
520 | + "spanNulls": false, |
521 | + "stacking": { |
522 | + "group": "A", |
523 | + "mode": "none" |
524 | + }, |
525 | + "thresholdsStyle": { |
526 | + "mode": "off" |
527 | + } |
528 | + }, |
529 | + "mappings": [], |
530 | + "thresholds": { |
531 | + "mode": "absolute", |
532 | + "steps": [ |
533 | + { |
534 | + "color": "green", |
535 | + "value": null |
536 | + }, |
537 | + { |
538 | + "color": "red", |
539 | + "value": 80 |
540 | + } |
541 | + ] |
542 | + } |
543 | + }, |
544 | + "overrides": [] |
545 | + }, |
546 | + "gridPos": { |
547 | + "h": 11, |
548 | + "w": 24, |
549 | + "x": 0, |
550 | + "y": 12 |
551 | + }, |
552 | + "id": 11, |
553 | + "options": { |
554 | + "legend": { |
555 | + "calcs": [], |
556 | + "displayMode": "list", |
557 | + "placement": "bottom" |
558 | + }, |
559 | + "tooltip": { |
560 | + "mode": "single" |
561 | + } |
562 | + }, |
563 | + "targets": [ |
564 | + { |
565 | + "exemplar": true, |
566 | + "expr": "{name=\"MBR\"}", |
567 | + "interval": "", |
568 | + "legendFormat": "{{ host }}", |
569 | + "queryType": "randomWalk", |
570 | + "refId": "Memory Bandwidth" |
571 | + } |
572 | + ], |
573 | + "title": "MBR", |
574 | + "type": "timeseries" |
575 | + }, |
576 | + { |
577 | + "datasource": "<< datasource >>", |
578 | + "fieldConfig": { |
579 | + "defaults": { |
580 | + "color": { |
581 | + "mode": "palette-classic" |
582 | + }, |
583 | + "custom": { |
584 | + "axisLabel": "MB/s", |
585 | + "axisPlacement": "auto", |
586 | + "barAlignment": 0, |
587 | + "drawStyle": "line", |
588 | + "fillOpacity": 0, |
589 | + "gradientMode": "none", |
590 | + "hideFrom": { |
591 | + "legend": false, |
592 | + "tooltip": false, |
593 | + "viz": false |
594 | + }, |
595 | + "lineInterpolation": "linear", |
596 | + "lineWidth": 1, |
597 | + "pointSize": 5, |
598 | + "scaleDistribution": { |
599 | + "type": "linear" |
600 | + }, |
601 | + "showPoints": "auto", |
602 | + "spanNulls": false, |
603 | + "stacking": { |
604 | + "group": "A", |
605 | + "mode": "none" |
606 | + }, |
607 | + "thresholdsStyle": { |
608 | + "mode": "off" |
609 | + } |
610 | + }, |
611 | + "mappings": [], |
612 | + "thresholds": { |
613 | + "mode": "absolute", |
614 | + "steps": [ |
615 | + { |
616 | + "color": "green", |
617 | + "value": null |
618 | + }, |
619 | + { |
620 | + "color": "red", |
621 | + "value": 80 |
622 | + } |
623 | + ] |
624 | + } |
625 | + }, |
626 | + "overrides": [] |
627 | + }, |
628 | + "gridPos": { |
629 | + "h": 11, |
630 | + "w": 24, |
631 | + "x": 0, |
632 | + "y": 23 |
633 | + }, |
634 | + "id": 10, |
635 | + "options": { |
636 | + "legend": { |
637 | + "calcs": [], |
638 | + "displayMode": "list", |
639 | + "placement": "bottom" |
640 | + }, |
641 | + "tooltip": { |
642 | + "mode": "single" |
643 | + } |
644 | + }, |
645 | + "targets": [ |
646 | + { |
647 | + "exemplar": true, |
648 | + "expr": "{name=\"MBT\"}", |
649 | + "interval": "", |
650 | + "legendFormat": "{{ host }}", |
651 | + "queryType": "randomWalk", |
652 | + "refId": "Memory Bandwidth" |
653 | + } |
654 | + ], |
655 | + "title": "MBT", |
656 | + "type": "timeseries" |
657 | + }, |
658 | + { |
659 | + "collapsed": true, |
660 | + "datasource": null, |
661 | + "gridPos": { |
662 | + "h": 1, |
663 | + "w": 24, |
664 | + "x": 0, |
665 | + "y": 34 |
666 | + }, |
667 | + "id": 6, |
668 | + "panels": [ |
669 | + { |
670 | + "datasource": "<< datasource >>", |
671 | + "fieldConfig": { |
672 | + "defaults": { |
673 | + "color": { |
674 | + "mode": "palette-classic" |
675 | + }, |
676 | + "custom": { |
677 | + "axisLabel": "", |
678 | + "axisPlacement": "auto", |
679 | + "barAlignment": 0, |
680 | + "drawStyle": "line", |
681 | + "fillOpacity": 0, |
682 | + "gradientMode": "none", |
683 | + "hideFrom": { |
684 | + "legend": false, |
685 | + "tooltip": false, |
686 | + "viz": false |
687 | + }, |
688 | + "lineInterpolation": "linear", |
689 | + "lineWidth": 1, |
690 | + "pointSize": 5, |
691 | + "scaleDistribution": { |
692 | + "type": "linear" |
693 | + }, |
694 | + "showPoints": "auto", |
695 | + "spanNulls": false, |
696 | + "stacking": { |
697 | + "group": "A", |
698 | + "mode": "none" |
699 | + }, |
700 | + "thresholdsStyle": { |
701 | + "mode": "off" |
702 | + } |
703 | + }, |
704 | + "mappings": [], |
705 | + "thresholds": { |
706 | + "mode": "absolute", |
707 | + "steps": [ |
708 | + { |
709 | + "color": "green", |
710 | + "value": null |
711 | + }, |
712 | + { |
713 | + "color": "red", |
714 | + "value": 80 |
715 | + } |
716 | + ] |
717 | + }, |
718 | + "unit": "deckbytes" |
719 | + }, |
720 | + "overrides": [] |
721 | + }, |
722 | + "gridPos": { |
723 | + "h": 11, |
724 | + "w": 24, |
725 | + "x": 0, |
726 | + "y": 2 |
727 | + }, |
728 | + "id": 4, |
729 | + "options": { |
730 | + "legend": { |
731 | + "calcs": [], |
732 | + "displayMode": "list", |
733 | + "placement": "bottom" |
734 | + }, |
735 | + "tooltip": { |
736 | + "mode": "single" |
737 | + } |
738 | + }, |
739 | + "targets": [ |
740 | + { |
741 | + "exemplar": true, |
742 | + "expr": "rdt_metric{name=\"LLC\", host=\"$host\"}", |
743 | + "interval": "", |
744 | + "legendFormat": "{{ host }}", |
745 | + "queryType": "randomWalk", |
746 | + "refId": "A" |
747 | + } |
748 | + ], |
749 | + "title": "LLC", |
750 | + "type": "timeseries" |
751 | + }, |
752 | + { |
753 | + "datasource": "<< datasource >>", |
754 | + "fieldConfig": { |
755 | + "defaults": { |
756 | + "color": { |
757 | + "mode": "palette-classic" |
758 | + }, |
759 | + "custom": { |
760 | + "axisLabel": "", |
761 | + "axisPlacement": "auto", |
762 | + "barAlignment": 0, |
763 | + "drawStyle": "line", |
764 | + "fillOpacity": 0, |
765 | + "gradientMode": "none", |
766 | + "hideFrom": { |
767 | + "legend": false, |
768 | + "tooltip": false, |
769 | + "viz": false |
770 | + }, |
771 | + "lineInterpolation": "linear", |
772 | + "lineWidth": 1, |
773 | + "pointSize": 5, |
774 | + "scaleDistribution": { |
775 | + "type": "linear" |
776 | + }, |
777 | + "showPoints": "auto", |
778 | + "spanNulls": false, |
779 | + "stacking": { |
780 | + "group": "A", |
781 | + "mode": "none" |
782 | + }, |
783 | + "thresholdsStyle": { |
784 | + "mode": "off" |
785 | + } |
786 | + }, |
787 | + "mappings": [], |
788 | + "thresholds": { |
789 | + "mode": "absolute", |
790 | + "steps": [ |
791 | + { |
792 | + "color": "green", |
793 | + "value": null |
794 | + }, |
795 | + { |
796 | + "color": "red", |
797 | + "value": 80 |
798 | + } |
799 | + ] |
800 | + }, |
801 | + "unit": "short" |
802 | + }, |
803 | + "overrides": [] |
804 | + }, |
805 | + "gridPos": { |
806 | + "h": 11, |
807 | + "w": 24, |
808 | + "x": 0, |
809 | + "y": 13 |
810 | + }, |
811 | + "id": 9, |
812 | + "options": { |
813 | + "legend": { |
814 | + "calcs": [], |
815 | + "displayMode": "list", |
816 | + "placement": "bottom" |
817 | + }, |
818 | + "tooltip": { |
819 | + "mode": "single" |
820 | + } |
821 | + }, |
822 | + "targets": [ |
823 | + { |
824 | + "exemplar": true, |
825 | + "expr": "rdt_metric{name=\"LLC_Misses\", host=\"$host\"}", |
826 | + "interval": "", |
827 | + "legendFormat": "{{ host }}", |
828 | + "queryType": "randomWalk", |
829 | + "refId": "A" |
830 | + } |
831 | + ], |
832 | + "title": "LLC Misses", |
833 | + "type": "timeseries" |
834 | + } |
835 | + ], |
836 | + "title": "Cache Occupancy", |
837 | + "type": "row" |
838 | + } |
839 | + ], |
840 | + "refresh": "", |
841 | + "schemaVersion": 30, |
842 | + "style": "dark", |
843 | + "tags": [], |
844 | + "templating": { |
845 | + "list": [ |
846 | + { |
847 | + "allValue": null, |
848 | + "current": { |
849 | + "selected": false, |
850 | + "text": "controller:ubuntu-1", |
851 | + "value": "controller:ubuntu-1" |
852 | + }, |
853 | + "datasource": "<< datasource >>", |
854 | + "definition": "label_values(host)", |
855 | + "description": null, |
856 | + "error": null, |
857 | + "hide": 0, |
858 | + "includeAll": false, |
859 | + "label": null, |
860 | + "multi": true, |
861 | + "name": "host", |
862 | + "options": [], |
863 | + "query": { |
864 | + "query": "label_values(host)", |
865 | + "refId": "StandardVariableQuery" |
866 | + }, |
867 | + "refresh": 1, |
868 | + "regex": "", |
869 | + "skipUrlSync": false, |
870 | + "sort": 0, |
871 | + "type": "query" |
872 | + } |
873 | + ] |
874 | + }, |
875 | + "time": { |
876 | + "from": "now-6h", |
877 | + "to": "now" |
878 | + }, |
879 | + "timepicker": {}, |
880 | + "timezone": "utc", |
881 | + "title": "Intel RDT - Memory Bandwidth Monitoring", |
882 | + "uid": "GWblKcRnd", |
883 | + "version": 5 |
884 | +} |
885 | diff --git a/src/templates/sudoers/telegraf_intel_rdt.tmpl b/src/templates/sudoers/telegraf_intel_rdt.tmpl |
886 | new file mode 100644 |
887 | index 0000000..0324a77 |
888 | --- /dev/null |
889 | +++ b/src/templates/sudoers/telegraf_intel_rdt.tmpl |
890 | @@ -0,0 +1,4 @@ |
891 | +Cmnd_Alias PQOS = /usr/sbin/pqos -r --iface-os --mon-file-type=csv --mon-interval=* |
892 | +{{ telegraf_user }} ALL=(root) NOPASSWD: PQOS |
893 | +Defaults!PQOS !logfile, !syslog, !pam_session |
894 | + |
895 | diff --git a/src/tests/unit/test_telegraf.py b/src/tests/unit/test_telegraf.py |
896 | index 4178a34..251a9b9 100644 |
897 | --- a/src/tests/unit/test_telegraf.py |
898 | +++ b/src/tests/unit/test_telegraf.py |
899 | @@ -20,6 +20,7 @@ import getpass |
900 | import grp |
901 | import json |
902 | import os |
903 | +import platform |
904 | import shutil |
905 | import subprocess |
906 | import sys |
907 | @@ -27,9 +28,11 @@ from textwrap import dedent |
908 | from unittest import mock |
909 | from unittest.mock import MagicMock, call, patch |
910 | |
911 | -from charmhelpers.core import host |
912 | +from charmhelpers import fetch |
913 | +from charmhelpers.core import host, kernel |
914 | from charmhelpers.core.hookenv import Config |
915 | from charmhelpers.core.templating import render |
916 | +from charmhelpers.fetch import apt_pkg |
917 | |
918 | import charms |
919 | from charms.reactive import RelationBase, bus, helpers, set_flag |
920 | @@ -1465,7 +1468,9 @@ class TestGrafanaDashboard: |
921 | mock_render, |
922 | ): |
923 | expected_datasource = "my_prometheus" |
924 | - fake_config = dict(prometheus_datasource=expected_datasource) |
925 | + fake_config = dict( |
926 | + prometheus_datasource=expected_datasource, collect_intel_rdt_metrics=False |
927 | + ) |
928 | expected_dashboard_context = dict( |
929 | datasource="{} - Juju generated source".format(expected_datasource), |
930 | bonds_enabled=True, |
931 | @@ -1484,15 +1489,19 @@ class TestGrafanaDashboard: |
932 | mock_render.return_value = mock_rendered_content |
933 | |
934 | telegraf.register_grafana_dashboard() |
935 | + dashboard_name = "telegraf" |
936 | + dashboard_filename = telegraf.GRAFANA_DASHBOARD_CONFIG[dashboard_name][ |
937 | + "template_file" |
938 | + ] |
939 | |
940 | mock_render.assert_called_once_with( |
941 | - source=telegraf.GRAFANA_DASHBOARD_TELEGRAF_FILE_NAME, |
942 | + source=dashboard_filename, |
943 | render_context=expected_dashboard_context, |
944 | variable_start_string="<<", |
945 | variable_end_string=">>", |
946 | ) |
947 | mock_grafana.register_dashboard.assert_called_once_with( |
948 | - name=telegraf.GRAFANA_DASHBOARD_NAME, dashboard=mock_dashboard_dict |
949 | + name=dashboard_name, dashboard=mock_dashboard_dict |
950 | ) |
951 | mock_set_flag.assert_called_once_with("grafana.configured") |
952 | |
953 | @@ -1617,3 +1626,108 @@ def test_collect_ipmi_sensor_metrics(monkeypatch, config): |
954 | """ |
955 | config_file = base_dir().join("telegraf.conf") |
956 | assert expected in config_file.read() |
957 | + |
958 | + |
959 | +def test_collect_intel_rdt_metrics(monkeypatch, config): |
960 | + monkeypatch.setattr(telegraf, "is_container", lambda: False) |
961 | + config["collect_intel_rdt_metrics"] = True |
962 | + monkeypatch.setattr(telegraf, "get_cpu_cores", lambda: '["0-23"]') |
963 | + monkeypatch.setattr(telegraf, "check_valid_intel_rdt_configuration", lambda: "") |
964 | + monkeypatch.setattr(kernel, "modprobe", lambda module, persist: None) |
965 | + telegraf.configure_telegraf() |
966 | + |
967 | + expected = """ |
968 | +[[inputs.intel_rdt]] |
969 | +cores = ["0-23"] |
970 | +use_sudo = true |
971 | +""" |
972 | + config_file = base_dir().join("telegraf.conf") |
973 | + assert expected in config_file.read() |
974 | + |
975 | + |
976 | +def test_get_cpu_cores(monkeypatch): |
977 | + lscpu_output = """ |
978 | +{ |
979 | + "lscpu": [ |
980 | + {"field": "Architecture:", "data": "x86_64"}, |
981 | + {"field": "On-line CPU(s) list:", "data": "0-23"} |
982 | + ] |
983 | +} |
984 | +""".encode( |
985 | + "utf8" |
986 | + ) |
987 | + monkeypatch.setattr(subprocess, "check_output", lambda cmd: lscpu_output) |
988 | + cores = telegraf.get_cpu_cores() |
989 | + assert cores == '["0-23"]' |
990 | + |
991 | + |
992 | +def test_check_valid_intel_rdt_configuration_kernel_version(monkeypatch): |
993 | + monkeypatch.setattr(telegraf, "is_container", lambda: False) |
994 | + monkeypatch.setattr(platform, "release", lambda: "4.4.0-73-generic") |
995 | + with pytest.raises( |
996 | + telegraf.InvalidIntelRDTConfigurationError, match="unsupported kernel version" |
997 | + ): |
998 | + telegraf.check_valid_intel_rdt_configuration() |
999 | + |
1000 | + |
1001 | +def test_check_valid_intel_rdt_configuration_pkg_present(monkeypatch): |
1002 | + monkeypatch.setattr(telegraf, "is_container", lambda: False) |
1003 | + monkeypatch.setattr(platform, "release", lambda: "5.4.0-73-generic") |
1004 | + monkeypatch.setattr(fetch, "get_installed_version", lambda pkg: None) |
1005 | + with pytest.raises( |
1006 | + telegraf.InvalidIntelRDTConfigurationError, |
1007 | + match="package 'intel-cmt-cat' is not installed yet", |
1008 | + ): |
1009 | + telegraf.check_valid_intel_rdt_configuration() |
1010 | + |
1011 | + |
1012 | +def test_check_valid_intel_rdt_configuration_pkg_version(monkeypatch): |
1013 | + monkeypatch.setattr(telegraf, "is_container", lambda: False) |
1014 | + monkeypatch.setattr(platform, "release", lambda: "5.4.0-73-generic") |
1015 | + monkeypatch.setattr(fetch, "get_installed_version", lambda pkg: {"ver_str": "0.0"}) |
1016 | + monkeypatch.setattr(apt_pkg, "version_compare", lambda a, b: -1) |
1017 | + with pytest.raises( |
1018 | + telegraf.InvalidIntelRDTConfigurationError, |
1019 | + match="package 'intel-cmt-cat' is older than required", |
1020 | + ): |
1021 | + telegraf.check_valid_intel_rdt_configuration() |
1022 | + |
1023 | + |
1024 | +def test_check_valid_intel_rdt_configuration_kernel_module(monkeypatch): |
1025 | + monkeypatch.setattr(telegraf, "is_container", lambda: False) |
1026 | + monkeypatch.setattr(platform, "release", lambda: "5.4.0-73-generic") |
1027 | + monkeypatch.setattr(kernel, "is_module_loaded", lambda module: False) |
1028 | + monkeypatch.setattr( |
1029 | + fetch, |
1030 | + "get_installed_version", |
1031 | + lambda pkg: {"ver_str": telegraf.RDT_MINIMUM_PKG_VERSION}, |
1032 | + ) |
1033 | + monkeypatch.setattr(apt_pkg, "version_compare", lambda a, b: 0) |
1034 | + with pytest.raises( |
1035 | + telegraf.InvalidIntelRDTConfigurationError, |
1036 | + match="required module", |
1037 | + ): |
1038 | + telegraf.check_valid_intel_rdt_configuration() |
1039 | + |
1040 | + |
1041 | +def test_check_valid_intel_rdt_configuration_pqos(monkeypatch): |
1042 | + def mock_check_call(*args, **kwargs): |
1043 | + raise subprocess.CalledProcessError( |
1044 | + cmd="fake", returncode=1, output="fail".encode("utf8") |
1045 | + ) |
1046 | + |
1047 | + monkeypatch.setattr(telegraf, "is_container", lambda: False) |
1048 | + monkeypatch.setattr(platform, "release", lambda: "5.4.0-73-generic") |
1049 | + monkeypatch.setattr(kernel, "is_module_loaded", lambda module: True) |
1050 | + monkeypatch.setattr( |
1051 | + fetch, |
1052 | + "get_installed_version", |
1053 | + lambda pkg: {"ver_str": telegraf.RDT_MINIMUM_PKG_VERSION}, |
1054 | + ) |
1055 | + monkeypatch.setattr(apt_pkg, "version_compare", lambda a, b: 0) |
1056 | + monkeypatch.setattr(subprocess, "check_call", mock_check_call) |
1057 | + with pytest.raises( |
1058 | + telegraf.InvalidIntelRDTConfigurationError, |
1059 | + match="pqos -d failed", |
1060 | + ): |
1061 | + telegraf.check_valid_intel_rdt_configuration() |
This merge proposal is being monitored by mergebot. Change the status to Approved to merge.