7ed9751...
by
Michael Jeanson <email address hidden>
fix: Unify possible CPU number fallback
The MUSL specific fallback to get the number of possible CPUs in the
system has the same issue with hot-unplugged CPUs as the Glibc
implementation we worked around by using the possible CPU mask from
sysfs.
To address this, unify our fallback code across all C libraries to get
the maximum CPU id from the directories in "/sys/devices/system/cpu".
625fc7b...
by
Michael Jeanson <email address hidden>
fix: removed accidental VLA in _get_num_possible_cpus()
The LTTNG_UST_PAGE_SIZE define can either point to a literal value or
the sysconf() function making buf[] a VLA. Replace this by a
cpumask specifc define that will always be a literal value.
4783663...
by
Michael Jeanson <email address hidden>
fix: num_possible_cpus() with hot-unplugged CPUs
We rely on sysconf(_SC_NPROCESSORS_CONF) to get the maximum possible
number of CPUs that can be attached to the system for the lifetime of an
application. We use this value to allocate an array of per-CPU buffers
that is indexed by the numerical id of the CPUs.
As such we expect that the highest possible CPU id would be one less
than the number returned by sysconf(_SC_NPROCESSORS_CONF) which is
unfortunatly not always the case and can vary across libc
implementations and versions.
Glibc up to 2.35 will count the number of "cpuX" directories in
"/sys/devices/system/cpu" which doesn't include CPUS that were
hot-unplugged.
This information is however provided by the kernel in
"/sys/devices/system/cpu/possible" in the form of a mask listing all the
CPUs that could possibly be hot-plugged in the system.
This patch changes the implementation of num_possible_cpus() to first
try parsing the possible CPU mask to extract the highest possible value
and if this fails fallback to the previous behavior.
b94384c...
by
Michael Jeanson <email address hidden>
fix: allow building with userspace-rcu 0.13
The liburcu detection code was checking for the 'synchronize_rcu_bp'
symbol in liburcu-bp which was dropped from the ABI in 0.13, add a
fallback check for 'urcu_bp_synchronize_rcu' which replaced it.
Also drop the check for 'call_rcu_bp' which we don't use and was just a
leftover of version check for 0.6 which has since been superseded.
Both those changes allow to build against liburcu >= 0.13.
Testing with a fixed number of loops per-thread only works if the
workload is distributed perfectly across CPUs. For instance, if a lock
is held in the workload (e.g. internally by open() and close()), those
may cause starvation of some threads, and therefore cause the benchmark
to be wrong because it will wait for the slowest thread to complete its
loops.
It is also not good for testing overcommit of threads vs cpus.
Change the test to report the number of loops performed in a given wall
time, and use this to report the average and std.dev. of tracing
overhead per event on each active CPU.
Change the benchmark workload to be only CPU-bound and not generate
system calls to minimize the inherent non-scalability of the workload
(e.g. locks held within the kernel).