glibc:azanella/atexit-order

Last commit made on 2019-07-11
Get this branch:
git clone -b azanella/atexit-order https://git.launchpad.net/glibc

Branch merges

Branch information

Name:
azanella/atexit-order
Repository:
lp:glibc

Recent commits

ba11e52... by Adhemerval Zanella on 2019-06-12

stdlib: Make atexit to not act as __cxa_atexit

This is patch to addresses the issue brought in recent discussion
regarding atexit and shared libraries [1] [2]. As indicated in the
libc-alpha discussion, the issue is since atexit uses __cxa_atexit
its interaction __cxa_finalize could lead to atexit handlers being
executed in a different order than the expected one. The github
project gives a small example that triggers it [3].

The changes I could come with changes slight the atexit semantic
as described in my last email [4]. So basically the changes are:

  1. Add the __atexit symbol which is linked as __cxa_finalize in
     static mode (so __dso_handle is correctly set). The __atexit
     symbol adds an ef_at exit_function entry on __exit_funcs,
     different than an ef_cxa one from __cxa_atexit.

     Old binaries would still call __cxa_atexit, so we do not actually
     need to add a compat symbol.

  2. Make __cxa_finalize to handle ef_at as well, similar to ef_cxa.

  3. Change how the internal exit handler are organized, so ef_at
     and ef_on handles (registered by atexit and on_exit) are executed
     before ef_cxa (registered by __cxa_atexit).

     Each entry set (struct exit_function_list) has on type associated
     (el_at or el_cxa) to represent the internal handle it contains.
     New insertions (done by the __atexit, __cxa_atexit, etc.) keep the
     node orders, with following constraints:

     3.1. el_at nodes should be prior el_cxa.
     3.2. el_at should contain only ef_at, ef_on, or ef_free elements.
     3.3. el_cxa should contain only ef_cxa or ef_free elements.
     3.4. new insertions on each node type should be be kept in lifo order.
     3.5. the original first element should be last one (since it is static
          allocated and 'exit' will deallocated the nodes in order. */

     So the execution on both __cxa_finalize, exit, or quick_exit will
     iterate over the list by executing first atexit/on_exit handlers and
     then __cxa_atexit ones. New handlers added by registered functions
     are handled as before, by using the ef_free entry and reseting the
     list iteration.

[1] https://sourceware.org/ml/libc-alpha/2019-06/msg00229.html
[2] https://sourceware.org/ml/libc-help/2019-06/msg00025.html
[3] https://github.com/mulle-nat/ld-so-breakage
[4] https://sourceware.org/ml/libc-alpha/2019-06/msg00231.html

3cd72ed... by Adhemerval Zanella on 2019-07-11

stdlib: Testcase to show wrong atexit execution order

This testcase is based on the one discussed on libc-help [1]

[1] https://github.com/mulle-nat/ld-so-breakage

The atexit calls are done by the DSO constructors and program expects
the handlers to be executed in the reverse order the library are
loaded.

The issue is _dl_fini might re-sort the order when libraries are
unloaded and the __cxa_finalize (called by __do_global_dtors_aux
set in arrayfini from DSO) might in turn call its registered atexit
in a wrong order.

d0093c5... by "H.J. Lu" <email address hidden> on 2019-07-01

Call _dl_open_check after relocation [BZ #24259]

This is a workaround for [BZ #20839] which doesn't remove the NODELETE
object when _dl_open_check throws an exception. Move it after relocation
in dl_open_worker to avoid leaving the NODELETE object mapped without
relocation.

 [BZ #24259]
 * elf/dl-open.c (dl_open_worker): Call _dl_open_check after
 relocation.
 * sysdeps/x86/Makefile (tests): Add tst-cet-legacy-5a,
 tst-cet-legacy-5b, tst-cet-legacy-6a and tst-cet-legacy-6b.
 (modules-names): Add tst-cet-legacy-mod-5a, tst-cet-legacy-mod-5b,
 tst-cet-legacy-mod-5c, tst-cet-legacy-mod-6a, tst-cet-legacy-mod-6b
 and tst-cet-legacy-mod-6c.
 (CFLAGS-tst-cet-legacy-5a.c): New.
 (CFLAGS-tst-cet-legacy-5b.c): Likewise.
 (CFLAGS-tst-cet-legacy-mod-5a.c): Likewise.
 (CFLAGS-tst-cet-legacy-mod-5b.c): Likewise.
 (CFLAGS-tst-cet-legacy-mod-5c.c): Likewise.
 (CFLAGS-tst-cet-legacy-6a.c): Likewise.
 (CFLAGS-tst-cet-legacy-6b.c): Likewise.
 (CFLAGS-tst-cet-legacy-mod-6a.c): Likewise.
 (CFLAGS-tst-cet-legacy-mod-6b.c): Likewise.
 (CFLAGS-tst-cet-legacy-mod-6c.c): Likewise.
 ($(objpfx)tst-cet-legacy-5a): Likewise.
 ($(objpfx)tst-cet-legacy-5a.out): Likewise.
 ($(objpfx)tst-cet-legacy-mod-5a.so): Likewise.
 ($(objpfx)tst-cet-legacy-mod-5b.so): Likewise.
 ($(objpfx)tst-cet-legacy-5b): Likewise.
 ($(objpfx)tst-cet-legacy-5b.out): Likewise.
 (tst-cet-legacy-5b-ENV): Likewise.
 ($(objpfx)tst-cet-legacy-6a): Likewise.
 ($(objpfx)tst-cet-legacy-6a.out): Likewise.
 ($(objpfx)tst-cet-legacy-mod-6a.so): Likewise.
 ($(objpfx)tst-cet-legacy-mod-6b.so): Likewise.
 ($(objpfx)tst-cet-legacy-6b): Likewise.
 ($(objpfx)tst-cet-legacy-6b.out): Likewise.
 (tst-cet-legacy-6b-ENV): Likewise.
 * sysdeps/x86/tst-cet-legacy-5.c: New file.
 * sysdeps/x86/tst-cet-legacy-5a.c: Likewise.
 * sysdeps/x86/tst-cet-legacy-5b.c: Likewise.
 * sysdeps/x86/tst-cet-legacy-6.c: Likewise.
 * sysdeps/x86/tst-cet-legacy-6a.c: Likewise.
 * sysdeps/x86/tst-cet-legacy-6b.c: Likewise.
 * sysdeps/x86/tst-cet-legacy-mod-5.c: Likewise.
 * sysdeps/x86/tst-cet-legacy-mod-5a.c: Likewise.
 * sysdeps/x86/tst-cet-legacy-mod-5b.c: Likewise.
 * sysdeps/x86/tst-cet-legacy-mod-5c.c: Likewise.
 * sysdeps/x86/tst-cet-legacy-mod-6.c: Likewise.
 * sysdeps/x86/tst-cet-legacy-mod-6a.c: Likewise.
 * sysdeps/x86/tst-cet-legacy-mod-6b.c: Likewise.
 * sysdeps/x86/tst-cet-legacy-mod-6c.c: Likewise.

3db85a9... by Paul Clarke on 2019-06-20

powerpc: Use faster means to access FPSCR when possible in some cases

Using 'mffs' instruction to read the Floating Point Status Control Register
(FPSCR) can force a processor flush in some cases, with undesirable
performance impact. If the values of the bits in the FPSCR which force the
flush are not needed, an instruction that is new to POWER9 (ISA version 3.0),
'mffsl' can be used instead.

Cases included: get_rounding_mode, fegetround, fegetmode, fegetexcept.

 * sysdeps/powerpc/bits/fenvinline.h (__fegetround): Use
 __fegetround_ISA300() or __fegetround_ISA2() as appropriate.
 (__fegetround_ISA300) New.
 (__fegetround_ISA2) New.
 * sysdeps/powerpc/fpu_control.h (IS_ISA300): New.
 (_FPU_MFFS): Move implementation...
 (_FPU_GETCW): Here.
 (_FPU_MFFSL): Move implementation....
 (_FPU_GET_RC_ISA300): Here. New.
 (_FPU_GET_RC): Use _FPU_GET_RC_ISA300() or _FPU_GETCW() as appropriate.
 * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status_ISA300): New.
 (fegetenv_status): New.
 * sysdeps/powerpc/fpu/fegetmode.c (fegetmode): Use fegetenv_status()
 instead of fegetenv_register().
 * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Likewise.

Reviewed-by: Tulio Magno Quites Machado Filho <email address hidden>

d064591... by Wilco Dijkstra <email address hidden> on 2019-06-28

Further improve string bench timing

Further improve the timings of the string benchmarks. Ensure most take
between 1 and 4 seconds to improve accuracy. Overall time taken increases
by 35%. Tested on AArch64.

Reviewed-by: Adhemerval Zanella <email address hidden>

 * benchtests/bench-math-inlines.c: Increase iterations.
 * benchtests/bench-memcmp.c: Likewise.
 * benchtests/bench-rawmemchr.c: Likewise.
 * benchtests/bench-strcmp.c: Likewise.
 * benchtests/bench-strcpy_chk.c: Likewise.
 * benchtests/bench-string.h (INNER_LOOP_ITERS8): Add define.
 (INNER_LOOP_ITERS_MEDIUM): Increase iterations.
 (INNER_LOOP_ITERS_SMALL): Likewise.
 * benchtests/bench-strncat.c: Increase iterations.
 * benchtests/bench-strncmp.c: Increase iterations.
 * benchtests/bench-strncpy.c: Reduce iterations for wide strings.
 * benchtests/bench-strrchr.c: Increase iterations.
 * benchtests/bench-strstr.c: Keep iterations unchanged.
 * benchtests/bench-strtod.c: Increase iterations.

afe23eb... by Anton Youdkevitch <email address hidden> on 2019-06-28

Bump up the runtime for "short" benchmarks

Some benchmarks with a very short runtime show significantly
different results across runs on Aarch64 - up to tens of percents.
Increasing the runtime to 100ms+ makes the deviation under 5%.

Tested on Aarch64 and x86-64.

Reviewed-by: Carlos O'Donell <email address hidden>

 * benchtests/bench-memccpy.c: Replace INNER_LOOP_ITERS
 with INNER_LOOP_ITERS_LARGE.
 * benchtests/bench-memchr.c: Likewise.
 * benchtests/bench-rawmemchr.c: Likewise.
 * benchtests/bench-strcat.c: Likewise.
 * benchtests/bench-strchr.c: Likewise.
 * benchtests/bench-string.h: Likewise.
 * benchtests/bench-strlen.c: Likewise.
 * benchtests/bench-strncpy.c: Likewise.
 * benchtests/bench-strnlen.c: Likewise.

507f55c... by Florian Weimer on 2019-06-28

Linux: Use mmap instead of malloc in dirent/tst-getdents64

malloc dirties the entire allocated memory region due to M_PERTURB
in the test harness.

589787f... by Tobias Klauser on 2019-06-28

Replace PREPARE_VERSION macro with inline function

 * sysdeps/unix/sysv/linux/dl-vdso.h (PREPARE_VERSION): Remove macro.
 (prepare_version_base): New helper inline function.
 (prepare_version): New macro replacing PREPARE_VERSION.
 (PREPARE_VERSION_KNOWN): Use prepare_version instead of PREPARE_VERSION.

Reviewed-by: Adhemerval Zanella <email address hidden>

f0b2132... by Florian Weimer on 2019-06-28

ld.so: Support moving versioned symbols between sonames [BZ #24741]

This change should be fully backwards-compatible because the old
code aborted the load if a soname mismatch was encountered
(instead of searching further for a matching symbol). This means
that no different symbols are found.

The soname check was explicitly disabled for the skip_map != NULL
case. However, this only happens with dl(v)sym and RTLD_NEXT,
and those lookups do not come with a verneed entry that could be used
for the check.

The error check was already explicitly disabled for the skip_map !=
NULL case, that is, when dl(v)sym was called with RTLD_NEXT. But
_dl_vsym always sets filename in the struct r_found_version argument
to NULL, so the check was not active anyway. This means that
symbol lookup results for the skip_map != NULL case do not change,
either.

17432d7... by Florian Weimer on 2019-06-28

support: Add xdlvsym function