hjl/pr20309/master : lp:glibc : Git : Code : GLibC

X86-64: Properly align stack in _dl_tlsdesc_dynamic

Since _dl_tlsdesc_dynamic is called via PLT, we need to add 8 bytes for
push in the PLT entry to align the stack.

[BZ #20309]
* configure.ac (have-mtls-dialect-gnu2): Set to yes if
-mtls-dialect=gnu2 works.
* configure: Regenerated.
* elf/Makefile [have-mtls-dialect-gnu2 = yes]
(tests): Add tst-gnu2-tls1.
(modules-names): Add tst-gnu2-tls1mod.
($(objpfx)tst-gnu2-tls1): New.
(tst-gnu2-tls1mod.so-no-z-defs): Likewise.
(CFLAGS-tst-gnu2-tls1mod.c): Likewise.
* elf/tst-gnu2-tls1.c: New file.
* elf/tst-gnu2-tls1mod.c: Likewise.
* sysdeps/x86_64/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Add 8
bytes for push in the PLT entry to align the stack.

elf.h: Add declarations for BPF

The EM_BPF number has been officially assigned, though it
has not yet been posted to the gabi webpage yet.

        * elf/elf.h (EM_BPF): New.
        (EM_NUM): Update.
        (R_BPF_NONE, R_BPF_MAP_FD): New.

elf.h: Sync with the gabi webpage

http://www.sco.com/developers/gabi/latest/ch4.eheader.html

Retrieved 2016-06-20.

        * elf/elf.h (EM_IAMCU, EM_SPU, EM_PDP10, EM_PDP11, EM_ARC_COMPACT,
        EM_VIDEOCORE, EM_TMM_GPP, EM_NS32K, EM_TPC, EM_SNP1K, EM_ST200,
        EM_IP2K, EM_MAX, EM_CR, EM_F2MC16, EM_MSP430, EM_BLACKFIN, EM_SE_C33,
        EM_SEP, EM_ARCA, EM_UNICORE, EM_EXCESS, EM_DXP, EM_ALTERA_NIOS2,
        EM_CRX, EM_XGATE, EM_C166, EM_M16C, EM_DSPIC30F, EM_CE, EM_M32C,
        EM_TSK3000, EM_RS08, EM_SHARC, EM_ECOG2, EM_SCORE7, EM_DSP24,
        EM_VIDEOCORE3, EM_LATTICEMICO32, EM_SE_C17, EM_TI_C6000, EM_TI_C2000,
        EM_TI_C5500, EM_TI_ARP32, EM_TI_PRU, EM_MMDSP_PLUS, EM_CYPRESS_M8C,
        EM_R32C, EM_TRIMEDIA, EM_QDSP6, EM_8051, EM_STXP7X, EM_NDS32,
        EM_ECOG1X, EM_MAXQ30, EM_XIMO16, EM_MANIK, EM_CRAYNV2, EM_RX,
        EM_METAG, EM_MCST_ELBRUS, EM_ECOG16, EM_CR16, EM_ETPU, EM_SLE9X,
        EM_L10M, EM_K10M, EM_AVR32, EM_STM8, EM_TILE64, EM_CUDA,
        EM_CLOUDSHIELD, EM_COREA_1ST, EM_COREA_2ND, EM_ARC_COMPACT2,
        EM_OPEN8, EM_RL78, EM_VIDEOCORE5, EM_78KOR, EM_56800EX, EM_BA1,
        EM_BA2, EM_XCORE, EM_MCHP_PIC, EM_KM32, EM_KMX32, EM_EMX16, EM_EMX8,
        EM_KVARC, EM_CDP, EM_COGE, EM_COOL, EM_NORC, EM_CSR_KALIMBA, EM_Z80,
        EM_VISIUM, EM_FT32, EM_MOXIE, EM_AMDGPU, EM_RISCV): New.
        (EM_NUM): Update.

S390: Fix relocation of _nl_current_LC_CATETORY_used in static build. [BZ #19860]

With shared libc, all locale categories are always loaded.
For static libc they aren't, but there exist a weak
_nl_current_LC_CATEGORY_used symbol for each category.
If the category is used, the locale/lc-CATEGORY.o is linked in
where _NL_CURRENT_DEFINE (LC_CATEGORY) defines and sets the
_nl_current_LC_CATEGORY_used symbol to one.

As reported by Marcin
"Bug 18960 - s390: _nl_locale_subfreeres uses larl opcode on misaligned symbol"
(https://sourceware.org/bugzilla/show_bug.cgi?id=18960)
In function _nl_locale_subfreeres (locale/setlocale.c) for each category
a check - &_nl_current_LC_CATEGORY_used != 0 - decides whether the category
is used or not.
There is also a second usage with the same mechanism in function __uselocale
(locale/uselocale.c).

On s390 a larl instruction with R_390_PC32DBL relocation is used to
get the address of _nl_current_LC_CATEGORY_used symbols. As larl loads the
address relative in halfwords and the code is always 2-byte aligned,
larl can only load even addresses.
At the end, the relocated address is always zero and never one.

Marcins patch (see bugzilla) uses the following declaration in locale/setlocale.c:
extern char _nl_current_##category##_used __attribute__((__aligned__(1)));
In function _nl_locale_subfreeres all categories are checked and therefore gcc
is now building an array of addresses in rodata section with an R_390_64
relocation for every address. This array is loaded with larl instruction and
each address is accessed by index.
This fixes only the usage in _nl_locale_subfreeres. Each user has to add the
alignment attribute.

This patch set the _nl_current_LC_CATEGORY_used symbols to two instead of one.
This way gcc can use larl instruction and the check against zero works on
every usage.

ChangeLog:

[BZ #19860]
* locale/localeinfo.h (_NL_CURRENT_DEFINE):
Set _nl_current_LC_CATEGORY_used to two instead of one.

MIPS: run tst-mode-switch-{1,2,3}.c using test-skeleton.c

For some reasons I have not investigated yet, tst-mode-switch-1 hangs on
a MIPS UTM-8 machine running an o32 userland and a 3.6.1 kernel.

This patch changes the test so that it runs under the test-skeleton
framework, causing the test to fail after a timeout instead of hanging
the whole testsuite. At the same time, also change the tst-mode-switch-2
and tst-mode-switch-3 tests.

Changelog:
* sysdeps/mips/tst-mode-switch-1.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* sysdeps/mips/tst-mode-switch-2.c (main): Likewise.
* sysdeps/mips/tst-mode-switch-3.c (main): Likewise.

Avoid "inexact" exceptions in i386/x86_64 trunc functions (bug 15479).

As discussed in
<https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS
18661-1 disallows ceil, floor, round and trunc functions from raising
the "inexact" exception, in accordance with general IEEE 754 semantics
for when that exception is raised. Fixing this for x87 floating point
is more complicated than for the other versions of these functions,
because they use the frndint instruction that raises "inexact" and
this can only be avoided by saving and restoring the whole
floating-point environment.

As I noted in
<https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have
now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7,
such that GCC will inline these functions on x86, without caring about
"inexact", when the default -ffp-int-builtin-inexact is in effect.
This allows users to get optimized code depending on the options they
pass to the compiler, while making the out-of-line functions follow TS
18661-1 semantics and avoid "inexact".

This patch duly fixes the out-of-line trunc function implementations
to avoid "inexact", in the same way as the nearbyint implementations.

I do not know how the performance of implementations such as these
based on saving the environment and changing the rounding mode
temporarily compares to that of the C versions or SSE 4.1 versions (of
course, for 32-bit x86 SSE implementations still need to get the
return value in an x87 register); it's entirely possible other
implementations could be faster in some cases.

Tested for x86_64 and x86.

[BZ #15479]
* sysdeps/i386/fpu/s_trunc.S (__trunc): Save and restore
floating-point environment rather than just control word.
* sysdeps/i386/fpu/s_truncf.S (__truncf): Likewise.
* sysdeps/i386/fpu/s_truncl.S (__truncl): Save and restore
floating-point environment, with "invalid" exceptions merged in,
rather than just control word.
* sysdeps/x86_64/fpu/s_truncl.S (__truncl): Likewise.
* math/libm-test.inc (trunc_test_data): Do not allow spurious
"inexact" exceptions.

Avoid "inexact" exceptions in i386/x86_64 floor functions (bug 15479).

As discussed in
<https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS
18661-1 disallows ceil, floor, round and trunc functions from raising
the "inexact" exception, in accordance with general IEEE 754 semantics
for when that exception is raised. Fixing this for x87 floating point
is more complicated than for the other versions of these functions,
because they use the frndint instruction that raises "inexact" and
this can only be avoided by saving and restoring the whole
floating-point environment.

As I noted in
<https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have
now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7,
such that GCC will inline these functions on x86, without caring about
"inexact", when the default -ffp-int-builtin-inexact is in effect.
This allows users to get optimized code depending on the options they
pass to the compiler, while making the out-of-line functions follow TS
18661-1 semantics and avoid "inexact".

This patch duly fixes the out-of-line floor function implementations
to avoid "inexact", in the same way as the nearbyint implementations.

I do not know how the performance of implementations such as these
based on saving the environment and changing the rounding mode
temporarily compares to that of the C versions or SSE 4.1 versions (of
course, for 32-bit x86 SSE implementations still need to get the
return value in an x87 register); it's entirely possible other
implementations could be faster in some cases.

Tested for x86_64 and x86.

[BZ #15479]
* sysdeps/i386/fpu/s_floor.S (__floor): Save and restore
floating-point environment rather than just control word.
* sysdeps/i386/fpu/s_floorf.S (__floorf): Likewise.
* sysdeps/i386/fpu/s_floorl.S (__floorl): Save and restore
floating-point environment, with "invalid" exceptions merged in,
rather than just control word.
* sysdeps/x86_64/fpu/s_floorl.S (__floorl): Likewise.
* math/libm-test.inc (floor_test_data): Do not allow spurious
"inexact" exceptions.

Avoid "inexact" exceptions in i386/x86_64 ceil functions (bug 15479).

As discussed in
<https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS
18661-1 disallows ceil, floor, round and trunc functions from raising
the "inexact" exception, in accordance with general IEEE 754 semantics
for when that exception is raised. Fixing this for x87 floating point
is more complicated than for the other versions of these functions,
because they use the frndint instruction that raises "inexact" and
this can only be avoided by saving and restoring the whole
floating-point environment.

As I noted in
<https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have
now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7,
such that GCC will inline these functions on x86, without caring about
"inexact", when the default -ffp-int-builtin-inexact is in effect.
This allows users to get optimized code depending on the options they
pass to the compiler, while making the out-of-line functions follow TS
18661-1 semantics and avoid "inexact".

This patch duly fixes the out-of-line ceil function implementations to
avoid "inexact", in the same way as the nearbyint implementations.

I do not know how the performance of implementations such as these
based on saving the environment and changing the rounding mode
temporarily compares to that of the C versions or SSE 4.1 versions (of
course, for 32-bit x86 SSE implementations still need to get the
return value in an x87 register); it's entirely possible other
implementations could be faster in some cases.

Tested for x86_64 and x86.

[BZ #15479]
* sysdeps/i386/fpu/s_ceil.S (__ceil): Save and restore
floating-point environment rather than just control word.
* sysdeps/i386/fpu/s_ceilf.S (__ceilf): Likewise.
* sysdeps/i386/fpu/s_ceill.S (__ceill): Save and restore
floating-point environment, with "invalid" exceptions merged in,
rather than just control word.
* sysdeps/x86_64/fpu/s_ceill.S (__ceill): Likewise.
* math/libm-test.inc (ceil_test_data): Do not allow spurious
"inexact" exceptions.

MIPS, SPARC: more fixes to the vfork aliases in libpthread.so

Commit 43c29487 tried to fix the vfork aliases in libpthread.so on MIPS
and SPARC, but failed to do it correctly, introducing an ABI change.

This patch does the remaining changes needed to align the MIPS and SPARC
vfork implementations with the other architectures. That way the the
alpha version of pt-vfork.S works correctly for MIPS and SPARC. The
changes for alpha were done in 82aab97c.

Changelog:
* sysdeps/unix/sysv/linux/mips/vfork.S (__vfork): Rename into
__libc_vfork.
(__vfork) [IS_IN (libc)]: Remove alias.
(__libc_vfork) [IS_IN (libc)]: Define as an alias.
* sysdeps/unix/sysv/linux/sparc/sparc32/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/vfork.S: Likewise.

Remove atomic_compare_and_exchange_bool_rel.

atomic_compare_and_exchange_bool_rel and
catomic_compare_and_exchange_bool_rel are removed and replaced with the
new C11-like atomic_compare_exchange_weak_release. The concurrent code
in nscd/cache.c has not been reviewed yet, so this patch does not add
detailed comments.

* nscd/cache.c (cache_add): Use new C11-like atomic operation instead
of atomic_compare_and_exchange_bool_rel.
* nptl/pthread_mutex_unlock.c (__pthread_mutex_unlock_full): Likewise.
* include/atomic.h (atomic_compare_and_exchange_bool_rel,
catomic_compare_and_exchange_bool_rel): Remove.
* sysdeps/aarch64/atomic-machine.h
(atomic_compare_and_exchange_bool_rel): Likewise.
* sysdeps/alpha/atomic-machine.h
(atomic_compare_and_exchange_bool_rel): Likewise.
* sysdeps/arm/atomic-machine.h
(atomic_compare_and_exchange_bool_rel): Likewise.
* sysdeps/mips/atomic-machine.h
(atomic_compare_and_exchange_bool_rel): Likewise.
* sysdeps/tile/atomic-machine.h
(atomic_compare_and_exchange_bool_rel): Likewise.

GLibC

glibc:hjl/pr20309/master

Branch merges

Branch information

Recent commits

GLibC

glibc:hjl/pr20309/master

Branch merges

Related source package recipes

Related snap packages

Related charm recipes

Branch information

Recent commits