View Bazaar branches
Get this repository:
git clone https://git.launchpad.net/glibc

Branches

Name Last Modified Last Commit
master 2017-12-13 02:34:33 UTC 4 hours ago
Fix testing with nss-crypt.

Author: Carlos-0
Author Date: 2017-12-13 02:32:42 UTC

Fix testing with nss-crypt.

A glibc master build with --enable-nss-crypt using the NSS
crypto libraries fails during make check with the following error:

<command-line>:0:0: error: "USE_CRYPT" redefined [-Werror]
<command-line>:0:0: note: this is the location of the previous
definition

This is caused by commit 36975e8e7ea227f7006abdc722ecfefe2079429b
by H.J. Lu which replaces all = with +=. The fix is to undefine
USE_CRYPT before defining it to zero.

Committed as an obvious fix. Fixes the build issue on x86_64 with
no regressions.

Signed-off-by: Carlos O'Donell <carlos@redhat.com>

hjl/cet/master 2017-12-13 01:24:09 UTC 5 hours ago
x86: Check GNU_PROPERTY_X86_FEATURE_1_SHSTK isn't set

Author: H.J. Lu
Author Date: 2017-12-08 12:47:49 UTC

x86: Check GNU_PROPERTY_X86_FEATURE_1_SHSTK isn't set

-fcf-protection -mcet is incompatible with makecontext family functions
since they can't properly set up and destroy shadow stack pointer. When
they are used, GNU_PROPERTY_X86_FEATURE_1_SHSTK shouldn't be set on the
program. Add a test to check it, which is expected to fail until the
GCC bug:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81842

is fixed.

 * sysdeps/unix/sysv/linux/x86/Makefile (tests-special): Add
 $(objpfx)tst-setcontext3-shstk.out.
 ($(objpfx)tst-setcontext3-shstk.out): New target.
 * sysdeps/unix/sysv/linux/x86/tst-setcontext3-shstk.sh: New file.

release/2.25/master 2017-12-12 19:14:43 UTC 11 hours ago
ia64: Add ipc_priv.h header to set __IPC_64 to zero

Author: James Clarke
Author Date: 2017-12-12 14:17:10 UTC

ia64: Add ipc_priv.h header to set __IPC_64 to zero

When running strace, IPC_64 was set in the command, but ia64 is
an architecture where CONFIG_ARCH_WANT_IPC_PARSE_VERSION *isn't* set
in the kernel, so ipc_parse_version just returns IPC_64 without
clearing the IPC_64 bit in the command.

 * sysdeps/unix/sysv/linux/ia64/ipc_priv.h: New file defining
 __IPC_64 to 0 to avoid IPC_64 being set.

Signed-off-by: James Clarke <jrtc27@jrtc27.com>

(cherry picked from commit 89bd8016b30e504829bea48c4cd556769abfcf3a)

release/2.26/master 2017-12-12 18:43:00 UTC 12 hours ago
ia64: Add ipc_priv.h header to set __IPC_64 to zero

Author: James Clarke
Author Date: 2017-12-12 14:17:10 UTC

ia64: Add ipc_priv.h header to set __IPC_64 to zero

When running strace, IPC_64 was set in the command, but ia64 is
an architecture where CONFIG_ARCH_WANT_IPC_PARSE_VERSION *isn't* set
in the kernel, so ipc_parse_version just returns IPC_64 without
clearing the IPC_64 bit in the command.

 * sysdeps/unix/sysv/linux/ia64/ipc_priv.h: New file defining
 __IPC_64 to 0 to avoid IPC_64 being set.

Signed-off-by: James Clarke <jrtc27@jrtc27.com>

(cherry picked from commit 89bd8016b30e504829bea48c4cd556769abfcf3a)

azanella/bz12683 2017-12-12 16:56:37 UTC 14 hours ago
nptl: Consolidate pthread_{timed,try}join{_np}

Author: Adhemerval Zanella
Author Date: 2017-01-25 19:08:51 UTC

nptl: Consolidate pthread_{timed,try}join{_np}

This patch consolidates the pthread_join and gnu extensions to avoid
simplify implementation and avoid code duplication. Both pthread_join
and pthread_tryjoin are now based on pthread_timedjoin_np.

It also fixes some inconsistencies on ESRCH, EINVAL, EDEADLK handling
(where each implementation differs from each other) and also on
clenup handler (which now always use a CAS). It also replace the
atomics operation with the C11 ones.

Checked on i686-linux-gnu, x86_64-linux-gnu, x86_64-linux-gnux32,
aarch64-linux-gnu, arm-linux-gnueabihf, and powerpc64le-linux-gnu.

 * nptl/pthreadP.h (__pthread_timedjoin_np): Define.
 * nptl/pthread_join.c (pthread_join): Use __pthread_timedjoin_np.
 * nptl/pthread_tryjoin.c (pthread_tryjoin): Likewise.
 * nptl/pthread_timedjoin.c (cleanup): Use CAS on argument setting.
 (pthread_timedjoin_np): Define internal symbol and common code from
 pthread_join.
 * sysdeps/unix/sysv/linux/i386/lowlevellock.h (__lll_timedwait_tid):
 Remove superflous checks.
 * sysdeps/unix/sysv/linux/x86_64/lowlevellock.h (__lll_timedwait_tid):
 Likewise.

hjl/x86/math 2017-12-11 20:46:50 UTC 2017-12-11
x86-64: Add cosf with FMA

Author: H.J. Lu
Author Date: 2017-12-11 20:35:29 UTC

x86-64: Add cosf with FMA

On Skylake, bench-cosf reports performance improvement:

            Before After Improvement
max 135.362 94.552 43%
min 8.532 7.688 11%
mean 17.1446 11.8128 45%

 * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
 Add s_cosf-sse2 and s_cosf-fma.
 (CFLAGS-s_cosf-fma.c): New.
 * sysdeps/x86_64/fpu/multiarch/s_cosf-fma.c: New file.
 * sysdeps/x86_64/fpu/multiarch/s_cosf-sse2.c: Likewise.
 * sysdeps/x86_64/fpu/multiarch/s_cosf.c: Likewise.

azanella/ifunc-c-sparc-m7 2017-12-11 20:36:54 UTC 2017-12-11
sparc: M7 optimized memset/bzero

Author: Jose E. Marchesi
Author Date: 2017-10-20 22:29:06 UTC

sparc: M7 optimized memset/bzero

Tested in sparcv9-*-* and sparc64-*-* targets in both multi and
non-multi arch configurations.

Support added to identify Sparc M7/T7/S7/M8/T8 processor capability.
Usual "make check" correctness tests run with no regressions.
Performance tests run on Sparc S7 using new code and old niagara4 code.

Optimizations for memset also apply to bzero as they share code.

For memset/bzero, performance comparison with niagara4 code:
For memset nonzero data,
  256-1023 bytes - 60-90% gain (in cache); 5% gain (out of cache)
  1K+ bytes - 80-260% gain (in cache); 40-80% gain (out of cache)
For memset zero data (and bzero),
  256-1023 bytes - 80-120% gain (in cache), 0% gain (out of cache)
  1024+ bytes - 2-4x gain (in cache), 10-35% gain (out of cache)

 Jose E. Marchesi <jose.marchesi@oracle.com>
 Adhemerval Zanella <adhemerval.zanella@linaro.org>

 * sysdeps/sparc/sparc32/sparcv9/multiarch/Makefile
 (sysdeps_routines): Add memset-niagara7.
 * sysdeps/sparc/sparc64/multiarch/Makefile (sysdes_rotuines):
 Likewise.
 * sysdeps/sparc/sparc32/sparcv9/multiarch/memset-niagara7.S: New
 file.
 * sysdeps/sparc/sparc64/multiarch/memset-niagara7.S: Likewise.
 * sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
 (__libc_ifunc_impl_list): Add __bzero_niagara7 and __memset_niagara7.
 * sysdeps/sparc/sparc64/multiarch/ifunc-memset.h (IFUNC_SELECTOR):
 Add niagara7 option.

hjl/cet/setjmp 2017-12-08 02:09:37 UTC 2017-12-08
x86: Add feature_1 to tcbhead_t [BZ #22563]

Author: H.J. Lu
Author Date: 2017-12-07 13:47:21 UTC

x86: Add feature_1 to tcbhead_t [BZ #22563]

On x86, padding in struct __jmp_buf_tag is used for shadow stack pointer
to support Shadow Stack in Intel Control-flow Enforcemen Technology.
cancel_jmp_buf has been updated to include saved_mask so that it is as
large as struct __jmp_buf_tag. We must suport the old cancel_jmp_buf
in existing binaries. Since symbol versioning doesn't work on
cancel_jmp_buf, feature_1 is added to tcbhead_t so that setjmp and
longjmp can check if shadow stack is enabled. NB: Shadow stack is
enabled only if all modules are shadow stack enabled.

 [BZ #22563]
 * sysdeps/i386/nptl/tcb-offsets.sym (FEATURE_1_OFFSET): New.
 * sysdeps/i386/nptl/tls.h (tcbhead_t): Add feature_1.
 * sysdeps/x86_64/nptl/tcb-offsets.sym (FEATURE_1_OFFSET): New.
 * sysdeps/x86_64/nptl/tls.h (tcbhead_t): Rename __glibc_unused1
 to feature_1.

hjl/pie/static 2017-12-08 01:28:04 UTC 2017-12-08
Add --enable-static-pie to build-many-glibcs.py

Author: H.J. Lu
Author Date: 2017-09-27 23:47:29 UTC

Add --enable-static-pie to build-many-glibcs.py

aaribaud/y2038-2.26-rfc-2 2017-12-03 13:41:57 UTC 2017-12-03
Y2038: add _TIME_BITS support

Author: Albert ARIBAUD (3ADEV)
Author Date: 2017-09-06 08:00:42 UTC

Y2038: add _TIME_BITS support

This makes all previously defined Y2038-proof API types, functions and
implementations the default when _TIME_BITS==64 and __WORDSIZE==32 (so
that 64-bit architectures are unaffected).

Note: it is assumed that the API is consistent, i.e. for each API type
which is enabled here, all API functions which depend on this type are
enabled and mapped to Y2038-proof implementations.

release/2.24/master 2017-12-02 21:59:06 UTC 2017-12-02
Update NEWS to add CVE-2017-15804 entry

Author: Aurelien Jarno
Author Date: 2017-12-01 20:53:51 UTC

Update NEWS to add CVE-2017-15804 entry

(cherry picked from commit 15e84c63c05e0652047ba5e738c54d79d62ba74b)

arm/ilp32 2017-11-27 15:20:50 UTC 2017-11-27
aarch64: Fix jmp_buf-macros.h for ILP32.

Author: Szabolcs Nagy
Author Date: 2017-11-10 18:59:31 UTC

aarch64: Fix jmp_buf-macros.h for ILP32.

The offset is different on ILP32 because __saved_mask is 4 byte aligned.

2017-11-27 Szabolcs Nagy <szabolcs.nagy@arm.com>

 * sysdeps/unix/sysv/linux/aarch64/jmp_buf-macros.h (SAVED_MASK_OFFSET):
 Fix for ILP32.

hjl/pr22370/master 2017-11-25 14:30:17 UTC 2017-11-25
Properly compute offsets of note descriptor and next note [BZ #22370]

Author: H.J. Lu
Author Date: 2017-10-31 11:51:41 UTC

Properly compute offsets of note descriptor and next note [BZ #22370]

A note header has 3 4-bytes fields, followed by note name and note
descriptor. According to gABI, in a note entry, the note name field,
not note name size, is padded for the note descriptor. And the note
descriptor field, not note descriptor size, is padded for the next
note entry. Notes are aligned to 4 bytes in 32-bit objects and 8 bytes
in 64-bit objects.

For all GNU notes, the name is "GNU" which is 4 bytes. They have the
same format in the first 16 bytes in both 32-bit and 64-bit objects.
They differ by note descriptor size and note type. So far, .note.ABI-tag
and .note.gnu.build-id notes are always aligned to 4 bytes. The exsting
codes compute the note size by aligning the note name size and note
descriptor size to 4 bytes. It happens to produce the same value as
the actual note size by luck since the name size is 4 and offset of the
note descriptor is 16. But it will produce the wrong size when note
alignment is 8 bytes in 64-bit objects.

This patch defines ELF_NOTE_DESC_OFFSET and ELF_NOTE_NEXT_OFFSET to
properly compute offsets of note descriptor and next note. It uses
alignment of PT_NOTE segment to support both 4-byte and 8-byte note
alignments in 64-bit objects. To handle PT_NOTE segments with
incorrect alignment, which may lead to an infinite loop, if segment
alignment is less than 4, we treate alignment as 4 bytes.

 [BZ #22370]
 * elf/dl-hwcaps.c (ROUND): Removed.
 (_dl_important_hwcaps): Replace ROUND with ELF_NOTE_DESC_OFFSET
 and ELF_NOTE_NEXT_OFFSET.
 * elf/dl-load.c (ROUND): Removed.
 (open_verify): Replace ROUND with ELF_NOTE_NEXT_OFFSET.
 * elf/readelflib.c (ROUND): Removed.
 (process_elf_file): Replace ROUND with ELF_NOTE_NEXT_OFFSET.
 * include/elf.h [!_ISOMAC]: Include <libc-pointer-arith.h>.
 [!_ISOMAC] (ELF_NOTE_DESC_OFFSET): New.
 [!_ISOMAC] (ELF_NOTE_NEXT_OFFSET): Likewise.

hjl/pr18822 2017-11-24 23:58:04 UTC 2017-11-24
Add wcharP.h to hide internal wchar functions [BZ #18822]

Author: H.J. Lu
Author Date: 2017-11-24 23:58:04 UTC

Add wcharP.h to hide internal wchar functions [BZ #18822]

For some targets, like i386 and s390, internal IFUNC functions must be
called via PLT with a special register. Add <wcharP.h> to allow targets,
which don't need a special register to call internal IFUNC functions via
PLT or have internal non-IFUNC wchar functions, to allow direct access to
internal wchar functions within libc.so and libc.a without using GOT nor
PLT.

Tested on i686 and x86-64. It removed 11 PLT relocations on i686 and
29 PLT relocations on x86-64.

 [BZ #18822]
 * include/wchar.h: Include <wcharP.h>.
 * sysdeps/generic/wcharP.h: New file.
 * sysdeps/i386/wcharP.h: Likewise.
 * sysdeps/x86_64/wcharP.h: Likewise.

ibm/2.26/master 2017-11-24 21:31:41 UTC 2017-11-24
Merge branch 'release/2.26/master' into ibm/2.26/master

Author: Tulio Magno Quites Machado Filho
Author Date: 2017-11-24 21:31:41 UTC

Merge branch 'release/2.26/master' into ibm/2.26/master

hjl/pr22353/master 2017-10-30 12:44:19 UTC 2017-10-30
Add strcpy-stosb.S

Author: H.J. Lu
Author Date: 2017-10-27 13:31:57 UTC

Add strcpy-stosb.S

hjl/pr22363/master 2017-10-29 21:15:47 UTC 2017-10-29
x32: Set GLRO(dl_platform) to "x86_64" by default [BZ #22363]

Author: H.J. Lu
Author Date: 2017-10-29 21:12:01 UTC

x32: Set GLRO(dl_platform) to "x86_64" by default [BZ #22363]

Set dl_platform to "x86_64" for x32 by default since kernel may set it
to "i686". This fixed:

FAIL: elf/tst-platform-1

on x32. Tested on x86-64 and x32.

 [BZ #22363]
 * sysdeps/x86/cpu-features.c (init_cpu_features): Set
 GLRO(dl_platform) to "x86_64" by default for x32.

hjl/pr22362/symlink 2017-10-29 19:03:03 UTC 2017-10-29
nptl: Simplify libpthread.so rules

Author: H.J. Lu
Author Date: 2017-10-29 19:03:03 UTC

nptl: Simplify libpthread.so rules

libpthread.so must be created against a special crti.o and when multi-lib
GCC is used to build glibc, the special crti.o must be placed under the
`gcc -print-multi-directory` subdirectory so that -B$(common-objpfx)nptl/
will pick it up. This patch compiles the special crti.o directly with
"-include pt-crti.h" and uses the new make-link-multidir from the fix
for [BZ #22362] to create the symlink for the multi-lib subdirectory.
It simplifies libpthread.so rules to support -B$(common-objpfx)nptl/
and -B$(common-objpfx)/ for correct crt*.o files.

Tested on x86-64 for x86-64, i686 and x32.

 * nptl/Makefile (multidir.mk): Don't generate.
 ($(objpfx)multidir.mk): Removed.
 (crtn-objs): Likewise.
 (generated-dirs): Don't add $(multidir).
 (multilib-crti-objs): New.
 (extra-objs): Use $(multilib-crti-objs). Don't add pt-crti.o.
 ($(objpfx)libpthread.so): Removed.
 (CPPFLAGS-crti.o): New.
 ($(objpfx)crti.o): Removed.
 ($(objpfx)$(multidir)/crti.o): Likewise.
 ($(objpfx)$(multidir)/crtn.o): Likewise.
 ($(addprefix $(objpfx)$(multidir)/, $(crti-objs))): New target.
 (generated): Remove multidir.mk.
 * nptl/pt-crti.S: Renamed to ...
 * nptl/pt-crti.h: This. Don't include <crti.S>.

hjl/pr22362/master 2017-10-29 02:05:09 UTC 2017-10-29
Use newly built crt*.o files to build shared objects [BZ #22362]

Author: H.J. Lu
Author Date: 2017-10-29 00:41:16 UTC

Use newly built crt*.o files to build shared objects [BZ #22362]

When multi-lib GCC is used to build glibc, the search order of GCC driver
for crt*.o is -B*/`gcc -print-multi-directory`, the installed diretory,
-B*/. This patch extends multi-lib support from nptl/Makefile to
csu/Makefile so that -B/glibc-build-directory/csu/ will pick up the newly
built crt*.o.

Tested on x86-64 for i686 and x32.

 [BZ #22362]
 * config.make.in (multidir): New.
 * configure.ac (libc_cv_multidir): New. AC_SUBST.
 * configure: Regenerated.
 * csu/Makefile [$(multidir) != .](multilib-extra-objs): New.
 [$(multidir) != .](extra-objs): Add $(multilib-extra-objs).
 [$(multidir) != .]($(addprefix $(objpfx)$(multidir)/, $(install-lib))):
 New target.
 * nptl/Makefile: Don't include multidir.mk.
 ($(objpfx)multidir.mk): Removed.

hjl/x86/master 2017-10-27 21:31:23 UTC 2017-10-27
x86: Add sysdeps/x86/sysdep.h

Author: H.J. Lu
Author Date: 2017-10-27 21:24:51 UTC

x86: Add sysdeps/x86/sysdep.h

Add a new header file, sysdeps/x86/sysdep.h, for common assembly code
macros bewteen i386 and x86-64. Tested on i686 and x86-64. There are
no differences in outputs of "readelf -a" and "objdump -dw" on all glibc
shared objects before and after the patch.

 * sysdeps/i386/sysdep.h: Include <sysdeps/x86/sysdep.h> instead
 of <sysdeps/generic/sysdep.h>.
 (ALIGNARG): Removed.
 (ASM_SIZE_DIRECTIVE): Likewise.
 (ENTRY): Likewise.
 (END): Likewise.
 (ENTRY_CHK): Likewise.
 (END_CHK): Likewise.
 (syscall_error): Likewise.
 (mcount): Likewise.
 (PSEUDO_END): Likewise.
 (L): Likewise.
 (atom_text_section): Likewise.
 * sysdeps/x86/sysdep.h: New file.
 * sysdeps/x86_64/sysdep.h: Include <sysdeps/x86/sysdep.h> instead
 of <sysdeps/generic/sysdep.h>.
 (ALIGNARG): Removed.
 (ASM_SIZE_DIRECTIVE): Likewise.
 (ENTRY): Likewise.
 (END): Likewise.
 (ENTRY_CHK): Likewise.
 (END_CHK): Likewise.
 (syscall_error): Likewise.
 (mcount): Likewise.
 (PSEUDO_END): Likewise.
 (L): Likewise.
 (atom_text_section): Likewise.

release/2.23/master 2017-10-22 15:47:20 UTC 2017-10-22
x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]

Author: H.J. Lu
Author Date: 2017-10-22 15:47:03 UTC

x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]

In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector,
mask and bound registers. It simplifies _dl_runtime_resolve and supports
different calling conventions. ld.so code size is reduced by more than
1 KB. However, use fxsave/xsave/xsavec takes a little bit more cycles
than saving and restoring vector and bound registers individually.

Latency for _dl_runtime_resolve to lookup the function, foo, from one
shared library plus libc.so:

                             Before After Change

Westmere (SSE)/fxsave 345 866 151%
IvyBridge (AVX)/xsave 420 643 53%
Haswell (AVX)/xsave 713 1252 75%
Skylake (AVX+MPX)/xsavec 559 719 28%
Skylake (AVX512+MPX)/xsavec 145 272 87%
Ryzen (AVX)/xsavec 280 553 97%

This is the worst case where portion of time spent for saving and
restoring registers is bigger than majority of cases. With smaller
_dl_runtime_resolve code size, overall performance impact is negligible.

On IvyBridge, differences in build and test time of binutils with lazy
binding GCC and binutils are noises. On Westmere, differences in
bootstrap and "makc check" time of GCC 7 with lazy binding GCC and
binutils are also noises.

 [BZ #21265]
 * sysdeps/x86/cpu-features-offsets.sym (XSAVE_STATE_SIZE_OFFSET):
 New.
 * sysdeps/x86/cpu-features.c: Include <libc-internal.h>.
 (init_cpu_features): Set xsave_state_size and bit_XSAVEC_Usable
 if needed.
 * sysdeps/x86/cpu-features.h (bit_XSAVEC_Usable): New.
 (STATE_SAVE_OFFSET): Likewise.
 (STATE_SAVE_MASK): Likewise.
 [__ASSEMBLER__]: Include <cpu-features-offsets.h>.
 (cpu_features): Add xsave_state_size.
 (index_XSAVEC_Usable): New.
 * sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup):
 Replace _dl_runtime_resolve_sse, _dl_runtime_resolve_avx and
 _dl_runtime_resolve_avx512 with _dl_runtime_resolve_fxsave,
 _dl_runtime_resolve_xsave and _dl_runtime_resolve_xsavec.
 * sysdeps/x86_64/dl-trampoline.S: Include <cpu-features.h>.
 (DL_RUNTIME_UNALIGNED_VEC_SIZE): Removed.
 (DL_RUNTIME_RESOLVE_REALIGN_STACK): Check STATE_SAVE_ALIGNMENT
 instead of VEC_SIZE.
 (REGISTER_SAVE_BND0): Removed.
 (REGISTER_SAVE_BND1): Likewise.
 (REGISTER_SAVE_BND3): Likewise.
 (REGISTER_SAVE_RAX): Always defined to 0.
 (VMOV): Removed.
 (_dl_runtime_resolve_avx512): Likewise.
 (_dl_runtime_resolve_avx): Likewise.
 (_dl_runtime_resolve_sse): Likewise.
 (USE_FXSAVE): New.
 (_dl_runtime_resolve_fxsave): Likewise.
 (USE_XSAVE): Likewise.
 (_dl_runtime_resolve_xsave): Likewise.
 (USE_XSAVEC): Likewise.
 (_dl_runtime_resolve_xsavec): Likewise.
 * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx512):
 Removed.
 (_dl_runtime_resolve_avx): Likewise.
 (_dl_runtime_resolve_sse): Likewise.
 (_dl_runtime_resolve_fxsave): New.
 (_dl_runtime_resolve_xsave): Likewise.
 (_dl_runtime_resolve_xsavec): Likewise.
 (_dl_runtime_profile): Defined only if _dl_runtime_profile is
 defined.

(cherry picked from commit b52b0d793dcb226ecb0ecca1e672ca265973233c)

hjl/pr21265/2.23 2017-10-22 11:48:11 UTC 2017-10-22
x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]

Author: H.J. Lu
Author Date: 2017-03-23 15:21:52 UTC

x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]

In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector,
mask and bound registers. It simplifies _dl_runtime_resolve and supports
different calling conventions. ld.so code size is reduced by more than
1 KB. However, use fxsave/xsave/xsavec takes a little bit more cycles
than saving and restoring vector and bound registers individually.

Latency for _dl_runtime_resolve to lookup the function, foo, from one
shared library plus libc.so:

                             Before After Change

Westmere (SSE)/fxsave 345 866 151%
IvyBridge (AVX)/xsave 420 643 53%
Haswell (AVX)/xsave 713 1252 75%
Skylake (AVX+MPX)/xsavec 559 719 28%
Skylake (AVX512+MPX)/xsavec 145 272 87%
Ryzen (AVX)/xsavec 280 553 97%

This is the worst case where portion of time spent for saving and
restoring registers is bigger than majority of cases. With smaller
_dl_runtime_resolve code size, overall performance impact is negligible.

On IvyBridge, differences in build and test time of binutils with lazy
binding GCC and binutils are noises. On Westmere, differences in
bootstrap and "makc check" time of GCC 7 with lazy binding GCC and
binutils are also noises.

 [BZ #21265]
 * sysdeps/x86/cpu-features-offsets.sym (XSAVE_STATE_SIZE_OFFSET):
 New.
 * sysdeps/x86/cpu-features.c: Include <libc-internal.h>.
 (init_cpu_features): Set xsave_state_size and bit_XSAVEC_Usable
 if needed.
 * sysdeps/x86/cpu-features.h (bit_XSAVEC_Usable): New.
 (STATE_SAVE_OFFSET): Likewise.
 (STATE_SAVE_MASK): Likewise.
 [__ASSEMBLER__]: Include <cpu-features-offsets.h>.
 (cpu_features): Add xsave_state_size.
 (index_XSAVEC_Usable): New.
 * sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup):
 Replace _dl_runtime_resolve_sse, _dl_runtime_resolve_avx and
 _dl_runtime_resolve_avx512 with _dl_runtime_resolve_fxsave,
 _dl_runtime_resolve_xsave and _dl_runtime_resolve_xsavec.
 * sysdeps/x86_64/dl-trampoline.S: Include <cpu-features.h>.
 (DL_RUNTIME_UNALIGNED_VEC_SIZE): Removed.
 (DL_RUNTIME_RESOLVE_REALIGN_STACK): Check STATE_SAVE_ALIGNMENT
 instead of VEC_SIZE.
 (REGISTER_SAVE_BND0): Removed.
 (REGISTER_SAVE_BND1): Likewise.
 (REGISTER_SAVE_BND3): Likewise.
 (REGISTER_SAVE_RAX): Always defined to 0.
 (VMOV): Removed.
 (_dl_runtime_resolve_avx512): Likewise.
 (_dl_runtime_resolve_avx): Likewise.
 (_dl_runtime_resolve_sse): Likewise.
 (USE_FXSAVE): New.
 (_dl_runtime_resolve_fxsave): Likewise.
 (USE_XSAVE): Likewise.
 (_dl_runtime_resolve_xsave): Likewise.
 (USE_XSAVEC): Likewise.
 (_dl_runtime_resolve_xsavec): Likewise.
 * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx512):
 Removed.
 (_dl_runtime_resolve_avx): Likewise.
 (_dl_runtime_resolve_sse): Likewise.
 (_dl_runtime_resolve_fxsave): New.
 (_dl_runtime_resolve_xsave): Likewise.
 (_dl_runtime_resolve_xsavec): Likewise.
 (_dl_runtime_profile): Defined only if _dl_runtime_profile is
 defined.

(cherry picked from commit b52b0d793dcb226ecb0ecca1e672ca265973233c)

hjl/pr21265/2.26 2017-10-22 11:38:41 UTC 2017-10-22
x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]

Author: H.J. Lu
Author Date: 2017-03-23 15:21:52 UTC

x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]

In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector,
mask and bound registers. It simplifies _dl_runtime_resolve and supports
different calling conventions. ld.so code size is reduced by more than
1 KB. However, use fxsave/xsave/xsavec takes a little bit more cycles
than saving and restoring vector and bound registers individually.

Latency for _dl_runtime_resolve to lookup the function, foo, from one
shared library plus libc.so:

                             Before After Change

Westmere (SSE)/fxsave 345 866 151%
IvyBridge (AVX)/xsave 420 643 53%
Haswell (AVX)/xsave 713 1252 75%
Skylake (AVX+MPX)/xsavec 559 719 28%
Skylake (AVX512+MPX)/xsavec 145 272 87%
Ryzen (AVX)/xsavec 280 553 97%

This is the worst case where portion of time spent for saving and
restoring registers is bigger than majority of cases. With smaller
_dl_runtime_resolve code size, overall performance impact is negligible.

On IvyBridge, differences in build and test time of binutils with lazy
binding GCC and binutils are noises. On Westmere, differences in
bootstrap and "makc check" time of GCC 7 with lazy binding GCC and
binutils are also noises.

 [BZ #21265]
 * sysdeps/x86/cpu-features-offsets.sym (XSAVE_STATE_SIZE_OFFSET):
 New.
 * sysdeps/x86/cpu-features.c: Include <libc-pointer-arith.h>.
 (get_common_indeces): Set xsave_state_size, xsave_state_full_size
 and bit_arch_XSAVEC_Usable if needed.
 (init_cpu_features): Remove bit_arch_Use_dl_runtime_resolve_slow
 and bit_arch_Use_dl_runtime_resolve_opt.
 * sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt):
 Removed.
 (bit_arch_Use_dl_runtime_resolve_slow): Likewise.
 (bit_arch_Prefer_No_AVX512): Updated.
 (bit_arch_MathVec_Prefer_No_AVX512): Likewise.
 (bit_arch_XSAVEC_Usable): New.
 (STATE_SAVE_OFFSET): Likewise.
 (STATE_SAVE_MASK): Likewise.
 [__ASSEMBLER__]: Include <cpu-features-offsets.h>.
 (cpu_features): Add xsave_state_size and xsave_state_full_size.
 (index_arch_Use_dl_runtime_resolve_opt): Removed.
 (index_arch_Use_dl_runtime_resolve_slow): Likewise.
 (index_arch_XSAVEC_Usable): New.
 * sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)):
 Support XSAVEC_Usable. Remove Use_dl_runtime_resolve_slow.
 * sysdeps/x86_64/Makefile (tst-x86_64-1-ENV): New if tunables
 is enabled.
 * sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup):
 Replace _dl_runtime_resolve_sse, _dl_runtime_resolve_avx,
 _dl_runtime_resolve_avx_slow, _dl_runtime_resolve_avx_opt,
 _dl_runtime_resolve_avx512 and _dl_runtime_resolve_avx512_opt
 with _dl_runtime_resolve_fxsave, _dl_runtime_resolve_xsave and
 _dl_runtime_resolve_xsavec.
 * sysdeps/x86_64/dl-trampoline.S (DL_RUNTIME_UNALIGNED_VEC_SIZE):
 Removed.
 (DL_RUNTIME_RESOLVE_REALIGN_STACK): Check STATE_SAVE_ALIGNMENT
 instead of VEC_SIZE.
 (REGISTER_SAVE_BND0): Removed.
 (REGISTER_SAVE_BND1): Likewise.
 (REGISTER_SAVE_BND3): Likewise.
 (REGISTER_SAVE_RAX): Always defined to 0.
 (VMOV): Removed.
 (_dl_runtime_resolve_avx): Likewise.
 (_dl_runtime_resolve_avx_slow): Likewise.
 (_dl_runtime_resolve_avx_opt): Likewise.
 (_dl_runtime_resolve_avx512): Likewise.
 (_dl_runtime_resolve_avx512_opt): Likewise.
 (_dl_runtime_resolve_sse): Likewise.
 (_dl_runtime_resolve_sse_vex): Likewise.
 (USE_FXSAVE): New.
 (_dl_runtime_resolve_fxsave): Likewise.
 (USE_XSAVE): Likewise.
 (_dl_runtime_resolve_xsave): Likewise.
 (USE_XSAVEC): Likewise.
 (_dl_runtime_resolve_xsavec): Likewise.
 * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx512):
 Removed.
 (_dl_runtime_resolve_avx512_opt): Likewise.
 (_dl_runtime_resolve_avx): Likewise.
 (_dl_runtime_resolve_avx_opt): Likewise.
 (_dl_runtime_resolve_sse): Likewise.
 (_dl_runtime_resolve_sse_vex): Likewise.
 (_dl_runtime_resolve_fxsave): New.
 (_dl_runtime_resolve_xsave): Likewise.
 (_dl_runtime_resolve_xsavec): Likewise.

(cherry picked from commit b52b0d793dcb226ecb0ecca1e672ca265973233c)

hjl/pr21265/2.24 2017-10-20 20:48:14 UTC 2017-10-20
x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]

Author: H.J. Lu
Author Date: 2017-03-23 15:21:52 UTC

x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]

In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector,
mask and bound registers. It simplifies _dl_runtime_resolve and supports
different calling conventions. ld.so code size is reduced by more than
1 KB. However, use fxsave/xsave/xsavec takes a little bit more cycles
than saving and restoring vector and bound registers individually.

Latency for _dl_runtime_resolve to lookup the function, foo, from one
shared library plus libc.so:

                             Before After Change

Westmere (SSE)/fxsave 345 866 151%
IvyBridge (AVX)/xsave 420 643 53%
Haswell (AVX)/xsave 713 1252 75%
Skylake (AVX+MPX)/xsavec 559 719 28%
Skylake (AVX512+MPX)/xsavec 145 272 87%
Ryzen (AVX)/xsavec 280 553 97%

This is the worst case where portion of time spent for saving and
restoring registers is bigger than majority of cases. With smaller
_dl_runtime_resolve code size, overall performance impact is negligible.

On IvyBridge, differences in build and test time of binutils with lazy
binding GCC and binutils are noises. On Westmere, differences in
bootstrap and "makc check" time of GCC 7 with lazy binding GCC and
binutils are also noises.

 [BZ #21265]
 * sysdeps/x86/cpu-features-offsets.sym (XSAVE_STATE_SIZE_OFFSET):
 New.
 * sysdeps/x86/cpu-features.c: Include <libc-internal.h>.
 (get_common_indeces): Set xsave_state_size and
 bit_arch_XSAVEC_Usable if needed.
 (init_cpu_features): Remove bit_arch_Use_dl_runtime_resolve_slow
 and bit_arch_Use_dl_runtime_resolve_opt.
 * sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt):
 Removed.
 (bit_arch_Use_dl_runtime_resolve_slow): Likewise.
 (bit_arch_Prefer_No_AVX512): Updated.
 (bit_arch_MathVec_Prefer_No_AVX512): Likewise.
 (bit_arch_XSAVEC_Usable): New.
 (STATE_SAVE_OFFSET): Likewise.
 (STATE_SAVE_MASK): Likewise.
 [__ASSEMBLER__]: Include <cpu-features-offsets.h>.
 (cpu_features): Add xsave_state_size.
 (index_arch_Use_dl_runtime_resolve_opt): Removed.
 (index_arch_Use_dl_runtime_resolve_slow): Likewise.
 (index_arch_XSAVEC_Usable): New.
 * sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup):
 Replace _dl_runtime_resolve_sse, _dl_runtime_resolve_avx,
 _dl_runtime_resolve_avx_slow, _dl_runtime_resolve_avx_opt,
 _dl_runtime_resolve_avx512 and _dl_runtime_resolve_avx512_opt
 with _dl_runtime_resolve_fxsave, _dl_runtime_resolve_xsave and
 _dl_runtime_resolve_xsavec.
 * sysdeps/x86_64/dl-trampoline.S (DL_RUNTIME_UNALIGNED_VEC_SIZE):
 Removed.
 (DL_RUNTIME_RESOLVE_REALIGN_STACK): Check STATE_SAVE_ALIGNMENT
 instead of VEC_SIZE.
 (REGISTER_SAVE_BND0): Removed.
 (REGISTER_SAVE_BND1): Likewise.
 (REGISTER_SAVE_BND3): Likewise.
 (REGISTER_SAVE_RAX): Always defined to 0.
 (VMOV): Removed.
 (_dl_runtime_resolve_avx): Likewise.
 (_dl_runtime_resolve_avx_slow): Likewise.
 (_dl_runtime_resolve_avx_opt): Likewise.
 (_dl_runtime_resolve_avx512): Likewise.
 (_dl_runtime_resolve_avx512_opt): Likewise.
 (_dl_runtime_resolve_sse): Likewise.
 (_dl_runtime_resolve_sse_vex): Likewise.
 (USE_FXSAVE): New.
 (_dl_runtime_resolve_fxsave): Likewise.
 (USE_XSAVE): Likewise.
 (_dl_runtime_resolve_xsave): Likewise.
 (USE_XSAVEC): Likewise.
 (_dl_runtime_resolve_xsavec): Likewise.
 * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx512):
 Removed.
 (_dl_runtime_resolve_avx512_opt): Likewise.
 (_dl_runtime_resolve_avx): Likewise.
 (_dl_runtime_resolve_avx_opt): Likewise.
 (_dl_runtime_resolve_sse): Likewise.
 (_dl_runtime_resolve_sse_vex): Likewise.
 (_dl_runtime_resolve_fxsave): New.
 (_dl_runtime_resolve_xsave): Likewise.
 (_dl_runtime_resolve_xsavec): Likewise.

(cherry picked from commit b52b0d793dcb226ecb0ecca1e672ca265973233c)

hjl/pr21265/2.25 2017-10-20 20:46:16 UTC 2017-10-20
x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]

Author: H.J. Lu
Author Date: 2017-03-23 15:21:52 UTC

x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]

In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector,
mask and bound registers. It simplifies _dl_runtime_resolve and supports
different calling conventions. ld.so code size is reduced by more than
1 KB. However, use fxsave/xsave/xsavec takes a little bit more cycles
than saving and restoring vector and bound registers individually.

Latency for _dl_runtime_resolve to lookup the function, foo, from one
shared library plus libc.so:

                             Before After Change

Westmere (SSE)/fxsave 345 866 151%
IvyBridge (AVX)/xsave 420 643 53%
Haswell (AVX)/xsave 713 1252 75%
Skylake (AVX+MPX)/xsavec 559 719 28%
Skylake (AVX512+MPX)/xsavec 145 272 87%
Ryzen (AVX)/xsavec 280 553 97%

This is the worst case where portion of time spent for saving and
restoring registers is bigger than majority of cases. With smaller
_dl_runtime_resolve code size, overall performance impact is negligible.

On IvyBridge, differences in build and test time of binutils with lazy
binding GCC and binutils are noises. On Westmere, differences in
bootstrap and "makc check" time of GCC 7 with lazy binding GCC and
binutils are also noises.

 [BZ #21265]
 * sysdeps/x86/cpu-features-offsets.sym (XSAVE_STATE_SIZE_OFFSET):
 New.
 * sysdeps/x86/cpu-features.c: Include <libc-internal.h>.
 (get_common_indeces): Set xsave_state_size and
 bit_arch_XSAVEC_Usable if needed.
 (init_cpu_features): Remove bit_arch_Use_dl_runtime_resolve_slow
 and bit_arch_Use_dl_runtime_resolve_opt.
 * sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt):
 Removed.
 (bit_arch_Use_dl_runtime_resolve_slow): Likewise.
 (bit_arch_Prefer_No_AVX512): Updated.
 (bit_arch_MathVec_Prefer_No_AVX512): Likewise.
 (bit_arch_XSAVEC_Usable): New.
 (STATE_SAVE_OFFSET): Likewise.
 (STATE_SAVE_MASK): Likewise.
 [__ASSEMBLER__]: Include <cpu-features-offsets.h>.
 (cpu_features): Add xsave_state_size.
 (index_arch_Use_dl_runtime_resolve_opt): Removed.
 (index_arch_Use_dl_runtime_resolve_slow): Likewise.
 (index_arch_XSAVEC_Usable): New.
 * sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup):
 Replace _dl_runtime_resolve_sse, _dl_runtime_resolve_avx,
 _dl_runtime_resolve_avx_slow, _dl_runtime_resolve_avx_opt,
 _dl_runtime_resolve_avx512 and _dl_runtime_resolve_avx512_opt
 with _dl_runtime_resolve_fxsave, _dl_runtime_resolve_xsave and
 _dl_runtime_resolve_xsavec.
 * sysdeps/x86_64/dl-trampoline.S (DL_RUNTIME_UNALIGNED_VEC_SIZE):
 Removed.
 (DL_RUNTIME_RESOLVE_REALIGN_STACK): Check STATE_SAVE_ALIGNMENT
 instead of VEC_SIZE.
 (REGISTER_SAVE_BND0): Removed.
 (REGISTER_SAVE_BND1): Likewise.
 (REGISTER_SAVE_BND3): Likewise.
 (REGISTER_SAVE_RAX): Always defined to 0.
 (VMOV): Removed.
 (_dl_runtime_resolve_avx): Likewise.
 (_dl_runtime_resolve_avx_slow): Likewise.
 (_dl_runtime_resolve_avx_opt): Likewise.
 (_dl_runtime_resolve_avx512): Likewise.
 (_dl_runtime_resolve_avx512_opt): Likewise.
 (_dl_runtime_resolve_sse): Likewise.
 (_dl_runtime_resolve_sse_vex): Likewise.
 (USE_FXSAVE): New.
 (_dl_runtime_resolve_fxsave): Likewise.
 (USE_XSAVE): Likewise.
 (_dl_runtime_resolve_xsave): Likewise.
 (USE_XSAVEC): Likewise.
 (_dl_runtime_resolve_xsavec): Likewise.
 * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx512):
 Removed.
 (_dl_runtime_resolve_avx512_opt): Likewise.
 (_dl_runtime_resolve_avx): Likewise.
 (_dl_runtime_resolve_avx_opt): Likewise.
 (_dl_runtime_resolve_sse): Likewise.
 (_dl_runtime_resolve_sse_vex): Likewise.
 (_dl_runtime_resolve_fxsave): New.
 (_dl_runtime_resolve_xsave): Likewise.
 (_dl_runtime_resolve_xsavec): Likewise.

(cherry picked from commit b52b0d793dcb226ecb0ecca1e672ca265973233c)

hjl/pr22298/master 2017-10-15 14:48:58 UTC 2017-10-15
Define __PTHREAD_MUTEX_HAVE_PREV only if undefined [BZ #22298]

Author: H.J. Lu
Author Date: 2017-10-15 14:48:58 UTC

Define __PTHREAD_MUTEX_HAVE_PREV only if undefined [BZ #22298]

It is incorrect to define __PTHREAD_MUTEX_HAVE_PREV to 1 only when
__WORDSIZE == 64. For x32, __PTHREAD_MUTEX_HAVE_PREV should be 1, but
it has __WORDSIZE == 32. This patch defines __PTHREAD_MUTEX_HAVE_PREV
based on __WORDSIZE only if it is undefined. __PTHREAD_MUTEX_HAVE_PREV
check is changed from "#ifdef" to "#if" to support values of 0 or 1.

 [BZ #22298]
 * nptl/allocatestack.c (allocate_stack): Check if
 __PTHREAD_MUTEX_HAVE_PREV is non-zero, instead if
 __PTHREAD_MUTEX_HAVE_PREV is defined.
 * nptl/descr.h (pthread): Likewise.
 * nptl/nptl-init.c (__pthread_initialize_minimal_internal):
 Likewise.
 * nptl/pthread_create.c (START_THREAD_DEFN): Likewise.
 * sysdeps/nptl/fork.c (__libc_fork): Likewise.
 * sysdeps/nptl/pthread.h (PTHREAD_MUTEX_INITIALIZER): Likewise.
 * sysdeps/nptl/bits/thread-shared-types.h
 (__PTHREAD_MUTEX_HAVE_PREV): Define only if it is undefined.
 (__pthread_internal_list): Check __pthread_internal_list instead
 of __WORDSIZE.
 (__PTHREAD_SPINS_DATA): Likewise.
 (__PTHREAD_SPINS): Likewise.
 (__pthread_mutex_s): Likewise.
 * sysdeps/x86/nptl/bits/pthreadtypes-arch.h
 (__PTHREAD_MUTEX_HAVE_PREV): Defined.

hjl/pr22284/2.26 2017-10-12 22:56:40 UTC 2017-10-12
Support profiling PIE [BZ #22284]

Author: H.J. Lu
Author Date: 2017-10-12 10:45:55 UTC

Support profiling PIE [BZ #22284]

Since PIE can be loaded at any address, we need to subtract load address
from PCs.

 [BZ #22284]
 * gmon/gmon.c [PIC]: Include <link.h>.
 [PIC] (callback): New function.
 (write_hist): Add an argument for load address. Subtract load
 address from PCs.
 (write_call_graph): Likewise.
 (write_gmon): Call __dl_iterate_phdr to get load address, pass
 it to write_hist and write_call_graph.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

(cherry picked from commit d165ca64980f90ccace088670652cc203d1b5411)

hjl/nsz/math 2017-09-30 02:39:47 UTC 2017-09-30
i386: Replace assembly versions of e_powf with generic e_powf.c

Author: H.J. Lu
Author Date: 2017-09-30 02:39:47 UTC

i386: Replace assembly versions of e_powf with generic e_powf.c

This patch replaces i386 assembly versions of e_powf with generic
e_powf.c.

 * sysdeps/i386/fpu/e_powf.S: Removed.
 * sysdeps/i386/fpu/e_powf_log2_data.c: Likewise.
 * sysdeps/i386/fpu/w_powf.c: Likewise.
 * sysdeps/i386/fpu/libm-test-ulps: Updated for generic e_powf.c.
 * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise.
 * sysdeps/i386/i686/fpu/multiarch/Makefile (libm-sysdep_routines):
 Add e_powf-sse2.
 (CFLAGS-e_powf-sse2.c): New.
 * sysdeps/i386/i686/fpu/multiarch/e_powf-sse2.c: New file.
 * sysdeps/i386/i686/fpu/multiarch/e_powf.c: Likewise.

nsz/math 2017-09-29 10:51:12 UTC 2017-09-29
Do not wrap logf, log2f and powf

Author: Szabolcs Nagy
Author Date: 2017-09-13 17:14:26 UTC

Do not wrap logf, log2f and powf

The new generic logf, log2f and powf code don't need wrappers any more,
they set errno inline so only use the wrappers on targets that need it.

2017-09-19 Szabolcs Nagy <szabolcs.nagy@arm.com>

 * sysdeps/ieee754/flt-32/e_log2f.c (__log2f): Define without wrapper.
 * sysdeps/ieee754/flt-32/e_logf.c (__logf): Likewise
 * sysdeps/ieee754/flt-32/e_powf.c (__powf): Likewise
 * sysdeps/ieee754/flt-32/w_log2f.c: New file.
 * sysdeps/ieee754/flt-32/w_logf.c: New file.
 * sysdeps/ieee754/flt-32/w_powf.c: New file.
 * sysdeps/i386/fpu/w_log2f.c: New file.
 * sysdeps/i386/fpu/w_logf.c: New file.
 * sysdeps/i386/fpu/w_powf.c: New file.
 * sysdeps/m68k/m680x0/fpu/w_log2f.c: New file.
 * sysdeps/m68k/m680x0/fpu/w_logf.c: New file.
 * sysdeps/m68k/m680x0/fpu/w_powf.c: New file.

hjl/pr22053/master 2017-08-31 13:49:37 UTC 2017-08-31
Remove zero terminator for .eh_frame in libc.so [BZ #22053]

Author: H.J. Lu
Author Date: 2017-08-31 13:49:37 UTC

Remove zero terminator for .eh_frame in libc.so [BZ #22053]

elf/sofini.c has a zero terminator for .eh_frame in libc.so. It was
added before -eh-frame-hdr is added to ld. Since -eh-frame-hdr is
always used to build libc.so, zero terminator in elf/sofini.c can be
removed.

 [BZ #22053]
 * elf/sofini.c (__FRAME_END__): Removed.

hjl/fma/master 2017-08-28 15:14:40 UTC 2017-08-28
Mention x86-64 FMA optimization in NEWS

Author: H.J. Lu
Author Date: 2017-08-16 15:56:24 UTC

Mention x86-64 FMA optimization in NEWS

 * NEWS: Mention x86-64 FMA optimization.

hjl/pr21967/master 2017-08-25 18:44:24 UTC 2017-08-25
x86: Add MathVec_Prefer_No_AVX512 to cpu-features [BZ #21967]

Author: H.J. Lu
Author Date: 2017-08-25 18:01:03 UTC

x86: Add MathVec_Prefer_No_AVX512 to cpu-features [BZ #21967]

AVX512 functions in mathvec are used on machines with AVX512. An AVX2
wrapper is also provided and it can be used when the AVX512 version
isn't profitable. MathVec_Prefer_No_AVX512 is addded to cpu-features.
If glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 is set in GLIBC_TUNABLES
environment variable, the AVX2 wrapper will be used.

 [BZ #21967]
 * sysdeps/x86/cpu-features.h (bit_arch_MathVec_Prefer_No_AVX512):
 New.
 (index_arch_MathVec_Prefer_No_AVX512): Likewise.
 * sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)):
 Handle MathVec_Prefer_No_AVX512.

zack/dont-install-libio-h 2017-08-23 11:52:27 UTC 2017-08-23
Don't install libio.h or _G_config.h.

Author: Zack Weinberg
Author Date: 2017-04-20 15:21:30 UTC

Don't install libio.h or _G_config.h.

This is an experimental patch which removes libio.h (and _G_config.h)
from the set of application-exposed headers. After this change, the
public stdio.h does not define any symbols whose names begin with _G_
nor _IO_, except that when optimizing, the guts of struct _IO_FILE and
three of the flag constants are visible (see bits/stdio.h and
bits/types/FILE_internals.h). There is a small amount of code
duplication in bits/stdio.h, of macro bodies from libio.h that are no
longer available. A number of internal .c files that were manually
doing PLT bypass for flockfile/funlockfile can now rely on
include/stdio.h to do it for them.

It passes the testsuite on x86_64-linux, but it needs a great deal of
additional testing; in particular I'm almost certain I broke the
support for old-format (GLIBC_2.0) struct _IO_FILE, which is
configured out on this target. Testing this properly would require
someone to get their hands on _really_ old binaries, compiled against
glibc 2.0, possibly statically-linked-but-using-NSS. Unfortunately,
libc.so cannot be expected to be binary identical.

However, this should be ready to feed into archive rebuilds to find
out what applications break.

Substantial clean-ups to the libio implementation are possible if this
sticks, but I haven't done 'em; this is intended to be minimal.

 * libio/Makefile: Don't install libio.h or _G_config.h. Do install
 bits/types/FILE_internals.h, bits/types/cookie_io_functions_t.h,
 and bits/types/__fpos_t.h.

 * libio/stdio.h: Don't include libio.h. Get __gnuc_va_list
 directly from stdarg.h, __fpos_t and __fpos64_t from
 bits/types/__fpos_t.h, and the cookie types from
 bits/types/cookie_io_functions_t.h. Change all uses of
 _G_va_list, _G_fpos_t, _G_fpos64_t, _IO_FILE,
 _IO_cookie_io_functions_t, and _IO_ssize_t to __gnuc_va_list,
 __fpos_t, __fpos64_t, FILE, cookie_io_functions_t, and __ssize_t
 respectively.
 Do not define getc nor putc as macros.
 Define BUFSIZ as literal 8192.

 * libio/bits/types/FILE_internals.h: New header. Provide complete
 definition of struct _IO_FILE (the complete version) here.
 Duplicate definitions of _IO_EOF_SEEN, _IO_ERR_SEEN, and _IO_USER_LOCK
 here, with value assertions if they are already defined.
 * libio/bits/types/__fpos_t.h: New header. Define __fpos_t and
 __fpos64_t here.
 * libio/bits/types/cookie_io_functions_t.h: New header. Define
 cookie_read_function_t, cookie_write_function_t,
 cookie_seek_function_t, cookie_close_function_t, and
 cookie_io_functions_t here.

 * libio/libio.h: Include features.h first thing, then error out if
 either _LIBC or __USE_GNU is not defined, or if _ISOMAC is
 defined. Inline all of _G_config.h except _G_HAVE_MREMAP here.
 Get definitions of __mbstate_t, __fpos_t, __fpos64_t, struct
 _IO_FILE, and the cookie-related types from the relevant
 bits/types headers. Get definition of NULL from stddef.h.
 Make all #ifdef _LIBC and #if __GNUC__ >= (2,3) blocks
 unconditional. Remove all #if 0 and #ifdef __cplusplus blocks.
 Change all uses of _G_va_list, _G_fpos_t, and _G_fpos64_t to
 __gnuc_va_list, __fpos_t, __fpos64_t respectively. Provide
 definitions of _STDIO_USES_IOSTREAM, __HAVE_COLUMN,
 _IO_file_flags, __io_read_fn, __io_write_fn, __io_seek_fn,
 __io_close_fn, _IO_cookie_io_functions_t for the sake of the
 implementation. When _IO_USE_OLD_IO_FILE is defined, define
 struct _IO_FILE_old.
 * libio/libioP.h: When _IO_USE_OLD_IO_FILE is defined, define
 struct _IO_FILE_old_plus. Only declare _IO_old_file_init_internal
 when _IO_USE_OLD_IO_FILE is defined, and have it take an
 argument of type struct _IO_FILE_old_plus.
 * libio/oldfileops.c: Change all uses of _IO_FILE to _IO_FILE_old,
 _IO_FILE_plus to _IO_FILE_old_plus, _IO_FILE_complete to _IO_FILE,
 and _IO_FILE_complete_plus to _IO_FILE_plus. Then adjust types
 to match caller/callee's expectations.
 * libio/oldiofdopen.c, libio/oldiofopen.c, libio/oldiopopen.c
 * libio/oldstdfiles.c: Likewise.
 * sysdeps/generic/_G_config.h, sysdeps/unix/sysv/linux/_G_config.h:
 Only provide definition or non-definition of _G_HAVE_MREMAP.

 * sysdeps/ieee754/ldbl-opt/nldbl-iovfscanf.c: Delete file.
 * sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Remove iovfscanf.
 * sysdeps/ieee754/ldbl-opt/nldbl-compat.c: Define
 __nldbl__IO_vsprintf as alias to __nldbl_vsprintf instead of
 the other way around.
 * sysdeps/ieee754/ldbl-opt/nldbl-compat.h:
 Change all uses of _G_va_list to __gnuc_va_list. Remove
 NLDBL_DECL for _IO_vfscanf.
 * sysdeps/ieee754/ldbl-opt/nldbl-fscanf.c
 * sysdeps/ieee754/ldbl-opt/nldbl-scanf.c
 * sysdeps/ieee754/ldbl-opt/nldbl-vfscanf.c
 * sysdeps/ieee754/ldbl-opt/nldbl-vscanf.c:
 Use __nldbl_vfscanf, not __nldbl__IO_vfscanf.

 * libio/bits/stdio.h: Add multiple-inclusion guard. Include
 bits/types/FILE_internals.h. Declare __uflow and __overflow here.
 Remove redundant __USE_EXTERN_INLINES ifdef. Change all uses of
 _G_va_list to __gnuc_va_list and _IO_ssize_t to __ssize_t.
 (getchar): Use getc, not _IO_getc.
 (__getc_unlocked, __putc_unlocked): New inlines, duplicating the
 bodies of _IO_getc_unlocked and _IO_putc_unlocked.
 (fgetc_unlocked, getc_unlocked, getchar_unlocked, fread_unlocked):
 Use __getc_unlocked.
 (fputc_unlocked, putc_unlocked, putchar_unlocked, fwrite_unlocked):
 Use __putc_unlocked.
 (feof_unlocked): Duplicate the body of _IO_feof_unlocked here.
 (ferror_unlocked): Duplicate the body of _IO_ferror_unlocked here.
 * libio/bits/stdio2.h: Change all uses of _G_va_list to __gnuc_va_list.
 (fread_unlocked): Use __getc_unlocked.
 * libio/bits/types/FILE.h, libio/bits/types/__FILE.h: Explain in
 comments why the name _IO_FILE is used.

 * include/stdio.h: Change all uses of _G_va_list to __gnuc_va_list,
 _IO_ssize_t to __size_t, _IO_FILE to FILE, and _IO_fpos_t to __fpos_t.
 When IS_IN (libc), redirect flockfile and funlockfile to
 __flockfile and __funlockfile respectively.
 When _IO_MTSAFE_IO and not _ISOMAC, include stdio-lock.h before
 stdio.h proper.
 * include/stdio_ext.h: Include bits/types/FILE_internals.h for the
 sake of the inline definition of __fsetlocking.
 * include/libio.h: Adjust #ifdef nest to activate multiple-include
 optimization.
 * include/bits/types/FILE_internals.h, include/bits/types/__fpos_t.h
 * include/bits/types/cookie_io_functions_t.h: New trivial wrappers.
 * include/bits/stdio.h: New wrapper; mark __uflow and __overflow
 as hidden for intra-libc callers.

 * csu/init.c: Include libio.h, not _G_config.h.

 * grp/fgetgrent_r.c, grp/putgrent.c, gshadow/fgetsgent_r.c
 * gshadow/putsgent.c, misc/getpass.c, misc/getttyent.c
 * misc/mntent_r.c, posix/getopt.c, pwd/fgetpwent_r.c
 * shadow/fgetspent_r.c, shadow/putspent.c:
 Don't include libio/iolibio.h. Don't redefine flockfile or
 funlockfile. Don't use _IO_flockfile or _IO_funlockfile.

 * libio/__fbufsize.c, libio/__flbf.c, libio/__fpending.c
 * libio/__freadable.c, libio/__freading.c, libio/__fwritable.c
 * libio/__fwriting.c, malloc/malloc.c: Include libio.h.
 * misc/err.c: Include libio.h. Don't redefine flockfile or funlockfile.

 * stdio-common/tstgetln.c: Include sys/types.h. Don't redefine ssize_t.
 * conform/data/stdio.h-data: va_list may be defined as __gnuc_va_list,
 not _G_va_list.
 * benchtests/strcoll-inputs/filelist#en_US.UTF-8: Remove _G_config.h.

zack/anon-unions 2017-08-23 11:42:53 UTC 2017-08-23
RFC: Use anonymous union for siginfo_t

Author: Zack Weinberg
Author Date: 2017-08-11 13:43:59 UTC

RFC: Use anonymous union for siginfo_t

C2011 officially includes an 'anonymous union' feature, and GCC has
supported it for many years. It makes sub-fields of a union that's a
struct field appear to be fields of the parent struct. If we use this
in the definition of siginfo_t, we don't need to define lots of
innocuous-looking identifiers like 'si_pid' as macros expanding to
chains of field accessors. The catch, however, is that the compiler
used to compile *programs that use glibc* - not just glibc itself -
must accept the use of this feature (in system headers) even when
running in an older conformance mode.

This patch only touches siginfo_t, but if people like the idea, we
could also do it for several other types:

netinet/in.h (in6_addr)
sys/stat.h (stat)
utmp.h (utmp)
signal.h (sigaction, sigevent_t)
ucontext.h (ucontext_t)

and maybe also - these use names in the user namespace for the fields
that would be removed:

net/if.h (ifaddr)
ifaddrs.h (ifaddrs)
netinet/in6.h (ip6_hdr) (really should be bitfields instead)
netinet/icmp6.h (many)
net/if_ppp.h (ifpppstatsreq, ifpppcstatsreq)
net/if_shaper.h (shaperconf)
a.out.h (exec) (only some versions)
sys/quota.h (dqblk?) (the dq_* macros refer to a field that doesn't exist?!)

There are still more hits in sunrpc and nis, but since hopefully that
code is going to go away, I don't propose to mess with them. And
there may be even more that aren't caught by grepping for
'#define IDENT IDENT.IDENT'.

As a side note (and this could be split for commit if felt
appropriate), the siginfo_t field aliases 'si_int' and 'si_ptr' are
not in POSIX. There are a few uses of these within glibc itself, and
a handful more in third-party software (not glibc, not uclibc, and not
linux) so I have preserved them, but put them under __USE_MISC and
added a deprecation warning.

This passes the glibc testsuite on x86-64-linux, which probably
*doesn't* test the case where someone is compiling a program in
an older conformance mode that uses siginfo_t
(-std=c99 -D_XOPEN_SOURCE=600, perhaps).

What do you think?

zw

 * sysdeps/unix/sysv/linux/bits/types/siginfo_t.h (siginfo_t):
 Use C2011 anonymous union and anonymous struct-in-union features
 to define this type. Rename some public fields with their
 official names.
 (si_pid, si_uid, si_timerid, si_overrun, si_status, si_utime)
 (si_stime, si_value, si_addr, si_addr_lsb, si_lower, si_upper)
 (si_pkey, si_band, si_fd, si_call_addr, si_syscall, si_arch):
 Do not define as macros.
 (si_int, si_ptr): Define only when __USE_MISC, with deprecation
 warnings.
 * sysdeps/unix/sysv/linux/ia64/bits/siginfo-arch.h
 * sysdeps/unix/sysv/linux/sparc/bits/siginfo-arch.h
 * sysdeps/unix/sysv/linux/tile/bits/siginfo-arch.h
 (__SI_SIGFAULT_ADDL): Define all fields with their public names
 when __USE_GNU, or with impl-namespace names otherwise.
 (si_imm, si_segvflags, si_isr, si_trapno): Do not define as macros.

 * sysdeps/unix/sysv/linux/timer_routines.c (timer_helper_thread)
 Use si_value.sival_ptr instead of si_ptr.
 * sysdeps/unix/sysv/linux/tst-getpid1.c (do_test):
 Use si_value.sival_int instead of si_int.

hjl/gmp 2017-08-22 12:02:20 UTC 2017-08-22
rshift.c: Replace assert with DEBUG and abort

Author: H.J. Lu
Author Date: 2017-08-22 00:07:36 UTC

rshift.c: Replace assert with DEBUG and abort

assert in stdlib/rshift.c should be for debug purpose only and there is
no such check in any rshift assembly implementations nor in lshift.c.
This patch replaces assert with DEBUG and abort, similar to lshift.c
so that generic GMP codes from libc.a can be linked with libc.so in
atest-exp, atest-exp2 and atest-sincos, which depend on the GMP
implementation in glibc.

 * stdlib/rshift.c (mpn_rshift): Replace ssert with DEBUG and
 abort.

hjl/fma/2.26 2017-08-16 15:46:05 UTC 2017-08-16
x86-64: Optimize e_expf with FMA [BZ #21912]

Author: H.J. Lu
Author Date: 2017-08-16 15:43:35 UTC

x86-64: Optimize e_expf with FMA [BZ #21912]

FMA optimized e_expf improves performance by more than 50% on Skylake.

 [BZ #21912]
 * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
 Add e_expf-fma.
 * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file.
 * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise.
 * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise.

(cherry picked from commit 24a2e6588d2e0c91b4003878b0625d4a9360e8f3)

fw/tst-gmon 2017-08-15 11:44:56 UTC 2017-08-15
gmon: Add test for basic mcount/gprof functionality

Author: Florian Weimer
Author Date: 2017-08-15 11:44:56 UTC

gmon: Add test for basic mcount/gprof functionality

hjl/pr21864/master 2017-08-13 16:58:33 UTC 2017-08-13
Don't compile non-lib modules as lib modules [BZ #21864]

Author: H.J. Lu
Author Date: 2017-07-30 04:04:09 UTC

Don't compile non-lib modules as lib modules [BZ #21864]

Some programs have more than one source files. These non-lib modules
should not be compiled with -DMODULE_NAME=libc. This patch puts these
non-lib modules in $(others-extras) and adds $(others-extras) to
all-nonlib.

 [BZ #21864]
 * Makerules (all-nonlib): Add $(others-extras).
 * catgets/Makefile (others-extras): New.
 * elf/Makefile (others-extras): Likewise.
 * nss/Makefile (others-extras): Likewise.

hjl/cet/pr21598 2017-08-13 14:31:11 UTC 2017-08-13
x86: Add IBT/SHSTK bits to cpu-features

Author: H.J. Lu
Author Date: 2017-06-22 15:51:42 UTC

x86: Add IBT/SHSTK bits to cpu-features

Add IBT/SHSTK bits to cpu-features for Shadow Stack in Intel Control-flow
Enforcement Technology (CET) instructions:

https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf

 * sysdeps/x86/cpu-features.h (bit_cpu_BIT): New.
 (bit_cpu_SHSTK): Likewise.
 (index_cpu_IBT): Likewise.
 (index_cpu_SHSTK): Likewise.
 (reg_IBT): Likewise.
 (reg_SHSTK): Likewise.
 * sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)):
 Handle index_cpu_IBT and index_cpu_SHSTK.

aaribaud/y2038-2.25-dev 2017-08-09 10:07:41 UTC 2017-08-09
Y2038: implement Y2038-ready adjtimex, ntp_adjtime

Author: Albert ARIBAUD (3ADEV)
Author Date: 2017-08-09 10:07:41 UTC

Y2038: implement Y2038-ready adjtimex, ntp_adjtime

hjl/pr21913/master 2017-08-07 20:05:12 UTC 2017-08-07
i386: Add <startup.h> [BZ #21913]

Author: H.J. Lu
Author Date: 2017-07-19 21:32:42 UTC

i386: Add <startup.h> [BZ #21913]

On Linux/i386, there are 3 ways to make a system call:

1. call *%gs:SYSINFO_OFFSET. This requires TLS initialization.
2. call *_dl_sysinfo. This requires relocation of _dl_sysinfo.
3. int $0x80. This is slower than #2 and #3, but works everywhere.

When an object file is compiled with PIC, #1 is prefered since it is
faster than #3 and doesn't require relocation of _dl_sysinfo. For
dynamic executables, ld.so initializes TLS. However, for static
executables, before TLS is initialized by __libc_setup_tls, #3 should
be used for syscalls.

This patch adds <startup.h> which defines _startup_fatal and defaults
it to __libc_fatal. It replaces __libc_fatal with _startup_fatal in
static executables where it is called before __libc_setup_tls is called.
This header file is included in all files containing functions which are
called before __libc_setup_tls is called. On Linux/i386, when PIE is
enabled by default, _startup_fatal is turned into ABORT_INSTRUCTION and
I386_USE_SYSENTER is defined to 0 so that "int $0x80" is used for system
calls before __libc_setup_tls is called.

 [BZ #21913]
 * config.h.in (BUILD_PIE_DEFAULT): New.
 * csu/libc-tls.c: Include <startup.h>.
 * elf/dl-tunables.c: Likewise.
 * sysdeps/unix/sysv/linux/i386/brk.c: Likewise.
 * csu/libc-tls.c: Include <startup.h>.
 (__libc_setup_tls): Call _startup_fatal instead of __libc_fatal.
 * sysdeps/generic/startup.h: New file.
 * sysdeps/unix/sysv/linux/i386/startup.h: Likewise.

hjl/xgetbv/2.25 2017-08-04 18:15:17 UTC 2017-08-04
x86-64: Use _dl_runtime_resolve_opt only with AVX512F [BZ #21871]

Author: H.J. Lu
Author Date: 2017-07-28 22:13:40 UTC

x86-64: Use _dl_runtime_resolve_opt only with AVX512F [BZ #21871]

On AVX machines with XGETBV (ECX == 1) like Skylake processors,

(gdb) disass _dl_runtime_resolve_avx_opt
Dump of assembler code for function _dl_runtime_resolve_avx_opt:
   0x0000000000015890 <+0>: push %rax
   0x0000000000015891 <+1>: push %rcx
   0x0000000000015892 <+2>: push %rdx
   0x0000000000015893 <+3>: mov $0x1,%ecx
   0x0000000000015898 <+8>: xgetbv
   0x000000000001589b <+11>: mov %eax,%r11d
   0x000000000001589e <+14>: pop %rdx
   0x000000000001589f <+15>: pop %rcx
   0x00000000000158a0 <+16>: pop %rax
   0x00000000000158a1 <+17>: and $0x4,%r11d
   0x00000000000158a5 <+21>: bnd je 0x16200 <_dl_runtime_resolve_sse_vex>
End of assembler dump.

is slower than:

(gdb) disass _dl_runtime_resolve_avx_slow
Dump of assembler code for function _dl_runtime_resolve_avx_slow:
   0x0000000000015850 <+0>: vorpd %ymm0,%ymm1,%ymm8
   0x0000000000015854 <+4>: vorpd %ymm2,%ymm3,%ymm9
   0x0000000000015858 <+8>: vorpd %ymm4,%ymm5,%ymm10
   0x000000000001585c <+12>: vorpd %ymm6,%ymm7,%ymm11
   0x0000000000015860 <+16>: vorpd %ymm8,%ymm9,%ymm9
   0x0000000000015865 <+21>: vorpd %ymm10,%ymm11,%ymm10
   0x000000000001586a <+26>: vpcmpeqd %xmm8,%xmm8,%xmm8
   0x000000000001586f <+31>: vorpd %ymm9,%ymm10,%ymm10
   0x0000000000015874 <+36>: vptest %ymm10,%ymm8
   0x0000000000015879 <+41>: bnd jae 0x158b0 <_dl_runtime_resolve_avx>
   0x000000000001587c <+44>: vzeroupper
   0x000000000001587f <+47>: bnd jmpq 0x16200 <_dl_runtime_resolve_sse_vex>
End of assembler dump.
(gdb)

since xgetbv takes much more cycles than single cycle operations like
vpord/vvpcmpeq/ptest. _dl_runtime_resolve_opt should be used only with
AVX512 where AVX512 instructions lead to lower CPU frequency on Skylake
server.

 [BZ #21871]
 * sysdeps/x86/cpu-features.c (init_cpu_features): Set
 bit_arch_Use_dl_runtime_resolve_opt only with AVX512F.

(cherry picked from commit d2cf37c0a2a375cf2fde69f1afbcc49e45368fc4)

hjl/ifunc/x86 2017-08-02 15:35:33 UTC 2017-08-02
x86: Remove assembly versions of HAS_CPU_FEATURE/HAS_ARCH_FEATURE

Author: H.J. Lu
Author Date: 2017-07-18 22:36:20 UTC

x86: Remove assembly versions of HAS_CPU_FEATURE/HAS_ARCH_FEATURE

Since all x86 IFUNC selectors are implemented in C, assembly versions of
HAS_CPU_FEATURE and HAS_ARCH_FEATURE can be removed.

 * sysdeps/x86/cpu-features.h [__ASSEMBLER__]
 (LOAD_RTLD_GLOBAL_RO_RDX, HAS_FEATURE, LOAD_FUNC_GOT_EAX,
 HAS_CPU_FEATURE, HAS_ARCH_FEATURE): Removed.

hjl/xgetbv/master 2017-08-02 15:12:41 UTC 2017-08-02
x86-64: Use _dl_runtime_resolve_opt only with AVX512F [BZ #21871]

Author: H.J. Lu
Author Date: 2017-07-28 22:13:40 UTC

x86-64: Use _dl_runtime_resolve_opt only with AVX512F [BZ #21871]

On AVX machines with XGETBV (ECX == 1) like Skylake processors,

(gdb) disass _dl_runtime_resolve_avx_opt
Dump of assembler code for function _dl_runtime_resolve_avx_opt:
   0x0000000000015890 <+0>: push %rax
   0x0000000000015891 <+1>: push %rcx
   0x0000000000015892 <+2>: push %rdx
   0x0000000000015893 <+3>: mov $0x1,%ecx
   0x0000000000015898 <+8>: xgetbv
   0x000000000001589b <+11>: mov %eax,%r11d
   0x000000000001589e <+14>: pop %rdx
   0x000000000001589f <+15>: pop %rcx
   0x00000000000158a0 <+16>: pop %rax
   0x00000000000158a1 <+17>: and $0x4,%r11d
   0x00000000000158a5 <+21>: bnd je 0x16200 <_dl_runtime_resolve_sse_vex>
End of assembler dump.

is slower than:

(gdb) disass _dl_runtime_resolve_avx_slow
Dump of assembler code for function _dl_runtime_resolve_avx_slow:
   0x0000000000015850 <+0>: vorpd %ymm0,%ymm1,%ymm8
   0x0000000000015854 <+4>: vorpd %ymm2,%ymm3,%ymm9
   0x0000000000015858 <+8>: vorpd %ymm4,%ymm5,%ymm10
   0x000000000001585c <+12>: vorpd %ymm6,%ymm7,%ymm11
   0x0000000000015860 <+16>: vorpd %ymm8,%ymm9,%ymm9
   0x0000000000015865 <+21>: vorpd %ymm10,%ymm11,%ymm10
   0x000000000001586a <+26>: vpcmpeqd %xmm8,%xmm8,%xmm8
   0x000000000001586f <+31>: vorpd %ymm9,%ymm10,%ymm10
   0x0000000000015874 <+36>: vptest %ymm10,%ymm8
   0x0000000000015879 <+41>: bnd jae 0x158b0 <_dl_runtime_resolve_avx>
   0x000000000001587c <+44>: vzeroupper
   0x000000000001587f <+47>: bnd jmpq 0x16200 <_dl_runtime_resolve_sse_vex>
End of assembler dump.
(gdb)

since xgetbv takes much more cycles than single cycle operations like
vpord/vvpcmpeq/ptest. _dl_runtime_resolve_opt should be used only with
AVX512 where AVX512 instructions lead to lower CPU frequency on Skylake
server.

 [BZ #21871]
 * sysdeps/x86/cpu-features.c (init_cpu_features): Set
 bit_arch_Use_dl_runtime_resolve_opt only with AVX512F.

hjl/pr21815/master 2017-08-02 14:50:15 UTC 2017-08-02
Compile tst-prelink.c without PIE [BZ #21815]

Author: H.J. Lu
Author Date: 2017-07-19 19:04:14 UTC

Compile tst-prelink.c without PIE [BZ #21815]

tst-prelink.c checks for conflict with GLOB_DAT relocation against
stdout. On i386, there is no GLOB_DAT relocation against stdout with
PIE. Compile tst-prelink.c without PIE to generate GLOB_DAT relocation.

 [BZ #21815]
 * elf/Makefile (CFLAGS-tst-prelink.c): New.
 (LDFLAGS-tst-prelink): Likewise.

aaribaud/y2038-2.25 2017-08-02 10:56:01 UTC 2017-08-02
Y2038: add Y2038-ready getrusage

Author: Albert ARIBAUD (3ADEV)
Author Date: 2017-08-02 10:56:01 UTC

Y2038: add Y2038-ready getrusage

hjl/ifunc/master 2017-07-30 00:06:08 UTC 2017-07-30
x86-64: Use IFUNC memcpy and mempcpy in libc.a

Author: H.J. Lu
Author Date: 2017-07-09 13:42:29 UTC

x86-64: Use IFUNC memcpy and mempcpy in libc.a

Since apply_irel is called before memcpy and mempcpy are called, we
can use IFUNC memcpy and mempcpy in libc.a.

 * sysdeps/x86_64/memmove.S (MEMCPY_SYMBOL): Don't check SHARED.
 (MEMPCPY_SYMBOL): Likewise.
 * sysdeps/x86_64/multiarch/ifunc-impl-list.c
 (__libc_ifunc_impl_list): Test memcpy and mempcpy in libc.a.
 * sysdeps/x86_64/multiarch/memcpy-ssse3-back.S: Also include
 in libc.a.
 * sysdeps/x86_64/multiarch/memcpy-ssse3.S: Likewise.
 * sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S:
 Likewise.
 * sysdeps/x86_64/multiarch/memcpy.c: Also include in libc.a.
 (__hidden_ver1): Don't use in libc.a.
 * sysdeps/x86_64/multiarch/memmove-sse2-unaligned-erms.S
 (__mempcpy): Don't create a weak alias in libc.a.
 * sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: Support
 libc.a.
 * sysdeps/x86_64/multiarch/mempcpy.c: Also include in libc.a.
 (__hidden_ver1): Don't use in libc.a.

hjl/pr21752/master 2017-07-19 18:10:31 UTC 2017-07-19
Avoid accessing corrupted stack from __stack_chk_fail [BZ #21752]

Author: H.J. Lu
Author Date: 2017-07-19 17:56:19 UTC

Avoid accessing corrupted stack from __stack_chk_fail [BZ #21752]

__libc_argv[0] points to address on stack and __libc_secure_getenv
accesses environment variables which are on stack. We should avoid
accessing stack when stack is corrupted.

This patch also renames function argument in __fortify_fail_abort
from do_backtrace to need_backtrace to avoid confusion with do_backtrace
from enum __libc_message_action.

 [BZ #21752]
 * debug/fortify_fail.c (__fortify_fail_abort): Don't pass down
 __libc_argv[0] if we aren't doing backtrace. Rename do_backtrace
 to need_backtrace.
 * sysdeps/posix/libc_fatal.c (__libc_message): Don't call
 __libc_secure_getenv if we aren't doing backtrace.

hjl/pr21666/2.25 2017-07-19 10:39:49 UTC 2017-07-19
Avoid .symver on common symbols [BZ #21666]

Author: H.J. Lu
Author Date: 2017-06-23 21:38:46 UTC

Avoid .symver on common symbols [BZ #21666]

The .symver directive on common symbol just creates a new common symbol,
not an alias and the newer assembler with the bug fix for

https://sourceware.org/bugzilla/show_bug.cgi?id=21661

will issue an error. Before the fix, we got

$ readelf -sW libc.so | grep "loc[12s]"
  5109: 00000000003a0608 8 OBJECT LOCAL DEFAULT 36 loc1
  5188: 00000000003a0610 8 OBJECT LOCAL DEFAULT 36 loc2
  5455: 00000000003a0618 8 OBJECT LOCAL DEFAULT 36 locs
  6575: 00000000003a05f0 8 OBJECT GLOBAL DEFAULT 36 locs@GLIBC_2.2.5
  7156: 00000000003a05f8 8 OBJECT GLOBAL DEFAULT 36 loc1@GLIBC_2.2.5
  7312: 00000000003a0600 8 OBJECT GLOBAL DEFAULT 36 loc2@GLIBC_2.2.5

in libc.so. The versioned loc1, loc2 and locs have the wrong addresses.
After the fix, we got

$ readelf -sW libc.so | grep "loc[12s]"
  6570: 000000000039e3b8 8 OBJECT GLOBAL DEFAULT 34 locs@GLIBC_2.2.5
  7151: 000000000039e3c8 8 OBJECT GLOBAL DEFAULT 34 loc1@GLIBC_2.2.5
  7307: 000000000039e3c0 8 OBJECT GLOBAL DEFAULT 34 loc2@GLIBC_2.2.5

 [BZ #21666]
 * misc/regexp.c (loc1): Add __attribute__ ((nocommon));
 (loc2): Likewise.
 (locs): Likewise.

(cherry picked from commit 388b4f1a02f3a801965028bbfcd48d905638b797)

linaro/2.23/master 2017-07-13 14:36:20 UTC 2017-07-13
Ignore and remove LD_HWCAP_MASK for AT_SECURE programs (bug #21209)

Author: Siddhesh Poyarekar
Author Date: 2017-03-07 15:22:04 UTC

Ignore and remove LD_HWCAP_MASK for AT_SECURE programs (bug #21209)

The LD_HWCAP_MASK environment variable may alter the selection of
function variants for some architectures. For AT_SECURE process it
means that if an outdated routine has a bug that would otherwise not
affect newer platforms by default, LD_HWCAP_MASK will allow that bug
to be exploited.

To be on the safe side, ignore and disable LD_HWCAP_MASK for setuid
binaries.

 [BZ #21209]
 * elf/rtld.c (process_envvars): Ignore LD_HWCAP_MASK for
 AT_SECURE processes.
 * sysdeps/generic/unsecvars.h: Add LD_HWCAP_MASK.

(cherry picked from commit 1c1243b6fc33c029488add276e56570a07803bfd)

hjl/pr12189 2017-07-10 21:14:27 UTC 2017-07-10
Replace int with bool in __fortify_fail_abort

Author: H.J. Lu
Author Date: 2017-07-10 21:14:27 UTC

Replace int with bool in __fortify_fail_abort

 * debug/fortify_fail.c (__fortify_fail_abort): Replace int with
 bool.
 (__fortify_fail): Pass false to __fortify_fail_abort.
 * debug/stack_chk_fail.c (__stack_chk_fail): Pass true to
 __fortify_fail_abort.
 * include/stdio.h: Include <stdbool.h>l
 (__fortify_fail_abort): Replace int with bool.

zack/build-many-improvements 2017-07-09 14:04:55 UTC 2017-07-09
build-many-glibcs: Impose a memory limit on build processes.

Author: Zack Weinberg
Author Date: 2017-07-07 14:45:37 UTC

build-many-glibcs: Impose a memory limit on build processes.

There are sometimes bugs in the compiler
(e.g. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78460) that cause
it to consume all available memory. To limit the impact of this on
automated test robots, impose memory limits on all subprocesses in
build-many-glibcs.py. When the bug hits, the compiler will still run
out of memory and crash, but that should not affect any other
simultaneous task.

The limit can be configured with the --memory-limit command line switch.
The default is 1.5 gigabytes or physical RAM divided by the number of
jobs to run in parallel, whichever is larger. (Empirically, 1.5 gigs
per process is enough for everything but the files affected by GCC
bug 78640, but 1 gig per process is insufficient for some of the math
tests and also for the "genautomata" step when building compilers for
powerpc64.)

Rather than continue to lengthen the argument list of the Context
constructor, it now takes the entire 'opts' object as its sole argument.

 * scripts/build-many-glibcs.py (total_ram): New function.
 (Context.set_memory_limits): New function.
 (Context.run_builds): Call set_memory_limits immediately before
 do_build.
 (get_parser): Add --memory-limit command-line switch.
 (Context.__init__): Take 'opts' object as sole argument.
 Add 'memory_limit' attribute to self. Make topdir absolute here.
 (main): Update to match.

zack/jda-hppa-patch 2017-07-09 14:03:38 UTC 2017-07-09
JDA's patches for HPPA

Author: Zack Weinberg
Author Date: 2017-06-01 12:50:10 UTC

JDA's patches for HPPA

hjl/pr21609/master 2017-07-05 13:18:48 UTC 2017-07-05
x86-64: Redirect __tls_get_addr to ___tls_get_addr

Author: H.J. Lu
Author Date: 2017-07-05 11:46:45 UTC

x86-64: Redirect __tls_get_addr to ___tls_get_addr

On x86-64, __tls_get_addr realigns stack to support GCC older than GCC
4.9.4:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066

___tls_get_addr is the alternative x86-64 runtime interface to TLS with
aligned stack. If linker treats ___tls_get_addr just like __tls_get_addr
and the weakref assembly directive is extended to redirect __tls_get_addr
to ___tls_get_addr, we can replace reference to __tls_get_addr with
___tls_get_addr if GCC aligns stack for __tls_get_addr, but doesn't
support ___tls_get_addr.

 * sysdeps/x86_64/Makefile (+cflags): New. Add
 -include $(..)sysdeps/x86_64/tls_get_addr.h if __tls_get_addr
 can be redireted to ___tls_get_addr.
 * sysdeps/x86_64/configure.ac: Check if GCC supports
 ___tls_get_addr. If not, check if __tls_get_addr to
 ___tls_get_addr redirection works.
 * sysdeps/x86_64/configure: Regenerated.
 * sysdeps/x86_64/tls_get_addr.h: New file.

hjl/pr21120/2.24 2017-06-30 16:53:31 UTC 2017-06-30
i386: Increase MALLOC_ALIGNMENT to 16 [BZ #21120]

Author: H.J. Lu
Author Date: 2017-06-30 16:11:08 UTC

i386: Increase MALLOC_ALIGNMENT to 16 [BZ #21120]

GCC 7 changed the definition of max_align_t on i386:

https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9b5c49ef97e63cc63f1ffa13baf771368105ebe2

As a result, glibc malloc no longer returns memory blocks which are as
aligned as max_align_t requires.

This causes malloc/tst-malloc-thread-fail to fail with an error like this
one:

error: allocation function 0, size 144 not aligned to 16

This patch moves the MALLOC_ALIGNMENT definition to <malloc-alignment.h>
and increases the malloc alignment to 16 for i386.

 [BZ #21120]
 * malloc/malloc.c (MALLOC_ALIGNMENT): Moved to ...
 * sysdeps/generic/malloc-alignment.h: Here. New file.
 * sysdeps/i386/malloc-alignment.h: Likewise.
 * sysdeps/generic/malloc-machine.h: Include <malloc-alignment.h>.

(cherry picked from commit 4e61a6be446026c327aa70cef221c9082bf0085d)

hjl/pr21120/2.25 2017-06-30 16:19:29 UTC 2017-06-30
i386: Increase MALLOC_ALIGNMENT to 16 [BZ #21120]

Author: H.J. Lu
Author Date: 2017-06-30 16:11:08 UTC

i386: Increase MALLOC_ALIGNMENT to 16 [BZ #21120]

GCC 7 changed the definition of max_align_t on i386:

https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9b5c49ef97e63cc63f1ffa13baf771368105ebe2

As a result, glibc malloc no longer returns memory blocks which are as
aligned as max_align_t requires.

This causes malloc/tst-malloc-thread-fail to fail with an error like this
one:

error: allocation function 0, size 144 not aligned to 16

This patch moves the MALLOC_ALIGNMENT definition to <malloc-alignment.h>
and increases the malloc alignment to 16 for i386.

 [BZ #21120]
 * malloc/malloc-internal.h (MALLOC_ALIGNMENT): Moved to ...
 * sysdeps/generic/malloc-alignment.h: Here. New file.
 * sysdeps/i386/malloc-alignment.h: Likewise.
 * sysdeps/generic/malloc-machine.h: Include <malloc-alignment.h>.

(cherry picked from commit 4e61a6be446026c327aa70cef221c9082bf0085d)

hjl/pr21120/master 2017-06-30 14:43:58 UTC 2017-06-30
i386: Increase MALLOC_ALIGNMENT to 16 [BZ #21120]

Author: H.J. Lu
Author Date: 2017-06-29 17:26:04 UTC

i386: Increase MALLOC_ALIGNMENT to 16 [BZ #21120]

GCC 7 changed the definition of max_align_t on i386:

https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9b5c49ef97e63cc63f1ffa13baf771368105ebe2

As a result, glibc malloc no longer returns memory blocks which are as
aligned as max_align_t requires.

This causes malloc/tst-malloc-thread-fail to fail with an error like this
one:

error: allocation function 0, size 144 not aligned to 16

This patch moves the MALLOC_ALIGNMENT definition to <malloc-alignment.h>
and increases the malloc alignment to 16 for i386.

 [BZ #21120]
 * malloc/malloc-internal.h (MALLOC_ALIGNMENT): Moved to ...
 * sysdeps/generic/malloc-alignment.h: Here. New file.
 * sysdeps/i386/malloc-alignment.h: Likewise.
 * sysdeps/generic/malloc-machine.h: Include <malloc-alignment.h>.

hjl/pr14995 2017-06-28 18:13:24 UTC 2017-06-28
Check linker support for INSERT in linker script

Author: H.J. Lu
Author Date: 2017-06-26 19:46:08 UTC

Check linker support for INSERT in linker script

Since gold doesn't support INSERT in linker script:

https://sourceware.org/bugzilla/show_bug.cgi?id=15373

tst-split-dynreloc fails to link with gold. Check if linker supports
INSERT in linker script before using it.

 * config.make.in (have-insert): New.
 * configure.ac (libc_cv_insert): New. Set to yes if linker
 supports INSERT in linker script.
 (AC_SUBST(libc_cv_insert): New.
 * configure: Regenerated.
 * sysdeps/x86_64/Makefile (tests): Add tst-split-dynreloc only
 if $(have-insert) == yes.

hjl/pr21666/master 2017-06-23 18:29:38 UTC 2017-06-23
x86-64: Optimize L(between_2_3) in memcmp-avx2-movbe.S

Author: H.J. Lu
Author Date: 2017-06-23 18:29:38 UTC

x86-64: Optimize L(between_2_3) in memcmp-avx2-movbe.S

Turn

 movzbl -1(%rdi, %rdx), %edi
 movzbl -1(%rsi, %rdx), %esi
 orl %edi, %eax
 orl %esi, %ecx

into

 movb -1(%rdi, %rdx), %al
 movb -1(%rsi, %rdx), %cl

 * sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S (between_2_3):
 Replace movzbl and orl with movb.

zack/errno-prettyprint 2017-06-22 22:28:45 UTC 2017-06-22
Add pretty-printer for errno.

Author: Zack Weinberg
Author Date: 2017-06-22 22:28:45 UTC

Add pretty-printer for errno.

This patch adds the actual pretty-printer for errno. I could have
used Python's built-in errno module to get the symbolic names for the
constants, but it seemed better to do something entirely under our
control, so there's a .pysym file generated from errnos.texi, with a
hook that allows the Hurd to add additional constants. Then a .py
module is generated from that plus errno.h in the usual manner; many
thanks to the authors of the .pysym mechanism.

There is also a test which verifies that the .py file (not the .pysym
file) covers all of the constants defined in errno.h.

hurd-add-errno-constants.awk has been manually tested, but the
makefile logic that runs it has not been tested.

 * stdlib/errno-printer.py: New pretty-printer.
 * stdlib/test-errno-constants.py: New special test.
 * stdlib/test-errno-printer.c, stdlib/test-errno-printer.py:
 New pretty-printer test.
 * stdlib/make-errno-constants.awk: New script to generate the
 .pysym file needed by errno-printer.py.
 * stdlib/Makefile: Install, run, and test all of the above, as
 appropriate.

 * sysdeps/mach/hurd/hurd-add-errno-constants.awk: New script to
 add Mach/Hurd-specific errno constants to the .pysym file used by
 stdlib/errno-printer.py.
 * sysdeps/mach/hurd/Makefile: Hook hurd-add-errno-constants.awk
 into the generation of that .pysym file.

hjl/memcmp/avx2 2017-06-21 17:16:07 UTC 2017-06-21
Add do_test2

Author: H.J. Lu
Author Date: 2017-06-21 17:16:07 UTC

Add do_test2

ibm/2.24/master 2017-06-21 14:18:55 UTC 2017-06-21
Merge branch 'release/2.24/master' into ibm/2.24/master

Author: Tulio Magno Quites Machado Filho
Author Date: 2017-06-21 14:18:55 UTC

Merge branch 'release/2.24/master' into ibm/2.24/master

hjl/tunables/master 2017-06-21 13:09:43 UTC 2017-06-21
x86: Rename glibc.tune.ifunc to glibc.tune.hwcaps

Author: H.J. Lu
Author Date: 2017-06-21 12:38:03 UTC

x86: Rename glibc.tune.ifunc to glibc.tune.hwcaps

Rename glibc.tune.ifunc to glibc.tune.hwcaps and move it to
sysdeps/x86/dl-tunables.list since it is x86 specicifc. Also
change type of data_cache_size, data_cache_size and
non_temporal_threshold to unsigned long int to match size_t.
Remove usage DEFAULT_STRLEN from cpu-tunables.c.

 * elf/dl-tunables.list (glibc.tune.ifunc): Removed.
 * sysdeps/x86/dl-tunables.list (glibc.tune.hwcaps): New.
 Remove security_level on all fields.
 * manual/tunables.texi: Replace ifunc with hwcaps.
 * sysdeps/x86/cpu-features.c (TUNABLE_CALLBACK (set_ifunc)):
 Renamed to ..
 (TUNABLE_CALLBACK (set_hwcaps)): This.
 (init_cpu_features): Updated.
 * sysdeps/x86/cpu-features.h (cpu_features): Change type of
 data_cache_size, data_cache_size and non_temporal_threshold to
 unsigned long int.
 * sysdeps/x86/cpu-tunables.c (DEFAULT_STRLEN): Removed.
 (TUNABLE_CALLBACK (set_ifunc)): Renamed to ...
 (TUNABLE_CALLBACK (set_hwcaps)): This. Update comments. Don't
 use DEFAULT_STRLEN.

zack/build-layout-experiment 2017-06-08 19:39:03 UTC 2017-06-08
Prepare for radical source tree reorganization.

Author: Zack Weinberg
Author Date: 2017-06-08 19:39:03 UTC

Prepare for radical source tree reorganization.

All top-level files and directories are moved into a temporary storage
directory, REORG.TODO, except for files that will certainly still
exist in their current form at top level when we're done (COPYING,
COPYING.LIB, LICENSES, NEWS, README), all old ChangeLog files (which
are moved to the new directory OldChangeLogs, instead), and the
generated file INSTALL (which is just deleted; in the new order, there
will be no generated files checked into version control).

hjl/avx2/master 2017-06-08 12:07:18 UTC 2017-06-08
x86-64: Optimize strrchr/wcsrchr with AVX2

Author: H.J. Lu
Author Date: 2017-05-26 19:21:55 UTC

x86-64: Optimize strrchr/wcsrchr with AVX2

Optimize strrchr/wcsrchr with AVX2 to check 32 bytes with vector
instructions. It is as fast as SSE2 version for small data sizes
and up to 1X faster for large data sizes on Haswell. Select AVX2
version on AVX2 machines where vzeroupper is preferred and AVX
unaligned load is fast.

 * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
 strrchr-sse2, strrchr-avx2, wcsrchr-sse2 and wcsrchr-avx2.
 * sysdeps/x86_64/multiarch/ifunc-impl-list.c
 (__libc_ifunc_impl_list): Add tests for __strrchr_avx2,
 __strrchr_sse2, __wcsrchr_avx2 and __wcsrchr_sse2.
 * sysdeps/x86_64/multiarch/strrchr-avx2.S: New file.
 * sysdeps/x86_64/multiarch/strrchr-sse2.S: Likewise.
 * sysdeps/x86_64/multiarch/strrchr.c: Likewise.
 * sysdeps/x86_64/multiarch/wcsrchr-avx2.S: Likewise.
 * sysdeps/x86_64/multiarch/wcsrchr-sse2.S: Likewise.
 * sysdeps/x86_64/multiarch/wcsrchr.c: Likewise.

hjl/avx2/fix 2017-06-06 12:31:48 UTC 2017-06-06
x86-64: Move wcsnlen.S to multiarch/wcsnlen-sse4_1.S

Author: H.J. Lu
Author Date: 2017-06-06 12:31:48 UTC

x86-64: Move wcsnlen.S to multiarch/wcsnlen-sse4_1.S

Since wcsnlen.S uses pminud which is the part of SSE4.1, move wcsnlen.S
to multiarch/wcsnlen-sse4_1.S.

 * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
 wcsnlen-sse4_1 and wcsnlen-c.
 * sysdeps/x86_64/multiarch/ifunc-impl-list.c
 (__libc_ifunc_impl_list): Test __wcsnlen_sse4_1 and
 __wcsnlen_sse2.
 * sysdeps/x86_64/multiarch/ifunc-sse4_1.h: New file.
 * sysdeps/x86_64/multiarch/wcsnlen-c.c: Likewise.
 * sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S: Likewise.
 * sysdeps/x86_64/multiarch/wcsnlen.c: Likewise.
 * sysdeps/x86_64/wcsnlen.S: Removed.

hjl/avx2/c 2017-06-05 22:09:59 UTC 2017-06-05
x86-64: Optimize strrchr/wcsrchr with AVX2

Author: H.J. Lu
Author Date: 2017-05-26 19:21:55 UTC

x86-64: Optimize strrchr/wcsrchr with AVX2

Optimize strrchr/wcsrchr with AVX2 to check 32 bytes with vector
instructions. It is as fast as SSE2 version for small data sizes
and up to 1X faster for large data sizes on Haswell. Select AVX2
version on AVX2 machines where vzeroupper is preferred and AVX
unaligned load is fast.

 * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
 strrchr-sse2, strrchr-avx2, wcsrchr-sse2 and wcsrchr-avx2.
 * sysdeps/x86_64/multiarch/ifunc-impl-list.c
 (__libc_ifunc_impl_list): Add tests for __strrchr_avx2,
 __strrchr_sse2, __wcsrchr_avx2 and __wcsrchr_sse2.
 * sysdeps/x86_64/multiarch/strrchr-avx2.S: New file.
 * sysdeps/x86_64/multiarch/strrchr-sse2.S: Likewise.
 * sysdeps/x86_64/multiarch/strrchr.c: Likewise.
 * sysdeps/x86_64/multiarch/wcsrchr-avx2.S: Likewise.
 * sysdeps/x86_64/multiarch/wcsrchr-sse2.S: Likewise.
 * sysdeps/x86_64/multiarch/wcsrchr.c: Likewise.

zack/build-experiments 2017-06-01 12:47:44 UTC 2017-06-01
Experimenting with alternatives to VPATH.

Author: Zack Weinberg
Author Date: 2017-06-01 12:47:44 UTC

Experimenting with alternatives to VPATH.

hjl/master 2017-05-29 13:47:22 UTC 2017-05-29
x86: Update __x86_shared_non_temporal_threshold

Author: H.J. Lu
Author Date: 2017-05-12 20:38:04 UTC

x86: Update __x86_shared_non_temporal_threshold

__x86_shared_non_temporal_threshold was set to 6 times of per-core
shared cache size, based on the large memcpy micro benchmark in glibc
on a 8-core processor. For a processor with more than 8 cores, the
threshold is too low. Set __x86_shared_non_temporal_threshold to the
3/4 of the total shared cache size so that it is unchanged on 8-core
processors. On processors with less than 8 cores, the threshold is
lower.

 * sysdeps/x86/cacheinfo.c (__x86_shared_non_temporal_threshold):
 Set to the 3/4 of the total shared cache size.

tuliom/float128 2017-05-25 19:14:47 UTC 2017-05-25
powerpc64le: Enable float128

Author: Paul E. Murphy
Author Date: 2016-08-09 21:48:54 UTC

powerpc64le: Enable float128

Add ulps for the float128 type, bits/floatn.h, and float128-abi.h.

Likewise, sqrt is not implemented in libgcc. The sfp-machine.h
header is taken from libgcc, and used to build a P7/P8 soft-fp
sqrtf128.

 * sysdeps/powerpc/fpu/libm-test-ulps: Regenerated.
 * sysdeps/powerpc/fpu/math_private.h:
 (__ieee754_sqrtf128): New inline override.
 * sysdeps/powerpc/powerpc64le/Implies-before: New file.
 * sysdeps/powerpc/powerpc64le/Makefile: New file.
 * sysdeps/powerpc/powerpc64le/bits/floatn.h: New file.
 * sysdeps/powerpc/powerpc64le/fpu/e_sqrtf128.c: New file.
 * sysdeps/powerpc/powerpc64le/fpu/sfp-machine.h: New file.
 * sysdeps/powerpc/powerpc64le/power9/fpu/e_sqrtf128.c: New file.

 * sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist:
 Regenerated.
 * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist:
 Likewise.

 * sysdeps/unix/sysv/linux/powerpc/powerpc64le/float128-abi.h
 New file.

hjl/cacheinfo/master 2017-05-24 13:35:25 UTC 2017-05-24
x86: Add cache info to cpu_features

Author: H.J. Lu
Author Date: 2017-05-24 03:22:13 UTC

x86: Add cache info to cpu_features

This patch adds cache info to cpu_features to support tunables for both
cache info as well as CPU features in a single x86 namespace. Since
init_cacheinfo is in libc.so and cpu_features is in ld.so, cache info
and CPU features must be in a place for tunables.

 * sysdeps/x86/cacheinfo.c (init_cacheinfo): Use data_size,
 shared_size and non_temporal_threshold from cpu_features if
 they aren't zero.
 * sysdeps/x86/cpu-features.h (cache_info): New.
 (cpu_features): Add cache.

aaribaud/y2038-2.23 2017-05-24 08:29:39 UTC 2017-05-24
Add __fstatat64_t64, __fxstatat_t64

Author: Albert ARIBAUD (3ADEV)
Author Date: 2017-05-24 08:27:17 UTC

Add __fstatat64_t64, __fxstatat_t64

hjl/x86/optimize 2017-05-22 20:15:35 UTC 2017-05-22
Add x86_cache.non_temporal_threshold to GLIBC_TUNABLES

Author: H.J. Lu
Author Date: 2017-05-22 19:00:43 UTC

Add x86_cache.non_temporal_threshold to GLIBC_TUNABLES

Add support for "glibc.x86_cache.non_temporal_threshold=number" to
GLIBC_TUNABLES.

 * elf/dl-tunables.list (x86_cache): New name space.
 * sysdeps/x86/cacheinfo.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE):
 New.
 [HAVE_TUNABLES]: Include <elf/dl-tunables.h>.
 [HAVE_TUNABLES] (DL_TUNABLE_CALLBACK (set_non_temporal_threshold)):
 New.
 [HAVE_TUNABLES] (init_cacheinfo): Call TUNABLE_SET_VAL_WITH_CALLBACK
 with set_non_temporal_threshold.

dj/malloc-tcache 2017-05-11 21:27:35 UTC 2017-05-11
Tweak Makefile, asserts, comments.

Author: DJ Delorie
Author Date: 2017-05-11 21:25:10 UTC

Tweak Makefile, asserts, comments.

* Un-Wundef-ify -DUSE_TCACHE
* More asserts in tcache get/put functions
* Clarify redundancy in tcache structure

linaro/2.21/master 2017-04-21 13:07:56 UTC 2017-04-21
Make io/ftwtest-sh remove temporary files on early exit.

Author: Joseph Myers
Author Date: 2015-10-21 21:18:21 UTC

Make io/ftwtest-sh remove temporary files on early exit.

The test io/ftwtest-sh creates a directory that at some points during
the test does not have execute permission. To avoid leaving behind
such a directory that prevents the build directory from being removed
with a simple "rm -rf", it traps various signals to make the directory
executable and remove it before exit. However, this doesn't cover the
case where one of the tests simply fails (which happens with cross
testing if testing on a remote system where the path to the build
directory involves a symlink, or if that remote system fell over
during testing - I think the latter is the case where the directory is
left behind with bad permissions).

This patch makes that test also trap signal 0 (exit) so that the
directory gets properly removed in such failure cases as well.

Tested in both configurations where the test passes and where it fails
to verify that the result of the test is unchanged but the directory
is no longer left behind where it was previously left behind.

 * io/ftwtest-sh: Also trap on exit to remove temporary files.

hjl/pr21258/2.23 2017-04-20 14:55:44 UTC 2017-04-20
x86-64: Improve branch predication in _dl_runtime_resolve_avx512_opt [BZ #21258]

Author: H.J. Lu
Author Date: 2017-03-21 17:59:31 UTC

x86-64: Improve branch predication in _dl_runtime_resolve_avx512_opt [BZ #21258]

On Skylake server, _dl_runtime_resolve_avx512_opt is used to preserve
the first 8 vector registers. The code layout is

  if only %xmm0 - %xmm7 registers are used
     preserve %xmm0 - %xmm7 registers
  if only %ymm0 - %ymm7 registers are used
     preserve %ymm0 - %ymm7 registers
  preserve %zmm0 - %zmm7 registers

Branch predication always executes the fallthrough code path to preserve
%zmm0 - %zmm7 registers speculatively, even though only %xmm0 - %xmm7
registers are used. This leads to lower CPU frequency on Skylake
server. This patch changes the fallthrough code path to preserve
%xmm0 - %xmm7 registers instead:

  if whole %zmm0 - %zmm7 registers are used
    preserve %zmm0 - %zmm7 registers
  if only %ymm0 - %ymm7 registers are used
     preserve %ymm0 - %ymm7 registers
  preserve %xmm0 - %xmm7 registers

Tested on Skylake server.

 [BZ #21258]
 * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve_opt):
 Define only if _dl_runtime_resolve is defined to
 _dl_runtime_resolve_sse_vex.
 * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_opt):
 Fallthrough to _dl_runtime_resolve_sse_vex.

(cherry picked from commit c15f8eb50cea7ad1a4ccece6e0982bf426d52c00)

fw/accept4 2017-04-14 08:30:01 UTC 2017-04-14
Assume that accept4 is always available and works

Author: Florian Weimer
Author Date: 2017-04-14 08:30:01 UTC

Assume that accept4 is always available and works

Simplify the Linux accept4 implementation based on the assumption
that it is available in some way. __ASSUME_ACCEPT4_SOCKETCALL was
previously unused, so remove it. Its functionality is implied
by the complex #if condition in accept4.c.

hjl/hwcap/master 2017-04-07 15:09:47 UTC 2017-04-07
x86: Set dl_hwcap from CPU features

Author: H.J. Lu
Author Date: 2017-04-05 22:24:30 UTC

x86: Set dl_hwcap from CPU features

On x86, the usage of AT_HWCAP in glibc is obsolete since addition of
dl_x86_cpu_features. dl_hwcap, which was set from AT_HWCAP, is used by
dynamic linker to build an array of hardware capability names, which are
added to search path when loading shared object. dl_hwcap was unused on
x86-64 and only SSE2 was used on i386.

This patch sets dl_hwcap with new hardware capabilities from CPU
features. Currently, 2 capabilities, SSE2 and AVX2, are supported.
The maximum number of hardware capabilities is 64. Since x86-64
includes SSE2, SSE2 is skipped on x86-64. dl_x86_cap_flags is kepted
for i386 and is used by _dl_show_auxv. dl_x86_hwcap_flags is added
for new hardware capabilities.

 * sysdeps/i386/dl-hwcap.h: New file.
 * sysdeps/x86/dl-hwcap.h: Likewise.
 * sysdeps/x86_64/dl-hwcap.h: Likewise.
 * sysdeps/x86_64/dl-procinfo.h: Likewise.
 * sysdeps/i386/dl-procinfo.c (_dl_x86_hwcap_flags): New.
 * sysdeps/i386/dl-procinfo.h: Include <dl-hwcap.h>.
 (_DL_HWCAP_COUNT): Removed.
 (HWCAP_I386_XXX): Likewise.
 (HWCAP_IMPORTANT): Likewise.
 (_dl_procinfo): Likewise.
 (_dl_hwcap_string): Likewise.
 (_dl_string_hwcap): Likewise.
 * sysdeps/unix/sysv/linux/i386/dl-procinfo.h (_dl_procinfo):
 Replace _DL_HWCAP_COUNT with 32.
 * sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h [!IS_IN (ldconfig)]:
 Include <sysdeps/x86_64/dl-procinfo.h>.
 * sysdeps/x86/cpu-features.c: Include <dl-hwcap.h>.
 (init_cpu_features): Set dl_hwcap and dl_hwcap_mask.
 * sysdeps/x86_64/dl-procinfo.c (_dl_x86_hwcap_flags): New.

gentoo/2.25 2017-03-20 14:57:14 UTC 2017-03-20
posix_spawn: fix stack setup on ia64 [BZ #21275]

Author: Mike Frysinger
Author Date: 2017-03-20 08:47:56 UTC

posix_spawn: fix stack setup on ia64 [BZ #21275]

The ia64-specific clone2 call expects the base of the stack mapping and
the stack size as sep arguments, not an initial stack value as on other
stack-grows-down architectures. Reuse the stack-grows-up macro so we
pass in the right stack base.

Reported-by: Matt Turner <mattst88@gentoo.org>
(cherry picked from commit ddc3fb333469c2997798742dc0509dc1e3201d91)
(cherry picked from commit 27ab0d9518746dfb59ed2ba59daefc981dc10e38)

gentoo/2.24 2017-03-20 14:56:52 UTC 2017-03-20
posix_spawn: fix stack setup on ia64 [BZ #21275]

Author: Mike Frysinger
Author Date: 2017-03-20 08:47:56 UTC

posix_spawn: fix stack setup on ia64 [BZ #21275]

The ia64-specific clone2 call expects the base of the stack mapping and
the stack size as sep arguments, not an initial stack value as on other
stack-grows-down architectures. Reuse the stack-grows-up macro so we
pass in the right stack base.

Reported-by: Matt Turner <mattst88@gentoo.org>
(cherry picked from commit ddc3fb333469c2997798742dc0509dc1e3201d91)
(cherry picked from commit 7043946c7921c0e3850dd2b3d948336624bb0f62)

fw/out_buffer 2017-03-10 19:45:25 UTC 2017-03-10
WIP struct out_buffer

Author: Florian Weimer
Author Date: 2017-03-10 19:45:25 UTC

WIP struct out_buffer

fw/bug16145 2017-03-09 15:34:11 UTC 2017-03-09
WIP reorganization to improve scalability of localtime

Author: Florian Weimer
Author Date: 2017-03-09 15:33:57 UTC

WIP reorganization to improve scalability of localtime

aaribaud/y2038-pre-5th-draft 2017-02-22 07:06:33 UTC 2017-02-22
Add 64-bit version of ctime_r

Author: Albert ARIBAUD (3ADEV)
Author Date: 2017-02-22 07:06:33 UTC

Add 64-bit version of ctime_r

aaribaud/y2038 2017-02-22 07:06:33 UTC 2017-02-22
Add 64-bit version of ctime_r

Author: Albert ARIBAUD (3ADEV)
Author Date: 2017-02-22 07:06:33 UTC

Add 64-bit version of ctime_r

release/2.19/master 2017-02-20 21:04:52 UTC 2017-02-20
Fix powerpc software sqrt (bug 17964).

Author: Joseph Myers
Author Date: 2015-02-12 23:05:37 UTC

Fix powerpc software sqrt (bug 17964).

As Adhemerval noted in
<https://sourceware.org/ml/libc-alpha/2015-01/msg00451.html>, the
powerpc sqrt implementation for when _ARCH_PPCSQ is not defined is
inaccurate in some cases.

The problem is that this code relies on fused multiply-add, and relies
on the compiler contracting a * b + c to get a fused operation. But
sysdeps/ieee754/dbl-64/Makefile disables contraction for e_sqrt.c,
because the implementation in that directory relies on *not* having
contracted operations.

While it would be possible to arrange makefiles so that an earlier
sysdeps directory can disable the setting in
sysdeps/ieee754/dbl-64/Makefile, it seems a lot cleaner to make the
dependence on fused operations explicit in the .c file. GCC 4.6
introduced support for __builtin_fma on powerpc and other
architectures with such instructions, so we can rely on that; this
patch duly makes the code use __builtin_fma for all such fused
operations.

Tested for powerpc32 (hard float).

2015-02-12 Joseph Myers <joseph@codesourcery.com>

 [BZ #17964]
 * sysdeps/powerpc/fpu/e_sqrt.c (__slow_ieee754_sqrt): Use
 __builtin_fma instead of relying on contraction of a * b + c.

(cherry picked from commit e8bd5286c68bc35be3b41e94c15c4387dcb3bec9)

fw/math-split-tests 2017-02-17 07:22:29 UTC 2017-02-17
RFC: Run libm tests separately for each function

Author: Joseph Myers
Author Date: 2017-02-16 23:08:20 UTC

RFC: Run libm tests separately for each function

At present, libm tests for each function get built into a single
executable (for each floating point type, for each of normal / inline
/ finite-math-only functions, plus vector variants) and run together,
resulting in a single PASS or FAIL (for each of those nine variants
plus vector variants). Building this executable involves reading
about 40 MB of libm-test-*.c sources, which would grow to maybe 70 or
80 MB when the complex inverse trig and hyperbolic functions move to
the auto-libm-test-* mechanism (that move is practical now that tests
for all functions don't need regenerating for any change to
auto-libm-test-in but you can instead regenerate each
auto-libm-test-out-* file independently; auto-libm-test-out-casin and
auto-libm-test-out-casinh take about 38 minutes each to generate on my
system after such a move, auto-libm-test-out-cacos and
auto-libm-test-out-cacosh take about 80 minutes each).

This patch arranges for tests of each function to be run separately
from the makefiles instead. There are 121 functions being tested for
each (type, variant pair) (actually 126, but run as 121 from the
Makefile because each of the pairs (exp10, pow10), (isfinite, finite),
(lgamma, gamma), (remainder, drem), (scalbn, ldexp), shares a table of
test results and so is run together), so 1089 separate tests run from
the Makefile, plus 48 vector tests on x86_64 (six functions for eight
vector variants). Each test only involves a libm-test-<func>.c file
of no more than about 4 MB, rather than all such files taking about 40
MB. With tests run separately, test summaries will indicate which
functions actually have problems (of course, those problems may just
be out-of-date libm-test-ulps files if the file hasn't been updated
for the architecture in question recently).

All the .c files for the 1089+48 tests are generated automatically
from the Makefiles. Various checked-in boilerplate .c files are
removed as no longer needed. CFLAGS definitions for the different
kinds of tests are generated using makefile iterators to apply
target-specific variable settings. libm-have-vector-test.h is no
longer needed; the list of functions to test for each vector type is
now in the sysdeps Makefile.

This should remove the amount of boilerplate needed for float128
testing support; test-float128.h will still be needed, but not various
.c files or Makefile CFLAGS definitions. The logic for creating
dependencies on libm-test-support-*.o files should also render
<https://sourceware.org/ml/libc-alpha/2017-02/msg00279.html>
unnecessary.

Any comments? Especially regarding the use of iterators; there is
existing precedent (in elf/Makefile) for using o-iterator.mk as a
generic iterator with object-suffixes-left set to something other than
a list of object suffixes, but maybe there should be a differently
named iterator for such generic uses?

2017-02-16 Joseph Myers <joseph@codesourcery.com>

 * math/Makefile (libm-tests-generated): Remove variable.
 (libm-tests-base-normal): New variable.
 (libm-tests-base-finite): Likewise.
 (libm-tests-base-inline): Likewise.
 (libm-tests-base): Likewise.
 (libm-tests-normal): Likewise.
 (libm-tests-finite): Likewise.
 (libm-tests-inline): Likewise.
 (libm-tests-vector): Likewise.
 (libm-tests): Define in terms of these new variables.
 (libm-tests-for-type): New variable.
 (libm-tests.o): Move definition.
 (tests): Move addition of $(libm-tests).
 (generated): Update for new and removed libm test files.
 ($(objpfx)libm-test.c): Remove target.
 ($(objpfx)libm-have-vector-test.h): Likewise.
 (CFLAGS-test-double-vlen2.c): Remove variable.
 (CFLAGS-test-double-vlen4.c): Likewise.
 (CFLAGS-test-double-vlen8.c): Likewise.
 (CFLAGS-test-float-vlen4.c): Likewise.
 (CFLAGS-test-float-vlen8.c): Likewise.
 (CFLAGS-test-float-vlen16.c): Likewise.
 (CFLAGS-test-float.c): Likewise.
 (CFLAGS-test-float-finite.c): Likewise.
 (CFLAGS-libm-test-support-float.c): Likewise.
 (CFLAGS-test-double.c): Likewise.
 (CFLAGS-test-double-finite.c): Likewise.
 (CFLAGS-libm-test-support-double.c): Likewise.
 (CFLAGS-test-ldouble.c): Likewise.
 (CFLAGS-test-ldouble-finite.c): Likewise.
 (CFLAGS-libm-test-support-ldouble.c): Likewise.
 (libm-test-inline-cflags): New variable.
 (CFLAGS-test-ifloat.c): Remove variable.
 (CFLAGS-test-idouble.c): Likewise.
 (CFLAGS-test-ildouble.c): Likewise.
 ($(addprefix $(objpfx), $(libm-tests.o))): Move target and update
 dependencies.
 ($(foreach t,$(libm-tests-normal),$(objpfx)$(t).c)): New rule.
 ($(foreach t,$(libm-tests-finite),$(objpfx)$(t).c)): Likewise.
 ($(foreach t,$(libm-tests-inline),$(objpfx)$(t).c)): Likewise.
 ($(foreach t,$(libm-tests-vector),$(objpfx)$(t).c)): Likewise.
 ($(foreach t,$(types),$(objpfx)libm-test-support-$(t).c)):
 Likewise.
 (dependencies on libm-test-support-*.o): Remove.
 ($(foreach f,$(libm-test-funcs-all),$(objpfx)$(o)-$(f).o)): New
 rules using iterators.
 ($(addprefix $(objpfx),$(call libm-tests-for-type,$(o)))):
 Likewise.
 ($(objpfx)libm-test-support-$(o).o): Likewise.
 ($(addprefix $(objpfx),$(filter-out $(tests-static)
 $(libm-vec-tests),$(tests)))): Filter out $(libm-tests-vector)
 instead.
 ($(addprefix $(objpfx), $(libm-vec-tests))): Use iterator to
 define rule instead.
 * math/README.libm-test: Update.
 * math/libm-test-acos.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-acosh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-asin.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-asinh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-atan.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-atan2.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-atanh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cabs.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cacos.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cacosh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-canonicalize.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-carg.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-casin.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-casinh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-catan.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-catanh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cbrt.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ccos.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ccosh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ceil.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cexp.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cimag.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-clog.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-clog10.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-conj.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-copysign.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cos.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cosh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cpow.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cproj.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-creal.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-csin.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-csinh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-csqrt.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ctan.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ctanh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-erf.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-erfc.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-exp.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-exp10.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-exp2.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-expm1.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fabs.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fdim.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-floor.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fma.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fmax.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fmaxmag.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fmin.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fminmag.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fmod.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fpclassify.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-frexp.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fromfp.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fromfpx.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-getpayload.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-hypot.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ilogb.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-iscanonical.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-iseqsig.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isfinite.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isgreater.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isgreaterequal.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isinf.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isless.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-islessequal.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-islessgreater.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isnan.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isnormal.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-issignaling.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-issubnormal.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isunordered.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-iszero.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-j0.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-j1.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-jn.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-lgamma.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-llogb.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-llrint.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-llround.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-log.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-log10.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-log1p.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-log2.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-logb.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-lrint.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-lround.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-modf.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-nearbyint.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-nextafter.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-nextdown.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-nexttoward.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-nextup.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-pow.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-remainder.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-remquo.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-rint.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-round.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-roundeven.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-scalb.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-scalbln.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-scalbn.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-setpayload.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-setpayloadsig.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-signbit.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-significand.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-sin.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-sincos.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-sinh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-sqrt.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-tan.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-tanh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-tgamma.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-totalorder.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-totalordermag.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-trunc.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ufromfp.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ufromfpx.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-y0.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-y1.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-yn.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-driver.c: Do not include libm-have-vector-test.h.
 (HAVE_VECTOR): Remove macro.
 (START): Do not call HAVE_VECTOR.
 * math/test-double-vlen2.h (FUNC_TEST): Remove macro.
 * math/test-double-vlen4.h (FUNC_TEST): Remove macro.
 * math/test-double-vlen8.h (FUNC_TEST): Remove macro.
 * math/test-float-vlen16.h (FUNC_TEST): Remove macro.
 * math/test-float-vlen4.h (FUNC_TEST): Remove macro.
 * math/test-float-vlen8.h (FUNC_TEST): Remove macro.
 * math/test-math-vector.h (FUNC_TEST): New macro.
 (WRAPPER_DECL): Rename to WRAPPER_DECL_f.
 * sysdeps/x86_64/fpu/Makefile (double-vlen2-funcs): New variable.
 (double-vlen4-funcs): Likewise.
 (double-vlen4-avx2-funcs): Likewise.
 (double-vlen8-funcs): Likewise.
 (float-vlen4-funcs): Likewise.
 (float-vlen8-funcs): Likewise.
 (float-vlen8-avx2-funcs): Likewise.
 (float-vlen16-funcs): Likewise.
 (CFLAGS-test-double-vlen4-avx2.c): Remove variable.
 (CFLAGS-test-float-vlen8-avx2.c): Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen4.h (TEST_VECTOR_cos): Remove
 macro.
 (TEST_VECTOR_sin): Likewise.
 (TEST_VECTOR_sincos): Likewise.
 (TEST_VECTOR_log): Likewise.
 (TEST_VECTOR_exp): Likewise.
 (TEST_VECTOR_pow): Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen8.h (TEST_VECTOR_cos):
 Likewise.
 (TEST_VECTOR_sin): Likewise.
 (TEST_VECTOR_sincos): Likewise.
 (TEST_VECTOR_log): Likewise.
 (TEST_VECTOR_exp): Likewise.
 (TEST_VECTOR_pow): Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen16.h (TEST_VECTOR_cosf):
 Likewise.
 (TEST_VECTOR_sinf): Likewise.
 (TEST_VECTOR_sincosf): Likewise.
 (TEST_VECTOR_logf): Likewise.
 (TEST_VECTOR_expf): Likewise.
 (TEST_VECTOR_powf): Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen8.h (TEST_VECTOR_cosf):
 Likewise.
 (TEST_VECTOR_sinf): Likewise.
 (TEST_VECTOR_sincosf): Likewise.
 (TEST_VECTOR_logf): Likewise.
 (TEST_VECTOR_expf): Likewise.
 (TEST_VECTOR_powf): Likewise.
 * math/gen-libm-have-vector-test.sh: Remove file.
 * math/libm-test.inc: Likewise.
 * math/libm-test-support-double.c: Likewise.
 * math/libm-test-support-float.c: Likewise.
 * math/libm-test-support-ldouble.c: Likewise.
 * math/test-double-finite.c: Likewise.: Likewise.
 * math/test-double.c: Likewise.
 * math/test-float-finite.c: Likewise.
 * math/test-float.c: Likewise.
 * math/test-idouble.c: Likewise.
 * math/test-ifloat.c: Likewise.
 * math/test-ildouble.c: Likewise.
 * math/test-ldouble-finite.c: Likewise.
 * math/test-ldouble.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen2.h: Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen4.h: Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.

fw/bug21041 2017-01-24 17:32:30 UTC 2017-01-24
WIP delayed IFUNC relocation

Author: Florian Weimer
Author Date: 2017-01-24 17:32:30 UTC

WIP delayed IFUNC relocation

azanella/c11-threads 2017-01-24 17:21:18 UTC 2017-01-24
Add manual documentation for threads.h

Author: Juan Manuel Torres Palma
Author Date: 2016-12-06 22:47:02 UTC

Add manual documentation for threads.h

This patch updates the manual and adds a new chapter to the manual,
explaining types macros, constants and functions defined by ISO C11
threads.h standard.

 * manual/Makefile (chapters): Add isothreads.texi.
 * manual/isothreads.texi: New file. Add new chapter for ISO C11
 threads documentation.

google/grte/v4-2.19/master 2017-01-19 22:01:46 UTC 2017-01-19
Fix nan functions handling of payload strings (BZ16962, CVE-2014-9761)

Author: Joseph Myers
Author Date: 2017-01-19 22:01:46 UTC

Fix nan functions handling of payload strings (BZ16962, CVE-2014-9761)

gentoo/2.23 2016-12-08 05:38:41 UTC 2016-12-08
alpha: fix trunc for big input values

Author: Aurelien Jarno
Author Date: 2016-08-02 07:18:59 UTC

alpha: fix trunc for big input values

The alpha specific version of trunc and truncf always add and subtract
0x1.0p23 or 0x1.0p52 even for big values. This causes this kind of
errors in the testsuite:

  Failure: Test: trunc_towardzero (0x1p107)
  Result:
   is: 1.6225927682921334e+32 0x1.fffffffffffffp+106
   should be: 1.6225927682921336e+32 0x1.0000000000000p+107
   difference: 1.8014398509481984e+16 0x1.0000000000000p+54
   ulp : 0.5000
   max.ulp : 0.0000

Change this by returning the input value when its absolute value is
greater than 0x1.0p23 or 0x1.0p52. NaN have to go through the add and
subtract operations to get possibly silenced.

Finally remove the code to handle inexact exception, trunc should never
generate such an exception.

Changelog:
 * sysdeps/alpha/fpu/s_trunc.c (__trunc): Return the input value
 when its absolute value is greater than 0x1.0p52.
 [_IEEE_FP_INEXACT] Remove.
 * sysdeps/alpha/fpu/s_truncf.c (__truncf): Return the input value
 when its absolute value is greater than 0x1.0p23.
 [_IEEE_FP_INEXACT] Remove.

(cherry picked from commit b74d259fe793499134eb743222cd8dd7c74a31ce)
(cherry picked from commit 3a5aa2ee4ffc515c8e7e615ea38d6b3b20ed0a30)

dj/malloc 2016-11-10 21:08:28 UTC 2016-11-10
Updates to trace2wl

Author: DJ Delorie
Author Date: 2016-11-10 21:08:28 UTC

Updates to trace2wl

* command line option -p to show progress
* command line option -f to use file-based buffers
* reduced memory footprint
* more 32/64-bit fixes

ibm/2.22/master 2016-10-14 20:01:21 UTC 2016-10-14
Merge branch release/2.22/master into ibm/2.22/master

Author: Tulio Magno Quites Machado Filho
Author Date: 2016-10-14 20:01:21 UTC

Merge branch release/2.22/master into ibm/2.22/master

release/2.22/master 2016-10-14 19:57:32 UTC 2016-10-14
powerpc: Sync hwcap.h with kernel

Author: Carlos Eduardo Seo
Author Date: 2016-10-14 19:57:32 UTC

powerpc: Sync hwcap.h with kernel

Linux commit b4b56f9ecab40f3b4ef53e130c9f6663be491894 introduced
a new HWCAP2 bit to indicate that the kernel now aborts a memory
transaction when a syscall is made. This patch adds that bit to
sysdeps/powerpc/bits/hwcap.h.

 * sysdeps/powerpc/bits/hwcap.h: Add PPC_FEATURE2_HTM_NOSC.
 * sysdeps/powerpc/dl-procinfo.c:
 (_dl_powerpc_cap_flags): Added descriptor for this hwcap
 feature so it shows when LD_SHOW_AUXV=1.

(cherry picked from commit 3c13f28c8eac1e5a883d1b3801314430a094fc99)

azanella/aarch64-split-stack 2016-07-28 20:11:06 UTC 2016-07-28
aarch64: Add split-stack TCB field

Author: Adhemerval Zanella
Author Date: 2016-04-25 19:15:49 UTC

aarch64: Add split-stack TCB field

This patch adds a new TCB field meant to be used by GCC split-stack
option. A new symbol, __tcb_private_ss, is also added to version
control the private TCB field.

Checked on aarch64.

 * sysdeps/aarch64/Makefile [$(subdir) == elf] (sysdep-dl-routines):
 Add tcb-version.
 * sysdeps/aarch64/Version [ld] (GLIBC_2.25): Define __tcb_private
 ss;
 * sysdeps/aarch64/nptl/tls.h (tcbhead_t): Add __private_ss field.
 * sysdeps/unix/sysv/linux/aarch64/ld.abilist: Add GLIBC_2.25 and
 __tcb_private_ss.
 * sysdeps/aarch64/tcb-version.c: New file.

hjl/libmvec/master 2016-07-19 20:20:10 UTC 2016-07-19
Don't compile do_test with -mavx/-mavx/-mavx512

Author: H.J. Lu
Author Date: 2016-07-19 20:01:36 UTC

Don't compile do_test with -mavx/-mavx/-mavx512

Don't compile do_test with -mavx, -mavx nor -mavx512 since they won't run
on non-AVX machines.

 [BZ #20384]
 * sysdeps/x86_64/fpu/Makefile (extra-test-objs): Add
 test-double-libmvec-sincos-avx-main.o,
 test-double-libmvec-sincos-avx2-main.o,
 test-double-libmvec-sincos-main.o,
 test-float-libmvec-sincosf-avx-main.o,
 test-float-libmvec-sincosf-avx2-main.o and
 test-float-libmvec-sincosf-main.o.
 test-float-libmvec-sincosf-avx512-main.o.
 ($(objpfx)test-double-libmvec-sincos): Also link with
 $(objpfx)test-double-libmvec-sincos-main.o.
 ($(objpfx)test-double-libmvec-sincos-avx): Also link with
 $(objpfx)test-double-libmvec-sincos-avx-main.o.
 ($(objpfx)test-double-libmvec-sincos-avx2): Also link with
 $(objpfx)test-double-libmvec-sincos-avx2-main.o.
 ($(objpfx)test-float-libmvec-sincosf): Also link with
 $(objpfx)test-float-libmvec-sincosf-main.o.
 ($(objpfx)test-float-libmvec-sincosf-avx): Also link with
 $(objpfx)test-float-libmvec-sincosf-avx2-main.o.
 [$(config-cflags-avx512) == yes] (extra-test-objs): Add
 test-double-libmvec-sincos-avx512-main.o and
 ($(objpfx)test-double-libmvec-sincos-avx512): Also link with
 $(objpfx)test-double-libmvec-sincos-avx512-main.o.
 ($(objpfx)test-float-libmvec-sincosf-avx512): Also link with
 $(objpfx)test-float-libmvec-sincosf-avx512-main.o.
 (CFLAGS-test-double-libmvec-sincos.c): Removed.
 (CFLAGS-test-float-libmvec-sincosf.c): Likewise.
 (CFLAGS-test-double-libmvec-sincos-main.c): New.
 (CFLAGS-test-double-libmvec-sincos-avx-main.c): Likewise.
 (CFLAGS-test-double-libmvec-sincos-avx2-main.c): Likewise.
 (CFLAGS-test-float-libmvec-sincosf-main.c): Likewise.
 (CFLAGS-test-float-libmvec-sincosf-avx-main.c): Likewise.
 (CFLAGS-test-float-libmvec-sincosf-avx2-main.c): Likewise.
 (CFLAGS-test-float-libmvec-sincosf-avx512-main.c): Likewise.
 (CFLAGS-test-double-libmvec-sincos-avx.c): Set to -DREQUIRE_AVX.
 (CFLAGS-test-float-libmvec-sincosf-avx.c ): Likewise.
 (CFLAGS-test-double-libmvec-sincos-avx2.c): Set to
 -DREQUIRE_AVX2.
 (CFLAGS-test-float-libmvec-sincosf-avx2.c ): Likewise.
 (CFLAGS-test-double-libmvec-sincos-avx512.c): Set to
 -DREQUIRE_AVX512F.
 (CFLAGS-test-float-libmvec-sincosf-avx512.c): Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-sincos.c: Rewritten.
 * sysdeps/x86_64/fpu/test-float-libmvec-sincosf.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx-main.c: New
 file.
 * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx2-main.c:
 Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx512-main.c:
 Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-sincos-main.c:
 Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx-main.c:
 Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx2-main.c:
 Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx512-main.c:
 Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-main.c:
 Likewise.

ibm/2.19/master 2016-07-11 17:16:01 UTC 2016-07-11
Merge branch 'release/2.19/master' into ibm/2.19/master

Author: Tulio Magno Quites Machado Filho
Author Date: 2016-07-11 17:16:01 UTC

Merge branch 'release/2.19/master' into ibm/2.19/master

Conflicts:
 NEWS

hjl/pr20314 2016-06-30 16:45:25 UTC 2016-06-30
Make copies of cstdlib/cmath and use them

Author: H.J. Lu
Author Date: 2016-06-30 13:56:19 UTC

Make copies of cstdlib/cmath and use them

If C++ headers <cstdlib> or <cmath> are used, GCC 6 will include
/usr/include/stdlib.h or /usr/include/math.h from "#include_next"
(instead of stdlib/stdlib.h or math/math.h in the glibc source
directory), and this turns up as a make dependency. An implicit
rule will kick in and make will try to install stdlib/stdlib.h or
math/math.h as /usr/include/stdlib.h or /usr/include/math.h because
the target is out of date. We make a copy of <cstdlib> and <cmath>
in the glibc build directory so that stdlib/stdlib.h and math/math.h
will be used instead of /usr/include/stdlib.h and /usr/include/math.h.

 [BZ #20314]
 * Makeconfig (CXXFLAGS): Prepend -I$(common-objpfx).
 * Makerules (before-compile): Add $(common-objpfx)cstdlib and
 $(common-objpfx)cmath.
 ($(common-objpfx)cstdlib): New target.
 ($(common-objpfx)cmath): Likewise.

hjl/pr18645 2016-06-30 03:00:39 UTC 2016-06-30
Compile tst-cleanupx4 test with -fexceptions

Author: H.J. Lu
Author Date: 2016-06-01 21:11:16 UTC

Compile tst-cleanupx4 test with -fexceptions

tst-cleanupx4 is linked with tst-cleanupx4.o and tst-cleanup4aux.o.
Since tst-cleanupx4.o is compiled from tst-cleanup4.c with -fexceptions,
tst-cleanup4aux.c should also be compiled with -fexceptions.

Tested on x86-64 and i686.

 [BZ 18645]
 * nptl/Makefile (extra-test-objs): Add tst-cleanupx4aux.o.
 (test-extras): Add tst-cleanupx4aux.
 (CFLAGS-tst-cleanupx4aux.c): New. Set to -fexceptions.
 ($(objpfx)tst-cleanupx4): Replace tst-cleanup4aux.o with
 tst-cleanupx4aux.o.
 * nptl/tst-cleanupx4aux.c: New file.

hjl/pr20139/master 2016-06-29 21:42:06 UTC 2016-06-29
Require binutils 2.24 to build x86-64 glibc

Author: H.J. Lu
Author Date: 2016-06-29 19:37:06 UTC

Require binutils 2.24 to build x86-64 glibc

If assembler doesn't support AVX512DQ, _dl_runtime_resolve_avx is used
to save the first 8 vector registers, which only saves the lower 256
bits of vector register, for lazy binding. When it is called on AVX512
platform, the upper 256 bits of ZMM registers are clobbered. Parameters
passed in ZMM registers will be wrong when the function is called the
first time. This patch requires binutils 2.24, whose assembler can store
and load ZMM registers, to build x86-64 glibc. Since mathvec library
needs assembler support for AVX512DQ, we disable mathvec if assembler
doesn't support AVX512DQ.

 [BZ #20139]
 * config.h.in (HAVE_AVX512_ASM_SUPPORT): Renamed to ...
 (HAVE_AVX512DQ_ASM_SUPPORT): This.
 * sysdeps/x86_64/configure.ac: Require assembler from binutils
 2.24 or above.
 (HAVE_AVX512_ASM_SUPPORT): Removed.
 (HAVE_AVX512DQ_ASM_SUPPORT): New.
 * sysdeps/x86_64/configure: Regenerated.
 * sysdeps/x86_64/dl-trampoline.S: Make HAVE_AVX512_ASM_SUPPORT
 check unconditional.
 * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Likewise.
 * sysdeps/x86_64/multiarch/memcpy.S: Likewise.
 * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
 * sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S:
 Likewise.
 * sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S:
 Likewise.
 * sysdeps/x86_64/multiarch/memmove.S: Likewise.
 * sysdeps/x86_64/multiarch/memmove_chk.S: Likewise.
 * sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
 * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
 * sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S:
 Likewise.
 * sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
 Likewise.
 * sysdeps/x86_64/multiarch/memset.S: Likewise.
 * sysdeps/x86_64/multiarch/memset_chk.S: Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: Check
 HAVE_AVX512DQ_ASM_SUPPORT instead of HAVE_AVX512_ASM_SUPPORT.
 * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx51:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S:
 Likewise.

hjl/pr20309/master 2016-06-29 14:38:22 UTC 2016-06-29
X86-64: Properly align stack in _dl_tlsdesc_dynamic

Author: H.J. Lu
Author Date: 2016-06-29 11:01:58 UTC

X86-64: Properly align stack in _dl_tlsdesc_dynamic

Since _dl_tlsdesc_dynamic is called via PLT, we need to add 8 bytes for
push in the PLT entry to align the stack.

 [BZ #20309]
 * configure.ac (have-mtls-dialect-gnu2): Set to yes if
 -mtls-dialect=gnu2 works.
 * configure: Regenerated.
 * elf/Makefile [have-mtls-dialect-gnu2 = yes]
 (tests): Add tst-gnu2-tls1.
 (modules-names): Add tst-gnu2-tls1mod.
 ($(objpfx)tst-gnu2-tls1): New.
 (tst-gnu2-tls1mod.so-no-z-defs): Likewise.
 (CFLAGS-tst-gnu2-tls1mod.c): Likewise.
 * elf/tst-gnu2-tls1.c: New file.
 * elf/tst-gnu2-tls1mod.c: Likewise.
 * sysdeps/x86_64/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Add 8
 bytes for push in the PLT entry to align the stack.

1100 of 285 results

Other repositories

Name Last Modified
lp:glibc 4 hours ago
11 of 1 result
You can't create new repositories for GLibC.