View Bazaar branches
Get this repository:
git clone https://git.launchpad.net/glibc

Branches

Name Last Modified Last Commit
hjl/cet/pr21598 2017-06-22 16:26:26 UTC 5 hours ago
i386: Add _dl_runtime_resolve_shstk [BZ #21598]

Author: H.J. Lu
Author Date: 2017-06-22 15:51:42 UTC

i386: Add _dl_runtime_resolve_shstk [BZ #21598]

Add a SHSTK compatible symbol resolver to support Shadow Stack in Intel
Control-flow Enforcement Technology (CET) instructions:

https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf

 [BZ #21598]
 * sysdeps/i386/dl-machine.h (elf_machine_runtime_setup): Use
 _dl_runtime_resolve_shstk if SHSTK is usable.
 * sysdeps/i386/dl-trampoline.S (_dl_runtime_resolve_shstk): New.
 * sysdeps/x86/cpu-features.h (bit_arch_IBT_Usable): New.
 (bit_arch_SHSTK_Usable): Likewise.
 (bit_cpu_SHSTK): Likewise.
 (index_cpu_IBT): Likewise.
 (index_cpu_SHSTK): Likewise.
 (index_arch_IBT_Usable): Likewise.
 (index_arch_SHSTK_Usable): Likewise.
 (reg_IBT): Likewise.
 (reg_SHSTK): Likewise.

hjl/cet/property 2017-06-22 16:22:48 UTC 5 hours ago
x86: Add <sys/cet.h> to support Intel CET

Author: H.J. Lu
Author Date: 2017-06-22 11:15:39 UTC

x86: Add <sys/cet.h> to support Intel CET

To support Intel Control-flow Enforcement Technology (CET) instructions:

https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-en
forcement-technology-preview.pdf

include sysdeps/unix/sysv/linux/x86/sys/cet.h for assembly and C sources
so that ELF program property can be added to all relocatable objects if
defines__IBT__ or __SHSTK__ is defined. If compiler defines__IBT__, the
IBT bit is turned on in x86 feature. If compiler defines __SHSTK__, the
SHSTK bit is turned on in x86 feature.

 * sysdeps/unix/sysv/linux/x86/Makefile (CPPFLAGS-.o): Add
 -include $(..)sysdeps/unix/sysv/linux/x86/sys/cet.h.
 (CPPFLAGS-.os): Likewise.
 (CPPFLAGS-.op): Likewise.
 * sysdeps/unix/sysv/linux/x86/sys/cet.h: New file.

master 2017-06-22 16:08:17 UTC 5 hours ago
Describe remainder as primary and drem as alternative in the manual

Author: Gabriel F. T. Gomes
Author Date: 2017-06-20 18:00:16 UTC

Describe remainder as primary and drem as alternative in the manual

In preparation for the documentation of _FloatN and _FloatNx variants of
the remainder function, this patch changes the descriptions of remainder
and drem, so that remainder is described as primary and drem as an
alternative name for the same functionality.

 * manual/arith.texi (Remainder Functions): Describe remainder as
 primary and drem as an alternative name. Change the comment on
 remainder to ISO, since it is defined in ISO C99.

hjl/cet/setjmp 2017-06-21 19:06:57 UTC 2017-06-21
Add bits/setjmp3.h

Author: H.J. Lu
Author Date: 2017-06-11 00:23:06 UTC

Add bits/setjmp3.h

hjl/memcmp/avx2 2017-06-21 17:16:07 UTC 2017-06-21
Add do_test2

Author: H.J. Lu
Author Date: 2017-06-21 17:16:07 UTC

Add do_test2

ibm/2.24/master 2017-06-21 14:18:55 UTC 2017-06-21
Merge branch 'release/2.24/master' into ibm/2.24/master

Author: Tulio Magno Quites Machado Filho
Author Date: 2017-06-21 14:18:55 UTC

Merge branch 'release/2.24/master' into ibm/2.24/master

hjl/tunables/master 2017-06-21 13:09:43 UTC 2017-06-21
x86: Rename glibc.tune.ifunc to glibc.tune.hwcaps

Author: H.J. Lu
Author Date: 2017-06-21 12:38:03 UTC

x86: Rename glibc.tune.ifunc to glibc.tune.hwcaps

Rename glibc.tune.ifunc to glibc.tune.hwcaps and move it to
sysdeps/x86/dl-tunables.list since it is x86 specicifc. Also
change type of data_cache_size, data_cache_size and
non_temporal_threshold to unsigned long int to match size_t.
Remove usage DEFAULT_STRLEN from cpu-tunables.c.

 * elf/dl-tunables.list (glibc.tune.ifunc): Removed.
 * sysdeps/x86/dl-tunables.list (glibc.tune.hwcaps): New.
 Remove security_level on all fields.
 * manual/tunables.texi: Replace ifunc with hwcaps.
 * sysdeps/x86/cpu-features.c (TUNABLE_CALLBACK (set_ifunc)):
 Renamed to ..
 (TUNABLE_CALLBACK (set_hwcaps)): This.
 (init_cpu_features): Updated.
 * sysdeps/x86/cpu-features.h (cpu_features): Change type of
 data_cache_size, data_cache_size and non_temporal_threshold to
 unsigned long int.
 * sysdeps/x86/cpu-tunables.c (DEFAULT_STRLEN): Removed.
 (TUNABLE_CALLBACK (set_ifunc)): Renamed to ...
 (TUNABLE_CALLBACK (set_hwcaps)): This. Update comments. Don't
 use DEFAULT_STRLEN.

aaribaud/y2038-2.25 2017-06-21 10:46:07 UTC 2017-06-21
Y2038: implement Y2038-ready fstatat64, fxstatat (WIP)

Author: Albert ARIBAUD (3ADEV)
Author Date: 2017-05-24 08:27:17 UTC

Y2038: implement Y2038-ready fstatat64, fxstatat (WIP)

WIP: there is no Y2038-proof linux struct stat for now,
so these implementations just use the existing syscalls
and convert from kernel 32-bit-time struct stat64 to
GLIBC Y2038-ready struct stat64

zack/headers-cleanups 2017-06-21 00:33:19 UTC 2017-06-21
Don't install libio.h or _G_config.h.

Author: Zack Weinberg
Author Date: 2017-04-20 15:21:30 UTC

Don't install libio.h or _G_config.h.

This is an experimental patch which removes libio.h (and _G_config.h)
from the set of application-exposed headers. After this change, the
public stdio.h does not define any symbols whose names begin with _G_
nor _IO_, except that when optimizing, the guts of struct _IO_FILE and
three of the flag constants are visible (see bits/stdio.h and
bits/types/FILE_internals.h). There is a small amount of code
duplication in bits/stdio.h, of macro bodies from libio.h that are no
longer available. A number of internal .c files that were manually
doing PLT bypass for flockfile/funlockfile can now rely on
include/stdio.h to do it for them.

It passes the testsuite on x86_64-linux, but it needs a great deal of
additional testing; in particular I'm almost certain I broke the
support for old-format (GLIBC_2.0) struct _IO_FILE, which is
configured out on this target. Testing this properly would require
someone to get their hands on _really_ old binaries, compiled against
glibc 2.0, possibly statically-linked-but-using-NSS. Unfortunately,
libc.so cannot be expected to be binary identical.

However, this should be ready to feed into archive rebuilds to find
out what applications break.

Substantial clean-ups to the libio implementation are possible if this
sticks, but I haven't done 'em; this is intended to be minimal.

 * libio/Makefile: Don't install libio.h or _G_config.h. Do install
 bits/types/FILE_internals.h, bits/types/cookie_io_functions_t.h,
 and bits/types/__fpos_t.h.

 * libio/stdio.h: Don't include libio.h. Get __gnuc_va_list
 directly from stdarg.h, __fpos_t and __fpos64_t from
 bits/types/__fpos_t.h, and the cookie types from
 bits/types/cookie_io_functions_t.h. Change all uses of
 _G_va_list, _G_fpos_t, _G_fpos64_t, _IO_FILE,
 _IO_cookie_io_functions_t, and _IO_ssize_t to __gnuc_va_list,
 __fpos_t, __fpos64_t, FILE, cookie_io_functions_t, and __ssize_t
 respectively.
 Do not define getc nor putc as macros.
 Define BUFSIZ as literal 8192.

 * libio/bits/types/FILE_internals.h: New header. Provide complete
 definition of struct _IO_FILE (the complete version) here.
 Duplicate definitions of _IO_EOF_SEEN, _IO_ERR_SEEN, and _IO_USER_LOCK
 here, with value assertions if they are already defined.
 * libio/bits/types/__fpos_t.h: New header. Define __fpos_t and
 __fpos64_t here.
 * libio/bits/types/cookie_io_functions_t.h: New header. Define
 cookie_read_function_t, cookie_write_function_t,
 cookie_seek_function_t, cookie_close_function_t, and
 cookie_io_functions_t here.

 * libio/libio.h: Include features.h first thing, then error out if
 either _LIBC or __USE_GNU is not defined, or if _ISOMAC is
 defined. Inline all of _G_config.h except _G_HAVE_MREMAP here.
 Get definitions of __mbstate_t, __fpos_t, __fpos64_t, struct
 _IO_FILE, and the cookie-related types from the relevant
 bits/types headers. Get definition of NULL from stddef.h.
 Make all #ifdef _LIBC and #if __GNUC__ >= (2,3) blocks
 unconditional. Remove all #if 0 and #ifdef __cplusplus blocks.
 Change all uses of _G_va_list, _G_fpos_t, and _G_fpos64_t to
 __gnuc_va_list, __fpos_t, __fpos64_t respectively. Provide
 definitions of _STDIO_USES_IOSTREAM, __HAVE_COLUMN,
 _IO_file_flags, __io_read_fn, __io_write_fn, __io_seek_fn,
 __io_close_fn, _IO_cookie_io_functions_t for the sake of the
 implementation. When _IO_USE_OLD_IO_FILE is defined, define
 struct _IO_FILE_old.
 * libio/libioP.h: When _IO_USE_OLD_IO_FILE is defined, define
 struct _IO_FILE_old_plus. Only declare _IO_old_file_init_internal
 when _IO_USE_OLD_IO_FILE is defined, and have it take an
 argument of type struct _IO_FILE_old_plus.
 * libio/oldfileops.c: Change all uses of _IO_FILE to _IO_FILE_old,
 _IO_FILE_plus to _IO_FILE_old_plus, _IO_FILE_complete to _IO_FILE,
 and _IO_FILE_complete_plus to _IO_FILE_plus. Then adjust types
 to match caller/callee's expectations.
 * libio/oldiofdopen.c, libio/oldiofopen.c, libio/oldiopopen.c
 * libio/oldstdfiles.c: Likewise.
 * sysdeps/generic/_G_config.h, sysdeps/unix/sysv/linux/_G_config.h:
 Only provide definition or non-definition of _G_HAVE_MREMAP.

 * sysdeps/ieee754/ldbl-opt/nldbl-iovfscanf.c: Delete file.
 * sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Remove iovfscanf.
 * sysdeps/ieee754/ldbl-opt/nldbl-compat.c: Define
 __nldbl__IO_vsprintf as alias to __nldbl_vsprintf instead of
 the other way around.
 * sysdeps/ieee754/ldbl-opt/nldbl-compat.h:
 Change all uses of _G_va_list to __gnuc_va_list. Remove
 NLDBL_DECL for _IO_vfscanf.
 * sysdeps/ieee754/ldbl-opt/nldbl-fscanf.c
 * sysdeps/ieee754/ldbl-opt/nldbl-scanf.c
 * sysdeps/ieee754/ldbl-opt/nldbl-vfscanf.c
 * sysdeps/ieee754/ldbl-opt/nldbl-vscanf.c:
 Use __nldbl_vfscanf, not __nldbl__IO_vfscanf.

 * libio/bits/stdio.h: Add multiple-inclusion guard. Include
 bits/types/FILE_internals.h. Declare __uflow and __overflow here.
 Remove redundant __USE_EXTERN_INLINES ifdef. Change all uses of
 _G_va_list to __gnuc_va_list and _IO_ssize_t to __ssize_t.
 (getchar): Use getc, not _IO_getc.
 (__getc_unlocked, __putc_unlocked): New inlines, duplicating the
 bodies of _IO_getc_unlocked and _IO_putc_unlocked.
 (fgetc_unlocked, getc_unlocked, getchar_unlocked, fread_unlocked):
 Use __getc_unlocked.
 (fputc_unlocked, putc_unlocked, putchar_unlocked, fwrite_unlocked):
 Use __putc_unlocked.
 (feof_unlocked): Duplicate the body of _IO_feof_unlocked here.
 (ferror_unlocked): Duplicate the body of _IO_ferror_unlocked here.
 * libio/bits/stdio2.h: Change all uses of _G_va_list to __gnuc_va_list.
 (fread_unlocked): Use __getc_unlocked.
 * libio/bits/types/FILE.h, libio/bits/types/__FILE.h: Explain in
 comments why the name _IO_FILE is used.

 * include/stdio.h: Change all uses of _G_va_list to __gnuc_va_list,
 _IO_ssize_t to __size_t, _IO_FILE to FILE, and _IO_fpos_t to __fpos_t.
 When IS_IN (libc), redirect flockfile and funlockfile to
 __flockfile and __funlockfile respectively.
 When _IO_MTSAFE_IO and not _ISOMAC, include stdio-lock.h before
 stdio.h proper.
 * include/stdio_ext.h: Include bits/types/FILE_internals.h for the
 sake of the inline definition of __fsetlocking.
 * include/libio.h: Adjust #ifdef nest to activate multiple-include
 optimization.
 * include/bits/types/FILE_internals.h, include/bits/types/__fpos_t.h
 * include/bits/types/cookie_io_functions_t.h: New trivial wrappers.
 * include/bits/stdio.h: New wrapper; mark __uflow and __overflow
 as hidden for intra-libc callers.

 * csu/init.c: Include libio.h, not _G_config.h.

 * grp/fgetgrent_r.c, grp/putgrent.c, gshadow/fgetsgent_r.c
 * gshadow/putsgent.c, misc/getpass.c, misc/getttyent.c
 * misc/mntent_r.c, posix/getopt.c, pwd/fgetpwent_r.c
 * shadow/fgetspent_r.c, shadow/putspent.c:
 Don't include libio/iolibio.h. Don't redefine flockfile or
 funlockfile. Don't use _IO_flockfile or _IO_funlockfile.

 * libio/__fbufsize.c, libio/__flbf.c, libio/__fpending.c
 * libio/__freadable.c, libio/__freading.c, libio/__fwritable.c
 * libio/__fwriting.c, malloc/malloc.c: Include libio.h.
 * misc/err.c: Include libio.h. Don't redefine flockfile or funlockfile.

 * stdio-common/tstgetln.c: Include sys/types.h. Don't redefine ssize_t.
 * conform/data/stdio.h-data: va_list may be defined as __gnuc_va_list,
 not _G_va_list.
 * benchtests/strcoll-inputs/filelist#en_US.UTF-8: Remove _G_config.h.

release/2.23/master 2017-06-20 04:28:29 UTC 2017-06-20
i686: Add missing IS_IN (libc) guards to vectorized strcspn

Author: Florian Weimer
Author Date: 2017-06-14 06:11:22 UTC

i686: Add missing IS_IN (libc) guards to vectorized strcspn

Since commit d957c4d3fa48d685ff2726c605c988127ef99395 (i386: Compile
rtld-*.os with -mno-sse -mno-mmx -mfpmath=387), vector intrinsics can
no longer be used in ld.so, even if the compiled code never makes it
into the final ld.so link. This commit adds the missing IS_IN (libc)
guard to the SSE 4.2 strcspn implementation, so that it can be used from
ld.so in the future.

(cherry picked from commit 69052a3a95da37169a08f9e59b2cc1808312753c)

release/2.24/master 2017-06-20 04:27:56 UTC 2017-06-20
i686: Add missing IS_IN (libc) guards to vectorized strcspn

Author: Florian Weimer
Author Date: 2017-06-14 06:11:22 UTC

i686: Add missing IS_IN (libc) guards to vectorized strcspn

Since commit d957c4d3fa48d685ff2726c605c988127ef99395 (i386: Compile
rtld-*.os with -mno-sse -mno-mmx -mfpmath=387), vector intrinsics can
no longer be used in ld.so, even if the compiled code never makes it
into the final ld.so link. This commit adds the missing IS_IN (libc)
guard to the SSE 4.2 strcspn implementation, so that it can be used from
ld.so in the future.

(cherry picked from commit 69052a3a95da37169a08f9e59b2cc1808312753c)

release/2.25/master 2017-06-20 04:27:09 UTC 2017-06-20
i686: Add missing IS_IN (libc) guards to vectorized strcspn

Author: Florian Weimer
Author Date: 2017-06-14 06:11:22 UTC

i686: Add missing IS_IN (libc) guards to vectorized strcspn

Since commit d957c4d3fa48d685ff2726c605c988127ef99395 (i386: Compile
rtld-*.os with -mno-sse -mno-mmx -mfpmath=387), vector intrinsics can
no longer be used in ld.so, even if the compiled code never makes it
into the final ld.so link. This commit adds the missing IS_IN (libc)
guard to the SSE 4.2 strcspn implementation, so that it can be used from
ld.so in the future.

(cherry picked from commit 69052a3a95da37169a08f9e59b2cc1808312753c)

hjl/ifunc/master 2017-06-19 14:42:44 UTC 2017-06-19
x86-64: Implement strcmp family IFUNC selectors in C

Author: H.J. Lu
Author Date: 2017-06-13 00:08:44 UTC

x86-64: Implement strcmp family IFUNC selectors in C

Implement strcmp family IFUNC selectors in C.

All internal calls within libc.so can use IFUNC on x86-64 since unlike
x86, x86-64 supports PC-relative addressing to access the GOT entry so
that it can call via PLT without using an extra register. For libc.a,
we can't use IFUNC for functions which are called before IFUNC has been
initialized. Use IFUNC internally reduces the icache footprint since
libc.so and other codes in the process use the same implementations.
This patch uses IFUNC for strcmp family functions within libc.

 * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
 strcmp-sse2, strcmp-sse4_2, strncmp-sse2, strncmp-sse4_2,
 strcasecmp_l-sse2, strcasecmp_l-sse4_2, strcasecmp_l-avx,
 strncase_l-sse2, strncase_l-sse4_2 and strncase_l-avx.
 * sysdeps/x86_64/multiarch/ifunc-strcasecmp.h: New file.
 * sysdeps/x86_64/multiarch/strcasecmp.c: Likewise.
 * sysdeps/x86_64/multiarch/strcasecmp_l-avx.S: Likewise.
 * sysdeps/x86_64/multiarch/strcasecmp_l-sse2.S: Likewise.
 * sysdeps/x86_64/multiarch/strcasecmp_l-sse4_2.S: Likewise.
 * sysdeps/x86_64/multiarch/strcasecmp_l.c: Likewise.
 * sysdeps/x86_64/multiarch/strcmp-sse2.S: Likewise.
 * sysdeps/x86_64/multiarch/strcmp-sse4_2.S: Likewise.
 * sysdeps/x86_64/multiarch/strcmp.c: Likewise.
 * sysdeps/x86_64/multiarch/strncase.c: Likewise.
 * sysdeps/x86_64/multiarch/strncase_l-avx.S : Likewise.
 * sysdeps/x86_64/multiarch/strncase_l-sse2.S: Likewise.
 * sysdeps/x86_64/multiarch/strncase_l-sse4_2.S: Likewise.
 * sysdeps/x86_64/multiarch/strncase_l.c: Likewise.
 * sysdeps/x86_64/multiarch/strncmp-sse2.S: Likewise.
 * sysdeps/x86_64/multiarch/strncmp-sse4_2.S: Likewise.
 * sysdeps/x86_64/multiarch/strncmp.c: Likewise.
 * sysdeps/x86_64/multiarch/strcasecmp_l.S: Removed.
 * sysdeps/x86_64/multiarch/strcmp.S: Likewise.
 * sysdeps/x86_64/multiarch/strncase_l.S: Likewise.
 * sysdeps/x86_64/multiarch/strncmp.S: Likewise.
 * sysdeps/x86_64/multiarch/strcmp-sse42.S: Include <sysdep.h>.
 (STRCMP_SSE42): New. Defined to __strcmp_sse42 if not defined.
 [USE_AS_STRCASECMP_L || USE_AS_STRNCASECMP_L]: Include
 "locale-defines.h".
 (UPDATE_STRNCMP_COUNTER): New.
 (SECTION): Likewise.
 (GLABEL): Likewise.
 (LABEL): Likewise.
 * sysdeps/x86_64/multiarch/strncmp-ssse3.S: Rewrite and enable
 for libc.a.

hjl/pr21598 2017-06-17 13:15:23 UTC 2017-06-17
i386: Update _dl_runtime_resolve/_dl_runtime_profile

Author: H.J. Lu
Author Date: 2017-06-16 21:32:02 UTC

i386: Update _dl_runtime_resolve/_dl_runtime_profile

To make symbol resolver compatible with Shadow Stack in Intel Control-flow
Enforcement Technology (CET) instructions:

https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf

call resolved function indirectly with %ecx.

 [BZ #21598]
 * sysdeps/i386/dl-trampoline.S (_dl_runtime_resolve): Call
 resolved function indirectly with %ecx.
 (_dl_runtime_profile): Likewise.

zack/build-layout-experiment 2017-06-08 19:39:03 UTC 2017-06-08
Prepare for radical source tree reorganization.

Author: Zack Weinberg
Author Date: 2017-06-08 19:39:03 UTC

Prepare for radical source tree reorganization.

All top-level files and directories are moved into a temporary storage
directory, REORG.TODO, except for files that will certainly still
exist in their current form at top level when we're done (COPYING,
COPYING.LIB, LICENSES, NEWS, README), all old ChangeLog files (which
are moved to the new directory OldChangeLogs, instead), and the
generated file INSTALL (which is just deleted; in the new order, there
will be no generated files checked into version control).

hjl/avx2/master 2017-06-08 12:07:18 UTC 2017-06-08
x86-64: Optimize strrchr/wcsrchr with AVX2

Author: H.J. Lu
Author Date: 2017-05-26 19:21:55 UTC

x86-64: Optimize strrchr/wcsrchr with AVX2

Optimize strrchr/wcsrchr with AVX2 to check 32 bytes with vector
instructions. It is as fast as SSE2 version for small data sizes
and up to 1X faster for large data sizes on Haswell. Select AVX2
version on AVX2 machines where vzeroupper is preferred and AVX
unaligned load is fast.

 * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
 strrchr-sse2, strrchr-avx2, wcsrchr-sse2 and wcsrchr-avx2.
 * sysdeps/x86_64/multiarch/ifunc-impl-list.c
 (__libc_ifunc_impl_list): Add tests for __strrchr_avx2,
 __strrchr_sse2, __wcsrchr_avx2 and __wcsrchr_sse2.
 * sysdeps/x86_64/multiarch/strrchr-avx2.S: New file.
 * sysdeps/x86_64/multiarch/strrchr-sse2.S: Likewise.
 * sysdeps/x86_64/multiarch/strrchr.c: Likewise.
 * sysdeps/x86_64/multiarch/wcsrchr-avx2.S: Likewise.
 * sysdeps/x86_64/multiarch/wcsrchr-sse2.S: Likewise.
 * sysdeps/x86_64/multiarch/wcsrchr.c: Likewise.

hjl/avx2/fix 2017-06-06 12:31:48 UTC 2017-06-06
x86-64: Move wcsnlen.S to multiarch/wcsnlen-sse4_1.S

Author: H.J. Lu
Author Date: 2017-06-06 12:31:48 UTC

x86-64: Move wcsnlen.S to multiarch/wcsnlen-sse4_1.S

Since wcsnlen.S uses pminud which is the part of SSE4.1, move wcsnlen.S
to multiarch/wcsnlen-sse4_1.S.

 * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
 wcsnlen-sse4_1 and wcsnlen-c.
 * sysdeps/x86_64/multiarch/ifunc-impl-list.c
 (__libc_ifunc_impl_list): Test __wcsnlen_sse4_1 and
 __wcsnlen_sse2.
 * sysdeps/x86_64/multiarch/ifunc-sse4_1.h: New file.
 * sysdeps/x86_64/multiarch/wcsnlen-c.c: Likewise.
 * sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S: Likewise.
 * sysdeps/x86_64/multiarch/wcsnlen.c: Likewise.
 * sysdeps/x86_64/wcsnlen.S: Removed.

hjl/avx2/c 2017-06-05 22:09:59 UTC 2017-06-05
x86-64: Optimize strrchr/wcsrchr with AVX2

Author: H.J. Lu
Author Date: 2017-05-26 19:21:55 UTC

x86-64: Optimize strrchr/wcsrchr with AVX2

Optimize strrchr/wcsrchr with AVX2 to check 32 bytes with vector
instructions. It is as fast as SSE2 version for small data sizes
and up to 1X faster for large data sizes on Haswell. Select AVX2
version on AVX2 machines where vzeroupper is preferred and AVX
unaligned load is fast.

 * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
 strrchr-sse2, strrchr-avx2, wcsrchr-sse2 and wcsrchr-avx2.
 * sysdeps/x86_64/multiarch/ifunc-impl-list.c
 (__libc_ifunc_impl_list): Add tests for __strrchr_avx2,
 __strrchr_sse2, __wcsrchr_avx2 and __wcsrchr_sse2.
 * sysdeps/x86_64/multiarch/strrchr-avx2.S: New file.
 * sysdeps/x86_64/multiarch/strrchr-sse2.S: Likewise.
 * sysdeps/x86_64/multiarch/strrchr.c: Likewise.
 * sysdeps/x86_64/multiarch/wcsrchr-avx2.S: Likewise.
 * sysdeps/x86_64/multiarch/wcsrchr-sse2.S: Likewise.
 * sysdeps/x86_64/multiarch/wcsrchr.c: Likewise.

zack/build-experiments 2017-06-01 12:47:44 UTC 2017-06-01
Experimenting with alternatives to VPATH.

Author: Zack Weinberg
Author Date: 2017-06-01 12:47:44 UTC

Experimenting with alternatives to VPATH.

hjl/master 2017-05-29 13:47:22 UTC 2017-05-29
x86: Update __x86_shared_non_temporal_threshold

Author: H.J. Lu
Author Date: 2017-05-12 20:38:04 UTC

x86: Update __x86_shared_non_temporal_threshold

__x86_shared_non_temporal_threshold was set to 6 times of per-core
shared cache size, based on the large memcpy micro benchmark in glibc
on a 8-core processor. For a processor with more than 8 cores, the
threshold is too low. Set __x86_shared_non_temporal_threshold to the
3/4 of the total shared cache size so that it is unchanged on 8-core
processors. On processors with less than 8 cores, the threshold is
lower.

 * sysdeps/x86/cacheinfo.c (__x86_shared_non_temporal_threshold):
 Set to the 3/4 of the total shared cache size.

tuliom/float128 2017-05-25 19:14:47 UTC 2017-05-25
powerpc64le: Enable float128

Author: Paul E. Murphy
Author Date: 2016-08-09 21:48:54 UTC

powerpc64le: Enable float128

Add ulps for the float128 type, bits/floatn.h, and float128-abi.h.

Likewise, sqrt is not implemented in libgcc. The sfp-machine.h
header is taken from libgcc, and used to build a P7/P8 soft-fp
sqrtf128.

 * sysdeps/powerpc/fpu/libm-test-ulps: Regenerated.
 * sysdeps/powerpc/fpu/math_private.h:
 (__ieee754_sqrtf128): New inline override.
 * sysdeps/powerpc/powerpc64le/Implies-before: New file.
 * sysdeps/powerpc/powerpc64le/Makefile: New file.
 * sysdeps/powerpc/powerpc64le/bits/floatn.h: New file.
 * sysdeps/powerpc/powerpc64le/fpu/e_sqrtf128.c: New file.
 * sysdeps/powerpc/powerpc64le/fpu/sfp-machine.h: New file.
 * sysdeps/powerpc/powerpc64le/power9/fpu/e_sqrtf128.c: New file.

 * sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist:
 Regenerated.
 * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist:
 Likewise.

 * sysdeps/unix/sysv/linux/powerpc/powerpc64le/float128-abi.h
 New file.

hjl/cacheinfo/master 2017-05-24 13:35:25 UTC 2017-05-24
x86: Add cache info to cpu_features

Author: H.J. Lu
Author Date: 2017-05-24 03:22:13 UTC

x86: Add cache info to cpu_features

This patch adds cache info to cpu_features to support tunables for both
cache info as well as CPU features in a single x86 namespace. Since
init_cacheinfo is in libc.so and cpu_features is in ld.so, cache info
and CPU features must be in a place for tunables.

 * sysdeps/x86/cacheinfo.c (init_cacheinfo): Use data_size,
 shared_size and non_temporal_threshold from cpu_features if
 they aren't zero.
 * sysdeps/x86/cpu-features.h (cache_info): New.
 (cpu_features): Add cache.

aaribaud/y2038-2.23 2017-05-24 08:29:39 UTC 2017-05-24
Add __fstatat64_t64, __fxstatat_t64

Author: Albert ARIBAUD (3ADEV)
Author Date: 2017-05-24 08:27:17 UTC

Add __fstatat64_t64, __fxstatat_t64

hjl/x86/optimize 2017-05-22 20:15:35 UTC 2017-05-22
Add x86_cache.non_temporal_threshold to GLIBC_TUNABLES

Author: H.J. Lu
Author Date: 2017-05-22 19:00:43 UTC

Add x86_cache.non_temporal_threshold to GLIBC_TUNABLES

Add support for "glibc.x86_cache.non_temporal_threshold=number" to
GLIBC_TUNABLES.

 * elf/dl-tunables.list (x86_cache): New name space.
 * sysdeps/x86/cacheinfo.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE):
 New.
 [HAVE_TUNABLES]: Include <elf/dl-tunables.h>.
 [HAVE_TUNABLES] (DL_TUNABLE_CALLBACK (set_non_temporal_threshold)):
 New.
 [HAVE_TUNABLES] (init_cacheinfo): Call TUNABLE_SET_VAL_WITH_CALLBACK
 with set_non_temporal_threshold.

dj/malloc-tcache 2017-05-11 21:27:35 UTC 2017-05-11
Tweak Makefile, asserts, comments.

Author: DJ Delorie
Author Date: 2017-05-11 21:25:10 UTC

Tweak Makefile, asserts, comments.

* Un-Wundef-ify -DUSE_TCACHE
* More asserts in tcache get/put functions
* Clarify redundancy in tcache structure

fw/syscall-list 2017-05-01 09:55:58 UTC 2017-05-01
<bits/syscall.h>: Use an arch-independent system call list on Linux

Author: Florian Weimer
Author Date: 2017-04-05 19:30:40 UTC

<bits/syscall.h>: Use an arch-independent system call list on Linux

This commit changes the way the list of SYS_* system call macros
is created on Linux. glibc now contains a list of all known system
calls, and the generated <bits/syscall.h> file defines the SYS_
macro only if the correspnding __NR_ macro is defined by the kernel
headers.

As a result, there glibc does not have to be rebuilt to pick up
system calls if the glibc sources already know about them. This
means that glibc can be built with older kernel headers, and if
the installed kernel headers are upgraded afterwards, additional
SYS_ macros become available as long as glibc has a record for
those system calls.

linaro/2.21/master 2017-04-21 13:07:56 UTC 2017-04-21
Make io/ftwtest-sh remove temporary files on early exit.

Author: Joseph Myers
Author Date: 2015-10-21 21:18:21 UTC

Make io/ftwtest-sh remove temporary files on early exit.

The test io/ftwtest-sh creates a directory that at some points during
the test does not have execute permission. To avoid leaving behind
such a directory that prevents the build directory from being removed
with a simple "rm -rf", it traps various signals to make the directory
executable and remove it before exit. However, this doesn't cover the
case where one of the tests simply fails (which happens with cross
testing if testing on a remote system where the path to the build
directory involves a symlink, or if that remote system fell over
during testing - I think the latter is the case where the directory is
left behind with bad permissions).

This patch makes that test also trap signal 0 (exit) so that the
directory gets properly removed in such failure cases as well.

Tested in both configurations where the test passes and where it fails
to verify that the result of the test is unchanged but the directory
is no longer left behind where it was previously left behind.

 * io/ftwtest-sh: Also trap on exit to remove temporary files.

linaro/2.23/master 2017-04-20 17:59:22 UTC 2017-04-20
posix: Add cleanup on the trap list for globtest.sh

Author: Adhemerval Zanella
Author Date: 2017-04-11 18:08:02 UTC

posix: Add cleanup on the trap list for globtest.sh

This patch prevents lingering files for SIGSEGV failures by adding
a cleanup handler on trap handler. Checked on x86_64-linux-gnu.

 * posix/globtest.sh: Add cleanup routine on trap 0.

Cherry-pick of 4fee33f.

hjl/pr21258/2.23 2017-04-20 14:55:44 UTC 2017-04-20
x86-64: Improve branch predication in _dl_runtime_resolve_avx512_opt [BZ #21258]

Author: H.J. Lu
Author Date: 2017-03-21 17:59:31 UTC

x86-64: Improve branch predication in _dl_runtime_resolve_avx512_opt [BZ #21258]

On Skylake server, _dl_runtime_resolve_avx512_opt is used to preserve
the first 8 vector registers. The code layout is

  if only %xmm0 - %xmm7 registers are used
     preserve %xmm0 - %xmm7 registers
  if only %ymm0 - %ymm7 registers are used
     preserve %ymm0 - %ymm7 registers
  preserve %zmm0 - %zmm7 registers

Branch predication always executes the fallthrough code path to preserve
%zmm0 - %zmm7 registers speculatively, even though only %xmm0 - %xmm7
registers are used. This leads to lower CPU frequency on Skylake
server. This patch changes the fallthrough code path to preserve
%xmm0 - %xmm7 registers instead:

  if whole %zmm0 - %zmm7 registers are used
    preserve %zmm0 - %zmm7 registers
  if only %ymm0 - %ymm7 registers are used
     preserve %ymm0 - %ymm7 registers
  preserve %xmm0 - %xmm7 registers

Tested on Skylake server.

 [BZ #21258]
 * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve_opt):
 Define only if _dl_runtime_resolve is defined to
 _dl_runtime_resolve_sse_vex.
 * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_opt):
 Fallthrough to _dl_runtime_resolve_sse_vex.

(cherry picked from commit c15f8eb50cea7ad1a4ccece6e0982bf426d52c00)

fw/accept4 2017-04-14 08:30:01 UTC 2017-04-14
Assume that accept4 is always available and works

Author: Florian Weimer
Author Date: 2017-04-14 08:30:01 UTC

Assume that accept4 is always available and works

Simplify the Linux accept4 implementation based on the assumption
that it is available in some way. __ASSUME_ACCEPT4_SOCKETCALL was
previously unused, so remove it. Its functionality is implied
by the complex #if condition in accept4.c.

hjl/hwcap/master 2017-04-07 15:09:47 UTC 2017-04-07
x86: Set dl_hwcap from CPU features

Author: H.J. Lu
Author Date: 2017-04-05 22:24:30 UTC

x86: Set dl_hwcap from CPU features

On x86, the usage of AT_HWCAP in glibc is obsolete since addition of
dl_x86_cpu_features. dl_hwcap, which was set from AT_HWCAP, is used by
dynamic linker to build an array of hardware capability names, which are
added to search path when loading shared object. dl_hwcap was unused on
x86-64 and only SSE2 was used on i386.

This patch sets dl_hwcap with new hardware capabilities from CPU
features. Currently, 2 capabilities, SSE2 and AVX2, are supported.
The maximum number of hardware capabilities is 64. Since x86-64
includes SSE2, SSE2 is skipped on x86-64. dl_x86_cap_flags is kepted
for i386 and is used by _dl_show_auxv. dl_x86_hwcap_flags is added
for new hardware capabilities.

 * sysdeps/i386/dl-hwcap.h: New file.
 * sysdeps/x86/dl-hwcap.h: Likewise.
 * sysdeps/x86_64/dl-hwcap.h: Likewise.
 * sysdeps/x86_64/dl-procinfo.h: Likewise.
 * sysdeps/i386/dl-procinfo.c (_dl_x86_hwcap_flags): New.
 * sysdeps/i386/dl-procinfo.h: Include <dl-hwcap.h>.
 (_DL_HWCAP_COUNT): Removed.
 (HWCAP_I386_XXX): Likewise.
 (HWCAP_IMPORTANT): Likewise.
 (_dl_procinfo): Likewise.
 (_dl_hwcap_string): Likewise.
 (_dl_string_hwcap): Likewise.
 * sysdeps/unix/sysv/linux/i386/dl-procinfo.h (_dl_procinfo):
 Replace _DL_HWCAP_COUNT with 32.
 * sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h [!IS_IN (ldconfig)]:
 Include <sysdeps/x86_64/dl-procinfo.h>.
 * sysdeps/x86/cpu-features.c: Include <dl-hwcap.h>.
 (init_cpu_features): Set dl_hwcap and dl_hwcap_mask.
 * sysdeps/x86_64/dl-procinfo.c (_dl_x86_hwcap_flags): New.

hjl/pr21265/xsavec 2017-03-28 20:46:42 UTC 2017-03-28
X86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve

Author: H.J. Lu
Author Date: 2017-03-23 15:21:52 UTC

X86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve

In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector,
mask and bound registers. It simplifies _dl_runtime_resolve and supports
different calling conventions. However, use xsave can be 10X slower
than saving and restoring vector and bound registers individually.

gentoo/2.25 2017-03-20 14:57:14 UTC 2017-03-20
posix_spawn: fix stack setup on ia64 [BZ #21275]

Author: Mike Frysinger
Author Date: 2017-03-20 08:47:56 UTC

posix_spawn: fix stack setup on ia64 [BZ #21275]

The ia64-specific clone2 call expects the base of the stack mapping and
the stack size as sep arguments, not an initial stack value as on other
stack-grows-down architectures. Reuse the stack-grows-up macro so we
pass in the right stack base.

Reported-by: Matt Turner <mattst88@gentoo.org>
(cherry picked from commit ddc3fb333469c2997798742dc0509dc1e3201d91)
(cherry picked from commit 27ab0d9518746dfb59ed2ba59daefc981dc10e38)

gentoo/2.24 2017-03-20 14:56:52 UTC 2017-03-20
posix_spawn: fix stack setup on ia64 [BZ #21275]

Author: Mike Frysinger
Author Date: 2017-03-20 08:47:56 UTC

posix_spawn: fix stack setup on ia64 [BZ #21275]

The ia64-specific clone2 call expects the base of the stack mapping and
the stack size as sep arguments, not an initial stack value as on other
stack-grows-down architectures. Reuse the stack-grows-up macro so we
pass in the right stack base.

Reported-by: Matt Turner <mattst88@gentoo.org>
(cherry picked from commit ddc3fb333469c2997798742dc0509dc1e3201d91)
(cherry picked from commit 7043946c7921c0e3850dd2b3d948336624bb0f62)

fw/out_buffer 2017-03-10 19:45:25 UTC 2017-03-10
WIP struct out_buffer

Author: Florian Weimer
Author Date: 2017-03-10 19:45:25 UTC

WIP struct out_buffer

fw/bug16145 2017-03-09 15:34:11 UTC 2017-03-09
WIP reorganization to improve scalability of localtime

Author: Florian Weimer
Author Date: 2017-03-09 15:33:57 UTC

WIP reorganization to improve scalability of localtime

aaribaud/y2038-pre-5th-draft 2017-02-22 07:06:33 UTC 2017-02-22
Add 64-bit version of ctime_r

Author: Albert ARIBAUD (3ADEV)
Author Date: 2017-02-22 07:06:33 UTC

Add 64-bit version of ctime_r

aaribaud/y2038 2017-02-22 07:06:33 UTC 2017-02-22
Add 64-bit version of ctime_r

Author: Albert ARIBAUD (3ADEV)
Author Date: 2017-02-22 07:06:33 UTC

Add 64-bit version of ctime_r

release/2.19/master 2017-02-20 21:04:52 UTC 2017-02-20
Fix powerpc software sqrt (bug 17964).

Author: Joseph Myers
Author Date: 2015-02-12 23:05:37 UTC

Fix powerpc software sqrt (bug 17964).

As Adhemerval noted in
<https://sourceware.org/ml/libc-alpha/2015-01/msg00451.html>, the
powerpc sqrt implementation for when _ARCH_PPCSQ is not defined is
inaccurate in some cases.

The problem is that this code relies on fused multiply-add, and relies
on the compiler contracting a * b + c to get a fused operation. But
sysdeps/ieee754/dbl-64/Makefile disables contraction for e_sqrt.c,
because the implementation in that directory relies on *not* having
contracted operations.

While it would be possible to arrange makefiles so that an earlier
sysdeps directory can disable the setting in
sysdeps/ieee754/dbl-64/Makefile, it seems a lot cleaner to make the
dependence on fused operations explicit in the .c file. GCC 4.6
introduced support for __builtin_fma on powerpc and other
architectures with such instructions, so we can rely on that; this
patch duly makes the code use __builtin_fma for all such fused
operations.

Tested for powerpc32 (hard float).

2015-02-12 Joseph Myers <joseph@codesourcery.com>

 [BZ #17964]
 * sysdeps/powerpc/fpu/e_sqrt.c (__slow_ieee754_sqrt): Use
 __builtin_fma instead of relying on contraction of a * b + c.

(cherry picked from commit e8bd5286c68bc35be3b41e94c15c4387dcb3bec9)

fw/math-split-tests 2017-02-17 07:22:29 UTC 2017-02-17
RFC: Run libm tests separately for each function

Author: Joseph Myers
Author Date: 2017-02-16 23:08:20 UTC

RFC: Run libm tests separately for each function

At present, libm tests for each function get built into a single
executable (for each floating point type, for each of normal / inline
/ finite-math-only functions, plus vector variants) and run together,
resulting in a single PASS or FAIL (for each of those nine variants
plus vector variants). Building this executable involves reading
about 40 MB of libm-test-*.c sources, which would grow to maybe 70 or
80 MB when the complex inverse trig and hyperbolic functions move to
the auto-libm-test-* mechanism (that move is practical now that tests
for all functions don't need regenerating for any change to
auto-libm-test-in but you can instead regenerate each
auto-libm-test-out-* file independently; auto-libm-test-out-casin and
auto-libm-test-out-casinh take about 38 minutes each to generate on my
system after such a move, auto-libm-test-out-cacos and
auto-libm-test-out-cacosh take about 80 minutes each).

This patch arranges for tests of each function to be run separately
from the makefiles instead. There are 121 functions being tested for
each (type, variant pair) (actually 126, but run as 121 from the
Makefile because each of the pairs (exp10, pow10), (isfinite, finite),
(lgamma, gamma), (remainder, drem), (scalbn, ldexp), shares a table of
test results and so is run together), so 1089 separate tests run from
the Makefile, plus 48 vector tests on x86_64 (six functions for eight
vector variants). Each test only involves a libm-test-<func>.c file
of no more than about 4 MB, rather than all such files taking about 40
MB. With tests run separately, test summaries will indicate which
functions actually have problems (of course, those problems may just
be out-of-date libm-test-ulps files if the file hasn't been updated
for the architecture in question recently).

All the .c files for the 1089+48 tests are generated automatically
from the Makefiles. Various checked-in boilerplate .c files are
removed as no longer needed. CFLAGS definitions for the different
kinds of tests are generated using makefile iterators to apply
target-specific variable settings. libm-have-vector-test.h is no
longer needed; the list of functions to test for each vector type is
now in the sysdeps Makefile.

This should remove the amount of boilerplate needed for float128
testing support; test-float128.h will still be needed, but not various
.c files or Makefile CFLAGS definitions. The logic for creating
dependencies on libm-test-support-*.o files should also render
<https://sourceware.org/ml/libc-alpha/2017-02/msg00279.html>
unnecessary.

Any comments? Especially regarding the use of iterators; there is
existing precedent (in elf/Makefile) for using o-iterator.mk as a
generic iterator with object-suffixes-left set to something other than
a list of object suffixes, but maybe there should be a differently
named iterator for such generic uses?

2017-02-16 Joseph Myers <joseph@codesourcery.com>

 * math/Makefile (libm-tests-generated): Remove variable.
 (libm-tests-base-normal): New variable.
 (libm-tests-base-finite): Likewise.
 (libm-tests-base-inline): Likewise.
 (libm-tests-base): Likewise.
 (libm-tests-normal): Likewise.
 (libm-tests-finite): Likewise.
 (libm-tests-inline): Likewise.
 (libm-tests-vector): Likewise.
 (libm-tests): Define in terms of these new variables.
 (libm-tests-for-type): New variable.
 (libm-tests.o): Move definition.
 (tests): Move addition of $(libm-tests).
 (generated): Update for new and removed libm test files.
 ($(objpfx)libm-test.c): Remove target.
 ($(objpfx)libm-have-vector-test.h): Likewise.
 (CFLAGS-test-double-vlen2.c): Remove variable.
 (CFLAGS-test-double-vlen4.c): Likewise.
 (CFLAGS-test-double-vlen8.c): Likewise.
 (CFLAGS-test-float-vlen4.c): Likewise.
 (CFLAGS-test-float-vlen8.c): Likewise.
 (CFLAGS-test-float-vlen16.c): Likewise.
 (CFLAGS-test-float.c): Likewise.
 (CFLAGS-test-float-finite.c): Likewise.
 (CFLAGS-libm-test-support-float.c): Likewise.
 (CFLAGS-test-double.c): Likewise.
 (CFLAGS-test-double-finite.c): Likewise.
 (CFLAGS-libm-test-support-double.c): Likewise.
 (CFLAGS-test-ldouble.c): Likewise.
 (CFLAGS-test-ldouble-finite.c): Likewise.
 (CFLAGS-libm-test-support-ldouble.c): Likewise.
 (libm-test-inline-cflags): New variable.
 (CFLAGS-test-ifloat.c): Remove variable.
 (CFLAGS-test-idouble.c): Likewise.
 (CFLAGS-test-ildouble.c): Likewise.
 ($(addprefix $(objpfx), $(libm-tests.o))): Move target and update
 dependencies.
 ($(foreach t,$(libm-tests-normal),$(objpfx)$(t).c)): New rule.
 ($(foreach t,$(libm-tests-finite),$(objpfx)$(t).c)): Likewise.
 ($(foreach t,$(libm-tests-inline),$(objpfx)$(t).c)): Likewise.
 ($(foreach t,$(libm-tests-vector),$(objpfx)$(t).c)): Likewise.
 ($(foreach t,$(types),$(objpfx)libm-test-support-$(t).c)):
 Likewise.
 (dependencies on libm-test-support-*.o): Remove.
 ($(foreach f,$(libm-test-funcs-all),$(objpfx)$(o)-$(f).o)): New
 rules using iterators.
 ($(addprefix $(objpfx),$(call libm-tests-for-type,$(o)))):
 Likewise.
 ($(objpfx)libm-test-support-$(o).o): Likewise.
 ($(addprefix $(objpfx),$(filter-out $(tests-static)
 $(libm-vec-tests),$(tests)))): Filter out $(libm-tests-vector)
 instead.
 ($(addprefix $(objpfx), $(libm-vec-tests))): Use iterator to
 define rule instead.
 * math/README.libm-test: Update.
 * math/libm-test-acos.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-acosh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-asin.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-asinh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-atan.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-atan2.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-atanh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cabs.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cacos.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cacosh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-canonicalize.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-carg.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-casin.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-casinh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-catan.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-catanh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cbrt.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ccos.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ccosh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ceil.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cexp.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cimag.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-clog.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-clog10.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-conj.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-copysign.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cos.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cosh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cpow.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-cproj.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-creal.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-csin.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-csinh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-csqrt.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ctan.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ctanh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-erf.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-erfc.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-exp.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-exp10.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-exp2.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-expm1.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fabs.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fdim.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-floor.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fma.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fmax.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fmaxmag.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fmin.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fminmag.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fmod.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fpclassify.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-frexp.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fromfp.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-fromfpx.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-getpayload.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-hypot.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ilogb.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-iscanonical.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-iseqsig.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isfinite.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isgreater.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isgreaterequal.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isinf.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isless.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-islessequal.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-islessgreater.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isnan.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isnormal.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-issignaling.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-issubnormal.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-isunordered.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-iszero.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-j0.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-j1.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-jn.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-lgamma.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-llogb.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-llrint.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-llround.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-log.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-log10.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-log1p.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-log2.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-logb.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-lrint.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-lround.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-modf.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-nearbyint.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-nextafter.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-nextdown.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-nexttoward.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-nextup.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-pow.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-remainder.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-remquo.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-rint.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-round.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-roundeven.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-scalb.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-scalbln.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-scalbn.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-setpayload.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-setpayloadsig.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-signbit.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-significand.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-sin.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-sincos.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-sinh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-sqrt.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-tan.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-tanh.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-tgamma.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-totalorder.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-totalordermag.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-trunc.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ufromfp.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-ufromfpx.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-y0.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-y1.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-yn.inc: Include libm-test-driver.c.
 (do_test): New function.
 * math/libm-test-driver.c: Do not include libm-have-vector-test.h.
 (HAVE_VECTOR): Remove macro.
 (START): Do not call HAVE_VECTOR.
 * math/test-double-vlen2.h (FUNC_TEST): Remove macro.
 * math/test-double-vlen4.h (FUNC_TEST): Remove macro.
 * math/test-double-vlen8.h (FUNC_TEST): Remove macro.
 * math/test-float-vlen16.h (FUNC_TEST): Remove macro.
 * math/test-float-vlen4.h (FUNC_TEST): Remove macro.
 * math/test-float-vlen8.h (FUNC_TEST): Remove macro.
 * math/test-math-vector.h (FUNC_TEST): New macro.
 (WRAPPER_DECL): Rename to WRAPPER_DECL_f.
 * sysdeps/x86_64/fpu/Makefile (double-vlen2-funcs): New variable.
 (double-vlen4-funcs): Likewise.
 (double-vlen4-avx2-funcs): Likewise.
 (double-vlen8-funcs): Likewise.
 (float-vlen4-funcs): Likewise.
 (float-vlen8-funcs): Likewise.
 (float-vlen8-avx2-funcs): Likewise.
 (float-vlen16-funcs): Likewise.
 (CFLAGS-test-double-vlen4-avx2.c): Remove variable.
 (CFLAGS-test-float-vlen8-avx2.c): Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen4.h (TEST_VECTOR_cos): Remove
 macro.
 (TEST_VECTOR_sin): Likewise.
 (TEST_VECTOR_sincos): Likewise.
 (TEST_VECTOR_log): Likewise.
 (TEST_VECTOR_exp): Likewise.
 (TEST_VECTOR_pow): Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen8.h (TEST_VECTOR_cos):
 Likewise.
 (TEST_VECTOR_sin): Likewise.
 (TEST_VECTOR_sincos): Likewise.
 (TEST_VECTOR_log): Likewise.
 (TEST_VECTOR_exp): Likewise.
 (TEST_VECTOR_pow): Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen16.h (TEST_VECTOR_cosf):
 Likewise.
 (TEST_VECTOR_sinf): Likewise.
 (TEST_VECTOR_sincosf): Likewise.
 (TEST_VECTOR_logf): Likewise.
 (TEST_VECTOR_expf): Likewise.
 (TEST_VECTOR_powf): Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen8.h (TEST_VECTOR_cosf):
 Likewise.
 (TEST_VECTOR_sinf): Likewise.
 (TEST_VECTOR_sincosf): Likewise.
 (TEST_VECTOR_logf): Likewise.
 (TEST_VECTOR_expf): Likewise.
 (TEST_VECTOR_powf): Likewise.
 * math/gen-libm-have-vector-test.sh: Remove file.
 * math/libm-test.inc: Likewise.
 * math/libm-test-support-double.c: Likewise.
 * math/libm-test-support-float.c: Likewise.
 * math/libm-test-support-ldouble.c: Likewise.
 * math/test-double-finite.c: Likewise.: Likewise.
 * math/test-double.c: Likewise.
 * math/test-float-finite.c: Likewise.
 * math/test-float.c: Likewise.
 * math/test-idouble.c: Likewise.
 * math/test-ifloat.c: Likewise.
 * math/test-ildouble.c: Likewise.
 * math/test-ldouble-finite.c: Likewise.
 * math/test-ldouble.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen2.h: Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen4.h: Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.

azanella/bz12683 2017-02-08 20:27:57 UTC 2017-02-08
nptl: Consolidate pthread_{timed,try}join{_np}

Author: Adhemerval Zanella
Author Date: 2017-01-25 19:08:51 UTC

nptl: Consolidate pthread_{timed,try}join{_np}

This patch consolidates the pthread_join and gnu extensions to avoid
simplify implementation and avoid code duplication. Both pthread_join
and pthread_tryjoin are now based on pthread_timedjoin_np.

It also fixes some inconsistencies on ESRCH, EINVAL, EDEADLK handling
(where each implementation differs from each other) and also on
clenup handler (which now always use a CAS). It also replace the
atomics operation with the C11 ones.

Checked on i686-linux-gnu, x86_64-linux-gnu, x86_64-linux-gnux32,
aarch64-linux-gnu, arm-linux-gnueabihf, and powerpc64le-linux-gnu.

 * nptl/pthreadP.h (__pthread_timedjoin_np): Define.
 * nptl/pthread_join.c (pthread_join): Use __pthread_timedjoin_np.
 * nptl/pthread_tryjoin.c (pthread_tryjoin): Likewise.
 * nptl/pthread_timedjoin.c (cleanup): Use CAS on argument setting.
 (pthread_timedjoin_np): Define internal symbol and common code from
 pthread_join.
 * sysdeps/unix/sysv/linux/i386/lowlevellock.h (__lll_timedwait_tid):
 Remove superflous checks.
 * sysdeps/unix/sysv/linux/x86_64/lowlevellock.h (__lll_timedwait_tid):
 Likewise.

fw/bug21041 2017-01-24 17:32:30 UTC 2017-01-24
WIP delayed IFUNC relocation

Author: Florian Weimer
Author Date: 2017-01-24 17:32:30 UTC

WIP delayed IFUNC relocation

azanella/c11-threads 2017-01-24 17:21:18 UTC 2017-01-24
Add manual documentation for threads.h

Author: Juan Manuel Torres Palma
Author Date: 2016-12-06 22:47:02 UTC

Add manual documentation for threads.h

This patch updates the manual and adds a new chapter to the manual,
explaining types macros, constants and functions defined by ISO C11
threads.h standard.

 * manual/Makefile (chapters): Add isothreads.texi.
 * manual/isothreads.texi: New file. Add new chapter for ISO C11
 threads documentation.

google/grte/v4-2.19/master 2017-01-19 22:01:46 UTC 2017-01-19
Fix nan functions handling of payload strings (BZ16962, CVE-2014-9761)

Author: Joseph Myers
Author Date: 2017-01-19 22:01:46 UTC

Fix nan functions handling of payload strings (BZ16962, CVE-2014-9761)

gentoo/2.23 2016-12-08 05:38:41 UTC 2016-12-08
alpha: fix trunc for big input values

Author: Aurelien Jarno
Author Date: 2016-08-02 07:18:59 UTC

alpha: fix trunc for big input values

The alpha specific version of trunc and truncf always add and subtract
0x1.0p23 or 0x1.0p52 even for big values. This causes this kind of
errors in the testsuite:

  Failure: Test: trunc_towardzero (0x1p107)
  Result:
   is: 1.6225927682921334e+32 0x1.fffffffffffffp+106
   should be: 1.6225927682921336e+32 0x1.0000000000000p+107
   difference: 1.8014398509481984e+16 0x1.0000000000000p+54
   ulp : 0.5000
   max.ulp : 0.0000

Change this by returning the input value when its absolute value is
greater than 0x1.0p23 or 0x1.0p52. NaN have to go through the add and
subtract operations to get possibly silenced.

Finally remove the code to handle inexact exception, trunc should never
generate such an exception.

Changelog:
 * sysdeps/alpha/fpu/s_trunc.c (__trunc): Return the input value
 when its absolute value is greater than 0x1.0p52.
 [_IEEE_FP_INEXACT] Remove.
 * sysdeps/alpha/fpu/s_truncf.c (__truncf): Return the input value
 when its absolute value is greater than 0x1.0p23.
 [_IEEE_FP_INEXACT] Remove.

(cherry picked from commit b74d259fe793499134eb743222cd8dd7c74a31ce)
(cherry picked from commit 3a5aa2ee4ffc515c8e7e615ea38d6b3b20ed0a30)

dj/malloc 2016-11-10 21:08:28 UTC 2016-11-10
Updates to trace2wl

Author: DJ Delorie
Author Date: 2016-11-10 21:08:28 UTC

Updates to trace2wl

* command line option -p to show progress
* command line option -f to use file-based buffers
* reduced memory footprint
* more 32/64-bit fixes

ibm/2.22/master 2016-10-14 20:01:21 UTC 2016-10-14
Merge branch release/2.22/master into ibm/2.22/master

Author: Tulio Magno Quites Machado Filho
Author Date: 2016-10-14 20:01:21 UTC

Merge branch release/2.22/master into ibm/2.22/master

release/2.22/master 2016-10-14 19:57:32 UTC 2016-10-14
powerpc: Sync hwcap.h with kernel

Author: Carlos Eduardo Seo
Author Date: 2016-10-14 19:57:32 UTC

powerpc: Sync hwcap.h with kernel

Linux commit b4b56f9ecab40f3b4ef53e130c9f6663be491894 introduced
a new HWCAP2 bit to indicate that the kernel now aborts a memory
transaction when a syscall is made. This patch adds that bit to
sysdeps/powerpc/bits/hwcap.h.

 * sysdeps/powerpc/bits/hwcap.h: Add PPC_FEATURE2_HTM_NOSC.
 * sysdeps/powerpc/dl-procinfo.c:
 (_dl_powerpc_cap_flags): Added descriptor for this hwcap
 feature so it shows when LD_SHOW_AUXV=1.

(cherry picked from commit 3c13f28c8eac1e5a883d1b3801314430a094fc99)

azanella/aarch64-split-stack 2016-07-28 20:11:06 UTC 2016-07-28
aarch64: Add split-stack TCB field

Author: Adhemerval Zanella
Author Date: 2016-04-25 19:15:49 UTC

aarch64: Add split-stack TCB field

This patch adds a new TCB field meant to be used by GCC split-stack
option. A new symbol, __tcb_private_ss, is also added to version
control the private TCB field.

Checked on aarch64.

 * sysdeps/aarch64/Makefile [$(subdir) == elf] (sysdep-dl-routines):
 Add tcb-version.
 * sysdeps/aarch64/Version [ld] (GLIBC_2.25): Define __tcb_private
 ss;
 * sysdeps/aarch64/nptl/tls.h (tcbhead_t): Add __private_ss field.
 * sysdeps/unix/sysv/linux/aarch64/ld.abilist: Add GLIBC_2.25 and
 __tcb_private_ss.
 * sysdeps/aarch64/tcb-version.c: New file.

hjl/libmvec/master 2016-07-19 20:20:10 UTC 2016-07-19
Don't compile do_test with -mavx/-mavx/-mavx512

Author: H.J. Lu
Author Date: 2016-07-19 20:01:36 UTC

Don't compile do_test with -mavx/-mavx/-mavx512

Don't compile do_test with -mavx, -mavx nor -mavx512 since they won't run
on non-AVX machines.

 [BZ #20384]
 * sysdeps/x86_64/fpu/Makefile (extra-test-objs): Add
 test-double-libmvec-sincos-avx-main.o,
 test-double-libmvec-sincos-avx2-main.o,
 test-double-libmvec-sincos-main.o,
 test-float-libmvec-sincosf-avx-main.o,
 test-float-libmvec-sincosf-avx2-main.o and
 test-float-libmvec-sincosf-main.o.
 test-float-libmvec-sincosf-avx512-main.o.
 ($(objpfx)test-double-libmvec-sincos): Also link with
 $(objpfx)test-double-libmvec-sincos-main.o.
 ($(objpfx)test-double-libmvec-sincos-avx): Also link with
 $(objpfx)test-double-libmvec-sincos-avx-main.o.
 ($(objpfx)test-double-libmvec-sincos-avx2): Also link with
 $(objpfx)test-double-libmvec-sincos-avx2-main.o.
 ($(objpfx)test-float-libmvec-sincosf): Also link with
 $(objpfx)test-float-libmvec-sincosf-main.o.
 ($(objpfx)test-float-libmvec-sincosf-avx): Also link with
 $(objpfx)test-float-libmvec-sincosf-avx2-main.o.
 [$(config-cflags-avx512) == yes] (extra-test-objs): Add
 test-double-libmvec-sincos-avx512-main.o and
 ($(objpfx)test-double-libmvec-sincos-avx512): Also link with
 $(objpfx)test-double-libmvec-sincos-avx512-main.o.
 ($(objpfx)test-float-libmvec-sincosf-avx512): Also link with
 $(objpfx)test-float-libmvec-sincosf-avx512-main.o.
 (CFLAGS-test-double-libmvec-sincos.c): Removed.
 (CFLAGS-test-float-libmvec-sincosf.c): Likewise.
 (CFLAGS-test-double-libmvec-sincos-main.c): New.
 (CFLAGS-test-double-libmvec-sincos-avx-main.c): Likewise.
 (CFLAGS-test-double-libmvec-sincos-avx2-main.c): Likewise.
 (CFLAGS-test-float-libmvec-sincosf-main.c): Likewise.
 (CFLAGS-test-float-libmvec-sincosf-avx-main.c): Likewise.
 (CFLAGS-test-float-libmvec-sincosf-avx2-main.c): Likewise.
 (CFLAGS-test-float-libmvec-sincosf-avx512-main.c): Likewise.
 (CFLAGS-test-double-libmvec-sincos-avx.c): Set to -DREQUIRE_AVX.
 (CFLAGS-test-float-libmvec-sincosf-avx.c ): Likewise.
 (CFLAGS-test-double-libmvec-sincos-avx2.c): Set to
 -DREQUIRE_AVX2.
 (CFLAGS-test-float-libmvec-sincosf-avx2.c ): Likewise.
 (CFLAGS-test-double-libmvec-sincos-avx512.c): Set to
 -DREQUIRE_AVX512F.
 (CFLAGS-test-float-libmvec-sincosf-avx512.c): Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-sincos.c: Rewritten.
 * sysdeps/x86_64/fpu/test-float-libmvec-sincosf.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx-main.c: New
 file.
 * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx2-main.c:
 Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx512-main.c:
 Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-sincos-main.c:
 Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx-main.c:
 Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx2-main.c:
 Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx512-main.c:
 Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-main.c:
 Likewise.

ibm/2.19/master 2016-07-11 17:16:01 UTC 2016-07-11
Merge branch 'release/2.19/master' into ibm/2.19/master

Author: Tulio Magno Quites Machado Filho
Author Date: 2016-07-11 17:16:01 UTC

Merge branch 'release/2.19/master' into ibm/2.19/master

Conflicts:
 NEWS

hjl/pr20314 2016-06-30 16:45:25 UTC 2016-06-30
Make copies of cstdlib/cmath and use them

Author: H.J. Lu
Author Date: 2016-06-30 13:56:19 UTC

Make copies of cstdlib/cmath and use them

If C++ headers <cstdlib> or <cmath> are used, GCC 6 will include
/usr/include/stdlib.h or /usr/include/math.h from "#include_next"
(instead of stdlib/stdlib.h or math/math.h in the glibc source
directory), and this turns up as a make dependency. An implicit
rule will kick in and make will try to install stdlib/stdlib.h or
math/math.h as /usr/include/stdlib.h or /usr/include/math.h because
the target is out of date. We make a copy of <cstdlib> and <cmath>
in the glibc build directory so that stdlib/stdlib.h and math/math.h
will be used instead of /usr/include/stdlib.h and /usr/include/math.h.

 [BZ #20314]
 * Makeconfig (CXXFLAGS): Prepend -I$(common-objpfx).
 * Makerules (before-compile): Add $(common-objpfx)cstdlib and
 $(common-objpfx)cmath.
 ($(common-objpfx)cstdlib): New target.
 ($(common-objpfx)cmath): Likewise.

hjl/pr18645 2016-06-30 03:00:39 UTC 2016-06-30
Compile tst-cleanupx4 test with -fexceptions

Author: H.J. Lu
Author Date: 2016-06-01 21:11:16 UTC

Compile tst-cleanupx4 test with -fexceptions

tst-cleanupx4 is linked with tst-cleanupx4.o and tst-cleanup4aux.o.
Since tst-cleanupx4.o is compiled from tst-cleanup4.c with -fexceptions,
tst-cleanup4aux.c should also be compiled with -fexceptions.

Tested on x86-64 and i686.

 [BZ 18645]
 * nptl/Makefile (extra-test-objs): Add tst-cleanupx4aux.o.
 (test-extras): Add tst-cleanupx4aux.
 (CFLAGS-tst-cleanupx4aux.c): New. Set to -fexceptions.
 ($(objpfx)tst-cleanupx4): Replace tst-cleanup4aux.o with
 tst-cleanupx4aux.o.
 * nptl/tst-cleanupx4aux.c: New file.

hjl/pr20139/master 2016-06-29 21:42:06 UTC 2016-06-29
Require binutils 2.24 to build x86-64 glibc

Author: H.J. Lu
Author Date: 2016-06-29 19:37:06 UTC

Require binutils 2.24 to build x86-64 glibc

If assembler doesn't support AVX512DQ, _dl_runtime_resolve_avx is used
to save the first 8 vector registers, which only saves the lower 256
bits of vector register, for lazy binding. When it is called on AVX512
platform, the upper 256 bits of ZMM registers are clobbered. Parameters
passed in ZMM registers will be wrong when the function is called the
first time. This patch requires binutils 2.24, whose assembler can store
and load ZMM registers, to build x86-64 glibc. Since mathvec library
needs assembler support for AVX512DQ, we disable mathvec if assembler
doesn't support AVX512DQ.

 [BZ #20139]
 * config.h.in (HAVE_AVX512_ASM_SUPPORT): Renamed to ...
 (HAVE_AVX512DQ_ASM_SUPPORT): This.
 * sysdeps/x86_64/configure.ac: Require assembler from binutils
 2.24 or above.
 (HAVE_AVX512_ASM_SUPPORT): Removed.
 (HAVE_AVX512DQ_ASM_SUPPORT): New.
 * sysdeps/x86_64/configure: Regenerated.
 * sysdeps/x86_64/dl-trampoline.S: Make HAVE_AVX512_ASM_SUPPORT
 check unconditional.
 * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Likewise.
 * sysdeps/x86_64/multiarch/memcpy.S: Likewise.
 * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
 * sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S:
 Likewise.
 * sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S:
 Likewise.
 * sysdeps/x86_64/multiarch/memmove.S: Likewise.
 * sysdeps/x86_64/multiarch/memmove_chk.S: Likewise.
 * sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
 * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
 * sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S:
 Likewise.
 * sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
 Likewise.
 * sysdeps/x86_64/multiarch/memset.S: Likewise.
 * sysdeps/x86_64/multiarch/memset_chk.S: Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: Check
 HAVE_AVX512DQ_ASM_SUPPORT instead of HAVE_AVX512_ASM_SUPPORT.
 * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx51:
 Likewise.
 * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S:
 Likewise.

hjl/pr20309/master 2016-06-29 14:38:22 UTC 2016-06-29
X86-64: Properly align stack in _dl_tlsdesc_dynamic

Author: H.J. Lu
Author Date: 2016-06-29 11:01:58 UTC

X86-64: Properly align stack in _dl_tlsdesc_dynamic

Since _dl_tlsdesc_dynamic is called via PLT, we need to add 8 bytes for
push in the PLT entry to align the stack.

 [BZ #20309]
 * configure.ac (have-mtls-dialect-gnu2): Set to yes if
 -mtls-dialect=gnu2 works.
 * configure: Regenerated.
 * elf/Makefile [have-mtls-dialect-gnu2 = yes]
 (tests): Add tst-gnu2-tls1.
 (modules-names): Add tst-gnu2-tls1mod.
 ($(objpfx)tst-gnu2-tls1): New.
 (tst-gnu2-tls1mod.so-no-z-defs): Likewise.
 (CFLAGS-tst-gnu2-tls1mod.c): Likewise.
 * elf/tst-gnu2-tls1.c: New file.
 * elf/tst-gnu2-tls1mod.c: Likewise.
 * sysdeps/x86_64/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Add 8
 bytes for push in the PLT entry to align the stack.

hjl/benchtests/wall-time 2016-06-21 18:31:45 UTC 2016-06-21
Force to use clock_gettime

Author: H.J. Lu
Author Date: 2016-06-21 18:31:45 UTC

Force to use clock_gettime

hjl/erms/2.23 2016-06-06 20:34:29 UTC 2016-06-06
Count number of logical processors sharing L2 cache

Author: H.J. Lu
Author Date: 2016-05-27 22:16:22 UTC

Count number of logical processors sharing L2 cache

For Intel processors, when there are both L2 and L3 caches, SMT level
type should be ued to count number of available logical processors
sharing L2 cache. If there is only L2 cache, core level type should
be used to count number of available logical processors sharing L2
cache. Number of available logical processors sharing L2 cache should
be used for non-inclusive L2 and L3 caches.

 * sysdeps/x86/cacheinfo.c (init_cacheinfo): Count number of
 available logical processors with SMT level type sharing L2
 cache for Intel processors.

hjl/erms/2.22 2016-06-06 20:23:12 UTC 2016-06-06
Count number of logical processors sharing L2 cache

Author: H.J. Lu
Author Date: 2016-05-27 22:16:22 UTC

Count number of logical processors sharing L2 cache

For Intel processors, when there are both L2 and L3 caches, SMT level
type should be ued to count number of available logical processors
sharing L2 cache. If there is only L2 cache, core level type should
be used to count number of available logical processors sharing L2
cache. Number of available logical processors sharing L2 cache should
be used for non-inclusive L2 and L3 caches.

 * sysdeps/x86/cacheinfo.c (init_cacheinfo): Count number of
 available logical processors with SMT level type sharing L2
 cache for Intel processors.

hjl/erms/ifunc 2016-05-25 17:10:45 UTC 2016-05-25
X86-64: Add dummy memcopy.h and wordcopy.c

Author: H.J. Lu
Author Date: 2016-04-01 21:01:24 UTC

X86-64: Add dummy memcopy.h and wordcopy.c

Since x86-64 doesn't use memory copy functions, add dummy memcopy.h and
wordcopy.c to reduce code size. It reduces the size of libc.so by about
1 KB.

 * sysdeps/x86_64/memcopy.h: New file.
 * sysdeps/x86_64/wordcopy.c: Likewise.

ibm/2.20/master 2016-05-25 13:04:06 UTC 2016-05-25
Merge release/2.20/master into ibm/2.20/master

Author: Gabriel F. T. Gomes
Author Date: 2016-05-25 13:04:06 UTC

Merge release/2.20/master into ibm/2.20/master

Conflicts:
 NEWS

release/2.20/master 2016-05-24 21:08:55 UTC 2016-05-24
CVE-2016-3075: Stack overflow in _nss_dns_getnetbyname_r [BZ #19879]

Author: Florian Weimer
Author Date: 2016-03-29 10:57:56 UTC

CVE-2016-3075: Stack overflow in _nss_dns_getnetbyname_r [BZ #19879]

The defensive copy is not needed because the name may not alias the
output buffer.

(cherry picked from commit 317b199b4aff8cfa27f2302ab404d2bb5032b9a4)
(cherry picked from commit f5b3338d70a7a2c626331ac4589b6deb2f610432)

hjl/cache/master 2016-05-20 21:57:00 UTC 2016-05-20
Count number of logical processors sharing L2 cache

Author: H.J. Lu
Author Date: 2016-05-13 20:26:37 UTC

Count number of logical processors sharing L2 cache

For Intel processors, when there are both L2 and L3 caches, SMT level
type should be ued to count number of available logical processors
sharing L2 cache. If there is only L2 cache, core level type should
be used to count number of available logical processors sharing L2
cache. Number of available logical processors sharing L2 cache should
be used for non-inclusive L2 and L3 caches.

 * sysdeps/x86/cacheinfo.c (init_cacheinfo): Count number of
 available logical processors with SMT level type sharing L2
 cache for Intel processors.

hjl/benchtests/memset 2016-05-20 19:10:57 UTC 2016-05-20
Skip simple and builtin memory implementations

Author: H.J. Lu
Author Date: 2016-04-07 15:27:12 UTC

Skip simple and builtin memory implementations

hjl/ld.so/master 2016-05-14 16:19:01 UTC 2016-05-14
X86: Add cache info to _dl_x86_cpu_features

Author: H.J. Lu
Author Date: 2016-05-10 12:42:49 UTC

X86: Add cache info to _dl_x86_cpu_features

This patch adds cache info to _dl_x86_cpu_features to allow a processor
to override cache info derived from CPUID.

Tested on x86 and x86-64.

 * sysdeps/x86/cacheinfo.c: Skip if not in libc.
 (init_cacheinfo): Use raw_data_size, raw_shared_size and
 shared_non_temporal_threshold from _dl_x86_cpu_features if
 not zero.
 * sysdeps/x86/cpu-features.h (cache_info): New.
 (cpu_features): Add cache.

release/2.21/master 2016-05-09 13:03:50 UTC 2016-05-09
Suppress GCC 6 warning about ambiguous 'else' with -Wparentheses

Author: Yvan Roux
Author Date: 2016-04-15 11:29:26 UTC

Suppress GCC 6 warning about ambiguous 'else' with -Wparentheses

Backport of df1cf48777fe4cd81ad7fb09ecbe5b31432b7c1c.

 * stdlib/setenv.c (unsetenv): Fix ambiguous 'else'.
 * nis/nis_call.c (nis_server_cache_add): Likewise.

hjl/benchtests/master 2016-05-08 16:10:50 UTC 2016-05-08
Clear destination buffer updated by the previous run

Author: H.J. Lu
Author Date: 2016-04-04 16:38:30 UTC

Clear destination buffer updated by the previous run

Clear the destination buffer updated by the previous run in bench-memcpy.c
and test-memcpy.c to catch the error when the following implementations do
not copy anything.

 PR string/19907
 * benchtests/bench-memcpy.c (do_one_test): Clear the destination
 buffer updated by the previous run.
 * string/test-memcpy.c (do_one_test): Likewise.
 * benchtests/bench-memmove.c (do_one_test): Add a comment.
 * string/test-memmove.c (do_one_test): Likewise.

hjl/2.17/memset 2016-05-05 13:29:28 UTC 2016-05-05
Faster memset on x64

Author: Ondrej Bilka
Author Date: 2013-05-20 06:26:00 UTC

Faster memset on x64

This implementation speed up memset in several ways. First is
avoiding expensive computed jump. Second is using fact that arguments
of memset are most of time aligned to 8 bytes.

Benchmark results on:

kam.mff.cuni.cz/~ondra/benchmark_string/memset_profile_result27_04_13.tar.bz2

(cherry picked from commit b2b671b677d92429a3d41bf451668f476aa267ed)

hjl/cacheline/ifunc 2016-04-25 15:32:09 UTC 2016-04-25
X86-64: Add dummy memcopy.h and wordcopy.c

Author: H.J. Lu
Author Date: 2016-04-01 21:01:24 UTC

X86-64: Add dummy memcopy.h and wordcopy.c

Since x86-64 doesn't use memory copy functions, add dummy memcopy.h and
wordcopy.c to reduce code size. It reduces the size of libc.so by about
1 KB.

 * sysdeps/x86_64/memcopy.h: New file.
 * sysdeps/x86_64/wordcopy.c: Likewise.

hjl/cacheline/master 2016-04-25 15:13:42 UTC 2016-04-25
Skip simple and builtin memory implementations

Author: H.J. Lu
Author Date: 2016-04-07 15:27:12 UTC

Skip simple and builtin memory implementations

hjl/erms/nt 2016-04-25 11:57:38 UTC 2016-04-25
Skip simple and builtin memory implementations

Author: H.J. Lu
Author Date: 2016-04-07 15:27:12 UTC

Skip simple and builtin memory implementations

fw/extend_alloca 2016-04-25 04:19:31 UTC 2016-04-25
Remove macros extend_alloca, extend_alloca_account [BZ #18023]

Author: Florian Weimer
Author Date: 2015-03-01 22:22:45 UTC

Remove macros extend_alloca, extend_alloca_account [BZ #18023]

And also the helper macro stackinfo_alloca_round.

extend_alloca simply does not work on x86_64 and current i386 because
its peculiar stack alignment rules.

Here's an analysis of the _dl_fini situation (before the removal of
extend_alloca).

Dump of assembler code for function _dl_fini:
<+0>: push %rbp
<+1>: mov %rsp,%rbp
<+4>: push %r15
<+6>: push %r14
<+8>: push %r13
<+10>: push %r12
<+12>: push %rbx
<+13>: sub $0x38,%rsp

The function pushes 6 registers on the stack and allocates 0x38 bytes,
which means that %rsp is a multiple of 16 after function prologue.

The initial alloca allocation does not change %rsp alignment:

<+210>: shr $0x4,%rcx
<+214>: shl $0x4,%rcx
<+218>: sub %rcx,%rsp

%r15 is the address of the previous stack allocation, it is used below.

This is the extend_alloca reallocation branch:

<+734>: add $0xf,%rdx
<+738>: and $0xfffffffffffffff0,%rdx
<+742>: lea 0x1e(%rdx),%rcx
<+746>: shr $0x4,%rcx
<+750>: shl $0x4,%rcx
<+754>: sub %rcx,%rsp
<+757>: lea 0xf(%rsp),%rcx
<+762>: and $0xfffffffffffffff0,%rcx
<+766>: lea (%rcx,%rdx,1),%rsi
<+770>: cmp %rsi,%r15
<+773>: je 0x7f963940b673 <_dl_fini+787>
<+775>: mov %rdx,-0x58(%rbp)
<+787>: add %rdx,-0x58(%rbp)

(a) %rdx, the new requested size, is rounded up to a multiple of 16
(+734, %+738), and the result is stored in %rdx@738.

(b) %rdx@738 + 31 is rounded down to a multiple of 16, the result is
stored in rcx@750 (+742, +746, +750). So %rcx@750 == %rdx@738 + 16.

(c) %rcx@750 bytes are allocated on the stack (+754). %rsp is rounded
upwards to a multiple of 16, result is stored in %rcx@762 (+757, +762).
This does not change the value of %rsp because it already was a multiple
of 16.

(d) %rsi@766 == %rcx@762 + %rdx@738 is compared against %r15. But this
comparison is always false because we allocated 16 extra bytes on the
stack in (b), which were reserved for the alignment in (c), but in fact
unused. We are left with a gap in stack usage, and the comparison is
always false.

(@XXX refers to register values after executing the instruction at
offset +XXX.)

If the alignment gap was actually used because of different alignment
for %rsp, then the comparison failure would still occur because the gap
would not have been added after this reallocation, but before the
previous allocation.

As a result, extend_alloca is never able to merge allocations. It also
turns out that the interface is difficult to use, especially in
cojunction with alloca account (which is rarely optional).

 [BZ #18023]
 * include/alloca.h (stackinfo_alloca_round, extend_alloca,
 extend_alloca_account): Remove.

release/2.18/master 2016-04-22 19:53:46 UTC 2016-04-22
resolv: Always set *resplen2 out parameter in send_dg [BZ #19791]

Author: Florian Weimer
Author Date: 2016-04-22 15:38:15 UTC

resolv: Always set *resplen2 out parameter in send_dg [BZ #19791]

Since commit 44d20bca52ace85850012b0ead37b360e3ecd96e (Implement
second fallback mode for DNS requests), there is a code path which
returns early, before *resplen2 is initialized. This happens if the
name server address is immediately recognized as invalid (because of
lack of protocol support, or if it is a broadcast address such
255.255.255.255, or another invalid address).

If this happens and *resplen2 was non-zero (which is the case if a
previous query resulted in a failure), __libc_res_nquery would reuse
an existing second answer buffer. This answer has been previously
identified as unusable (for example, it could be an NXDOMAIN
response). Due to the presence of a second answer, no name server
switching will occur. The result is a name resolution failure,
although a successful resolution would have been possible if name
servers have been switched and queries had proceeded along the search
path.

The above paragraph still simplifies the situation. Before glibc
2.23, if the second answer needed malloc, the stub resolver would
still attempt to reuse the second answer, but this is not possible
because __libc_res_nsearch has freed it, after the unsuccessful call
to __libc_res_nquerydomain, and set the buffer pointer to NULL. This
eventually leads to an assertion failure in __libc_res_nquery:

 /* Make sure both hp and hp2 are defined */
 assert((hp != NULL) && (hp2 != NULL));

If assertions are disabled, the consequence is a NULL pointer
dereference on the next line.

Starting with glibc 2.23, as a result of commit
e9db92d3acfe1822d56d11abcea5bfc4c41cf6ca (CVE-2015-7547: getaddrinfo()
stack-based buffer overflow (Bug 18665)), the second answer is always
allocated with malloc. This means that the assertion failure happens
with small responses as well because there is no buffer to reuse, as
soon as there is a name resolution failure which triggers a search for
an answer along the search path.

This commit addresses the issue by ensuring that *resplen2 is
initialized before the send_dg function returns.

This commit also addresses a bug where an invalid second reply is
incorrectly returned as a valid to the caller.

(cherry picked from commit b66d837bb5398795c6b0f651bd5a5d66091d8577)

ibm/2.18/master 2016-04-20 14:18:20 UTC 2016-04-20
NEWS: Add 18665 and 19791 to fixed bug list.

Author: Paul E. Murphy
Author Date: 2016-04-20 14:18:20 UTC

NEWS: Add 18665 and 19791 to fixed bug list.

gentoo/2.22 2016-04-10 00:13:40 UTC 2016-04-10
configure: fix `test ==` usage

Author: Mike Frysinger
Author Date: 2016-04-10 00:02:48 UTC

configure: fix `test ==` usage

POSIX defines the = operator, but not ==. Fix the few places where we
incorrectly used ==.

(cherry picked from commit b2d4456b333970ab4cb01ed8045b9a8d2c4832f3)
(cherry picked from commit e2c17de539da301c96afa4181347c63eb94d99b1)

hjl/erms/master 2016-03-31 16:00:47 UTC 2016-03-31
Add memmove/memset-avx512-unaligned-erms-no-vzeroupper.S

Author: H.J. Lu
Author Date: 2016-03-13 08:26:57 UTC

Add memmove/memset-avx512-unaligned-erms-no-vzeroupper.S

hjl/erms/i386 2016-03-28 13:11:19 UTC 2016-03-28
Add 32-bit Enhanced REP MOVSB/STOSB (ERMS) memcpy/memset

Author: H.J. Lu
Author Date: 2011-09-21 22:21:28 UTC

Add 32-bit Enhanced REP MOVSB/STOSB (ERMS) memcpy/memset

Add and test 32-bit memcpy/memset with Enhanced REP MOVSB/STOSB (ERMS).

 * sysdeps/i386/i686/multiarch/Makefile (sysdep_routines): Add
 bcopy-erms, memcpy-erms, memmove-erms, mempcpy-erms, bzero-erms
 and memset-erms.
 * sysdeps/i386/i686/multiarch/bcopy-erms.S: New file.
 * sysdeps/i386/i686/multiarch/bzero-erms.S: Likewise.
 * sysdeps/i386/i686/multiarch/memcpy-erms.S: Likewise.
 * sysdeps/i386/i686/multiarch/memmove-erms.S: Likewise.
 * sysdeps/i386/i686/multiarch/mempcpy-erms.S: Likewise.
 * sysdeps/i386/i686/multiarch/memset-erms.S: Likewise.
 * sysdeps/i386/i686/multiarch/ifunc-impl-list.c
 (__libc_ifunc_impl_list): Add __bcopy_erms, __bzero_erms,
 __memmove_chk_erms, __memmove_erms, __memset_chk_erms,
 __memset_erms, __memcpy_chk_erms, __memcpy_erms,
 __mempcpy_chk_erms and __mempcpy_erms.

hjl/pr19583 2016-03-23 17:56:38 UTC 2016-03-23
[x86] Add a feature bit: Fast_Unaligned_Copy

Author: H.J. Lu
Author Date: 2016-03-23 17:33:19 UTC

[x86] Add a feature bit: Fast_Unaligned_Copy

On AMD processors, memcpy optimized with unaligned SSE load is
slower than emcpy optimized with aligned SSSE3 while other string
functions are faster with unaligned SSE load. A feature bit,
Fast_Unaligned_Copy, is added to select memcpy optimized with
unaligned SSE load.

 [BZ #19583]
 * sysdeps/x86/cpu-features.c (init_cpu_features): Set
 Fast_Unaligned_Copy with Fast_Unaligned_Load for Intel
 processors. Set Fast_Copy_Backward for AMD Excavator
 processors.
 * sysdeps/x86/cpu-features.h (bit_arch_Fast_Unaligned_Copy):
 New.
 (index_arch_Fast_Unaligned_Copy): Likewise.
 * sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Check
 Fast_Unaligned_Copy instead of Fast_Unaligned_Load.

hjl/pr19776/master 2016-03-07 17:32:48 UTC 2016-03-07
Enable __memcpy_chk_sse2_unaligned

Author: H.J. Lu
Author Date: 2016-03-07 13:47:26 UTC

Enable __memcpy_chk_sse2_unaligned

Check Fast_Unaligned_Load for __memcpy_chk_sse2_unaligned. The new
selection order is:

1. __memcpy_chk_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
2. __memcpy_chk_sse2_unaligned if Fast_Unaligned_Load bit is set.
3. __memcpy_chk_sse2 if SSSE3 isn't available.
4. __memcpy_chk_ssse3_back if Fast_Copy_Backward bit it set.
5. __memcpy_chk_ssse3

 [BZ #19776]
 * sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Check
 Fast_Unaligned_Load to enable __mempcpy_chk_sse2_unaligned.

hjl/pr18858/master 2016-03-06 23:30:00 UTC 2016-03-06
Test unaligned_1 mempcpy functions

Author: H.J. Lu
Author Date: 2016-03-06 23:18:04 UTC

Test unaligned_1 mempcpy functions

hjl/mempcpy 2016-03-04 13:44:17 UTC 2016-03-04
Add a comment in sysdeps/x86_64/Makefile

Author: H.J. Lu
Author Date: 2016-03-04 13:44:17 UTC

Add a comment in sysdeps/x86_64/Makefile

Mention recursive calls when ENTRY is used in _mcount.S.

 * sysdeps/x86_64/Makefile (sysdep_noprof): Add a comment.

hjl/plt/2.22 2016-02-23 19:21:45 UTC 2016-02-23
[x86_64] Set DL_RUNTIME_UNALIGNED_VEC_SIZE to 8

Author: H.J. Lu
Author Date: 2016-02-22 17:32:57 UTC

[x86_64] Set DL_RUNTIME_UNALIGNED_VEC_SIZE to 8

Due to GCC bug:

   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066

__tls_get_addr may be called with 8-byte stack alignment. Although
this bug has been fixed in GCC 4.9.4, 5.3 and 6, we can't assume
that stack will be always aligned at 16 bytes. Since SSE optimized
memory/string functions with aligned SSE register load and store are
used in the dynamic linker, we must set DL_RUNTIME_UNALIGNED_VEC_SIZE
to 8 so that _dl_runtime_resolve_sse will align the stack before
calling _dl_fixup:

Dump of assembler code for function _dl_runtime_resolve_sse:
   0x00007ffff7deea90 <+0>: push %rbx
   0x00007ffff7deea91 <+1>: mov %rsp,%rbx
   0x00007ffff7deea94 <+4>: and $0xfffffffffffffff0,%rsp
                                ^^^^^^^^^^^ Align stack to 16 bytes
   0x00007ffff7deea98 <+8>: sub $0x100,%rsp
   0x00007ffff7deea9f <+15>: mov %rax,0xc0(%rsp)
   0x00007ffff7deeaa7 <+23>: mov %rcx,0xc8(%rsp)
   0x00007ffff7deeaaf <+31>: mov %rdx,0xd0(%rsp)
   0x00007ffff7deeab7 <+39>: mov %rsi,0xd8(%rsp)
   0x00007ffff7deeabf <+47>: mov %rdi,0xe0(%rsp)
   0x00007ffff7deeac7 <+55>: mov %r8,0xe8(%rsp)
   0x00007ffff7deeacf <+63>: mov %r9,0xf0(%rsp)
   0x00007ffff7deead7 <+71>: movaps %xmm0,(%rsp)
   0x00007ffff7deeadb <+75>: movaps %xmm1,0x10(%rsp)
   0x00007ffff7deeae0 <+80>: movaps %xmm2,0x20(%rsp)
   0x00007ffff7deeae5 <+85>: movaps %xmm3,0x30(%rsp)
   0x00007ffff7deeaea <+90>: movaps %xmm4,0x40(%rsp)
   0x00007ffff7deeaef <+95>: movaps %xmm5,0x50(%rsp)
   0x00007ffff7deeaf4 <+100>: movaps %xmm6,0x60(%rsp)
   0x00007ffff7deeaf9 <+105>: movaps %xmm7,0x70(%rsp)

 [BZ #19679]
 * sysdeps/x86_64/dl-trampoline.S (DL_RUNIME_UNALIGNED_VEC_SIZE):
 Renamed to ...
 (DL_RUNTIME_UNALIGNED_VEC_SIZE): This. Set to 8.
 (DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ...
 (DL_RUNTIME_RESOLVE_REALIGN_STACK): This. Updated.
 (DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ...
 (DL_RUNTIME_RESOLVE_REALIGN_STACK): This.
 * sysdeps/x86_64/dl-trampoline.h
 (DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ...
 (DL_RUNTIME_RESOLVE_REALIGN_STACK): This.

hjl/pr19679/2.23 2016-02-19 23:52:31 UTC 2016-02-19
[x86_64] Set DL_RUNTIME_UNALIGNED_VEC_SIZE to 8

Author: H.J. Lu
Author Date: 2016-02-19 23:43:45 UTC

[x86_64] Set DL_RUNTIME_UNALIGNED_VEC_SIZE to 8

Due to GCC bug:

   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066

__tls_get_addr may be called with 8-byte stack alignment. Although
this bug has been fixed in GCC 4.9.4, 5.3 and 6, we can't assume
that stack will be always aligned at 16 bytes. Since SSE optimized
memory/string functions with aligned SSE register load and store are
used in the dynamic linker, we must set DL_RUNTIME_UNALIGNED_VEC_SIZE
to 8 so that _dl_runtime_resolve_sse will align the stack before
calling _dl_fixup:

Dump of assembler code for function _dl_runtime_resolve_sse:
   0x00007ffff7deea90 <+0>: push %rbx
   0x00007ffff7deea91 <+1>: mov %rsp,%rbx
   0x00007ffff7deea94 <+4>: and $0xfffffffffffffff0,%rsp
                                ^^^^^^^^^^^ Align stack to 16 bytes
   0x00007ffff7deea98 <+8>: sub $0x100,%rsp
   0x00007ffff7deea9f <+15>: mov %rax,0xc0(%rsp)
   0x00007ffff7deeaa7 <+23>: mov %rcx,0xc8(%rsp)
   0x00007ffff7deeaaf <+31>: mov %rdx,0xd0(%rsp)
   0x00007ffff7deeab7 <+39>: mov %rsi,0xd8(%rsp)
   0x00007ffff7deeabf <+47>: mov %rdi,0xe0(%rsp)
   0x00007ffff7deeac7 <+55>: mov %r8,0xe8(%rsp)
   0x00007ffff7deeacf <+63>: mov %r9,0xf0(%rsp)
   0x00007ffff7deead7 <+71>: movaps %xmm0,(%rsp)
   0x00007ffff7deeadb <+75>: movaps %xmm1,0x10(%rsp)
   0x00007ffff7deeae0 <+80>: movaps %xmm2,0x20(%rsp)
   0x00007ffff7deeae5 <+85>: movaps %xmm3,0x30(%rsp)
   0x00007ffff7deeaea <+90>: movaps %xmm4,0x40(%rsp)
   0x00007ffff7deeaef <+95>: movaps %xmm5,0x50(%rsp)
   0x00007ffff7deeaf4 <+100>: movaps %xmm6,0x60(%rsp)
   0x00007ffff7deeaf9 <+105>: movaps %xmm7,0x70(%rsp)

 [BZ #19679]
 * sysdeps/x86_64/dl-trampoline.S (DL_RUNIME_UNALIGNED_VEC_SIZE):
 Renamed to ...
 (DL_RUNTIME_UNALIGNED_VEC_SIZE): This. Set to 8.
 (DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ...
 (DL_RUNTIME_RESOLVE_REALIGN_STACK): This. Updated.
 (DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ...
 (DL_RUNTIME_RESOLVE_REALIGN_STACK): This.
 * sysdeps/x86_64/dl-trampoline.h
 (DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ...
 (DL_RUNTIME_RESOLVE_REALIGN_STACK): This.

andros/pr19654 2016-02-18 11:17:51 UTC 2016-02-18
Added tests to ensure link with *_finite aliases from libmvec (BZ #19654 fix).

Author: Andrew Senkevich
Author Date: 2016-02-18 11:17:51 UTC

Added tests to ensure link with *_finite aliases from libmvec (BZ #19654 fix).

 [BZ #19654]
 * sysdeps/x86_64/fpu/Makefile: Added new tests.
 * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx-main.c: New.
 * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx-mod.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx2-main.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx2-mod.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx2.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx512-main.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx512-mod.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx512.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-alias-main.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-alias-mod.c: Likewise.
 * sysdeps/x86_64/fpu/test-double-libmvec-alias.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx-main.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx-mod.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx2-main.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx2-mod.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx2.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx512-main.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx512-mod.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx512.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-alias-main.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-alias-mod.c: Likewise.
 * sysdeps/x86_64/fpu/test-float-libmvec-alias.c: Likewise.
 * sysdeps/x86_64/fpu/test-libmvec-alias-mod.c: Likewise.

gentoo/2.21 2016-02-17 16:18:58 UTC 2016-02-17
Fix parallel build error

Author: Andreas Schwab
Author Date: 2015-03-02 14:47:56 UTC

Fix parallel build error

https://bugs.gentoo.org/74948

(cherry picked from commit e8b6be0016f131c2ac72bf3213eabdb59800e63b)
(cherry picked from commit e04da210f7cd564c46a8db0e15a0c6e726f3977e)

hjl/pr19590 2016-02-16 20:20:37 UTC 2016-02-16
Remove test-double-libmvec-alias-*-wrappers.c

Author: H.J. Lu
Author Date: 2016-02-16 20:20:37 UTC

Remove test-double-libmvec-alias-*-wrappers.c

rth/execl 2016-02-09 11:27:56 UTC 2016-02-09
alpha: Implement execl{,e,p} without double stack allocation

Author: Richard Henderson
Author Date: 2016-02-09 02:43:08 UTC

alpha: Implement execl{,e,p} without double stack allocation

hjl/avx512f-mem/prefetcht1 2016-01-15 20:58:16 UTC 2016-01-15
Use prefetcht1 with non-temporal stores

Author: H.J. Lu
Author Date: 2016-01-15 20:58:16 UTC

Use prefetcht1 with non-temporal stores

hjl/avx512f-mem/master 2016-01-15 20:44:13 UTC 2016-01-15
Always use prefetchnta with non-temporal stores

Author: H.J. Lu
Author Date: 2016-01-15 20:44:13 UTC

Always use prefetchnta with non-temporal stores

hjl/avx512f-mem/old 2016-01-15 20:21:15 UTC 2016-01-15
Use prefetchnta with non-temporal stores

Author: H.J. Lu
Author Date: 2016-01-15 20:21:15 UTC

Use prefetchnta with non-temporal stores

andros/avx512f-mem 2016-01-15 20:03:44 UTC 2016-01-15
Tuned loops with non-temporal access.

Author: Andrew Senkevich
Author Date: 2016-01-15 20:03:44 UTC

Tuned loops with non-temporal access.

    * sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S: Tuned
    prefetch.

hjl/pr19463 2016-01-15 16:43:07 UTC 2016-01-15
Avoid strdup/strndup/strsep

Author: H.J. Lu
Author Date: 2016-01-13 23:03:46 UTC

Avoid strdup/strndup/strsep

hjl/pr19363/2.22 2016-01-04 16:15:26 UTC 2016-01-04
Provide x32 times

Author: H.J. Lu
Author Date: 2015-12-17 19:46:49 UTC

Provide x32 times

Since times returns 64-bit clock_t on x32, we need to provide x32 times
by redefining INTERNAL_SYSCALL_NCS and INTERNAL_SYSCALL_ERROR_P with
64-bit return type for syscall. All system calls returning 64-bit
integer, which are lseek, time and times, must be handled specially for
x32. lseek is handled by x32 lseek.S and time doesn't check syscall
return. times is the only missed one. Before this patch, there are

0000000 <__times>:
   0: b8 64 00 00 40 mov $0x40000064,%eax
   5: 0f 05 syscall
   7: 48 63 d0 movslq %eax,%rdx
                                ^^^^^^^^^^ Incorrect signed extension
   a: 48 83 fa f2 cmp $0xfffffffffffffff2,%rdx
   e: 75 07 jne 17 <__times+0x17>
  10: 3d 00 f0 ff ff cmp $0xfffff000,%eax
                                ^^^^^^^^^^^^^^^^^^^^^ 32-bit compare
  15: 77 11 ja 28 <__times+0x28>
  17: 48 83 fa ff cmp $0xffffffffffffffff,%rdx
  1b: b8 00 00 00 00 mov $0x0,%eax
  20: 48 0f 45 c2 cmovne %rdx,%rax
  24: c3 retq

After this patch, there are

00000000 <__times>:
   0: b8 64 00 00 40 mov $0x40000064,%eax
   5: 0f 05 syscall
   7: 48 83 f8 f2 cmp $0xfffffffffffffff2,%rax
   b: 75 08 jne 15 <__times+0x15>
   d: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
  13: 77 13 ja 28 <__times+0x28>
  15: 48 83 f8 ff cmp $0xffffffffffffffff,%rax
  19: ba 00 00 00 00 mov $0x0,%edx
  1e: 48 0f 44 c2 cmove %rdx,%rax
  22: c3 retq

The incorrect signed extension and 32-bit compare are gone.

 [BZ #19363]
 * sysdeps/unix/sysv/linux/x86_64/x32/times.c: New file.

hjl/pr19371/master 2015-12-16 14:50:42 UTC 2015-12-16
Properly handle x32 syscall

Author: H.J. Lu
Author Date: 2015-12-16 14:50:42 UTC

Properly handle x32 syscall

X32 syscall() may return 64-bit integer as lseek, time and times. Its
return type should be __syscall_slong_t instead of long int. We need
to properly return 64-bit error value.

Before the patch:

Dump of assembler code for function syscall:
   0x000dab20 <+0>: mov %rdi,%rax
   0x000dab23 <+3>: mov %rsi,%rdi
   0x000dab26 <+6>: mov %rdx,%rsi
   0x000dab29 <+9>: mov %rcx,%rdx
   0x000dab2c <+12>: mov %r8,%r10
   0x000dab2f <+15>: mov %r9,%r8
   0x000dab32 <+18>: mov 0x8(%rsp),%r9
   0x000dab37 <+23>: syscall
   0x000dab39 <+25>: cmp $0xfffffffffffff001,%rax
   0x000dab3f <+31>: jae 0xdab42 <syscall+34>
   0x000dab41 <+33>: retq
   0x000dab42 <+34>: mov 0x2b3367(%rip),%rcx # 0x38deb0
   0x000dab49 <+41>: neg %eax
   0x000dab4b <+43>: mov %eax,%fs:(%rcx)
   0x000dab4e <+46>: or $0xffffffff,%eax
                        ^^^^^^^^^^^^^^^^^^ This is 32-bit error return.
   0x000dab51 <+49>: retq
End of assembler dump.

After the patch:

Dump of assembler code for function syscall:
   0x000daaf0 <+0>: mov %rdi,%rax
   0x000daaf3 <+3>: mov %rsi,%rdi
   0x000daaf6 <+6>: mov %rdx,%rsi
   0x000daaf9 <+9>: mov %rcx,%rdx
   0x000daafc <+12>: mov %r8,%r10
   0x000daaff <+15>: mov %r9,%r8
   0x000dab02 <+18>: mov 0x8(%rsp),%r9
   0x000dab07 <+23>: syscall
   0x000dab09 <+25>: cmp $0xfffffffffffff001,%rax
   0x000dab0f <+31>: jae 0xdab12 <syscall+34>
   0x000dab11 <+33>: retq
   0x000dab12 <+34>: mov 0x2b3397(%rip),%rcx # 0x38deb0
   0x000dab19 <+41>: neg %eax
   0x000dab1b <+43>: mov %eax,%fs:(%rcx)
   0x000dab1e <+46>: or $0xffffffffffffffff,%rax
   0x000dab22 <+50>: retq
End of assembler dump.

 [BZ #19371]
 * posix/unistd.h (syscall): Use __syscall_slong_t for return
 type.
 * sysdeps/unix/sysv/linux/x86_64/x32/syscall.S: New file.

hjl/pr19363/master 2015-12-16 13:46:20 UTC 2015-12-16
Use INTERNAL_SYSCALL_TIMES* macros for Linux times

Author: H.J. Lu
Author Date: 2015-12-15 03:09:13 UTC

Use INTERNAL_SYSCALL_TIMES* macros for Linux times

The Linux times function, which returns clock_t, is implemented with
INTERNAL_SYSCALL_DECL, INTERNAL_SYSCALL, INTERNAL_SYSCALL_ERROR_P and
INTERNAL_SYSCALL_ERRNO. Since INTERNAL_SYSCALL* macros use 32-bit
integer and clock_t is 64-bit on x32, this is a mismatch on x32. All
system calls returning 64-bit integer, which are lseek, time and times,
must be handled specially for x32. lseek is handled by x32 lseek.S and
time doesn't check syscall return. times is the only missed one.

This patch replaces INTERNAL_SYSCALL* macros in Linux times.c with
INTERNAL_SYSCALL_TIMES* macros which are default to INTERNAL_SYSCALL*
macros and provides x32 times.c with proper INTERNAL_SYSCALL_TIMES*
macros.

There is no code change on times for i686 nor x86-64. For x32, before
this patch, there are

0000000 <__times>:
   0: b8 64 00 00 40 mov $0x40000064,%eax
   5: 0f 05 syscall
   7: 48 63 d0 movslq %eax,%rdx
                                ^^^^^^^^^^ Incorrect signed extension
   a: 48 83 fa f2 cmp $0xfffffffffffffff2,%rdx
   e: 75 07 jne 17 <__times+0x17>
  10: 3d 00 f0 ff ff cmp $0xfffff000,%eax
                                ^^^^^^^^^^^^^^^^^^^^^ 32-bit compare
  15: 77 11 ja 28 <__times+0x28>
  17: 48 83 fa ff cmp $0xffffffffffffffff,%rdx
  1b: b8 00 00 00 00 mov $0x0,%eax
  20: 48 0f 45 c2 cmovne %rdx,%rax
  24: c3 retq

After this patch, there are

00000000 <__times>:
   0: b8 64 00 00 40 mov $0x40000064,%eax
   5: 0f 05 syscall
   7: 48 83 f8 f2 cmp $0xfffffffffffffff2,%rax
   b: 75 08 jne 15 <__times+0x15>
   d: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
  13: 77 13 ja 28 <__times+0x28>
  15: 48 83 f8 ff cmp $0xffffffffffffffff,%rax
  19: ba 00 00 00 00 mov $0x0,%edx
  1e: 48 0f 44 c2 cmove %rdx,%rax
  22: c3 retq

The incorrect signed extension and 32-bit compare are gone.

 [BZ #19363]
 * sysdeps/unix/sysv/linux/times.c (INTERNAL_SYSCALL_TIMES_DECL):
 New.
 (INTERNAL_SYSCALL_TIMES): Likewise.
 (INTERNAL_SYSCALL_TIMES_ERROR_P): Likewise.
 (INTERNAL_SYSCALL_TIMES_ERRNO): Likewise.
 (__times): Replace INTERNAL_SYSCALL* macros with
 INTERNAL_SYSCALL_TIMES* macros.
 * sysdeps/unix/sysv/linux/x86_64/x32/times.c: New file.

hjl/32bit/2.22 2015-12-15 21:45:12 UTC 2015-12-15
Add Prefer_MAP_32BIT_EXEC to map executable pages with MAP_32BIT

Author: H.J. Lu
Author Date: 2015-10-21 21:44:23 UTC

Add Prefer_MAP_32BIT_EXEC to map executable pages with MAP_32BIT

According to Silvermont software optimization guide, for 64-bit
applications, branch prediction performance can be negatively impacted
when the target of a branch is more than 4GB away from the branch. Add
the Prefer_MAP_32BIT_EXEC bit so that mmap will try to map executable
pages with MAP_32BIT first. NB: MAP_32BIT will map to lower 2GB, not
lower 4GB, address. Prefer_MAP_32BIT_EXEC reduces bits available for
address space layout randomization (ASLR), which is always disabled for
SUID programs and can only be enabled by setting environment variable,
LD_PREFER_MAP_32BIT_EXEC.

On Fedora 23, this patch speeds up GCC 5 testsuite by 3% on Silvermont.

 [BZ #19367]
 * sysdeps/unix/sysv/linux/wordsize-64/mmap.c: New file.
 * sysdeps/unix/sysv/linux/x86_64/64/dl-librecon.h: Likewise.
 * sysdeps/unix/sysv/linux/x86_64/64/mmap.c: Likewise.
 * sysdeps/x86/cpu-features.h (bit_Prefer_MAP_32BIT_EXEC): New.
 (index_Prefer_MAP_32BIT_EXEC): Likewise.

(cherry picked from commit b9eb92ab05204df772eb4929eccd018637c9f3e9)

hjl/32bit/master 2015-12-15 21:16:02 UTC 2015-12-15
Add Prefer_MAP_32BIT_EXEC to map executable pages with MAP_32BIT

Author: H.J. Lu
Author Date: 2015-10-21 21:44:23 UTC

Add Prefer_MAP_32BIT_EXEC to map executable pages with MAP_32BIT

According to Silvermont software optimization guide, for 64-bit
applications, branch prediction performance can be negatively impacted
when the target of a branch is more than 4GB away from the branch. Add
the Prefer_MAP_32BIT_EXEC bit so that mmap will try to map executable
pages with MAP_32BIT first. NB: MAP_32BIT will map to lower 2GB, not
lower 4GB, address. Prefer_MAP_32BIT_EXEC reduces bits available for
address space layout randomization (ASLR), which is always disabled for
SUID programs and can only be enabled by setting environment variable,
LD_PREFER_MAP_32BIT_EXEC.

On Fedora 23, this patch speeds up GCC 5 testsuite by 3% on Silvermont.

 [BZ #19367]
 * sysdeps/unix/sysv/linux/wordsize-64/mmap.c: New file.
 * sysdeps/unix/sysv/linux/x86_64/64/dl-librecon.h: Likewise.
 * sysdeps/unix/sysv/linux/x86_64/64/mmap.c: Likewise.
 * sysdeps/x86/cpu-features.h (bit_Prefer_MAP_32BIT_EXEC): New.
 (index_Prefer_MAP_32BIT_EXEC): Likewise.

hjl/pr19363/clobber 2015-12-15 04:35:51 UTC 2015-12-15
Use REGISTERS_CLOBBERED_BY_SYSCALL

Author: H.J. Lu
Author Date: 2015-12-15 04:35:51 UTC

Use REGISTERS_CLOBBERED_BY_SYSCALL

hjl/pr19178/master 2015-11-10 23:54:22 UTC 2015-11-10
Run tst-prelink test for GLOB_DAT reloc

Author: H.J. Lu
Author Date: 2015-11-10 23:54:22 UTC

Run tst-prelink test for GLOB_DAT reloc

Run tst-prelink test on targets with GLOB_DAT relocaton.

 * config.make.in (have-glob-dat-reloc): New.
 * configure.ac (libc_cv_has_glob_dat): New. Set to yes if
 target supports GLOB_DAT relocaton. AC_SUBST.
 * configure: Regenerated.
 * elf/Makefile (tests): Add tst-prelink.
 (tests-special): Add $(objpfx)tst-prelink-cmp.out.
 (tst-prelink-ENV): New.
 ($(objpfx)tst-prelink-conflict.out): Likewise.
 ($(objpfx)tst-prelink-cmp.out): Likewise.
 * sysdeps/x86/tst-prelink.c: Moved to ...
 * elf/tst-prelink.c: Here.
 * sysdeps/x86/tst-prelink.exp: Moved to ...
 * elf/tst-prelink.exp: Here.
 * sysdeps/x86/Makefile (tests): Don't add tst-prelink.
 (tst-prelink-ENV): Removed.
 ($(objpfx)tst-prelink-conflict.out): Likewise.
 ($(objpfx)tst-prelink-cmp.out): Likewise.
 (tests-special): Don't add $(objpfx)tst-prelink-cmp.out.

hjl/pr19122 2015-10-20 12:43:19 UTC 2015-10-20
Mark internal unistd functions hidden in ld.so

Author: H.J. Lu
Author Date: 2015-10-14 22:21:55 UTC

Mark internal unistd functions hidden in ld.so

Since internal unistd functions are only used internally in ld.so and
libc.so, they can be made hidden. Some functions can't be hidden in
ld.so on Hurd since they will be preempted by the ones in libc.so after
bootstrap.

 [BZ #19122]
 * include/unistd.h [IS_IN (rtld)]: Include <dl-unistd.h>.
 * sysdeps/generic/dl-unistd.h: New file.
 * sysdeps/mach/hurd/dl-unistd.h: Likewise.

hjl/i386/master 2015-10-19 17:45:25 UTC 2015-10-19
Avoid reading errno in syscall implementations

Author: H.J. Lu
Author Date: 2015-08-21 21:46:05 UTC

Avoid reading errno in syscall implementations

Reading errno is expensive for x86 PIC. With INTERNAL_SYSCALL,
INTERNAL_SYSCALL_ERROR_P, INTERNAL_SYSCALL_ERRNO and
INLINE_SYSCALL_ERROR_RETURN_VALUE, we can avoid reading errno.

There are no code changes on x86-64. On i686, libc.so sizes in bytes
show:

        text data bss dec
after 1748495 11380 11132 1771007
before 1748403 11380 11132 1770915

 * sysdeps/unix/sysv/linux/eventfd.c (eventfd): Use
 INTERNAL_SYSCALL, INTERNAL_SYSCALL_ERROR_P and
 INTERNAL_SYSCALL_ERRNO to avoid reading errno.
 * sysdeps/unix/sysv/linux/fstatfs64.c (__fstatfs64): Likewise.
 * sysdeps/unix/sysv/linux/getrlimit64.c (__getrlimit64):
 Likewise.
 * sysdeps/unix/sysv/linux/setrlimit64.c (setrlimit64):
 Likewise.
 * sysdeps/unix/sysv/linux/signalfd.c (signalfd): Likewise.
 * sysdeps/unix/sysv/linux/statfs64.c (__statfs64): Likewise.

1100 of 243 results

Other repositories

Name Last Modified
lp:glibc 4 hours ago
11 of 1 result
You can't create new repositories for GLibC.