glibc:andros/avx512f-mem

Last commit made on 2016-01-15
Get this branch:
git clone -b andros/avx512f-mem https://git.launchpad.net/glibc

Branch merges

Branch information

Name:
andros/avx512f-mem
Repository:
lp:glibc

Recent commits

e8ae919... by Andrew Senkevich <email address hidden>

Tuned loops with non-temporal access.

    * sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S: Tuned
    prefetch.

6e0be40... by Andrew Senkevich <email address hidden>

Added memcpy/memmove family optimized with AVX512 for KNL hardware.

Added AVX512 implementations of memcpy, mempcpy, memmove, memcpy_chk,
mempcpy_chk, memmove_chk.
It shows average improvement more than 30% over AVX versions on KNL
hardware (performance results attached in the thread
<https://sourceware.org/ml/libc-alpha/2016-01/msg00258.html>).

    * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Added new files.
    * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Added new tests.
    * sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S: New file.
    * sysdeps/x86_64/multiarch/mempcpy-avx512-no-vzeroupper.S: Likewise.
    * sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: Likewise.
    * sysdeps/x86_64/multiarch/memcpy.S: Added new IFUNC branch.
    * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
    * sysdeps/x86_64/multiarch/memmove.c: Likewise.
    * sysdeps/x86_64/multiarch/memmove_chk.c: Likewise.
    * sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
    * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.

a3e5b4f... by "Paul E. Murphy" <email address hidden>

Fix race in tst-mqueue5

The check is done on line 117 by a thread spawned
from do_child(), forked from do_test(). This test
generates a signal in the forked process.

Either thread may handle the signal, and on ppc,
it happens to be done on do_child, on the thread
which is not doing the check on line 117.

This exposes a race condition whereby the test
incorrectly fails as the signal is caught during
or after the check.

This is mitigated by ensuring the signal is blocked
in the child thread while thread is running.

692de4b... by Martin Sebor <email address hidden>

Have iconv accept redundant escape sequences in IBM900, IBM903, IBM905,
IBM907, and IBM909.

Patch for bug #17197 changes the encoder to avoid generating redundant
shift sequences. However, those sequences may already be present in
data encododed by prior versions of the encoder. This change modifies
the decoder to also avoid rejecting redundant shift sequences.

        [BZ #19432]
        * iconvdata/Makefile: Add bug-iconv11.
        * iconvdata/bug-iconv11.c: New test.
        * iconvdata/ibm930.c: Do not reject redundant shift sequences.
        * iconvdata/ibm933.c: Same.
        * iconvdata/ibm935.c: Same.
        * iconvdata/ibm937.c: Same.
        * iconvdata/ibm939.c: Same.

f2b3078... by Martin Sebor <email address hidden>

Fix build failures with -DDEBUG.

        [BZ #19443]
        * crypt/crypt_util.c [DEBUG] (_ufc_prbits): Correct format string.
        [DEBUG] (_ufc_set_bits): Declare used.
        * iconv/gconv_dl.c [DEBUG]: Add a missing include directive.
        [DEBUG] (print_all): Declare used.
        * resolv/res_send.c [DEBUG] (__libc_res_nsend): Explicitly convert
        operands of the ternary ?: expression to target type.
        * stdlib/rshift.c [DEBUG] (mpn_rshift): Use assert() instead of
        calling the undeclared abort.
        * time/mktime.c [DEBUG] (DEBUG): Rename to DEBUG_MKTIME.

ad37480... by Martin Sebor <email address hidden>

Fix build errors with -DNDEBUG.

        [BZ #18755]
        * iconv/skeleton.c (FUNCTION_NAME): Suppress -Wunused-but-set-variable
        warnings.
        * sysdeps/nptl/gai_misc.h (__gai_start_notify_thread): Same.
        (__gai_create_helper_thread): Same.
        * nscd/nscd.c (do_exit): Suppress -Wunused-variable.
        * iconvdata/iso-2022-cn-ext.c (BODY): Initialize local variable
        to suppress -Wmaybe-uninitialized warnings.

0924537... by "H.J. Lu" <email address hidden>

Call math_opt_barrier inside if

Since floating-point operation may trigger floating-point exceptions,
we call math_opt_barrier inside if to prevent code motion.

 [BZ #19465]
 * sysdeps/ieee754/dbl-64/s_fma.c (__fma): Call math_opt_barrier
 inside if.
 * sysdeps/ieee754/ldbl-128/s_fmal.c (__fmal): Likewise.
 * sysdeps/ieee754/ldbl-96/s_fma.c (__fma): Likewise.
 * sysdeps/ieee754/ldbl-96/s_fmal.c (__fmal): Likewise.

82c9a4f... by "H.J. Lu" <email address hidden>

Use TIME_T_MAX and TIME_T_MIN in tst-mktime2.c

GCC 5.3 compiles

for (time_t_max = 1; 0 < time_t_max; time_t_max *= 2)
    continue;

into an infinite loop with -Os. We can copy TIME_T_MAX and TIME_T_MIN
from time/mktime.c.

 [BZ #19466]
 * time/tst-mktime2.c (time_t_max): Removed.
 (time_t_min): Likewise.
 (TYPE_SIGNED): New.
 (TYPE_MINIMUM): Likewise.
 (TYPE_MAXIMUM): Likewise.
 (TIME_T_MIN): Likewise.
 (TIME_T_MAX): Likewise.
 (mktime_test): Replace time_t_max and time_t_min with TIME_T_MAX
 and TIME_T_MIN.
 (do_test): Likewise.

d7890e6... by Amit Pawar <email address hidden>

Set index_Fast_Unaligned_Load for Excavator family CPUs

GLIBC benchtest testcases shows SSE2_Unaligned based implementations
are performing faster compare to SSE2 based implementations for
routines: strcmp, strcat, strncat, stpcpy, stpncpy, strcpy, strncpy
and strstr. Flag index_Fast_Unaligned_Load is set for Excavator family
0x15h CPU's. This makes SSE2_Unaligned based implementations as
default for these routines.

 [BZ #19467]
 * sysdeps/x86/cpu-features.c (init_cpu_features): Set
 index_Fast_Unaligned_Load flag for Excavator family CPUs.

a4b5177... by Marcin Koƛcielnicki

Add __private_ss to s390 struct tcbhead.

Preparation for gcc -fsplit-stack support (gcc bug #68191). The new
field is basically identical to the one on x86. Its TCB offset needs
to be constant, as it'll be hardcoded in gcc.

ChangeLog:

 * sysdeps/s390/nptl/tls.h (struct tcbhead_t): Add __private_ss field.