glibc:ibm/2.28/master

Last commit made on 2021-02-13
Get this branch:
git clone -b ibm/2.28/master https://git.launchpad.net/glibc

Branch merges

Branch information

Name:
ibm/2.28/master
Repository:
lp:glibc

Recent commits

cde88e7... by Tulio Magno Quites Machado Filho <email address hidden>

Merge branch release/2.28/master into ibm/2.28/master

e9db776... by Florian Weimer

gconv: Fix assertion failure in ISO-2022-JP-3 module (bug 27256)

The conversion loop to the internal encoding does not follow
the interface contract that __GCONV_FULL_OUTPUT is only returned
after the internal wchar_t buffer has been filled completely. This
is enforced by the first of the two asserts in iconv/skeleton.c:

       /* We must run out of output buffer space in this
   rerun. */
       assert (outbuf == outerr);
       assert (nstatus == __GCONV_FULL_OUTPUT);

This commit solves this issue by queuing a second wide character
which cannot be written immediately in the state variable, like
other converters already do (e.g., BIG5-HKSCS or TSCII).

Reported-by: Tavis Ormandy <email address hidden>
(cherry picked from commit 7d88c6142c6efc160c0ee5e4f85cde382c072888)

44fd888... by "H.J. Lu" <email address hidden>

x86: Check IFUNC definition in unrelocated executable [BZ #20019]

Calling an IFUNC function defined in unrelocated executable also leads to
segfault. Issue a fatal error message when calling IFUNC function defined
in the unrelocated executable from a shared library.

On x86, ifuncmain6pie failed with:

[hjl@gnu-cfl-2 build-i686-linux]$ ./elf/ifuncmain6pie --direct
./elf/ifuncmain6pie: IFUNC symbol 'foo' referenced in '/export/build/gnu/tools-build/glibc-32bit/build-i686-linux/elf/ifuncmod6.so' is defined in the executable and creates an unsatisfiable circular dependency.
[hjl@gnu-cfl-2 build-i686-linux]$ readelf -rW elf/ifuncmod6.so | grep foo
00003ff4 00000706 R_386_GLOB_DAT 0000400c foo_ptr
00003ff8 00000406 R_386_GLOB_DAT 00000000 foo
0000400c 00000401 R_386_32 00000000 foo
[hjl@gnu-cfl-2 build-i686-linux]$

Remove non-JUMP_SLOT relocations against foo in ifuncmod6.so, which
trigger the circular IFUNC dependency, and build ifuncmain6pie with
-Wl,-z,lazy.

(cherry picked from commits 6ea5b57afa5cdc9ce367d2b69a2cebfb273e4617
 and 7137d682ebfcb6db5dfc5f39724718699922f06c)

4a68828... by "H.J. Lu" <email address hidden>

x86: Set header.feature_1 in TCB for always-on CET [BZ #27177]

Update dl_cet_check() to set header.feature_1 in TCB when both IBT and
SHSTK are always on.

(cherry picked from commit 2ef23b520597f4ea1790a669b83e608f24f4cf12)

56fbd0b... by "H.J. Lu" <email address hidden>

x86-64: Avoid rep movsb with short distance [BZ #27130]

When copying with "rep movsb", if the distance between source and
destination is N*4GB + [1..63] with N >= 0, performance may be very
slow. This patch updates memmove-vec-unaligned-erms.S for AVX and
AVX512 versions with the distance in RCX:

 cmpl $63, %ecx
 // Don't use "rep movsb" if ECX <= 63
 jbe L(Don't use rep movsb")
 Use "rep movsb"

Benchtests data with bench-memcpy, bench-memcpy-large, bench-memcpy-random
and bench-memcpy-walk on Skylake, Ice Lake and Tiger Lake show that its
performance impact is within noise range as "rep movsb" is only used for
data size >= 4KB.

(cherry picked from commit 3ec5d83d2a237d39e7fd6ef7a0bc8ac4c171a4a5)

d81114e... by Szabolcs Nagy <email address hidden>

aarch64: Fix DT_AARCH64_VARIANT_PCS handling [BZ #26798]

The variant PCS support was ineffective because in the common case
linkmap->l_mach.plt == 0 but then the symbol table flags were ignored
and normal lazy binding was used instead of resolving the relocs early.
(This was a misunderstanding about how GOT[1] is setup by the linker.)

In practice this mainly affects SVE calls when the vector length is
more than 128 bits, then the top bits of the argument registers get
clobbered during lazy binding.

Fixes bug 26798.

(cherry picked from commit 558251bd8785760ad40fcbfeaaee5d27fa5b0fe4)

e5dac99... by Wilco Dijkstra <email address hidden>

AArch64: Use __memcpy_simd on Neoverse N2/V1

Add CPU detection of Neoverse N2 and Neoverse V1, and select __memcpy_simd as
the memcpy/memmove ifunc.

Reviewed-by: Adhemerval Zanella <email address hidden>
(cherry picked from commit e11ed9d2b4558eeacff81557dc9557001af42a6b)

98979f6... by Wilco Dijkstra <email address hidden>

AArch64: Rename IS_ARES to IS_NEOVERSE_N1

Rename IS_ARES to IS_NEOVERSE_N1 since that is a bit clearer.

Reviewed-by: Carlos O'Donell <email address hidden>
(cherry picked from commit 0f6278a8793a5d04ea31878119eccf99f469a02d)

fe09348... by Wilco Dijkstra <email address hidden>

[AArch64] Improve integer memcpy

Further optimize integer memcpy. Small cases now include copies up
to 32 bytes. 64-128 byte copies are split into two cases to improve
performance of 64-96 byte copies. Comments have been rewritten.

(cherry picked from commit 700065132744e0dfa6d4d9142d63f6e3a1934726)

722c935... by Krzysztof Koch <email address hidden>

aarch64: Increase small and medium cases for __memcpy_generic

Increase the upper bound on medium cases from 96 to 128 bytes.
Now, up to 128 bytes are copied unrolled.

Increase the upper bound on small cases from 16 to 32 bytes so that
copies of 17-32 bytes are not impacted by the larger medium case.

Benchmarking:
The attached figures show relative timing difference with respect
to 'memcpy_generic', which is the existing implementation.
'memcpy_med_128' denotes the the version of memcpy_generic with
only the medium case enlarged. The 'memcpy_med_128_small_32' numbers
are for the version of memcpy_generic submitted in this patch, which
has both medium and small cases enlarged. The figures were generated
using the script from:
https://www.sourceware.org/ml/libc-alpha/2019-10/msg00563.html

Depending on the platform, the performance improvement in the
bench-memcpy-random.c benchmark ranges from 6% to 20% between
the original and final version of memcpy.S

Tested against GLIBC testsuite and randomized tests.

(cherry picked from commit b9f145df85145506f8e61bac38b792584a38d88f)