glibc:codonell/c-utf8

Last commit made on 2021-07-29
Get this branch:
git clone -b codonell/c-utf8 https://git.launchpad.net/glibc

Branch merges

Branch information

Name:
codonell/c-utf8
Repository:
lp:glibc

Recent commits

b2a7720... by Carlos-0

Add generic C.UTF-8 locale (Bug 17318)

We add a new C.UTF-8 locale. This locale is not builtin to glibc, but
is provided as a distinct locale. The locale provides full support
for UTF-8 and this includes full code point sorting via strcmp-based
collation.

The collation uses a new keyword 'strcmp_collation' which drops all
collation rules and generates an empty zero rules collation to enable
strcmp usage in collation. This ensures that we get full code point
sorting for C.UTF-8 with a minimal 92 bytes of overhead (LC_COLLATE
structure information).

The new locale is added to SUPPORTED. Minimal test data for specific
code points (minus those not supported by collate-test) is provided
in C.UTF-8.in, and this verifies code point sorting is working
reasonably across the range. The locale was tested manually with the
full set of code points without failure.

The locale is harmonized with locales already shipping in Gentoo,
Debian, Ubuntu, Fedora, CentOS Stream, and RHEL. A new tst-iconv9 test
is added which verifies the C.UTF-8 locale is generally usable.

Testing for fnmatch, regexec, and recomp is provided by extending
bug-regex1, bugregex19, bug-regex4, bug-regex6, transbug, tst-fnmatch,
tst-regcomp-truncated, and tst-regex to use C.UTF-8.

Tested on x86_64 or i686 without regression.

7de41f9... by Carlos-0

Add 'strcmp_collation' support for LC_COLLATE.

Support a new directive 'strcmp_collation' in the LC_COLLATE
section of a locale source file. This new directive causes all
collation rules to be dropped and instead 'strcmp' is used for
collation of the input character set. This is required to allow
for a C.UTF-8 that contains zero collation rules (minimal size)
and sorts using code point sorting.

Tested on x86_64 and i686 without regression.

52d5eb7... by Carlos-0

Add support for locales with zero collation rules.

While there is code to handle 'nrules == 0' in various locations
within posix/fnmatch_loop.c, posix/regcomp.c and posix/regexec.c,
these conditionals do not work. The only collation with zero
rules in effect today is the builtin C/POSIX locale which is
built by hand, and despite have zero rules it has a collseqmb
and collseqwc tables stored in the locale data. These tables are
simple identity tables which are not actually required and could
be removed at a later date after this change. The changes are in
order to prepare for C.UTF-8 which has zero rules and has no
collation sequence tables (multibyte or widechar).

No regressions on x86_64 or i686.

c37fc3e... by Carlos-0

Update libc.pot for 2.34 release.

91cc803... by "H.J. Lu" <email address hidden>

x86-64: Add Avoid_Short_Distance_REP_MOVSB

commit 3ec5d83d2a237d39e7fd6ef7a0bc8ac4c171a4a5
Author: H.J. Lu <email address hidden>
Date: Sat Jan 25 14:19:40 2020 -0800

    x86-64: Avoid rep movsb with short distance [BZ #27130]

introduced some regressions on Intel processors without Fast Short REP
MOV (FSRM). Add Avoid_Short_Distance_REP_MOVSB to avoid rep movsb with
short distance only on Intel processors with FSRM. bench-memmove-large
on Skylake server shows that cycles of __memmove_evex_unaligned_erms
improves for the following data size:

                                  before after Improvement
length=4127, align1=3, align2=0: 479.38 349.25 27%
length=4223, align1=9, align2=5: 405.62 333.25 18%
length=8223, align1=3, align2=0: 786.12 496.38 37%
length=8319, align1=9, align2=5: 727.50 501.38 31%
length=16415, align1=3, align2=0: 1436.88 840.00 41%
length=16511, align1=9, align2=5: 1375.50 836.38 39%
length=32799, align1=3, align2=0: 2890.00 1860.12 36%
length=32895, align1=9, align2=5: 2891.38 1931.88 33%

c25c321... by "H.J. Lu" <email address hidden>

Typo: Rename HAVE_CLONE3_WAPPER to HAVE_CLONE3_WRAPPER

5f18453... by Florian Weimer

build-many-glibcs.py: Add x86_64-linux-gnu-minimal configuration

This configuration exercises various --disable-* configure options.
It is expected to catch -Werror failures that only affect these
configurations.

70d08ba... by Siddhesh Poyarekar <email address hidden>

tests: use xmalloc to allocate implementation array

The benchmark and tests must fail in case of allocation failure in the
implementation array. Also annotate the x* allocators in support.h so
that the compiler has more information about them.

Reviewed-by: Florian Weimer <email address hidden>

b8e8bb3... by Siddhesh Poyarekar <email address hidden>

xmalloc: Fix warnings with gcc analyzer

Tell the compiler that xmalloc family of allocators always return
non-NULL. xrealloc in locale/programs also always returns non-NULL,
but that conflicts with default realloc behaviour and that of xrealloc
in libsupport, so keep it as is for now and resolve the differences
later.

Reviewed-by: Florian Weimer <email address hidden>

4aedc25... by Siddhesh Poyarekar <email address hidden>

__cxa_thread_atexit_impl: Abort on allocation failure [BZ #18524]

Abort in the unlikely event that allocation fails when trying to
register a TLS destructor.

Reviewed-by: Florian Weimer <email address hidden>