glibc:nsz/math

Last commit made on 2017-09-29
Get this branch:
git clone -b nsz/math https://git.launchpad.net/glibc

Branch merges

Branch information

Name:
nsz/math
Repository:
lp:glibc

Recent commits

dc230e2... by Szabolcs Nagy <email address hidden>

Do not wrap logf, log2f and powf

The new generic logf, log2f and powf code don't need wrappers any more,
they set errno inline so only use the wrappers on targets that need it.

2017-09-19 Szabolcs Nagy <email address hidden>

 * sysdeps/ieee754/flt-32/e_log2f.c (__log2f): Define without wrapper.
 * sysdeps/ieee754/flt-32/e_logf.c (__logf): Likewise
 * sysdeps/ieee754/flt-32/e_powf.c (__powf): Likewise
 * sysdeps/ieee754/flt-32/w_log2f.c: New file.
 * sysdeps/ieee754/flt-32/w_logf.c: New file.
 * sysdeps/ieee754/flt-32/w_powf.c: New file.
 * sysdeps/i386/fpu/w_log2f.c: New file.
 * sysdeps/i386/fpu/w_logf.c: New file.
 * sysdeps/i386/fpu/w_powf.c: New file.
 * sysdeps/m68k/m680x0/fpu/w_log2f.c: New file.
 * sysdeps/m68k/m680x0/fpu/w_logf.c: New file.
 * sysdeps/m68k/m680x0/fpu/w_powf.c: New file.

7dc9a7c... by Szabolcs Nagy <email address hidden>

Do not wrap expf and exp2f

The new generic expf and exp2f code don't need wrappers any more, they
set errno inline, so only use the wrappers on targets that need it.
(If the wrapper is needed, then the top level wrapper code is included,
otherwise empty w_exp*f.c is used to suppress the wrapper.)

A powerpc64 expf implementation includes the expf c code directly which
needed some changes.

2017-09-25 Szabolcs Nagy <email address hidden>
     H.J. Lu <email address hidden>

 * sysdeps/ieee754/flt-32/e_exp2f.c (__exp2f): Define without wrapper.
 * sysdeps/ieee754/flt-32/e_expf.c (__expf): Likewise
 * sysdeps/ieee754/flt-32/w_exp2f.c: New file.
 * sysdeps/ieee754/flt-32/w_expf.c: New file.
 * sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Update for
 the new expf code.
 * sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: New file.
 * sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: New file.
 * sysdeps/m68k/m680x0/fpu/w_exp2f.c: New file.
 * sysdeps/m68k/m680x0/fpu/w_expf.c: New file.
 * sysdeps/i386/fpu/w_exp2f.c: New file.
 * sysdeps/i386/fpu/w_expf.c: New file.
 * sysdeps/i386/i686/fpu/multiarch/w_expf.c: New file.
 * sysdeps/x86_64/fpu/w_expf.c: New file.

309a2d3... by Szabolcs Nagy <email address hidden>

New symbol version for logf, log2f and powf without SVID compat

This patch changes the logf, log2f and powf error handling semantics
to only set errno accoring to POSIX rules. New symbol version is
introduced at GLIBC_2.27.

The old wrappers are kept for compat symbols.

ia64 needed assembly change to have the new and compat versioned
symbol map to the same function.

All linux libm abilists are updated.

2017-09-19 Szabolcs Nagy <email address hidden>

 * math/Versions (logf): New libm symbol at GLIBC_2.27.
 (log2f): Likewise.
 (powf): Likewise.
 * math/w_log2f.c: New file.
 * math/w_logf.c: New file.
 * math/w_powf.c: New file.
 * math/w_log2f_compat.c (__log2f_compat): For compat symbol only.
 * math/w_logf_compat.c (__logf_compat): Likewise.
 * math/w_powf_compat.c (__powf_compat): Likewise.
 * sysdeps/ia64/fpu/e_log2f.S: Add versioned symbols.
 * sysdeps/ia64/fpu/e_logf.S: Likewise.
 * sysdeps/ia64/fpu/e_powf.S: Likewise.
 * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
 * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist:
 Likewise.
 * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist:
 Likewise.
 * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist:
 Likewise.
 * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist:
 Likewise.
 * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist:
 Likewise.
 * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist:
 Likewise.
 * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
 * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.

e260a9f... by Szabolcs Nagy <email address hidden>

New generic powf

without wrapper on aarch64:
powf reciprocal-throughput: 4.2x faster
powf latency: 2.6x faster
old worst-case error: 1.11 ulp
new worst-case error: 0.82 ulp
aarch64 .text size: -780 bytes
aarch64 .rodata size: +144 bytes

powf(x,y) is implemented as exp2(y*log2(x)) with the same algorithms
that are used in exp2f and log2f, except that the log2f polynomial is
larger for extra precision and its output (and exp2f input) may be
scaled by a power of 2 (POWF_SCALE) to simplify the argument reduction
step of exp2 (possible when efficient round and convert toint operation
is available).

The special case handling tries to minimize the checks in the hot path.
When the input of exp2_inline is checked, int arithmetics is used as
that was faster on the tested aarch64 cores.

2017-09-19 Szabolcs Nagy <email address hidden>

 * math/Makefile (type-float-routines): Add e_powf_log2_data.
 * sysdeps/ieee754/flt-32/e_powf.c: New implementation.
 * sysdeps/ieee754/flt-32/e_powf_log2_data.c: New file.
 * sysdeps/ieee754/flt-32/math_config.h (__powf_data): Define.
 (issignalingf_inline): Likewise.
 (POWF_LOG2_TABLE_BITS): Likewise.
 (POWF_LOG2_POLY_ORDER): Likewise.
 (POWF_SCALE_BITS): Likewise.
 (POWF_SCALE): Likewise.
 * sysdeps/i386/fpu/e_powf_log2_data.c: New file.
 * sysdeps/ia64/fpu/e_powf_log2_data.c: New file.
 * sysdeps/m68k/m680x0/fpu/e_powf_log2_data.c: New file.

eb38f71... by Szabolcs Nagy <email address hidden>

New generic log2f

Similar to the new logf: double precision arithmetics and a small
lookup table is used. The argument reduction step is the same as in
the new logf.

without wrapper on aarch64:
log2f reciprocal-throughput: 2.3x faster
log2f latency: 2.1x faster
old worst case error: 1.72 ulp
new worst case error: 0.75 ulp
aarch64 .text size: -252 bytes
aarch64 .rodata size: +244 bytes

2017-09-19 Szabolcs Nagy <email address hidden>

 * math/Makefile (type-float-routines): Add e_log2f_data.
 * sysdeps/ieee754/flt-32/e_log2f.c: New implementation.
 * sysdeps/ieee754/flt-32/e_log2f_data.c: New file.
 * sysdeps/ieee754/flt-32/math_config.h (__log2f_data): Define.
 (LOG2F_TABLE_BITS, LOG2F_POLY_ORDER): Define.
 * sysdeps/i386/fpu/e_log2f_data.c: New file.
 * sysdeps/ia64/fpu/e_log2f_data.c: New file.
 * sysdeps/m68k/m680x0/fpu/e_log2f_data.c: New file.

90c42e4... by Szabolcs Nagy <email address hidden>

missed ChangeLog entry

bf27d39... by Szabolcs Nagy <email address hidden>

New generic logf

without wrapper on aarch64:
logf reciprocal-throughput: 2.2x faster
logf latency: 1.9x faster
old worst case error: 0.89 ulp
new worst case error: 0.82 ulp
aarch64 .text size: -356 bytes
aarch64 .rodata size: +240 bytes

Uses double precision arithmetics and a lookup table to allow smaller
polynomial and avoid the use of division.

Data is in a separate translation unit with fixed layout to prevent the
compiler generating suboptimal literal access.

Errors are handled inline according to POSIX rules, but this patch
keeps the wrapper with SVID compatible error handling.

Needs libm-test-ulps adjustment for clogf in non-nearest rounding mode.

 * math/Makefile (type-float-routines): Add e_logf_data.
 * sysdeps/ieee754/flt-32/e_logf.c: New implementation.
 * sysdeps/ieee754/flt-32/e_logf_data.c: New file.
 * sysdeps/ieee754/flt-32/math_config.h (__logf_data): Define.
 (LOGF_TABLE_BITS, LOGF_POLY_ORDER): Define.
 * sysdeps/i386/fpu/e_logf_data.c: New file.
 * sysdeps/ia64/fpu/e_logf_data.c: New file.
 * sysdeps/m68k/m680x0/fpu/e_logf_data.c: New file.

4088d8d... by "H.J. Lu" <email address hidden>

x86: Allow undefined _DYNAMIC in static executable

When --enable-static-pie is used to build static PIE, _DYNAMIC is used
to compute the load address of static PIE. But _DYNAMIC is undefined
when creating static executable. This patch makes _DYNAMIC weak in PIE
libc.a so that it can be undefined.

 * sysdeps/i386/dl-machine.h (elf_machine_load_address): Allow
 undefined _DYNAMIC in PIE libc.a.
 * sysdeps/x86_64/dl-machine.h (elf_machine_load_address):
 Likewse.

4d3693e... by Wilco Dijkstra <email address hidden>

Remove ancient __signbit inlines

Remove __signbit inlines from mathinline.h. Math.h already uses
the builtin when supported, so additional inlines are only used
on pre 4.0 GCCs. Similarly remove ancient copysign and fabs
inlines.

 * sysdeps/alpha/fpu/bits/mathinline.h: Delete file.
 * sysdeps/ia64/fpu/bits/mathinline.h: Delete file.
 * sysdeps/m68k/coldfire/fpu/bits/mathinline.h: Delete file.
 * sysdeps/m68k/m680x0/fpu/bits/mathinline.h: (__signbitf): Remove.
 (__signbit): Remove.
 (__signbitl): Remove.
 * sysdeps/powerpc/bits/mathinline.h (__signbitf): Remove.
 (__signbit): Remove.
 (__signbitl): Remove.
 * sysdeps/s390/fpu/bits/mathinline.h: (__signbitf): Remove.
 (__signbit): Remove.
 (__signbitl): Remove
 * sysdeps/sparc/fpu/bits/mathinline.h (__signbitf): Remove.
 (__signbit): Remove.
 (__signbitl): Remove.
 * sysdeps/tile/bits/mathinline.h: Delete file.
 * sysdeps/x86/fpu/bits/mathinline.h (__signbitf): Remove.
 (__signbit): Remove.
 (__signbitl): Remove.

1e6d072... by Wilco Dijkstra <email address hidden>

Simplify C99 isgreater macros

Simplify the C99 isgreater macros. Although some support was added
in GCC 2.97, not all targets added support until GCC 3.1. Therefore
only use the builtins in math.h from GCC 3.1 onwards, and defer to
generic macros otherwise. Improve the generic isunordered macro
to use compares rather than call fpclassify twice - this is not only
faster but also correct for signaling NaNs.

 * math/math.h: Improve handling of C99 isgreater macros.
 * sysdeps/alpha/fpu/bits/mathinline.h: Remove isgreater macros.
 * sysdeps/m68k/m680x0/fpu/bits/mathinline.h: Likewise.
 * sysdeps/powerpc/bits/mathinline.h: Likewise.
 * sysdeps/sparc/fpu/bits/mathinline.h: Likewise.
 * sysdeps/x86/fpu/bits/mathinline.h: Likewise.