Load specific external function addresses via GOT slot
Drawbacks with -fno-plt and noplt attribute are
1. -fno-plt may force locally defined functions to be called via their
GOT slots through indirect branch, instead of direct branch.
2. noplt attribute doesn't work on libcalls of builtin functions.
3. noplt attribute requires modifying source codes which may not be
desirable for third party source packages.
Add -fno-plt=file and -fno-plt=[symbol,...] options to specify which
external function addresses should be loaded from the GOT slot to avoid
PLT. We don't set REG_EQUAL note with external function symbols whose
address are loaded from GOT slots so that load from the GOT slot won't
be optimized out by a register load.
gcc/
PR target/67400
* Makefile.in (OBJS): Add noplt-symbols.o.
* common.opt (fno-plt=): New option.
* explow.c (force_reg): Don't set REG_EQUAL if
targetm.cannot_set_reg_equal_const returns true.
* expr.c (emit_move_insn): Likewise.
* noplt-symbols.c: New file.
* noplt-symbols.h: Likewise.
* target.def (cannot_set_reg_equal_const): New target hook.
* toplev.c: Include "noplt-symbols.h".
(process_options): Call noplt_symbols_initialize.
(toplev::main): Call noplt_symbols_finish.
* config/i386/i386-protos.h (ix86_noplt_operand): New.
(ix86_noplt_addr_symbol_rtx): Likewise.
* config/i386/i386.c: Include "noplt-symbols.h".
(ix86_noplt_rtx_p): New function.
(ix86_noplt_operand): Likewise.
(ix86_noplt_addr_symbol_rtx): Likewise.
(ix86_cannot_set_reg_equal_const): Likewise.
(ix86_function_ok_for_sibcall): Replace flag_plt with
!noplt_decl_p.
(ix86_legitimate_address_p): Allow UNSPEC_GOT and UNSPEC_GOTPCREL
if ix86_noplt_addr_symbol_rtx doesn't return NULL.
(ix86_print_operand_address): Support UNSPEC_GOT and
UNSPEC_GOTPCREL if ix86_noplt_addr_symbol_rtx doesn't return
NULL.
(ix86_expand_move): Load the external function address via the
GOT slot if ix86_noplt_operand returns true.
(ix86_expand_call): Replace flag_plt and noplt attribute check
with !ix86_noplt_rtx_p.
(ix86_nopic_noplt_attribute_p): Call ix86_noplt_rtx_p.
(TARGET_CANNOT_SET_REG_EQUAL_CONST): New.
* config/i386/i386.h (SYMBOL_FLAG_PLT): New.
(SYMBOL_REF_PLT_P): Likewise.
(SYMBOL_FLAG_NOPLT): Likewise.
(SYMBOL_REF_NOPLT_P): Likewise.
* doc/tm.texi.in (TARGET_CANNOT_SET_REG_EQUAL_CONST): New hook.
* doc/tm.texi: Updated.
Compare address of external function via its GOT slot
Load address of external function from its GOT slot for -fno-plt -fno-pic
if assembler and linker support R_386_GOT32X and R_X86_64_GOTPCRELX to
avoid the PLT slot. R_386_GOT32X and R_X86_64_GOTPCRELX instruct linker
to re-encode the instruction to convert loading function address from its
GOT slot to immediate if the function is defined locally.
Generate R_386_GOT32x relocation for -fno-plt -fno-pic
This patch extends -fno-plt to non-PIC on x86. -fno-plt works in 64-bit
mode with the existing binutils. For 32-bit, we need the updated
assembler and linker to support "call/jmp *foo@GOT", which accesses the
GOT slot without a base register, with a new R_386_GOT32x relocation.
gcc/
* config/i386/i386.c (ix86_nopic_noplt_attribute_p): Check
HAVE_LD_R_386_GOT32X == 0 before returning false.
(ix86_output_call_insn): Generate "%!jmp/call\t*%p0@GOT" for
32-bit.
Check if x86 binutils supports R_386_GOT32X/R_X86_64_GOTPCRELX
Define HAVE_LD_R_386_GOT32X to 1 if 32-bit x86 assembler generates
R_386_GOT32X for "jmp *foo@GOT". Define HAVE_LD_R_X86_64_GOTPCRELX
to 1 if 64-bit x86 assembler generates R_X86_64_GOTPCRELX for
"jmp *foo@GOTPCREL(%rip)".
* configure.ac (HAVE_LD_R_386_GOT32X): New. Defined to 1
if 32-bit assembler generates R_386_GOT32X for "jmp *foo@GOT".
Otherise, defined to 0.
(HAVE_LD_R_X86_64_GOTPCRELX): New. Defined to 1 if 64-bit
x86 assembler generates R_X86_64_GOTPCRELX for
"jmp *foo@GOTPCREL(%rip)".
* config.in: Regenerated.
* configure: Likewise.
prepare_call_address in calls.c is the wrong place to handle -fno-plt.
We shoudn't force function address into register and hope that load
function address via GOT and indirect call via register will be folded
into indirect call via GOT, which doesn't always happen. Also non-PIC
case can only be handled in backend. Instead, backend should expand
external function call into indirect call via GOT for -fno-plt.
This patch reverts -fno-plt in prepare_call_address and handles it in
ix86_expand_call. Other backends may need similar changes to support
-fno-plt. Alternately, we can introduce a target hook to indicate
whether an external function should be called via register for -fno-plt
so that i386 backend can disable it in prepare_call_address.
gcc/
PR target/67215
* calls.c (prepare_call_address): Don't handle -fno-plt here.
* config/i386/i386.c (ix86_expand_call): Generate indirect call
via GOT for -fno-plt. Support indirect call via GOT for x32.
* config/i386/predicates.md (sibcall_memory_operand): Allow
GOT memory operand.
9202af5...
by
ktkachov <ktkachov@138bc75d-0d04-0410-961f-82ee72b054a4>
[AArch64] Add support for 64-bit vector-mode ldp/stp
* config/aarch64/aarch64.c (aarch64_mode_valid_for_sched_fusion_p):
New function.
(fusion_load_store): Use it.
* config/aarch64/aarch64-ldpstp.md: Add new peephole2s for
ldp and stp in VD modes.
* config/aarch64/aarch64-simd.md (load_pair<mode>, VD): New pattern.
(store_pair<mode>, VD): Likewise.
* gcc.target/aarch64/stp_vec_64_1.c: New test.
* gcc.target/aarch64/ldp_vec_64_1.c: Likewise.