Merge lp:~ramana/gcc-linaro/a5-a15-costs-backport-4.6 into lp:gcc-linaro/4.6

Proposed by Ramana Radhakrishnan
Status: Merged
Approved by: Ulrich Weigand
Approved revision: no longer in the source branch.
Merged at revision: 106759
Proposed branch: lp:~ramana/gcc-linaro/a5-a15-costs-backport-4.6
Merge into: lp:gcc-linaro/4.6
Diff against target: 562 lines (+251/-66) (has conflicts)
10 files modified
ChangeLog.linaro (+80/-0)
gcc/config/arm/arm-cores.def (+16/-15)
gcc/config/arm/arm-protos.h (+5/-0)
gcc/config/arm/arm-tune.md (+1/-1)
gcc/config/arm/arm.c (+108/-23)
gcc/config/arm/arm.h (+15/-5)
gcc/config/arm/arm.md (+23/-1)
gcc/config/arm/thumb2.md (+0/-20)
gcc/doc/invoke.texi (+2/-1)
gcc/dojump.c (+1/-0)
Text conflict in ChangeLog.linaro
To merge this branch: bzr merge lp:~ramana/gcc-linaro/a5-a15-costs-backport-4.6
Reviewer Review Type Date Requested Status
Linaro Toolchain Developers Pending
Review via email: mp+64575@code.launchpad.net

Description of the change

Hi,

This contains a set of backports of cost models and A5 / A15 tuning that was committed recently to trunk. Some of the costs infrastructure would make the BRANCH_COST tuning and other cost tuning work a bit easier to backport from trunk for the A9 as well as allow us to add more parameters for the A9.

There is some R5 tuning also pulled back but that is a zero-cost backport since that is something that comes by default with the A15 div instruction work.

It would be useful to get this sanity tested once with the default --with-tune=cortex-a9 to be sure nothing else breaks with this infrastructure now merged in.

cheers
Ramana

To post a comment you must log in.
Revision history for this message
Linaro Toolchain Builder (cbuild) wrote :

cbuild has taken a snapshot of this branch at r106759 and queued it for build.

The snapshot is available at:
 http://ex.seabright.co.nz/snapshots/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6.tar.xdelta3.xz

and will be built on the following builders:
 a9-builder i686 x86_64

You can track the build queue at:
 http://ex.seabright.co.nz/helpers/scheduler

cbuild-snapshot: gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6
cbuild-ancestor: lp:gcc-linaro/4.6+bzr106756
cbuild-state: check

Revision history for this message
Linaro Toolchain Builder (cbuild) wrote :

cbuild successfully built this on i686-lucid-cbuild132-scorpius-i686r1.

The build results are available at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6/logs/i686-lucid-cbuild132-scorpius-i686r1

The test suite results were unchanged compared to the branch point lp:gcc-linaro/4.6+bzr106756.

The full testsuite results are at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6/logs/i686-lucid-cbuild132-scorpius-i686r1/gcc-testsuite.txt

cbuild-checked: i686-lucid-cbuild132-scorpius-i686r1

Revision history for this message
Linaro Toolchain Builder (cbuild) wrote :

cbuild successfully built this on x86_64-maverick-cbuild132-crucis-x86_64r1.

The build results are available at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6/logs/x86_64-maverick-cbuild132-crucis-x86_64r1

The test suite results were unchanged compared to the branch point lp:gcc-linaro/4.6+bzr106756.

The full testsuite results are at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6/logs/x86_64-maverick-cbuild132-crucis-x86_64r1/gcc-testsuite.txt

cbuild-checked: x86_64-maverick-cbuild132-crucis-x86_64r1

Revision history for this message
Linaro Toolchain Builder (cbuild) wrote :

cbuild successfully built this on armv7l-maverick-cbuild132-ursa2-cortexa9r1.

The build results are available at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6/logs/armv7l-maverick-cbuild132-ursa2-cortexa9r1

The test suite results were unchanged compared to the branch point lp:gcc-linaro/4.6+bzr106756.

The full testsuite results are at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6/logs/armv7l-maverick-cbuild132-ursa2-cortexa9r1/gcc-testsuite.txt

cbuild-checked: armv7l-maverick-cbuild132-ursa2-cortexa9r1

Revision history for this message
Ulrich Weigand (uweigand) wrote :

The backport looks good to me. Given that there are no testsuite regressions either, this is OK.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'ChangeLog.linaro'
--- ChangeLog.linaro 2011-06-14 14:09:57 +0000
+++ ChangeLog.linaro 2011-06-14 16:54:35 +0000
@@ -1,3 +1,4 @@
1<<<<<<< TREE
12011-06-14 Andrew Stubbs <ams@codesourcery.com>22011-06-14 Andrew Stubbs <ams@codesourcery.com>
23
3 gcc/4 gcc/
@@ -10,6 +11,85 @@
10 gcc/11 gcc/
11 * LINARO-VERSION: Update.12 * LINARO-VERSION: Update.
1213
14=======
152011-06-14 Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>
16
17 Backport from mainline.
18 2011-06-03 Julian Brown <julian@codesourcery.com>
19
20 * config/arm/arm-cores.def (strongarm, strongarm110, strongarm1100)
21 (strongarm1110): Use strongarm tuning.
22 * config/arm/arm-protos.h (tune_params): Add max_insns_skipped
23 field.
24 * config/arm/arm.c (arm_strongarm_tune): New.
25 (arm_slowmul_tune, arm_fastmul_tune, arm_xscale_tune, arm_9e_tune)
26 (arm_v6t2_tune, arm_cortex_tune, arm_cortex_a5_tune)
27 (arm_cortex_a9_tune, arm_fa726te_tune): Add max_insns_skipped field
28 setting, using previous defaults or 1 for Cortex-A5.
29 (arm_option_override): Set max_insns_skipped from current tuning.
30
312011-06-14 Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>
32
33 Backport from mainline.
34 2011-06-02 Julian Brown <julian@codesourcery.com>
35
36 * config/arm/arm-cores.def (cortex-a5): Use cortex_a5 tuning.
37 * config/arm/arm.c (arm_cortex_a5_branch_cost): New.
38 (arm_cortex_a5_tune): New.
39
40 2011-06-02 Julian Brown <julian@codesourcery.com>
41
42 * config/arm/arm-protos.h (tune_params): Add branch_cost hook.
43 * config/arm/arm.c (arm_default_branch_cost): New.
44 (arm_slowmul_tune, arm_fastmul_tune, arm_xscale_tune, arm_9e_tune)
45 (arm_v6t2_tune, arm_cortex_tune, arm_cortex_a9_tune)
46 (arm_fa726_tune): Set branch_cost field using
47 arm_default_branch_cost.
48 * config/arm/arm.h (BRANCH_COST): Use branch_cost hook from
49 current_tune structure.
50 * dojump.c (tm_p.h): Include file.
51
52 2011-06-02 Julian Brown <julian@codesourcery.com>
53
54 * config/arm/arm-cores.def (arm1156t2-s, arm1156t2f-s): Use v6t2
55 tuning.
56 (cortex-a5, cortex-a8, cortex-a15, cortex-r4, cortex-r4f, cortex-m4)
57 (cortex-m3, cortex-m1, cortex-m0): Use cortex tuning.
58 * config/arm/arm-protos.h (tune_params): Add prefer_constant_pool
59 field.
60 * config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune)
61 (arm_xscale_tune, arm_9e_tune, arm_cortex_a9_tune)
62 (arm_fa726te_tune): Add prefer_constant_pool setting.
63 (arm_v6t2_tune, arm_cortex_tune): New.
64 * config/arm/arm.h (TARGET_USE_MOVT): Make dependent on
65 prefer_constant_pool setting.
66
672011-06-14 Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>
68
69 Backport from mainline
70 2011-06-01 Paul Brook <paul@cpodesourcery.com>
71
72 * config/arm/arm-cores.def: Add cortex-r5. Add DIV flags to
73 Cortex-A15.
74 * config/arm/arm-tune.md: Regenerate.
75 * config/arm/arm.c (FL_DIV): Rename...
76 (FL_THUMB_DIV): ... to this.
77 (FL_ARM_DIV): Define.
78 (FL_FOR_ARCH7R, FL_FOR_ARCH7M): Use FL_THUMB_DIV.
79 (arm_arch_hwdiv): Remove.
80 (arm_arch_thumb_hwdiv, arm_arch_arm_hwdiv): New variables.
81 (arm_issue_rate): Add cortexr5.
82 * config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Set
83 __ARM_ARCH_EXT_IDIV__.
84 (TARGET_IDIV): Define.
85 (arm_arch_hwdiv): Remove.
86 (arm_arch_arm_hwdiv, arm_arch_thumb_hwdiv): New prototypes.
87 * config/arm/arm.md (tune_cortexr4): Add cortexr5.
88 (divsi3, udivsi3): New patterns.
89 * config/arm/thumb2.md (divsi3, udivsi3): Remove.
90 * doc/invoke.texi: Document ARM -mcpu=cortex-r5
91
92>>>>>>> MERGE-SOURCE
132011-06-13 Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>932011-06-13 Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>
1494
15 Backport from mainline:95 Backport from mainline:
1696
=== modified file 'gcc/config/arm/arm-cores.def'
--- gcc/config/arm/arm-cores.def 2011-01-03 20:52:22 +0000
+++ gcc/config/arm/arm-cores.def 2011-06-14 16:54:35 +0000
@@ -70,10 +70,10 @@
70/* V4 Architecture Processors */70/* V4 Architecture Processors */
71ARM_CORE("arm8", arm8, 4, FL_MODE26 | FL_LDSCHED, fastmul)71ARM_CORE("arm8", arm8, 4, FL_MODE26 | FL_LDSCHED, fastmul)
72ARM_CORE("arm810", arm810, 4, FL_MODE26 | FL_LDSCHED, fastmul)72ARM_CORE("arm810", arm810, 4, FL_MODE26 | FL_LDSCHED, fastmul)
73ARM_CORE("strongarm", strongarm, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, fastmul)73ARM_CORE("strongarm", strongarm, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, strongarm)
74ARM_CORE("strongarm110", strongarm110, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, fastmul)74ARM_CORE("strongarm110", strongarm110, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, strongarm)
75ARM_CORE("strongarm1100", strongarm1100, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, fastmul)75ARM_CORE("strongarm1100", strongarm1100, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, strongarm)
76ARM_CORE("strongarm1110", strongarm1110, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, fastmul)76ARM_CORE("strongarm1110", strongarm1110, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, strongarm)
77ARM_CORE("fa526", fa526, 4, FL_LDSCHED, fastmul)77ARM_CORE("fa526", fa526, 4, FL_LDSCHED, fastmul)
78ARM_CORE("fa626", fa626, 4, FL_LDSCHED, fastmul)78ARM_CORE("fa626", fa626, 4, FL_LDSCHED, fastmul)
7979
@@ -122,15 +122,16 @@
122ARM_CORE("arm1176jzf-s", arm1176jzfs, 6ZK, FL_LDSCHED | FL_VFPV2, 9e)122ARM_CORE("arm1176jzf-s", arm1176jzfs, 6ZK, FL_LDSCHED | FL_VFPV2, 9e)
123ARM_CORE("mpcorenovfp", mpcorenovfp, 6K, FL_LDSCHED, 9e)123ARM_CORE("mpcorenovfp", mpcorenovfp, 6K, FL_LDSCHED, 9e)
124ARM_CORE("mpcore", mpcore, 6K, FL_LDSCHED | FL_VFPV2, 9e)124ARM_CORE("mpcore", mpcore, 6K, FL_LDSCHED | FL_VFPV2, 9e)
125ARM_CORE("arm1156t2-s", arm1156t2s, 6T2, FL_LDSCHED, 9e)125ARM_CORE("arm1156t2-s", arm1156t2s, 6T2, FL_LDSCHED, v6t2)
126ARM_CORE("arm1156t2f-s", arm1156t2fs, 6T2, FL_LDSCHED | FL_VFPV2, 9e)126ARM_CORE("arm1156t2f-s", arm1156t2fs, 6T2, FL_LDSCHED | FL_VFPV2, v6t2)
127ARM_CORE("cortex-a5", cortexa5, 7A, FL_LDSCHED, 9e)127ARM_CORE("cortex-a5", cortexa5, 7A, FL_LDSCHED, cortex_a5)
128ARM_CORE("cortex-a8", cortexa8, 7A, FL_LDSCHED, 9e)128ARM_CORE("cortex-a8", cortexa8, 7A, FL_LDSCHED, cortex)
129ARM_CORE("cortex-a9", cortexa9, 7A, FL_LDSCHED, cortex_a9)129ARM_CORE("cortex-a9", cortexa9, 7A, FL_LDSCHED, cortex_a9)
130ARM_CORE("cortex-a15", cortexa15, 7A, FL_LDSCHED, 9e)130ARM_CORE("cortex-a15", cortexa15, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex)
131ARM_CORE("cortex-r4", cortexr4, 7R, FL_LDSCHED, 9e)131ARM_CORE("cortex-r4", cortexr4, 7R, FL_LDSCHED, cortex)
132ARM_CORE("cortex-r4f", cortexr4f, 7R, FL_LDSCHED, 9e)132ARM_CORE("cortex-r4f", cortexr4f, 7R, FL_LDSCHED, cortex)
133ARM_CORE("cortex-m4", cortexm4, 7EM, FL_LDSCHED, 9e)133ARM_CORE("cortex-r5", cortexr5, 7R, FL_LDSCHED | FL_ARM_DIV, cortex)
134ARM_CORE("cortex-m3", cortexm3, 7M, FL_LDSCHED, 9e)134ARM_CORE("cortex-m4", cortexm4, 7EM, FL_LDSCHED, cortex)
135ARM_CORE("cortex-m1", cortexm1, 6M, FL_LDSCHED, 9e)135ARM_CORE("cortex-m3", cortexm3, 7M, FL_LDSCHED, cortex)
136ARM_CORE("cortex-m0", cortexm0, 6M, FL_LDSCHED, 9e)136ARM_CORE("cortex-m1", cortexm1, 6M, FL_LDSCHED, cortex)
137ARM_CORE("cortex-m0", cortexm0, 6M, FL_LDSCHED, cortex)
137138
=== modified file 'gcc/config/arm/arm-protos.h'
--- gcc/config/arm/arm-protos.h 2011-05-03 15:17:25 +0000
+++ gcc/config/arm/arm-protos.h 2011-06-14 16:54:35 +0000
@@ -219,9 +219,14 @@
219 bool (*rtx_costs) (rtx, RTX_CODE, RTX_CODE, int *, bool);219 bool (*rtx_costs) (rtx, RTX_CODE, RTX_CODE, int *, bool);
220 bool (*sched_adjust_cost) (rtx, rtx, rtx, int *);220 bool (*sched_adjust_cost) (rtx, rtx, rtx, int *);
221 int constant_limit;221 int constant_limit;
222 /* Maximum number of instructions to conditionalise in
223 arm_final_prescan_insn. */
224 int max_insns_skipped;
222 int num_prefetch_slots;225 int num_prefetch_slots;
223 int l1_cache_size;226 int l1_cache_size;
224 int l1_cache_line_size;227 int l1_cache_line_size;
228 bool prefer_constant_pool;
229 int (*branch_cost) (bool, bool);
225};230};
226231
227extern const struct tune_params *current_tune;232extern const struct tune_params *current_tune;
228233
=== modified file 'gcc/config/arm/arm-tune.md'
--- gcc/config/arm/arm-tune.md 2010-12-20 17:48:51 +0000
+++ gcc/config/arm/arm-tune.md 2011-06-14 16:54:35 +0000
@@ -1,5 +1,5 @@
1;; -*- buffer-read-only: t -*-1;; -*- buffer-read-only: t -*-
2;; Generated automatically by gentune.sh from arm-cores.def2;; Generated automatically by gentune.sh from arm-cores.def
3(define_attr "tune"3(define_attr "tune"
4 "arm2,arm250,arm3,arm6,arm60,arm600,arm610,arm620,arm7,arm7d,arm7di,arm70,arm700,arm700i,arm710,arm720,arm710c,arm7100,arm7500,arm7500fe,arm7m,arm7dm,arm7dmi,arm8,arm810,strongarm,strongarm110,strongarm1100,strongarm1110,fa526,fa626,arm7tdmi,arm7tdmis,arm710t,arm720t,arm740t,arm9,arm9tdmi,arm920,arm920t,arm922t,arm940t,ep9312,arm10tdmi,arm1020t,arm9e,arm946es,arm966es,arm968es,arm10e,arm1020e,arm1022e,xscale,iwmmxt,iwmmxt2,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1026ejs,arm1136js,arm1136jfs,arm1176jzs,arm1176jzfs,mpcorenovfp,mpcore,arm1156t2s,arm1156t2fs,cortexa5,cortexa8,cortexa9,cortexa15,cortexr4,cortexr4f,cortexm4,cortexm3,cortexm1,cortexm0"4 "arm2,arm250,arm3,arm6,arm60,arm600,arm610,arm620,arm7,arm7d,arm7di,arm70,arm700,arm700i,arm710,arm720,arm710c,arm7100,arm7500,arm7500fe,arm7m,arm7dm,arm7dmi,arm8,arm810,strongarm,strongarm110,strongarm1100,strongarm1110,fa526,fa626,arm7tdmi,arm7tdmis,arm710t,arm720t,arm740t,arm9,arm9tdmi,arm920,arm920t,arm922t,arm940t,ep9312,arm10tdmi,arm1020t,arm9e,arm946es,arm966es,arm968es,arm10e,arm1020e,arm1022e,xscale,iwmmxt,iwmmxt2,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1026ejs,arm1136js,arm1136jfs,arm1176jzs,arm1176jzfs,mpcorenovfp,mpcore,arm1156t2s,arm1156t2fs,cortexa5,cortexa8,cortexa9,cortexa15,cortexr4,cortexr4f,cortexr5,cortexm4,cortexm3,cortexm1,cortexm0"
5 (const (symbol_ref "((enum attr_tune) arm_tune)")))5 (const (symbol_ref "((enum attr_tune) arm_tune)")))
66
=== modified file 'gcc/config/arm/arm.c'
--- gcc/config/arm/arm.c 2011-05-11 14:49:48 +0000
+++ gcc/config/arm/arm.c 2011-06-14 16:54:35 +0000
@@ -255,6 +255,8 @@
255static void arm_conditional_register_usage (void);255static void arm_conditional_register_usage (void);
256static reg_class_t arm_preferred_rename_class (reg_class_t rclass);256static reg_class_t arm_preferred_rename_class (reg_class_t rclass);
257static unsigned int arm_autovectorize_vector_sizes (void);257static unsigned int arm_autovectorize_vector_sizes (void);
258static int arm_default_branch_cost (bool, bool);
259static int arm_cortex_a5_branch_cost (bool, bool);
258260
259261
260262
261/* Table of machine attributes. */263/* Table of machine attributes. */
@@ -672,12 +674,13 @@
672#define FL_THUMB2 (1 << 16) /* Thumb-2. */674#define FL_THUMB2 (1 << 16) /* Thumb-2. */
673#define FL_NOTM (1 << 17) /* Instructions not present in the 'M'675#define FL_NOTM (1 << 17) /* Instructions not present in the 'M'
674 profile. */676 profile. */
675#define FL_DIV (1 << 18) /* Hardware divide. */677#define FL_THUMB_DIV (1 << 18) /* Hardware divide (Thumb mode). */
676#define FL_VFPV3 (1 << 19) /* Vector Floating Point V3. */678#define FL_VFPV3 (1 << 19) /* Vector Floating Point V3. */
677#define FL_NEON (1 << 20) /* Neon instructions. */679#define FL_NEON (1 << 20) /* Neon instructions. */
678#define FL_ARCH7EM (1 << 21) /* Instructions present in the ARMv7E-M680#define FL_ARCH7EM (1 << 21) /* Instructions present in the ARMv7E-M
679 architecture. */681 architecture. */
680#define FL_ARCH7 (1 << 22) /* Architecture 7. */682#define FL_ARCH7 (1 << 22) /* Architecture 7. */
683#define FL_ARM_DIV (1 << 23) /* Hardware divide (ARM mode). */
681684
682#define FL_IWMMXT (1 << 29) /* XScale v2 or "Intel Wireless MMX technology". */685#define FL_IWMMXT (1 << 29) /* XScale v2 or "Intel Wireless MMX technology". */
683686
@@ -704,8 +707,8 @@
704#define FL_FOR_ARCH6M (FL_FOR_ARCH6 & ~FL_NOTM)707#define FL_FOR_ARCH6M (FL_FOR_ARCH6 & ~FL_NOTM)
705#define FL_FOR_ARCH7 ((FL_FOR_ARCH6T2 & ~FL_NOTM) | FL_ARCH7)708#define FL_FOR_ARCH7 ((FL_FOR_ARCH6T2 & ~FL_NOTM) | FL_ARCH7)
706#define FL_FOR_ARCH7A (FL_FOR_ARCH7 | FL_NOTM | FL_ARCH6K)709#define FL_FOR_ARCH7A (FL_FOR_ARCH7 | FL_NOTM | FL_ARCH6K)
707#define FL_FOR_ARCH7R (FL_FOR_ARCH7A | FL_DIV)710#define FL_FOR_ARCH7R (FL_FOR_ARCH7A | FL_THUMB_DIV)
708#define FL_FOR_ARCH7M (FL_FOR_ARCH7 | FL_DIV)711#define FL_FOR_ARCH7M (FL_FOR_ARCH7 | FL_THUMB_DIV)
709#define FL_FOR_ARCH7EM (FL_FOR_ARCH7M | FL_ARCH7EM)712#define FL_FOR_ARCH7EM (FL_FOR_ARCH7M | FL_ARCH7EM)
710713
711/* The bits in this mask specify which714/* The bits in this mask specify which
@@ -791,7 +794,8 @@
791int arm_arch_thumb2;794int arm_arch_thumb2;
792795
793/* Nonzero if chip supports integer division instruction. */796/* Nonzero if chip supports integer division instruction. */
794int arm_arch_hwdiv;797int arm_arch_arm_hwdiv;
798int arm_arch_thumb_hwdiv;
795799
796/* In case of a PRE_INC, POST_INC, PRE_DEC, POST_DEC memory reference,800/* In case of a PRE_INC, POST_INC, PRE_DEC, POST_DEC memory reference,
797 we must report the mode of the memory reference from801 we must report the mode of the memory reference from
@@ -864,48 +868,117 @@
864{868{
865 arm_slowmul_rtx_costs,869 arm_slowmul_rtx_costs,
866 NULL,870 NULL,
867 3,871 3, /* Constant limit. */
868 ARM_PREFETCH_NOT_BENEFICIAL872 5, /* Max cond insns. */
873 ARM_PREFETCH_NOT_BENEFICIAL,
874 true, /* Prefer constant pool. */
875 arm_default_branch_cost
869};876};
870877
871const struct tune_params arm_fastmul_tune =878const struct tune_params arm_fastmul_tune =
872{879{
873 arm_fastmul_rtx_costs,880 arm_fastmul_rtx_costs,
874 NULL,881 NULL,
875 1,882 1, /* Constant limit. */
876 ARM_PREFETCH_NOT_BENEFICIAL883 5, /* Max cond insns. */
884 ARM_PREFETCH_NOT_BENEFICIAL,
885 true, /* Prefer constant pool. */
886 arm_default_branch_cost
887};
888
889/* StrongARM has early execution of branches, so a sequence that is worth
890 skipping is shorter. Set max_insns_skipped to a lower value. */
891
892const struct tune_params arm_strongarm_tune =
893{
894 arm_fastmul_rtx_costs,
895 NULL,
896 1, /* Constant limit. */
897 3, /* Max cond insns. */
898 ARM_PREFETCH_NOT_BENEFICIAL,
899 true, /* Prefer constant pool. */
900 arm_default_branch_cost
877};901};
878902
879const struct tune_params arm_xscale_tune =903const struct tune_params arm_xscale_tune =
880{904{
881 arm_xscale_rtx_costs,905 arm_xscale_rtx_costs,
882 xscale_sched_adjust_cost,906 xscale_sched_adjust_cost,
883 2,907 2, /* Constant limit. */
884 ARM_PREFETCH_NOT_BENEFICIAL908 3, /* Max cond insns. */
909 ARM_PREFETCH_NOT_BENEFICIAL,
910 true, /* Prefer constant pool. */
911 arm_default_branch_cost
885};912};
886913
887const struct tune_params arm_9e_tune =914const struct tune_params arm_9e_tune =
888{915{
889 arm_9e_rtx_costs,916 arm_9e_rtx_costs,
890 NULL,917 NULL,
891 1,918 1, /* Constant limit. */
892 ARM_PREFETCH_NOT_BENEFICIAL919 5, /* Max cond insns. */
920 ARM_PREFETCH_NOT_BENEFICIAL,
921 true, /* Prefer constant pool. */
922 arm_default_branch_cost
923};
924
925const struct tune_params arm_v6t2_tune =
926{
927 arm_9e_rtx_costs,
928 NULL,
929 1, /* Constant limit. */
930 5, /* Max cond insns. */
931 ARM_PREFETCH_NOT_BENEFICIAL,
932 false, /* Prefer constant pool. */
933 arm_default_branch_cost
934};
935
936/* Generic Cortex tuning. Use more specific tunings if appropriate. */
937const struct tune_params arm_cortex_tune =
938{
939 arm_9e_rtx_costs,
940 NULL,
941 1, /* Constant limit. */
942 5, /* Max cond insns. */
943 ARM_PREFETCH_NOT_BENEFICIAL,
944 false, /* Prefer constant pool. */
945 arm_default_branch_cost
946};
947
948/* Branches can be dual-issued on Cortex-A5, so conditional execution is
949 less appealing. Set max_insns_skipped to a low value. */
950
951const struct tune_params arm_cortex_a5_tune =
952{
953 arm_9e_rtx_costs,
954 NULL,
955 1, /* Constant limit. */
956 1, /* Max cond insns. */
957 ARM_PREFETCH_NOT_BENEFICIAL,
958 false, /* Prefer constant pool. */
959 arm_cortex_a5_branch_cost
893};960};
894961
895const struct tune_params arm_cortex_a9_tune =962const struct tune_params arm_cortex_a9_tune =
896{963{
897 arm_9e_rtx_costs,964 arm_9e_rtx_costs,
898 cortex_a9_sched_adjust_cost,965 cortex_a9_sched_adjust_cost,
899 1,966 1, /* Constant limit. */
900 ARM_PREFETCH_BENEFICIAL(4,32,32)967 5, /* Max cond insns. */
968 ARM_PREFETCH_BENEFICIAL(4,32,32),
969 false, /* Prefer constant pool. */
970 arm_default_branch_cost
901};971};
902972
903const struct tune_params arm_fa726te_tune =973const struct tune_params arm_fa726te_tune =
904{974{
905 arm_9e_rtx_costs,975 arm_9e_rtx_costs,
906 fa726te_sched_adjust_cost,976 fa726te_sched_adjust_cost,
907 1,977 1, /* Constant limit. */
908 ARM_PREFETCH_NOT_BENEFICIAL978 5, /* Max cond insns. */
979 ARM_PREFETCH_NOT_BENEFICIAL,
980 true, /* Prefer constant pool. */
981 arm_default_branch_cost
909};982};
910983
911984
@@ -1711,7 +1784,8 @@
1711 arm_tune_wbuf = (tune_flags & FL_WBUF) != 0;1784 arm_tune_wbuf = (tune_flags & FL_WBUF) != 0;
1712 arm_tune_xscale = (tune_flags & FL_XSCALE) != 0;1785 arm_tune_xscale = (tune_flags & FL_XSCALE) != 0;
1713 arm_arch_iwmmxt = (insn_flags & FL_IWMMXT) != 0;1786 arm_arch_iwmmxt = (insn_flags & FL_IWMMXT) != 0;
1714 arm_arch_hwdiv = (insn_flags & FL_DIV) != 0;1787 arm_arch_thumb_hwdiv = (insn_flags & FL_THUMB_DIV) != 0;
1788 arm_arch_arm_hwdiv = (insn_flags & FL_ARM_DIV) != 0;
1715 arm_tune_cortex_a9 = (arm_tune == cortexa9) != 0;1789 arm_tune_cortex_a9 = (arm_tune == cortexa9) != 0;
17161790
1717 /* If we are not using the default (ARM mode) section anchor offset1791 /* If we are not using the default (ARM mode) section anchor offset
@@ -1991,12 +2065,7 @@
1991 max_insns_skipped = 6;2065 max_insns_skipped = 6;
1992 }2066 }
1993 else2067 else
1994 {2068 max_insns_skipped = current_tune->max_insns_skipped;
1995 /* StrongARM has early execution of branches, so a sequence
1996 that is worth skipping is shorter. */
1997 if (arm_tune_strongarm)
1998 max_insns_skipped = 3;
1999 }
20002069
2001 /* Hot/Cold partitioning is not currently supported, since we can't2070 /* Hot/Cold partitioning is not currently supported, since we can't
2002 handle literal pool placement in that case. */2071 handle literal pool placement in that case. */
@@ -8211,6 +8280,21 @@
8211 return cost;8280 return cost;
8212}8281}
82138282
8283static int
8284arm_default_branch_cost (bool speed_p, bool predictable_p ATTRIBUTE_UNUSED)
8285{
8286 if (TARGET_32BIT)
8287 return (TARGET_THUMB2 && !speed_p) ? 1 : 4;
8288 else
8289 return (optimize > 0) ? 2 : 0;
8290}
8291
8292static int
8293arm_cortex_a5_branch_cost (bool speed_p, bool predictable_p)
8294{
8295 return speed_p ? 0 : arm_default_branch_cost (speed_p, predictable_p);
8296}
8297
8214static int fp_consts_inited = 0;8298static int fp_consts_inited = 0;
82158299
8216/* Only zero is valid for VFP. Other values are also valid for FPA. */8300/* Only zero is valid for VFP. Other values are also valid for FPA. */
@@ -23123,6 +23207,7 @@
23123 {23207 {
23124 case cortexr4:23208 case cortexr4:
23125 case cortexr4f:23209 case cortexr4f:
23210 case cortexr5:
23126 case cortexa5:23211 case cortexa5:
23127 case cortexa8:23212 case cortexa8:
23128 case cortexa9:23213 case cortexa9:
2312923214
=== modified file 'gcc/config/arm/arm.h'
--- gcc/config/arm/arm.h 2011-06-02 12:12:00 +0000
+++ gcc/config/arm/arm.h 2011-06-14 16:54:35 +0000
@@ -101,6 +101,8 @@
101 builtin_define ("__ARM_PCS"); \101 builtin_define ("__ARM_PCS"); \
102 builtin_define ("__ARM_EABI__"); \102 builtin_define ("__ARM_EABI__"); \
103 } \103 } \
104 if (TARGET_IDIV) \
105 builtin_define ("__ARM_ARCH_EXT_IDIV__"); \
104 } while (0)106 } while (0)
105107
106/* The various ARM cores. */108/* The various ARM cores. */
@@ -282,7 +284,8 @@
282 (TARGET_32BIT && arm_arch6 && (arm_arch_notm || arm_arch7em))284 (TARGET_32BIT && arm_arch6 && (arm_arch_notm || arm_arch7em))
283285
284/* Should MOVW/MOVT be used in preference to a constant pool. */286/* Should MOVW/MOVT be used in preference to a constant pool. */
285#define TARGET_USE_MOVT (arm_arch_thumb2 && !optimize_size)287#define TARGET_USE_MOVT \
288 (arm_arch_thumb2 && !optimize_size && !current_tune->prefer_constant_pool)
286289
287/* We could use unified syntax for arm mode, but for now we just use it290/* We could use unified syntax for arm mode, but for now we just use it
288 for Thumb-2. */291 for Thumb-2. */
@@ -303,6 +306,10 @@
303/* Nonzero if this chip supports ldrex{bhd} and strex{bhd}. */306/* Nonzero if this chip supports ldrex{bhd} and strex{bhd}. */
304#define TARGET_HAVE_LDREXBHD ((arm_arch6k && TARGET_ARM) || arm_arch7)307#define TARGET_HAVE_LDREXBHD ((arm_arch6k && TARGET_ARM) || arm_arch7)
305308
309/* Nonzero if integer division instructions supported. */
310#define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \
311 || (TARGET_THUMB2 && arm_arch_thumb_hwdiv))
312
306/* True iff the full BPABI is being used. If TARGET_BPABI is true,313/* True iff the full BPABI is being used. If TARGET_BPABI is true,
307 then TARGET_AAPCS_BASED must be true -- but the converse does not314 then TARGET_AAPCS_BASED must be true -- but the converse does not
308 hold. TARGET_BPABI implies the use of the BPABI runtime library,315 hold. TARGET_BPABI implies the use of the BPABI runtime library,
@@ -487,8 +494,11 @@
487/* Nonzero if chip supports Thumb 2. */494/* Nonzero if chip supports Thumb 2. */
488extern int arm_arch_thumb2;495extern int arm_arch_thumb2;
489496
490/* Nonzero if chip supports integer division instruction. */497/* Nonzero if chip supports integer division instruction in ARM mode. */
491extern int arm_arch_hwdiv;498extern int arm_arch_arm_hwdiv;
499
500/* Nonzero if chip supports integer division instruction in Thumb mode. */
501extern int arm_arch_thumb_hwdiv;
492502
493#ifndef TARGET_DEFAULT503#ifndef TARGET_DEFAULT
494#define TARGET_DEFAULT (MASK_APCS_FRAME)504#define TARGET_DEFAULT (MASK_APCS_FRAME)
@@ -2018,8 +2028,8 @@
2018/* Try to generate sequences that don't involve branches, we can then use2028/* Try to generate sequences that don't involve branches, we can then use
2019 conditional instructions */2029 conditional instructions */
2020#define BRANCH_COST(speed_p, predictable_p) \2030#define BRANCH_COST(speed_p, predictable_p) \
2021 (TARGET_32BIT ? (TARGET_THUMB2 && !speed_p ? 1 : 4) \2031 (current_tune->branch_cost (speed_p, predictable_p))
2022 : (optimize > 0 ? 2 : 0))2032
20232033
20242034
2025/* Position Independent Code. */2035/* Position Independent Code. */
2026/* We decide which register to use based on the compilation options and2036/* We decide which register to use based on the compilation options and
20272037
=== modified file 'gcc/config/arm/arm.md'
--- gcc/config/arm/arm.md 2011-06-02 15:58:33 +0000
+++ gcc/config/arm/arm.md 2011-06-14 16:54:35 +0000
@@ -490,7 +490,7 @@
490490
491(define_attr "tune_cortexr4" "yes,no"491(define_attr "tune_cortexr4" "yes,no"
492 (const (if_then_else492 (const (if_then_else
493 (eq_attr "tune" "cortexr4,cortexr4f")493 (eq_attr "tune" "cortexr4,cortexr4f,cortexr5")
494 (const_string "yes")494 (const_string "yes")
495 (const_string "no"))))495 (const_string "no"))))
496496
@@ -3738,6 +3738,28 @@
3738 (set_attr "predicable" "yes")]3738 (set_attr "predicable" "yes")]
3739)3739)
37403740
3741
3742;; Division instructions
3743(define_insn "divsi3"
3744 [(set (match_operand:SI 0 "s_register_operand" "=r")
3745 (div:SI (match_operand:SI 1 "s_register_operand" "r")
3746 (match_operand:SI 2 "s_register_operand" "r")))]
3747 "TARGET_IDIV"
3748 "sdiv%?\t%0, %1, %2"
3749 [(set_attr "predicable" "yes")
3750 (set_attr "insn" "sdiv")]
3751)
3752
3753(define_insn "udivsi3"
3754 [(set (match_operand:SI 0 "s_register_operand" "=r")
3755 (udiv:SI (match_operand:SI 1 "s_register_operand" "r")
3756 (match_operand:SI 2 "s_register_operand" "r")))]
3757 "TARGET_IDIV"
3758 "udiv%?\t%0, %1, %2"
3759 [(set_attr "predicable" "yes")
3760 (set_attr "insn" "udiv")]
3761)
3762
37413763
37423764
3743;; Unary arithmetic insns3765;; Unary arithmetic insns
37443766
37453767
=== modified file 'gcc/config/arm/thumb2.md'
--- gcc/config/arm/thumb2.md 2011-05-11 07:15:47 +0000
+++ gcc/config/arm/thumb2.md 2011-06-14 16:54:35 +0000
@@ -779,26 +779,6 @@
779 (set_attr "length" "2")]779 (set_attr "length" "2")]
780)780)
781781
782(define_insn "divsi3"
783 [(set (match_operand:SI 0 "s_register_operand" "=r")
784 (div:SI (match_operand:SI 1 "s_register_operand" "r")
785 (match_operand:SI 2 "s_register_operand" "r")))]
786 "TARGET_THUMB2 && arm_arch_hwdiv"
787 "sdiv%?\t%0, %1, %2"
788 [(set_attr "predicable" "yes")
789 (set_attr "insn" "sdiv")]
790)
791
792(define_insn "udivsi3"
793 [(set (match_operand:SI 0 "s_register_operand" "=r")
794 (udiv:SI (match_operand:SI 1 "s_register_operand" "r")
795 (match_operand:SI 2 "s_register_operand" "r")))]
796 "TARGET_THUMB2 && arm_arch_hwdiv"
797 "udiv%?\t%0, %1, %2"
798 [(set_attr "predicable" "yes")
799 (set_attr "insn" "udiv")]
800)
801
802(define_insn "*thumb2_subsi_short"782(define_insn "*thumb2_subsi_short"
803 [(set (match_operand:SI 0 "low_register_operand" "=l")783 [(set (match_operand:SI 0 "low_register_operand" "=l")
804 (minus:SI (match_operand:SI 1 "low_register_operand" "l")784 (minus:SI (match_operand:SI 1 "low_register_operand" "l")
805785
=== modified file 'gcc/doc/invoke.texi'
--- gcc/doc/invoke.texi 2011-05-11 07:15:47 +0000
+++ gcc/doc/invoke.texi 2011-06-14 16:54:35 +0000
@@ -10208,7 +10208,8 @@
10208@samp{arm1136j-s}, @samp{arm1136jf-s}, @samp{mpcore}, @samp{mpcorenovfp},10208@samp{arm1136j-s}, @samp{arm1136jf-s}, @samp{mpcore}, @samp{mpcorenovfp},
10209@samp{arm1156t2-s}, @samp{arm1156t2f-s}, @samp{arm1176jz-s}, @samp{arm1176jzf-s},10209@samp{arm1156t2-s}, @samp{arm1156t2f-s}, @samp{arm1176jz-s}, @samp{arm1176jzf-s},
10210@samp{cortex-a5}, @samp{cortex-a8}, @samp{cortex-a9}, @samp{cortex-a15},10210@samp{cortex-a5}, @samp{cortex-a8}, @samp{cortex-a9}, @samp{cortex-a15},
10211@samp{cortex-r4}, @samp{cortex-r4f}, @samp{cortex-m4}, @samp{cortex-m3},10211@samp{cortex-r4}, @samp{cortex-r4f}, @samp{cortex-r5},
10212@samp{cortex-m4}, @samp{cortex-m3},
10212@samp{cortex-m1},10213@samp{cortex-m1},
10213@samp{cortex-m0},10214@samp{cortex-m0},
10214@samp{xscale}, @samp{iwmmxt}, @samp{iwmmxt2}, @samp{ep9312}.10215@samp{xscale}, @samp{iwmmxt}, @samp{iwmmxt2}, @samp{ep9312}.
1021510216
=== modified file 'gcc/dojump.c'
--- gcc/dojump.c 2010-05-19 19:09:57 +0000
+++ gcc/dojump.c 2011-06-14 16:54:35 +0000
@@ -36,6 +36,7 @@
36#include "ggc.h"36#include "ggc.h"
37#include "basic-block.h"37#include "basic-block.h"
38#include "output.h"38#include "output.h"
39#include "tm_p.h"
3940
40static bool prefer_and_bit_test (enum machine_mode, int);41static bool prefer_and_bit_test (enum machine_mode, int);
41static void do_jump_by_parts_greater (tree, tree, int, rtx, rtx, int);42static void do_jump_by_parts_greater (tree, tree, int, rtx, rtx, int);

Subscribers

People subscribed via source and target branches