Merge lp:~ramana/gcc-linaro/a5-a15-costs-backport-4.6 into lp:gcc-linaro/4.6

Proposed by Ramana Radhakrishnan
Status: Merged
Approved by: Ulrich Weigand
Approved revision: no longer in the source branch.
Merged at revision: 106759
Proposed branch: lp:~ramana/gcc-linaro/a5-a15-costs-backport-4.6
Merge into: lp:gcc-linaro/4.6
Diff against target: 562 lines (+251/-66) (has conflicts)
10 files modified
ChangeLog.linaro (+80/-0)
gcc/config/arm/arm-cores.def (+16/-15)
gcc/config/arm/arm-protos.h (+5/-0)
gcc/config/arm/arm-tune.md (+1/-1)
gcc/config/arm/arm.c (+108/-23)
gcc/config/arm/arm.h (+15/-5)
gcc/config/arm/arm.md (+23/-1)
gcc/config/arm/thumb2.md (+0/-20)
gcc/doc/invoke.texi (+2/-1)
gcc/dojump.c (+1/-0)
Text conflict in ChangeLog.linaro
To merge this branch: bzr merge lp:~ramana/gcc-linaro/a5-a15-costs-backport-4.6
Reviewer Review Type Date Requested Status
Linaro Toolchain Developers Pending
Review via email: mp+64575@code.launchpad.net

Description of the change

Hi,

This contains a set of backports of cost models and A5 / A15 tuning that was committed recently to trunk. Some of the costs infrastructure would make the BRANCH_COST tuning and other cost tuning work a bit easier to backport from trunk for the A9 as well as allow us to add more parameters for the A9.

There is some R5 tuning also pulled back but that is a zero-cost backport since that is something that comes by default with the A15 div instruction work.

It would be useful to get this sanity tested once with the default --with-tune=cortex-a9 to be sure nothing else breaks with this infrastructure now merged in.

cheers
Ramana

To post a comment you must log in.
Revision history for this message
Linaro Toolchain Builder (cbuild) wrote :

cbuild has taken a snapshot of this branch at r106759 and queued it for build.

The snapshot is available at:
 http://ex.seabright.co.nz/snapshots/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6.tar.xdelta3.xz

and will be built on the following builders:
 a9-builder i686 x86_64

You can track the build queue at:
 http://ex.seabright.co.nz/helpers/scheduler

cbuild-snapshot: gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6
cbuild-ancestor: lp:gcc-linaro/4.6+bzr106756
cbuild-state: check

Revision history for this message
Linaro Toolchain Builder (cbuild) wrote :

cbuild successfully built this on i686-lucid-cbuild132-scorpius-i686r1.

The build results are available at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6/logs/i686-lucid-cbuild132-scorpius-i686r1

The test suite results were unchanged compared to the branch point lp:gcc-linaro/4.6+bzr106756.

The full testsuite results are at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6/logs/i686-lucid-cbuild132-scorpius-i686r1/gcc-testsuite.txt

cbuild-checked: i686-lucid-cbuild132-scorpius-i686r1

Revision history for this message
Linaro Toolchain Builder (cbuild) wrote :

cbuild successfully built this on x86_64-maverick-cbuild132-crucis-x86_64r1.

The build results are available at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6/logs/x86_64-maverick-cbuild132-crucis-x86_64r1

The test suite results were unchanged compared to the branch point lp:gcc-linaro/4.6+bzr106756.

The full testsuite results are at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6/logs/x86_64-maverick-cbuild132-crucis-x86_64r1/gcc-testsuite.txt

cbuild-checked: x86_64-maverick-cbuild132-crucis-x86_64r1

Revision history for this message
Linaro Toolchain Builder (cbuild) wrote :

cbuild successfully built this on armv7l-maverick-cbuild132-ursa2-cortexa9r1.

The build results are available at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6/logs/armv7l-maverick-cbuild132-ursa2-cortexa9r1

The test suite results were unchanged compared to the branch point lp:gcc-linaro/4.6+bzr106756.

The full testsuite results are at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106759~ramana~a5-a15-costs-backport-4.6/logs/armv7l-maverick-cbuild132-ursa2-cortexa9r1/gcc-testsuite.txt

cbuild-checked: armv7l-maverick-cbuild132-ursa2-cortexa9r1

Revision history for this message
Ulrich Weigand (uweigand) wrote :

The backport looks good to me. Given that there are no testsuite regressions either, this is OK.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'ChangeLog.linaro'
2--- ChangeLog.linaro 2011-06-14 14:09:57 +0000
3+++ ChangeLog.linaro 2011-06-14 16:54:35 +0000
4@@ -1,3 +1,4 @@
5+<<<<<<< TREE
6 2011-06-14 Andrew Stubbs <ams@codesourcery.com>
7
8 gcc/
9@@ -10,6 +11,85 @@
10 gcc/
11 * LINARO-VERSION: Update.
12
13+=======
14+2011-06-14 Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>
15+
16+ Backport from mainline.
17+ 2011-06-03 Julian Brown <julian@codesourcery.com>
18+
19+ * config/arm/arm-cores.def (strongarm, strongarm110, strongarm1100)
20+ (strongarm1110): Use strongarm tuning.
21+ * config/arm/arm-protos.h (tune_params): Add max_insns_skipped
22+ field.
23+ * config/arm/arm.c (arm_strongarm_tune): New.
24+ (arm_slowmul_tune, arm_fastmul_tune, arm_xscale_tune, arm_9e_tune)
25+ (arm_v6t2_tune, arm_cortex_tune, arm_cortex_a5_tune)
26+ (arm_cortex_a9_tune, arm_fa726te_tune): Add max_insns_skipped field
27+ setting, using previous defaults or 1 for Cortex-A5.
28+ (arm_option_override): Set max_insns_skipped from current tuning.
29+
30+2011-06-14 Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>
31+
32+ Backport from mainline.
33+ 2011-06-02 Julian Brown <julian@codesourcery.com>
34+
35+ * config/arm/arm-cores.def (cortex-a5): Use cortex_a5 tuning.
36+ * config/arm/arm.c (arm_cortex_a5_branch_cost): New.
37+ (arm_cortex_a5_tune): New.
38+
39+ 2011-06-02 Julian Brown <julian@codesourcery.com>
40+
41+ * config/arm/arm-protos.h (tune_params): Add branch_cost hook.
42+ * config/arm/arm.c (arm_default_branch_cost): New.
43+ (arm_slowmul_tune, arm_fastmul_tune, arm_xscale_tune, arm_9e_tune)
44+ (arm_v6t2_tune, arm_cortex_tune, arm_cortex_a9_tune)
45+ (arm_fa726_tune): Set branch_cost field using
46+ arm_default_branch_cost.
47+ * config/arm/arm.h (BRANCH_COST): Use branch_cost hook from
48+ current_tune structure.
49+ * dojump.c (tm_p.h): Include file.
50+
51+ 2011-06-02 Julian Brown <julian@codesourcery.com>
52+
53+ * config/arm/arm-cores.def (arm1156t2-s, arm1156t2f-s): Use v6t2
54+ tuning.
55+ (cortex-a5, cortex-a8, cortex-a15, cortex-r4, cortex-r4f, cortex-m4)
56+ (cortex-m3, cortex-m1, cortex-m0): Use cortex tuning.
57+ * config/arm/arm-protos.h (tune_params): Add prefer_constant_pool
58+ field.
59+ * config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune)
60+ (arm_xscale_tune, arm_9e_tune, arm_cortex_a9_tune)
61+ (arm_fa726te_tune): Add prefer_constant_pool setting.
62+ (arm_v6t2_tune, arm_cortex_tune): New.
63+ * config/arm/arm.h (TARGET_USE_MOVT): Make dependent on
64+ prefer_constant_pool setting.
65+
66+2011-06-14 Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>
67+
68+ Backport from mainline
69+ 2011-06-01 Paul Brook <paul@cpodesourcery.com>
70+
71+ * config/arm/arm-cores.def: Add cortex-r5. Add DIV flags to
72+ Cortex-A15.
73+ * config/arm/arm-tune.md: Regenerate.
74+ * config/arm/arm.c (FL_DIV): Rename...
75+ (FL_THUMB_DIV): ... to this.
76+ (FL_ARM_DIV): Define.
77+ (FL_FOR_ARCH7R, FL_FOR_ARCH7M): Use FL_THUMB_DIV.
78+ (arm_arch_hwdiv): Remove.
79+ (arm_arch_thumb_hwdiv, arm_arch_arm_hwdiv): New variables.
80+ (arm_issue_rate): Add cortexr5.
81+ * config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Set
82+ __ARM_ARCH_EXT_IDIV__.
83+ (TARGET_IDIV): Define.
84+ (arm_arch_hwdiv): Remove.
85+ (arm_arch_arm_hwdiv, arm_arch_thumb_hwdiv): New prototypes.
86+ * config/arm/arm.md (tune_cortexr4): Add cortexr5.
87+ (divsi3, udivsi3): New patterns.
88+ * config/arm/thumb2.md (divsi3, udivsi3): Remove.
89+ * doc/invoke.texi: Document ARM -mcpu=cortex-r5
90+
91+>>>>>>> MERGE-SOURCE
92 2011-06-13 Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>
93
94 Backport from mainline:
95
96=== modified file 'gcc/config/arm/arm-cores.def'
97--- gcc/config/arm/arm-cores.def 2011-01-03 20:52:22 +0000
98+++ gcc/config/arm/arm-cores.def 2011-06-14 16:54:35 +0000
99@@ -70,10 +70,10 @@
100 /* V4 Architecture Processors */
101 ARM_CORE("arm8", arm8, 4, FL_MODE26 | FL_LDSCHED, fastmul)
102 ARM_CORE("arm810", arm810, 4, FL_MODE26 | FL_LDSCHED, fastmul)
103-ARM_CORE("strongarm", strongarm, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, fastmul)
104-ARM_CORE("strongarm110", strongarm110, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, fastmul)
105-ARM_CORE("strongarm1100", strongarm1100, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, fastmul)
106-ARM_CORE("strongarm1110", strongarm1110, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, fastmul)
107+ARM_CORE("strongarm", strongarm, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, strongarm)
108+ARM_CORE("strongarm110", strongarm110, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, strongarm)
109+ARM_CORE("strongarm1100", strongarm1100, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, strongarm)
110+ARM_CORE("strongarm1110", strongarm1110, 4, FL_MODE26 | FL_LDSCHED | FL_STRONG, strongarm)
111 ARM_CORE("fa526", fa526, 4, FL_LDSCHED, fastmul)
112 ARM_CORE("fa626", fa626, 4, FL_LDSCHED, fastmul)
113
114@@ -122,15 +122,16 @@
115 ARM_CORE("arm1176jzf-s", arm1176jzfs, 6ZK, FL_LDSCHED | FL_VFPV2, 9e)
116 ARM_CORE("mpcorenovfp", mpcorenovfp, 6K, FL_LDSCHED, 9e)
117 ARM_CORE("mpcore", mpcore, 6K, FL_LDSCHED | FL_VFPV2, 9e)
118-ARM_CORE("arm1156t2-s", arm1156t2s, 6T2, FL_LDSCHED, 9e)
119-ARM_CORE("arm1156t2f-s", arm1156t2fs, 6T2, FL_LDSCHED | FL_VFPV2, 9e)
120-ARM_CORE("cortex-a5", cortexa5, 7A, FL_LDSCHED, 9e)
121-ARM_CORE("cortex-a8", cortexa8, 7A, FL_LDSCHED, 9e)
122+ARM_CORE("arm1156t2-s", arm1156t2s, 6T2, FL_LDSCHED, v6t2)
123+ARM_CORE("arm1156t2f-s", arm1156t2fs, 6T2, FL_LDSCHED | FL_VFPV2, v6t2)
124+ARM_CORE("cortex-a5", cortexa5, 7A, FL_LDSCHED, cortex_a5)
125+ARM_CORE("cortex-a8", cortexa8, 7A, FL_LDSCHED, cortex)
126 ARM_CORE("cortex-a9", cortexa9, 7A, FL_LDSCHED, cortex_a9)
127-ARM_CORE("cortex-a15", cortexa15, 7A, FL_LDSCHED, 9e)
128-ARM_CORE("cortex-r4", cortexr4, 7R, FL_LDSCHED, 9e)
129-ARM_CORE("cortex-r4f", cortexr4f, 7R, FL_LDSCHED, 9e)
130-ARM_CORE("cortex-m4", cortexm4, 7EM, FL_LDSCHED, 9e)
131-ARM_CORE("cortex-m3", cortexm3, 7M, FL_LDSCHED, 9e)
132-ARM_CORE("cortex-m1", cortexm1, 6M, FL_LDSCHED, 9e)
133-ARM_CORE("cortex-m0", cortexm0, 6M, FL_LDSCHED, 9e)
134+ARM_CORE("cortex-a15", cortexa15, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex)
135+ARM_CORE("cortex-r4", cortexr4, 7R, FL_LDSCHED, cortex)
136+ARM_CORE("cortex-r4f", cortexr4f, 7R, FL_LDSCHED, cortex)
137+ARM_CORE("cortex-r5", cortexr5, 7R, FL_LDSCHED | FL_ARM_DIV, cortex)
138+ARM_CORE("cortex-m4", cortexm4, 7EM, FL_LDSCHED, cortex)
139+ARM_CORE("cortex-m3", cortexm3, 7M, FL_LDSCHED, cortex)
140+ARM_CORE("cortex-m1", cortexm1, 6M, FL_LDSCHED, cortex)
141+ARM_CORE("cortex-m0", cortexm0, 6M, FL_LDSCHED, cortex)
142
143=== modified file 'gcc/config/arm/arm-protos.h'
144--- gcc/config/arm/arm-protos.h 2011-05-03 15:17:25 +0000
145+++ gcc/config/arm/arm-protos.h 2011-06-14 16:54:35 +0000
146@@ -219,9 +219,14 @@
147 bool (*rtx_costs) (rtx, RTX_CODE, RTX_CODE, int *, bool);
148 bool (*sched_adjust_cost) (rtx, rtx, rtx, int *);
149 int constant_limit;
150+ /* Maximum number of instructions to conditionalise in
151+ arm_final_prescan_insn. */
152+ int max_insns_skipped;
153 int num_prefetch_slots;
154 int l1_cache_size;
155 int l1_cache_line_size;
156+ bool prefer_constant_pool;
157+ int (*branch_cost) (bool, bool);
158 };
159
160 extern const struct tune_params *current_tune;
161
162=== modified file 'gcc/config/arm/arm-tune.md'
163--- gcc/config/arm/arm-tune.md 2010-12-20 17:48:51 +0000
164+++ gcc/config/arm/arm-tune.md 2011-06-14 16:54:35 +0000
165@@ -1,5 +1,5 @@
166 ;; -*- buffer-read-only: t -*-
167 ;; Generated automatically by gentune.sh from arm-cores.def
168 (define_attr "tune"
169- "arm2,arm250,arm3,arm6,arm60,arm600,arm610,arm620,arm7,arm7d,arm7di,arm70,arm700,arm700i,arm710,arm720,arm710c,arm7100,arm7500,arm7500fe,arm7m,arm7dm,arm7dmi,arm8,arm810,strongarm,strongarm110,strongarm1100,strongarm1110,fa526,fa626,arm7tdmi,arm7tdmis,arm710t,arm720t,arm740t,arm9,arm9tdmi,arm920,arm920t,arm922t,arm940t,ep9312,arm10tdmi,arm1020t,arm9e,arm946es,arm966es,arm968es,arm10e,arm1020e,arm1022e,xscale,iwmmxt,iwmmxt2,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1026ejs,arm1136js,arm1136jfs,arm1176jzs,arm1176jzfs,mpcorenovfp,mpcore,arm1156t2s,arm1156t2fs,cortexa5,cortexa8,cortexa9,cortexa15,cortexr4,cortexr4f,cortexm4,cortexm3,cortexm1,cortexm0"
170+ "arm2,arm250,arm3,arm6,arm60,arm600,arm610,arm620,arm7,arm7d,arm7di,arm70,arm700,arm700i,arm710,arm720,arm710c,arm7100,arm7500,arm7500fe,arm7m,arm7dm,arm7dmi,arm8,arm810,strongarm,strongarm110,strongarm1100,strongarm1110,fa526,fa626,arm7tdmi,arm7tdmis,arm710t,arm720t,arm740t,arm9,arm9tdmi,arm920,arm920t,arm922t,arm940t,ep9312,arm10tdmi,arm1020t,arm9e,arm946es,arm966es,arm968es,arm10e,arm1020e,arm1022e,xscale,iwmmxt,iwmmxt2,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1026ejs,arm1136js,arm1136jfs,arm1176jzs,arm1176jzfs,mpcorenovfp,mpcore,arm1156t2s,arm1156t2fs,cortexa5,cortexa8,cortexa9,cortexa15,cortexr4,cortexr4f,cortexr5,cortexm4,cortexm3,cortexm1,cortexm0"
171 (const (symbol_ref "((enum attr_tune) arm_tune)")))
172
173=== modified file 'gcc/config/arm/arm.c'
174--- gcc/config/arm/arm.c 2011-05-11 14:49:48 +0000
175+++ gcc/config/arm/arm.c 2011-06-14 16:54:35 +0000
176@@ -255,6 +255,8 @@
177 static void arm_conditional_register_usage (void);
178 static reg_class_t arm_preferred_rename_class (reg_class_t rclass);
179 static unsigned int arm_autovectorize_vector_sizes (void);
180+static int arm_default_branch_cost (bool, bool);
181+static int arm_cortex_a5_branch_cost (bool, bool);
182
183
184
185 /* Table of machine attributes. */
186@@ -672,12 +674,13 @@
187 #define FL_THUMB2 (1 << 16) /* Thumb-2. */
188 #define FL_NOTM (1 << 17) /* Instructions not present in the 'M'
189 profile. */
190-#define FL_DIV (1 << 18) /* Hardware divide. */
191+#define FL_THUMB_DIV (1 << 18) /* Hardware divide (Thumb mode). */
192 #define FL_VFPV3 (1 << 19) /* Vector Floating Point V3. */
193 #define FL_NEON (1 << 20) /* Neon instructions. */
194 #define FL_ARCH7EM (1 << 21) /* Instructions present in the ARMv7E-M
195 architecture. */
196 #define FL_ARCH7 (1 << 22) /* Architecture 7. */
197+#define FL_ARM_DIV (1 << 23) /* Hardware divide (ARM mode). */
198
199 #define FL_IWMMXT (1 << 29) /* XScale v2 or "Intel Wireless MMX technology". */
200
201@@ -704,8 +707,8 @@
202 #define FL_FOR_ARCH6M (FL_FOR_ARCH6 & ~FL_NOTM)
203 #define FL_FOR_ARCH7 ((FL_FOR_ARCH6T2 & ~FL_NOTM) | FL_ARCH7)
204 #define FL_FOR_ARCH7A (FL_FOR_ARCH7 | FL_NOTM | FL_ARCH6K)
205-#define FL_FOR_ARCH7R (FL_FOR_ARCH7A | FL_DIV)
206-#define FL_FOR_ARCH7M (FL_FOR_ARCH7 | FL_DIV)
207+#define FL_FOR_ARCH7R (FL_FOR_ARCH7A | FL_THUMB_DIV)
208+#define FL_FOR_ARCH7M (FL_FOR_ARCH7 | FL_THUMB_DIV)
209 #define FL_FOR_ARCH7EM (FL_FOR_ARCH7M | FL_ARCH7EM)
210
211 /* The bits in this mask specify which
212@@ -791,7 +794,8 @@
213 int arm_arch_thumb2;
214
215 /* Nonzero if chip supports integer division instruction. */
216-int arm_arch_hwdiv;
217+int arm_arch_arm_hwdiv;
218+int arm_arch_thumb_hwdiv;
219
220 /* In case of a PRE_INC, POST_INC, PRE_DEC, POST_DEC memory reference,
221 we must report the mode of the memory reference from
222@@ -864,48 +868,117 @@
223 {
224 arm_slowmul_rtx_costs,
225 NULL,
226- 3,
227- ARM_PREFETCH_NOT_BENEFICIAL
228+ 3, /* Constant limit. */
229+ 5, /* Max cond insns. */
230+ ARM_PREFETCH_NOT_BENEFICIAL,
231+ true, /* Prefer constant pool. */
232+ arm_default_branch_cost
233 };
234
235 const struct tune_params arm_fastmul_tune =
236 {
237 arm_fastmul_rtx_costs,
238 NULL,
239- 1,
240- ARM_PREFETCH_NOT_BENEFICIAL
241+ 1, /* Constant limit. */
242+ 5, /* Max cond insns. */
243+ ARM_PREFETCH_NOT_BENEFICIAL,
244+ true, /* Prefer constant pool. */
245+ arm_default_branch_cost
246+};
247+
248+/* StrongARM has early execution of branches, so a sequence that is worth
249+ skipping is shorter. Set max_insns_skipped to a lower value. */
250+
251+const struct tune_params arm_strongarm_tune =
252+{
253+ arm_fastmul_rtx_costs,
254+ NULL,
255+ 1, /* Constant limit. */
256+ 3, /* Max cond insns. */
257+ ARM_PREFETCH_NOT_BENEFICIAL,
258+ true, /* Prefer constant pool. */
259+ arm_default_branch_cost
260 };
261
262 const struct tune_params arm_xscale_tune =
263 {
264 arm_xscale_rtx_costs,
265 xscale_sched_adjust_cost,
266- 2,
267- ARM_PREFETCH_NOT_BENEFICIAL
268+ 2, /* Constant limit. */
269+ 3, /* Max cond insns. */
270+ ARM_PREFETCH_NOT_BENEFICIAL,
271+ true, /* Prefer constant pool. */
272+ arm_default_branch_cost
273 };
274
275 const struct tune_params arm_9e_tune =
276 {
277 arm_9e_rtx_costs,
278 NULL,
279- 1,
280- ARM_PREFETCH_NOT_BENEFICIAL
281+ 1, /* Constant limit. */
282+ 5, /* Max cond insns. */
283+ ARM_PREFETCH_NOT_BENEFICIAL,
284+ true, /* Prefer constant pool. */
285+ arm_default_branch_cost
286+};
287+
288+const struct tune_params arm_v6t2_tune =
289+{
290+ arm_9e_rtx_costs,
291+ NULL,
292+ 1, /* Constant limit. */
293+ 5, /* Max cond insns. */
294+ ARM_PREFETCH_NOT_BENEFICIAL,
295+ false, /* Prefer constant pool. */
296+ arm_default_branch_cost
297+};
298+
299+/* Generic Cortex tuning. Use more specific tunings if appropriate. */
300+const struct tune_params arm_cortex_tune =
301+{
302+ arm_9e_rtx_costs,
303+ NULL,
304+ 1, /* Constant limit. */
305+ 5, /* Max cond insns. */
306+ ARM_PREFETCH_NOT_BENEFICIAL,
307+ false, /* Prefer constant pool. */
308+ arm_default_branch_cost
309+};
310+
311+/* Branches can be dual-issued on Cortex-A5, so conditional execution is
312+ less appealing. Set max_insns_skipped to a low value. */
313+
314+const struct tune_params arm_cortex_a5_tune =
315+{
316+ arm_9e_rtx_costs,
317+ NULL,
318+ 1, /* Constant limit. */
319+ 1, /* Max cond insns. */
320+ ARM_PREFETCH_NOT_BENEFICIAL,
321+ false, /* Prefer constant pool. */
322+ arm_cortex_a5_branch_cost
323 };
324
325 const struct tune_params arm_cortex_a9_tune =
326 {
327 arm_9e_rtx_costs,
328 cortex_a9_sched_adjust_cost,
329- 1,
330- ARM_PREFETCH_BENEFICIAL(4,32,32)
331+ 1, /* Constant limit. */
332+ 5, /* Max cond insns. */
333+ ARM_PREFETCH_BENEFICIAL(4,32,32),
334+ false, /* Prefer constant pool. */
335+ arm_default_branch_cost
336 };
337
338 const struct tune_params arm_fa726te_tune =
339 {
340 arm_9e_rtx_costs,
341 fa726te_sched_adjust_cost,
342- 1,
343- ARM_PREFETCH_NOT_BENEFICIAL
344+ 1, /* Constant limit. */
345+ 5, /* Max cond insns. */
346+ ARM_PREFETCH_NOT_BENEFICIAL,
347+ true, /* Prefer constant pool. */
348+ arm_default_branch_cost
349 };
350
351
352@@ -1711,7 +1784,8 @@
353 arm_tune_wbuf = (tune_flags & FL_WBUF) != 0;
354 arm_tune_xscale = (tune_flags & FL_XSCALE) != 0;
355 arm_arch_iwmmxt = (insn_flags & FL_IWMMXT) != 0;
356- arm_arch_hwdiv = (insn_flags & FL_DIV) != 0;
357+ arm_arch_thumb_hwdiv = (insn_flags & FL_THUMB_DIV) != 0;
358+ arm_arch_arm_hwdiv = (insn_flags & FL_ARM_DIV) != 0;
359 arm_tune_cortex_a9 = (arm_tune == cortexa9) != 0;
360
361 /* If we are not using the default (ARM mode) section anchor offset
362@@ -1991,12 +2065,7 @@
363 max_insns_skipped = 6;
364 }
365 else
366- {
367- /* StrongARM has early execution of branches, so a sequence
368- that is worth skipping is shorter. */
369- if (arm_tune_strongarm)
370- max_insns_skipped = 3;
371- }
372+ max_insns_skipped = current_tune->max_insns_skipped;
373
374 /* Hot/Cold partitioning is not currently supported, since we can't
375 handle literal pool placement in that case. */
376@@ -8211,6 +8280,21 @@
377 return cost;
378 }
379
380+static int
381+arm_default_branch_cost (bool speed_p, bool predictable_p ATTRIBUTE_UNUSED)
382+{
383+ if (TARGET_32BIT)
384+ return (TARGET_THUMB2 && !speed_p) ? 1 : 4;
385+ else
386+ return (optimize > 0) ? 2 : 0;
387+}
388+
389+static int
390+arm_cortex_a5_branch_cost (bool speed_p, bool predictable_p)
391+{
392+ return speed_p ? 0 : arm_default_branch_cost (speed_p, predictable_p);
393+}
394+
395 static int fp_consts_inited = 0;
396
397 /* Only zero is valid for VFP. Other values are also valid for FPA. */
398@@ -23123,6 +23207,7 @@
399 {
400 case cortexr4:
401 case cortexr4f:
402+ case cortexr5:
403 case cortexa5:
404 case cortexa8:
405 case cortexa9:
406
407=== modified file 'gcc/config/arm/arm.h'
408--- gcc/config/arm/arm.h 2011-06-02 12:12:00 +0000
409+++ gcc/config/arm/arm.h 2011-06-14 16:54:35 +0000
410@@ -101,6 +101,8 @@
411 builtin_define ("__ARM_PCS"); \
412 builtin_define ("__ARM_EABI__"); \
413 } \
414+ if (TARGET_IDIV) \
415+ builtin_define ("__ARM_ARCH_EXT_IDIV__"); \
416 } while (0)
417
418 /* The various ARM cores. */
419@@ -282,7 +284,8 @@
420 (TARGET_32BIT && arm_arch6 && (arm_arch_notm || arm_arch7em))
421
422 /* Should MOVW/MOVT be used in preference to a constant pool. */
423-#define TARGET_USE_MOVT (arm_arch_thumb2 && !optimize_size)
424+#define TARGET_USE_MOVT \
425+ (arm_arch_thumb2 && !optimize_size && !current_tune->prefer_constant_pool)
426
427 /* We could use unified syntax for arm mode, but for now we just use it
428 for Thumb-2. */
429@@ -303,6 +306,10 @@
430 /* Nonzero if this chip supports ldrex{bhd} and strex{bhd}. */
431 #define TARGET_HAVE_LDREXBHD ((arm_arch6k && TARGET_ARM) || arm_arch7)
432
433+/* Nonzero if integer division instructions supported. */
434+#define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \
435+ || (TARGET_THUMB2 && arm_arch_thumb_hwdiv))
436+
437 /* True iff the full BPABI is being used. If TARGET_BPABI is true,
438 then TARGET_AAPCS_BASED must be true -- but the converse does not
439 hold. TARGET_BPABI implies the use of the BPABI runtime library,
440@@ -487,8 +494,11 @@
441 /* Nonzero if chip supports Thumb 2. */
442 extern int arm_arch_thumb2;
443
444-/* Nonzero if chip supports integer division instruction. */
445-extern int arm_arch_hwdiv;
446+/* Nonzero if chip supports integer division instruction in ARM mode. */
447+extern int arm_arch_arm_hwdiv;
448+
449+/* Nonzero if chip supports integer division instruction in Thumb mode. */
450+extern int arm_arch_thumb_hwdiv;
451
452 #ifndef TARGET_DEFAULT
453 #define TARGET_DEFAULT (MASK_APCS_FRAME)
454@@ -2018,8 +2028,8 @@
455 /* Try to generate sequences that don't involve branches, we can then use
456 conditional instructions */
457 #define BRANCH_COST(speed_p, predictable_p) \
458- (TARGET_32BIT ? (TARGET_THUMB2 && !speed_p ? 1 : 4) \
459- : (optimize > 0 ? 2 : 0))
460+ (current_tune->branch_cost (speed_p, predictable_p))
461+
462
463
464 /* Position Independent Code. */
465 /* We decide which register to use based on the compilation options and
466
467=== modified file 'gcc/config/arm/arm.md'
468--- gcc/config/arm/arm.md 2011-06-02 15:58:33 +0000
469+++ gcc/config/arm/arm.md 2011-06-14 16:54:35 +0000
470@@ -490,7 +490,7 @@
471
472 (define_attr "tune_cortexr4" "yes,no"
473 (const (if_then_else
474- (eq_attr "tune" "cortexr4,cortexr4f")
475+ (eq_attr "tune" "cortexr4,cortexr4f,cortexr5")
476 (const_string "yes")
477 (const_string "no"))))
478
479@@ -3738,6 +3738,28 @@
480 (set_attr "predicable" "yes")]
481 )
482
483+
484+;; Division instructions
485+(define_insn "divsi3"
486+ [(set (match_operand:SI 0 "s_register_operand" "=r")
487+ (div:SI (match_operand:SI 1 "s_register_operand" "r")
488+ (match_operand:SI 2 "s_register_operand" "r")))]
489+ "TARGET_IDIV"
490+ "sdiv%?\t%0, %1, %2"
491+ [(set_attr "predicable" "yes")
492+ (set_attr "insn" "sdiv")]
493+)
494+
495+(define_insn "udivsi3"
496+ [(set (match_operand:SI 0 "s_register_operand" "=r")
497+ (udiv:SI (match_operand:SI 1 "s_register_operand" "r")
498+ (match_operand:SI 2 "s_register_operand" "r")))]
499+ "TARGET_IDIV"
500+ "udiv%?\t%0, %1, %2"
501+ [(set_attr "predicable" "yes")
502+ (set_attr "insn" "udiv")]
503+)
504+
505
506
507 ;; Unary arithmetic insns
508
509
510=== modified file 'gcc/config/arm/thumb2.md'
511--- gcc/config/arm/thumb2.md 2011-05-11 07:15:47 +0000
512+++ gcc/config/arm/thumb2.md 2011-06-14 16:54:35 +0000
513@@ -779,26 +779,6 @@
514 (set_attr "length" "2")]
515 )
516
517-(define_insn "divsi3"
518- [(set (match_operand:SI 0 "s_register_operand" "=r")
519- (div:SI (match_operand:SI 1 "s_register_operand" "r")
520- (match_operand:SI 2 "s_register_operand" "r")))]
521- "TARGET_THUMB2 && arm_arch_hwdiv"
522- "sdiv%?\t%0, %1, %2"
523- [(set_attr "predicable" "yes")
524- (set_attr "insn" "sdiv")]
525-)
526-
527-(define_insn "udivsi3"
528- [(set (match_operand:SI 0 "s_register_operand" "=r")
529- (udiv:SI (match_operand:SI 1 "s_register_operand" "r")
530- (match_operand:SI 2 "s_register_operand" "r")))]
531- "TARGET_THUMB2 && arm_arch_hwdiv"
532- "udiv%?\t%0, %1, %2"
533- [(set_attr "predicable" "yes")
534- (set_attr "insn" "udiv")]
535-)
536-
537 (define_insn "*thumb2_subsi_short"
538 [(set (match_operand:SI 0 "low_register_operand" "=l")
539 (minus:SI (match_operand:SI 1 "low_register_operand" "l")
540
541=== modified file 'gcc/doc/invoke.texi'
542--- gcc/doc/invoke.texi 2011-05-11 07:15:47 +0000
543+++ gcc/doc/invoke.texi 2011-06-14 16:54:35 +0000
544@@ -10208,7 +10208,8 @@
545 @samp{arm1136j-s}, @samp{arm1136jf-s}, @samp{mpcore}, @samp{mpcorenovfp},
546 @samp{arm1156t2-s}, @samp{arm1156t2f-s}, @samp{arm1176jz-s}, @samp{arm1176jzf-s},
547 @samp{cortex-a5}, @samp{cortex-a8}, @samp{cortex-a9}, @samp{cortex-a15},
548-@samp{cortex-r4}, @samp{cortex-r4f}, @samp{cortex-m4}, @samp{cortex-m3},
549+@samp{cortex-r4}, @samp{cortex-r4f}, @samp{cortex-r5},
550+@samp{cortex-m4}, @samp{cortex-m3},
551 @samp{cortex-m1},
552 @samp{cortex-m0},
553 @samp{xscale}, @samp{iwmmxt}, @samp{iwmmxt2}, @samp{ep9312}.
554
555=== modified file 'gcc/dojump.c'
556--- gcc/dojump.c 2010-05-19 19:09:57 +0000
557+++ gcc/dojump.c 2011-06-14 16:54:35 +0000
558@@ -36,6 +36,7 @@
559 #include "ggc.h"
560 #include "basic-block.h"
561 #include "output.h"
562+#include "tm_p.h"
563
564 static bool prefer_and_bit_test (enum machine_mode, int);
565 static void do_jump_by_parts_greater (tree, tree, int, rtx, rtx, int);

Subscribers

People subscribed via source and target branches