Merge lp:~ams-codesourcery/gcc-linaro/discourage-neon-on-a8 into lp:gcc-linaro/4.6
- discourage-neon-on-a8
- Merge into 4.6
Status: | Rejected |
---|---|
Rejected by: | Andrew Stubbs |
Proposed branch: | lp:~ams-codesourcery/gcc-linaro/discourage-neon-on-a8 |
Merge into: | lp:gcc-linaro/4.6 |
Diff against target: |
306 lines (+124/-41) (has conflicts) 4 files modified
ChangeLog.linaro (+21/-0) gcc/config/arm/arm.md (+12/-4) gcc/config/arm/neon.md (+46/-34) gcc/config/arm/vfp.md (+45/-3) Text conflict in ChangeLog.linaro |
To merge this branch: | bzr merge lp:~ams-codesourcery/gcc-linaro/discourage-neon-on-a8 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Ramana Radhakrishnan (community) | Needs Information | ||
Review via email: mp+56564@code.launchpad.net |
This proposal supersedes a proposal from 2011-03-25.
Commit message
Description of the change
Discourage use of NEON for integer operations on Cortex-A8.
Transfers from NEON/VFP registers to core registers are prohibitively expensive on A8. Modelling this cost is difficult, so simply discourage the use of NEON for integers. (Float values must be transferred anyway, so leave those unaffected.)
This is merged from gcc-linaro/
Linaro Toolchain Builder (cbuild) wrote : Posted in a previous version of this proposal | # |
Linaro Toolchain Builder (cbuild) wrote : Posted in a previous version of this proposal | # |
cbuild successfully built this on i686-lucid-
The build results are available at:
http://
cbuild-checked: i686-lucid-
Linaro Toolchain Builder (cbuild) wrote : Posted in a previous version of this proposal | # |
cbuild successfully built this on armv7l-
The build results are available at:
http://
cbuild-checked: armv7l-
Linaro Toolchain Builder (cbuild) wrote : Posted in a previous version of this proposal | # |
cbuild successfully built this on armv7l-
The build results are available at:
http://
cbuild-checked: armv7l-
Linaro Toolchain Builder (cbuild) wrote : Posted in a previous version of this proposal | # |
cbuild successfully built this on x86_64-
The build results are available at:
http://
cbuild-checked: x86_64-
Linaro Toolchain Builder (cbuild) wrote : | # |
cbuild has taken a snapshot of this branch at r106730 and queued it for build.
The snapshot is available at:
http://
and will be built on the following builders:
a8-builder a9-builder i686 x86_64
You can track the build queue at:
http://
cbuild-snapshot: gcc-linaro-
cbuild-ancestor: lp:gcc-linaro/4.6+bzr106729
cbuild-state: check
Linaro Toolchain Builder (cbuild) wrote : | # |
cbuild successfully built this on i686-lucid-
The build results are available at:
http://
The test suite results were unchanged compared to the branch point lp:gcc-linaro/4.6+bzr106729.
The full testsuite results are at:
http://
cbuild-checked: i686-lucid-
Linaro Toolchain Builder (cbuild) wrote : | # |
cbuild successfully built this on x86_64-
The build results are available at:
http://
The test suite results were unchanged compared to the branch point lp:gcc-linaro/4.6+bzr106729.
The full testsuite results are at:
http://
cbuild-checked: x86_64-
Linaro Toolchain Builder (cbuild) wrote : | # |
cbuild successfully built this on armv7l-
The build results are available at:
http://
The test suite was not checked as the branch point lp:gcc-linaro/4.6+bzr106729 has nothing to compare against.
The full testsuite results are at:
http://
cbuild-checked: armv7l-
Ramana Radhakrishnan (ramana) wrote : | # |
I have marked this as needs information as the cbuild run didn't seem to have a baseline to compare the test results with .
I would think this is largely ok because this has been approved upstream into trunk.
http://
cheers
Ramana
Ira Rosen (irar) wrote : | # |
I am out of the office until 17/04/2011.
Note: This is an automated response to your message "Re: [Merge]
lp:~ams-codesourcery/gcc-linaro/discourage-neon-on-a8 into
lp:gcc-linaro/4.6" sent on 12/4/11 15:49:29.
This is the only notification you will receive while this person is away.
Unmerged revisions
- 106730. By Andrew Stubbs
-
Discourage use of NEON on Cortex-A8.
Backport from FSF.
Preview Diff
1 | === modified file 'ChangeLog.linaro' |
2 | --- ChangeLog.linaro 2011-04-05 16:18:11 +0000 |
3 | +++ ChangeLog.linaro 2011-04-06 13:23:50 +0000 |
4 | @@ -1,3 +1,4 @@ |
5 | +<<<<<<< TREE |
6 | 2011-03-23 Andrew Stubbs <ams@codesourcery.com> |
7 | |
8 | Backport from FSF: |
9 | @@ -12,6 +13,26 @@ |
10 | |
11 | Merge from FSF GCC 4.6 (svn branches/gcc-4_6-branch 171336). |
12 | |
13 | +======= |
14 | +2011-03-25 Andrew Stubbs <ams@codesourcery.com> |
15 | + |
16 | + Backport from FSF: |
17 | + |
18 | + 2011-03-25 Bernd Schmidt <bernds@codesourcery.com> |
19 | + Andrew Stubbs <ams@codesourcery.com> |
20 | + |
21 | + gcc/ |
22 | + * config/arm/vfp.md (arm_movdi_vfp): Enable only when not tuning |
23 | + for Cortex-A8. |
24 | + (arm_movdi_vfp_cortexa8): New pattern. |
25 | + * config/arm/neon.md (adddi3_neon, subdi3_neon, anddi3_neon, |
26 | + iordi3_neon, xordi3_neon): Add alternatives to discourage Neon |
27 | + instructions when tuning for Cortex-A8. Set attribute "arch". |
28 | + * config/arm/arm.md: Move include arm-tune.md up a bit. |
29 | + (define_attr "arch"): Add "onlya8" and "nota8" values. |
30 | + (define_attr "arch_enabled"): Handle "onlya8" and "nota8". |
31 | + |
32 | +>>>>>>> MERGE-SOURCE |
33 | 2011-03-22 Andrew Stubbs <ams@codesourcery.com> |
34 | |
35 | Backport from FSF: |
36 | |
37 | === modified file 'gcc/config/arm/arm.md' |
38 | --- gcc/config/arm/arm.md 2011-03-15 19:59:25 +0000 |
39 | +++ gcc/config/arm/arm.md 2011-04-06 13:23:50 +0000 |
40 | @@ -149,6 +149,9 @@ |
41 | ;;--------------------------------------------------------------------------- |
42 | ;; Attributes |
43 | |
44 | +;; Processor type. This is created automatically from arm-cores.def. |
45 | +(include "arm-tune.md") |
46 | + |
47 | ; IS_THUMB is set to 'yes' when we are generating Thumb code, and 'no' when |
48 | ; generating ARM code. This is used to control the length of some insn |
49 | ; patterns that share the same RTL in both ARM and Thumb code. |
50 | @@ -192,7 +195,7 @@ |
51 | ; for ARM or Thumb-2 with arm_arch6, and nov6 for ARM without |
52 | ; arm_arch6. This attribute is used to compute attribute "enabled", |
53 | ; use type "any" to enable an alternative in all cases. |
54 | -(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6" |
55 | +(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,onlya8,nota8" |
56 | (const_string "any")) |
57 | |
58 | (define_attr "arch_enabled" "no,yes" |
59 | @@ -225,6 +228,14 @@ |
60 | |
61 | (and (eq_attr "arch" "nov6") |
62 | (ne (symbol_ref "(TARGET_32BIT && !arm_arch6)") (const_int 0))) |
63 | + (const_string "yes") |
64 | + |
65 | + (and (eq_attr "arch" "onlya8") |
66 | + (eq_attr "tune" "cortexa8")) |
67 | + (const_string "yes") |
68 | + |
69 | + (and (eq_attr "arch" "nota8") |
70 | + (not (eq_attr "tune" "cortexa8"))) |
71 | (const_string "yes")] |
72 | (const_string "no"))) |
73 | |
74 | @@ -485,9 +496,6 @@ |
75 | ;;--------------------------------------------------------------------------- |
76 | ;; Pipeline descriptions |
77 | |
78 | -;; Processor type. This is created automatically from arm-cores.def. |
79 | -(include "arm-tune.md") |
80 | - |
81 | (define_attr "tune_cortexr4" "yes,no" |
82 | (const (if_then_else |
83 | (eq_attr "tune" "cortexr4,cortexr4f") |
84 | |
85 | === modified file 'gcc/config/arm/neon.md' |
86 | --- gcc/config/arm/neon.md 2011-01-03 20:52:22 +0000 |
87 | +++ gcc/config/arm/neon.md 2011-04-06 13:23:50 +0000 |
88 | @@ -583,23 +583,25 @@ |
89 | ) |
90 | |
91 | (define_insn "adddi3_neon" |
92 | - [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r") |
93 | - (plus:DI (match_operand:DI 1 "s_register_operand" "%w,0,0") |
94 | - (match_operand:DI 2 "s_register_operand" "w,r,0"))) |
95 | + [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r,?w") |
96 | + (plus:DI (match_operand:DI 1 "s_register_operand" "%w,0,0,w") |
97 | + (match_operand:DI 2 "s_register_operand" "w,r,0,w"))) |
98 | (clobber (reg:CC CC_REGNUM))] |
99 | "TARGET_NEON" |
100 | { |
101 | switch (which_alternative) |
102 | { |
103 | - case 0: return "vadd.i64\t%P0, %P1, %P2"; |
104 | + case 0: /* fall through */ |
105 | + case 3: return "vadd.i64\t%P0, %P1, %P2"; |
106 | case 1: return "#"; |
107 | case 2: return "#"; |
108 | default: gcc_unreachable (); |
109 | } |
110 | } |
111 | - [(set_attr "neon_type" "neon_int_1,*,*") |
112 | - (set_attr "conds" "*,clob,clob") |
113 | - (set_attr "length" "*,8,8")] |
114 | + [(set_attr "neon_type" "neon_int_1,*,*,neon_int_1") |
115 | + (set_attr "conds" "*,clob,clob,*") |
116 | + (set_attr "length" "*,8,8,*") |
117 | + (set_attr "arch" "nota8,*,*,onlya8")] |
118 | ) |
119 | |
120 | (define_insn "*sub<mode>3_neon" |
121 | @@ -617,24 +619,26 @@ |
122 | ) |
123 | |
124 | (define_insn "subdi3_neon" |
125 | - [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r,?&r") |
126 | - (minus:DI (match_operand:DI 1 "s_register_operand" "w,0,r,0") |
127 | - (match_operand:DI 2 "s_register_operand" "w,r,0,0"))) |
128 | + [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r,?&r,?w") |
129 | + (minus:DI (match_operand:DI 1 "s_register_operand" "w,0,r,0,w") |
130 | + (match_operand:DI 2 "s_register_operand" "w,r,0,0,w"))) |
131 | (clobber (reg:CC CC_REGNUM))] |
132 | "TARGET_NEON" |
133 | { |
134 | switch (which_alternative) |
135 | { |
136 | - case 0: return "vsub.i64\t%P0, %P1, %P2"; |
137 | + case 0: /* fall through */ |
138 | + case 4: return "vsub.i64\t%P0, %P1, %P2"; |
139 | case 1: /* fall through */ |
140 | case 2: /* fall through */ |
141 | case 3: return "subs\\t%Q0, %Q1, %Q2\;sbc\\t%R0, %R1, %R2"; |
142 | default: gcc_unreachable (); |
143 | } |
144 | } |
145 | - [(set_attr "neon_type" "neon_int_2,*,*,*") |
146 | - (set_attr "conds" "*,clob,clob,clob") |
147 | - (set_attr "length" "*,8,8,8")] |
148 | + [(set_attr "neon_type" "neon_int_2,*,*,*,neon_int_2") |
149 | + (set_attr "conds" "*,clob,clob,clob,*") |
150 | + (set_attr "length" "*,8,8,8,*") |
151 | + (set_attr "arch" "nota8,*,*,*,onlya8")] |
152 | ) |
153 | |
154 | (define_insn "*mul<mode>3_neon" |
155 | @@ -720,23 +724,26 @@ |
156 | ) |
157 | |
158 | (define_insn "iordi3_neon" |
159 | - [(set (match_operand:DI 0 "s_register_operand" "=w,w,?&r,?&r") |
160 | - (ior:DI (match_operand:DI 1 "s_register_operand" "%w,0,0,r") |
161 | - (match_operand:DI 2 "neon_logic_op2" "w,Dl,r,r")))] |
162 | + [(set (match_operand:DI 0 "s_register_operand" "=w,w,?&r,?&r,?w,?w") |
163 | + (ior:DI (match_operand:DI 1 "s_register_operand" "%w,0,0,r,w,0") |
164 | + (match_operand:DI 2 "neon_logic_op2" "w,Dl,r,r,w,Dl")))] |
165 | "TARGET_NEON" |
166 | { |
167 | switch (which_alternative) |
168 | { |
169 | - case 0: return "vorr\t%P0, %P1, %P2"; |
170 | - case 1: return neon_output_logic_immediate ("vorr", &operands[2], |
171 | + case 0: /* fall through */ |
172 | + case 4: return "vorr\t%P0, %P1, %P2"; |
173 | + case 1: /* fall through */ |
174 | + case 5: return neon_output_logic_immediate ("vorr", &operands[2], |
175 | DImode, 0, VALID_NEON_QREG_MODE (DImode)); |
176 | case 2: return "#"; |
177 | case 3: return "#"; |
178 | default: gcc_unreachable (); |
179 | } |
180 | } |
181 | - [(set_attr "neon_type" "neon_int_1,neon_int_1,*,*") |
182 | - (set_attr "length" "*,*,8,8")] |
183 | + [(set_attr "neon_type" "neon_int_1,neon_int_1,*,*,neon_int_1,neon_int_1") |
184 | + (set_attr "length" "*,*,8,8,*,*") |
185 | + (set_attr "arch" "nota8,nota8,*,*,onlya8,onlya8")] |
186 | ) |
187 | |
188 | ;; The concrete forms of the Neon immediate-logic instructions are vbic and |
189 | @@ -762,23 +769,26 @@ |
190 | ) |
191 | |
192 | (define_insn "anddi3_neon" |
193 | - [(set (match_operand:DI 0 "s_register_operand" "=w,w,?&r,?&r") |
194 | - (and:DI (match_operand:DI 1 "s_register_operand" "%w,0,0,r") |
195 | - (match_operand:DI 2 "neon_inv_logic_op2" "w,DL,r,r")))] |
196 | + [(set (match_operand:DI 0 "s_register_operand" "=w,w,?&r,?&r,?w,?w") |
197 | + (and:DI (match_operand:DI 1 "s_register_operand" "%w,0,0,r,w,0") |
198 | + (match_operand:DI 2 "neon_inv_logic_op2" "w,DL,r,r,w,DL")))] |
199 | "TARGET_NEON" |
200 | { |
201 | switch (which_alternative) |
202 | { |
203 | - case 0: return "vand\t%P0, %P1, %P2"; |
204 | - case 1: return neon_output_logic_immediate ("vand", &operands[2], |
205 | + case 0: /* fall through */ |
206 | + case 4: return "vand\t%P0, %P1, %P2"; |
207 | + case 1: /* fall through */ |
208 | + case 5: return neon_output_logic_immediate ("vand", &operands[2], |
209 | DImode, 1, VALID_NEON_QREG_MODE (DImode)); |
210 | case 2: return "#"; |
211 | case 3: return "#"; |
212 | default: gcc_unreachable (); |
213 | } |
214 | } |
215 | - [(set_attr "neon_type" "neon_int_1,neon_int_1,*,*") |
216 | - (set_attr "length" "*,*,8,8")] |
217 | + [(set_attr "neon_type" "neon_int_1,neon_int_1,*,*,neon_int_1,neon_int_1") |
218 | + (set_attr "length" "*,*,8,8,*,*") |
219 | + (set_attr "arch" "nota8,nota8,*,*,onlya8,onlya8")] |
220 | ) |
221 | |
222 | (define_insn "orn<mode>3_neon" |
223 | @@ -836,16 +846,18 @@ |
224 | ) |
225 | |
226 | (define_insn "xordi3_neon" |
227 | - [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r") |
228 | - (xor:DI (match_operand:DI 1 "s_register_operand" "%w,0,r") |
229 | - (match_operand:DI 2 "s_register_operand" "w,r,r")))] |
230 | + [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r,?w") |
231 | + (xor:DI (match_operand:DI 1 "s_register_operand" "%w,0,r,w") |
232 | + (match_operand:DI 2 "s_register_operand" "w,r,r,w")))] |
233 | "TARGET_NEON" |
234 | "@ |
235 | veor\t%P0, %P1, %P2 |
236 | # |
237 | - #" |
238 | - [(set_attr "neon_type" "neon_int_1,*,*") |
239 | - (set_attr "length" "*,8,8")] |
240 | + # |
241 | + veor\t%P0, %P1, %P2" |
242 | + [(set_attr "neon_type" "neon_int_1,*,*,neon_int_1") |
243 | + (set_attr "length" "*,8,8,*") |
244 | + (set_attr "arch" "nota8,*,*,onlya8")] |
245 | ) |
246 | |
247 | (define_insn "one_cmpl<mode>2" |
248 | |
249 | === modified file 'gcc/config/arm/vfp.md' |
250 | --- gcc/config/arm/vfp.md 2011-01-20 22:03:29 +0000 |
251 | +++ gcc/config/arm/vfp.md 2011-04-06 13:23:50 +0000 |
252 | @@ -134,9 +134,51 @@ |
253 | ;; DImode moves |
254 | |
255 | (define_insn "*arm_movdi_vfp" |
256 | - [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r,m,w,r,w,w, Uv") |
257 | - (match_operand:DI 1 "di_operand" "rIK,mi,r,r,w,w,Uvi,w"))] |
258 | - "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP |
259 | + [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r, m,w,r,w,w, Uv") |
260 | + (match_operand:DI 1 "di_operand" "rIK,mi,r,r,w,w,Uvi,w"))] |
261 | + "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP && arm_tune != cortexa8 |
262 | + && ( register_operand (operands[0], DImode) |
263 | + || register_operand (operands[1], DImode))" |
264 | + "* |
265 | + switch (which_alternative) |
266 | + { |
267 | + case 0: |
268 | + return \"#\"; |
269 | + case 1: |
270 | + case 2: |
271 | + return output_move_double (operands); |
272 | + case 3: |
273 | + return \"fmdrr%?\\t%P0, %Q1, %R1\\t%@ int\"; |
274 | + case 4: |
275 | + return \"fmrrd%?\\t%Q0, %R0, %P1\\t%@ int\"; |
276 | + case 5: |
277 | + if (TARGET_VFP_SINGLE) |
278 | + return \"fcpys%?\\t%0, %1\\t%@ int\;fcpys%?\\t%p0, %p1\\t%@ int\"; |
279 | + else |
280 | + return \"fcpyd%?\\t%P0, %P1\\t%@ int\"; |
281 | + case 6: case 7: |
282 | + return output_move_vfp (operands); |
283 | + default: |
284 | + gcc_unreachable (); |
285 | + } |
286 | + " |
287 | + [(set_attr "type" "*,load2,store2,r_2_f,f_2_r,ffarithd,f_loadd,f_stored") |
288 | + (set_attr "neon_type" "*,*,*,neon_mcr_2_mcrr,neon_mrrc,neon_vmov,*,*") |
289 | + (set (attr "length") (cond [(eq_attr "alternative" "0,1,2") (const_int 8) |
290 | + (eq_attr "alternative" "5") |
291 | + (if_then_else |
292 | + (eq (symbol_ref "TARGET_VFP_SINGLE") (const_int 1)) |
293 | + (const_int 8) |
294 | + (const_int 4))] |
295 | + (const_int 4))) |
296 | + (set_attr "pool_range" "*,1020,*,*,*,*,1020,*") |
297 | + (set_attr "neg_pool_range" "*,1008,*,*,*,*,1008,*")] |
298 | +) |
299 | + |
300 | +(define_insn "*arm_movdi_vfp_cortexa8" |
301 | + [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r,m,w,!r,w,w, Uv") |
302 | + (match_operand:DI 1 "di_operand" "rIK,mi,r,r,w,w,Uvi,w"))] |
303 | + "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP && arm_tune == cortexa8 |
304 | && ( register_operand (operands[0], DImode) |
305 | || register_operand (operands[1], DImode))" |
306 | "* |
cbuild has taken a snapshot of this branch at r106730 and queued it for
build.
The snapshot is available at: ex.seabright. co.nz/snapshots /
http://
and named something like gcc-linaro- 4.5+bzr106730~ ams-codesourcer y~discourage- neon-on- a8.*
You can track the build queue at: ex.seabright. co.nz/helpers/ scheduler
http://
cbuild-snapshot: gcc-linaro- 4.5+bzr106730~ ams-codesourcer y~discourage- neon-on- a8
cbuild-state: check