Merge lp:~ams-codesourcery/gcc-linaro/widening-multiplies-4.6 into lp:gcc-linaro/4.6
- widening-multiplies-4.6
- Merge into 4.6
Status: | Superseded |
---|---|
Proposed branch: | lp:~ams-codesourcery/gcc-linaro/widening-multiplies-4.6 |
Merge into: | lp:gcc-linaro/4.6 |
Diff against target: |
1157 lines (+645/-134) (has conflicts) 17 files modified
ChangeLog.linaro (+105/-0) gcc/config/arm/arm.md (+1/-1) gcc/expr.c (+14/-15) gcc/genopinit.c (+24/-20) gcc/optabs.c (+56/-15) gcc/optabs.h (+52/-0) gcc/testsuite/gcc.target/arm/no-wmla-1.c (+11/-0) gcc/testsuite/gcc.target/arm/wmul-10.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-5.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-6.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-7.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-8.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-9.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c (+17/-0) gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c (+17/-0) gcc/tree-cfg.c (+2/-2) gcc/tree-ssa-math-opts.c (+286/-81) Text conflict in ChangeLog.linaro |
To merge this branch: | bzr merge lp:~ams-codesourcery/gcc-linaro/widening-multiplies-4.6 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Michael Hope | Needs Fixing | ||
Review via email: mp+68349@code.launchpad.net |
This proposal supersedes a proposal from 2011-07-15.
This proposal has been superseded by a proposal from 2011-07-22.
Commit message
Description of the change
Widening multiplies optimizations.
The first commit is not approved yet, but the rest are reviewed upstream, and read to commit.
http://<email address hidden>
UPDATE: Now with an extra bug-fix.
Linaro Toolchain Builder (cbuild) wrote : Posted in a previous version of this proposal | # |
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal | # |
cbuild had trouble building this on <proposals.Build instance at 0x2b85680>.
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: armv7l-
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal | # |
cbuild had trouble building this on <proposals.Build instance at 0x2141ea8>.
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: armv7l-
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal | # |
cbuild had trouble building this on <proposals.Build instance at 0x7fe8f501a050>.
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: i686-natty-
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal | # |
cbuild had trouble building this on <proposals.Build instance at 0x7fe8f5028ab8>.
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: x86_64-
Linaro Toolchain Builder (cbuild) wrote : | # |
cbuild has taken a snapshot of this branch at r106782 and queued it for build.
The snapshot is available at:
http://
and will be built on the following builders:
a9-builder armv5-builder i686 x86_64
You can track the build queue at:
http://
cbuild-snapshot: gcc-linaro-
cbuild-ancestor: lp:gcc-linaro/4.6+bzr106774
cbuild-state: check
Michael Hope (michaelh1) wrote : | # |
cbuild had trouble building this on <proposals.Build instance at 0x3fec710>.
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: i686-natty-
Michael Hope (michaelh1) wrote : | # |
cbuild had trouble building this on <proposals.Build instance at 0x2b08c68>.
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: x86_64-
Michael Hope (michaelh1) wrote : | # |
cbuild had trouble building this on armv7l-
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: armv7l-
Michael Hope (michaelh1) wrote : | # |
cbuild had trouble building this on armv7l-
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: armv7l-
Preview Diff
1 | === modified file 'ChangeLog.linaro' |
2 | --- ChangeLog.linaro 2011-07-18 14:47:22 +0000 |
3 | +++ ChangeLog.linaro 2011-07-19 09:04:38 +0000 |
4 | @@ -1,3 +1,4 @@ |
5 | +<<<<<<< TREE |
6 | 2011-07-18 Andrew Stubbs <ams@codesourcery.com> |
7 | |
8 | gcc/ |
9 | @@ -83,6 +84,110 @@ |
10 | |
11 | * gcc.c-torture/compile/20110401-1.c: New test. |
12 | |
13 | +======= |
14 | +2011-07-15 Andrew Stubbs <ams@codesourcery.com> |
15 | + |
16 | + Backport from patches proposed for 4.7: |
17 | + |
18 | + 2011-06-24 Andrew Stubbs <ams@codesourcery.com> |
19 | + |
20 | + gcc/ |
21 | + * tree-ssa-math-opts.c (convert_mult_to_widen): Better handle |
22 | + unsigned inputs of different modes. |
23 | + (convert_plusminus_to_widen): Likewise. |
24 | + |
25 | + gcc/testsuite/ |
26 | + * gcc.target/arm/wmul-9.c: New file. |
27 | + * gcc.target/arm/wmul-bitfield-2.c: New file. |
28 | + |
29 | + 2011-07-14 Andrew Stubbs <ams@codesourcery.com> |
30 | + |
31 | + gcc/ |
32 | + * tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument |
33 | + 'type'. |
34 | + Use 'type' from caller, not inferred from 'rhs'. |
35 | + Don't reject non-conversion statements. Do return lhs in this case. |
36 | + (is_widening_mult_p): Add new argument 'type'. |
37 | + Use 'type' from caller, not inferred from 'stmt'. |
38 | + Pass type to is_widening_mult_rhs_p. |
39 | + (convert_mult_to_widen): Pass type to is_widening_mult_p. |
40 | + (convert_plusminus_to_widen): Likewise. |
41 | + |
42 | + gcc/testsuite/ |
43 | + * gcc.target/arm/wmul-8.c: New file. |
44 | + |
45 | + 2011-07-14 Andrew Stubbs <ams@codesourcery.com> |
46 | + |
47 | + gcc/ |
48 | + * tree-ssa-math-opts.c (is_widening_mult_p): Remove FIXME. |
49 | + Ensure the the larger type is the first operand. |
50 | + |
51 | + gcc/testsuite/ |
52 | + * gcc.target/arm/wmul-7.c: New file. |
53 | + |
54 | + 2011-07-14 Andrew Stubbs <ams@codesourcery.com> |
55 | + |
56 | + gcc/ |
57 | + * tree-ssa-math-opts.c (convert_mult_to_widen): Convert |
58 | + unsupported unsigned multiplies to signed. |
59 | + (convert_plusminus_to_widen): Likewise. |
60 | + |
61 | + gcc/testsuite/ |
62 | + * gcc.target/arm/wmul-6.c: New file. |
63 | + |
64 | + 2011-07-14 Andrew Stubbs <ams@codesourcery.com> |
65 | + |
66 | + gcc/ |
67 | + * tree-ssa-math-opts.c (convert_plusminus_to_widen): Permit a single |
68 | + conversion statement separating multiply-and-accumulate. |
69 | + |
70 | + gcc/testsuite/ |
71 | + * gcc.target/arm/wmul-5.c: New file. |
72 | + * gcc.target/arm/no-wmla-1.c: New file. |
73 | + |
74 | + 2011-07-14 Andrew Stubbs <ams@codesourcery.com> |
75 | + |
76 | + gcc/ |
77 | + * config/arm/arm.md (maddhidi4): Remove '*' from name. |
78 | + * expr.c (expand_expr_real_2): Use find_widening_optab_handler. |
79 | + * optabs.c (find_widening_optab_handler_and_mode): New function. |
80 | + (expand_widen_pattern_expr): Use find_widening_optab_handler. |
81 | + (expand_binop_directly): Likewise. |
82 | + (expand_binop): Likewise. |
83 | + * optabs.h (find_widening_optab_handler): New macro define. |
84 | + (find_widening_optab_handler_and_mode): New prototype. |
85 | + * tree-cfg.c (verify_gimple_assign_binary): Adjust WIDEN_MULT_EXPR |
86 | + type precision rules. |
87 | + (verify_gimple_assign_ternary): Likewise for WIDEN_MULT_PLUS_EXPR. |
88 | + * tree-ssa-math-opts.c (build_and_insert_cast): New function. |
89 | + (is_widening_mult_rhs_p): Allow widening by more than one mode. |
90 | + Explicitly disallow mis-matched input types. |
91 | + (convert_mult_to_widen): Use find_widening_optab_handler, and cast |
92 | + input types to fit the new handler. |
93 | + (convert_plusminus_to_widen): Likewise. |
94 | + |
95 | + gcc/testsuite/ |
96 | + * gcc.target/arm/wmul-bitfield-1.c: New file. |
97 | + |
98 | + |
99 | + 2011-07-09 Andrew Stubbs <ams@codesourcery.com> |
100 | + |
101 | + gcc/ |
102 | + * expr.c (expand_expr_real_2): Use widening_optab_handler. |
103 | + * genopinit.c (optabs): Use set_widening_optab_handler for $N. |
104 | + (gen_insn): $N now means $a must be wider than $b, not consecutive. |
105 | + * optabs.c (expand_widen_pattern_expr): Use widening_optab_handler. |
106 | + (expand_binop_directly): Likewise. |
107 | + (expand_binop): Likewise. |
108 | + * optabs.h (widening_optab_handlers): New struct. |
109 | + (optab_d): New member, 'widening'. |
110 | + (widening_optab_handler): New function. |
111 | + (set_widening_optab_handler): New function. |
112 | + * tree-ssa-math-opts.c (convert_mult_to_widen): Use |
113 | + widening_optab_handler. |
114 | + (convert_plusminus_to_widen): Likewise. |
115 | + |
116 | +>>>>>>> MERGE-SOURCE |
117 | 2011-07-13 Richard Sandiford <richard.sandiford@linaro.org> |
118 | |
119 | Backport from mainline: |
120 | |
121 | === modified file 'gcc/config/arm/arm.md' |
122 | --- gcc/config/arm/arm.md 2011-06-28 12:02:27 +0000 |
123 | +++ gcc/config/arm/arm.md 2011-07-19 09:04:38 +0000 |
124 | @@ -1839,7 +1839,7 @@ |
125 | (set_attr "predicable" "yes")] |
126 | ) |
127 | |
128 | -(define_insn "*maddhidi4" |
129 | +(define_insn "maddhidi4" |
130 | [(set (match_operand:DI 0 "s_register_operand" "=r") |
131 | (plus:DI |
132 | (mult:DI (sign_extend:DI |
133 | |
134 | === modified file 'gcc/expr.c' |
135 | --- gcc/expr.c 2011-06-02 12:12:00 +0000 |
136 | +++ gcc/expr.c 2011-07-19 09:04:38 +0000 |
137 | @@ -7658,18 +7658,16 @@ |
138 | { |
139 | enum machine_mode innermode = TYPE_MODE (TREE_TYPE (treeop0)); |
140 | this_optab = usmul_widen_optab; |
141 | - if (mode == GET_MODE_2XWIDER_MODE (innermode)) |
142 | + if (find_widening_optab_handler (this_optab, mode, innermode, 0) |
143 | + != CODE_FOR_nothing) |
144 | { |
145 | - if (optab_handler (this_optab, mode) != CODE_FOR_nothing) |
146 | - { |
147 | - if (TYPE_UNSIGNED (TREE_TYPE (treeop0))) |
148 | - expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1, |
149 | - EXPAND_NORMAL); |
150 | - else |
151 | - expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0, |
152 | - EXPAND_NORMAL); |
153 | - goto binop3; |
154 | - } |
155 | + if (TYPE_UNSIGNED (TREE_TYPE (treeop0))) |
156 | + expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1, |
157 | + EXPAND_NORMAL); |
158 | + else |
159 | + expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0, |
160 | + EXPAND_NORMAL); |
161 | + goto binop3; |
162 | } |
163 | } |
164 | /* Check for a multiplication with matching signedness. */ |
165 | @@ -7684,10 +7682,10 @@ |
166 | optab other_optab = zextend_p ? smul_widen_optab : umul_widen_optab; |
167 | this_optab = zextend_p ? umul_widen_optab : smul_widen_optab; |
168 | |
169 | - if (mode == GET_MODE_2XWIDER_MODE (innermode) |
170 | - && TREE_CODE (treeop0) != INTEGER_CST) |
171 | + if (TREE_CODE (treeop0) != INTEGER_CST) |
172 | { |
173 | - if (optab_handler (this_optab, mode) != CODE_FOR_nothing) |
174 | + if (find_widening_optab_handler (this_optab, mode, innermode, 0) |
175 | + != CODE_FOR_nothing) |
176 | { |
177 | expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1, |
178 | EXPAND_NORMAL); |
179 | @@ -7695,7 +7693,8 @@ |
180 | unsignedp, this_optab); |
181 | return REDUCE_BIT_FIELD (temp); |
182 | } |
183 | - if (optab_handler (other_optab, mode) != CODE_FOR_nothing |
184 | + if (find_widening_optab_handler (other_optab, mode, innermode, 0) |
185 | + != CODE_FOR_nothing |
186 | && innermode == word_mode) |
187 | { |
188 | rtx htem, hipart; |
189 | |
190 | === modified file 'gcc/genopinit.c' |
191 | --- gcc/genopinit.c 2011-05-05 15:43:06 +0000 |
192 | +++ gcc/genopinit.c 2011-07-19 09:04:38 +0000 |
193 | @@ -46,10 +46,12 @@ |
194 | used. $A and $B are replaced with the full name of the mode; $a and $b |
195 | are replaced with the short form of the name, as above. |
196 | |
197 | - If $N is present in the pattern, it means the two modes must be consecutive |
198 | - widths in the same mode class (e.g, QImode and HImode). $I means that |
199 | - only full integer modes should be considered for the next mode, and $F |
200 | - means that only float modes should be considered. |
201 | + If $N is present in the pattern, it means the two modes must be in |
202 | + the same mode class, and $b must be greater than $a (e.g, QImode |
203 | + and HImode). |
204 | + |
205 | + $I means that only full integer modes should be considered for the |
206 | + next mode, and $F means that only float modes should be considered. |
207 | $P means that both full and partial integer modes should be considered. |
208 | $Q means that only fixed-point modes should be considered. |
209 | |
210 | @@ -99,17 +101,17 @@ |
211 | "set_optab_handler (smulv_optab, $A, CODE_FOR_$(mulv$I$a3$))", |
212 | "set_optab_handler (umul_highpart_optab, $A, CODE_FOR_$(umul$a3_highpart$))", |
213 | "set_optab_handler (smul_highpart_optab, $A, CODE_FOR_$(smul$a3_highpart$))", |
214 | - "set_optab_handler (smul_widen_optab, $B, CODE_FOR_$(mul$a$b3$)$N)", |
215 | - "set_optab_handler (umul_widen_optab, $B, CODE_FOR_$(umul$a$b3$)$N)", |
216 | - "set_optab_handler (usmul_widen_optab, $B, CODE_FOR_$(usmul$a$b3$)$N)", |
217 | - "set_optab_handler (smadd_widen_optab, $B, CODE_FOR_$(madd$a$b4$)$N)", |
218 | - "set_optab_handler (umadd_widen_optab, $B, CODE_FOR_$(umadd$a$b4$)$N)", |
219 | - "set_optab_handler (ssmadd_widen_optab, $B, CODE_FOR_$(ssmadd$a$b4$)$N)", |
220 | - "set_optab_handler (usmadd_widen_optab, $B, CODE_FOR_$(usmadd$a$b4$)$N)", |
221 | - "set_optab_handler (smsub_widen_optab, $B, CODE_FOR_$(msub$a$b4$)$N)", |
222 | - "set_optab_handler (umsub_widen_optab, $B, CODE_FOR_$(umsub$a$b4$)$N)", |
223 | - "set_optab_handler (ssmsub_widen_optab, $B, CODE_FOR_$(ssmsub$a$b4$)$N)", |
224 | - "set_optab_handler (usmsub_widen_optab, $B, CODE_FOR_$(usmsub$a$b4$)$N)", |
225 | + "set_widening_optab_handler (smul_widen_optab, $B, $A, CODE_FOR_$(mul$a$b3$)$N)", |
226 | + "set_widening_optab_handler (umul_widen_optab, $B, $A, CODE_FOR_$(umul$a$b3$)$N)", |
227 | + "set_widening_optab_handler (usmul_widen_optab, $B, $A, CODE_FOR_$(usmul$a$b3$)$N)", |
228 | + "set_widening_optab_handler (smadd_widen_optab, $B, $A, CODE_FOR_$(madd$a$b4$)$N)", |
229 | + "set_widening_optab_handler (umadd_widen_optab, $B, $A, CODE_FOR_$(umadd$a$b4$)$N)", |
230 | + "set_widening_optab_handler (ssmadd_widen_optab, $B, $A, CODE_FOR_$(ssmadd$a$b4$)$N)", |
231 | + "set_widening_optab_handler (usmadd_widen_optab, $B, $A, CODE_FOR_$(usmadd$a$b4$)$N)", |
232 | + "set_widening_optab_handler (smsub_widen_optab, $B, $A, CODE_FOR_$(msub$a$b4$)$N)", |
233 | + "set_widening_optab_handler (umsub_widen_optab, $B, $A, CODE_FOR_$(umsub$a$b4$)$N)", |
234 | + "set_widening_optab_handler (ssmsub_widen_optab, $B, $A, CODE_FOR_$(ssmsub$a$b4$)$N)", |
235 | + "set_widening_optab_handler (usmsub_widen_optab, $B, $A, CODE_FOR_$(usmsub$a$b4$)$N)", |
236 | "set_optab_handler (sdiv_optab, $A, CODE_FOR_$(div$a3$))", |
237 | "set_optab_handler (ssdiv_optab, $A, CODE_FOR_$(ssdiv$Q$a3$))", |
238 | "set_optab_handler (sdivv_optab, $A, CODE_FOR_$(div$V$I$a3$))", |
239 | @@ -304,7 +306,7 @@ |
240 | { |
241 | int force_float = 0, force_int = 0, force_partial_int = 0; |
242 | int force_fixed = 0; |
243 | - int force_consec = 0; |
244 | + int force_wider = 0; |
245 | int matches = 1; |
246 | |
247 | for (pp = optabs[pindex]; pp[0] != '$' || pp[1] != '('; pp++) |
248 | @@ -322,7 +324,7 @@ |
249 | switch (*++pp) |
250 | { |
251 | case 'N': |
252 | - force_consec = 1; |
253 | + force_wider = 1; |
254 | break; |
255 | case 'I': |
256 | force_int = 1; |
257 | @@ -391,7 +393,10 @@ |
258 | || mode_class[i] == MODE_VECTOR_FRACT |
259 | || mode_class[i] == MODE_VECTOR_UFRACT |
260 | || mode_class[i] == MODE_VECTOR_ACCUM |
261 | - || mode_class[i] == MODE_VECTOR_UACCUM)) |
262 | + || mode_class[i] == MODE_VECTOR_UACCUM) |
263 | + && (! force_wider |
264 | + || *pp == 'a' |
265 | + || m1 < i)) |
266 | break; |
267 | } |
268 | |
269 | @@ -411,8 +416,7 @@ |
270 | } |
271 | |
272 | if (matches && pp[0] == '$' && pp[1] == ')' |
273 | - && *np == 0 |
274 | - && (! force_consec || (int) GET_MODE_WIDER_MODE(m1) == m2)) |
275 | + && *np == 0) |
276 | break; |
277 | } |
278 | |
279 | |
280 | === modified file 'gcc/optabs.c' |
281 | --- gcc/optabs.c 2011-07-04 14:03:49 +0000 |
282 | +++ gcc/optabs.c 2011-07-19 09:04:38 +0000 |
283 | @@ -225,6 +225,37 @@ |
284 | return 1; |
285 | } |
286 | |
287 | |
288 | +/* Find a widening optab even if it doesn't widen as much as we want. |
289 | + E.g. if from_mode is HImode, and to_mode is DImode, and there is no |
290 | + direct HI->SI insn, then return SI->DI, if that exists. |
291 | + If PERMIT_NON_WIDENING is non-zero then this can be used with |
292 | + non-widening optabs also. */ |
293 | + |
294 | +enum insn_code |
295 | +find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode, |
296 | + enum machine_mode from_mode, |
297 | + int permit_non_widening, |
298 | + enum machine_mode *found_mode) |
299 | +{ |
300 | + for (; (permit_non_widening || from_mode != to_mode) |
301 | + && GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode) |
302 | + && from_mode != VOIDmode; |
303 | + from_mode = GET_MODE_WIDER_MODE (from_mode)) |
304 | + { |
305 | + enum insn_code handler = widening_optab_handler (op, to_mode, |
306 | + from_mode); |
307 | + |
308 | + if (handler != CODE_FOR_nothing) |
309 | + { |
310 | + if (found_mode) |
311 | + *found_mode = from_mode; |
312 | + return handler; |
313 | + } |
314 | + } |
315 | + |
316 | + return CODE_FOR_nothing; |
317 | +} |
318 | + |
319 | |
320 | /* Widen OP to MODE and return the rtx for the widened operand. UNSIGNEDP |
321 | says whether OP is signed or unsigned. NO_EXTEND is nonzero if we need |
322 | not actually do a sign-extend or zero-extend, but can leave the |
323 | @@ -517,8 +548,9 @@ |
324 | optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default); |
325 | if (ops->code == WIDEN_MULT_PLUS_EXPR |
326 | || ops->code == WIDEN_MULT_MINUS_EXPR) |
327 | - icode = (int) optab_handler (widen_pattern_optab, |
328 | - TYPE_MODE (TREE_TYPE (ops->op2))); |
329 | + icode = (int) find_widening_optab_handler (widen_pattern_optab, |
330 | + TYPE_MODE (TREE_TYPE (ops->op2)), |
331 | + tmode0, 0); |
332 | else |
333 | icode = (int) optab_handler (widen_pattern_optab, tmode0); |
334 | gcc_assert (icode != CODE_FOR_nothing); |
335 | @@ -1389,7 +1421,9 @@ |
336 | rtx target, int unsignedp, enum optab_methods methods, |
337 | rtx last) |
338 | { |
339 | - int icode = (int) optab_handler (binoptab, mode); |
340 | + enum machine_mode from_mode = GET_MODE (op0); |
341 | + int icode = (int) find_widening_optab_handler (binoptab, mode, |
342 | + from_mode, 1); |
343 | enum machine_mode mode0 = insn_data[icode].operand[1].mode; |
344 | enum machine_mode mode1 = insn_data[icode].operand[2].mode; |
345 | enum machine_mode tmp_mode; |
346 | @@ -1546,7 +1580,8 @@ |
347 | /* If we can do it with a three-operand insn, do so. */ |
348 | |
349 | if (methods != OPTAB_MUST_WIDEN |
350 | - && optab_handler (binoptab, mode) != CODE_FOR_nothing) |
351 | + && find_widening_optab_handler (binoptab, mode, GET_MODE (op0), 1) |
352 | + != CODE_FOR_nothing) |
353 | { |
354 | temp = expand_binop_directly (mode, binoptab, op0, op1, target, |
355 | unsignedp, methods, last); |
356 | @@ -1585,9 +1620,10 @@ |
357 | takes operands of this mode and makes a wider mode. */ |
358 | |
359 | if (binoptab == smul_optab |
360 | - && GET_MODE_WIDER_MODE (mode) != VOIDmode |
361 | - && (optab_handler ((unsignedp ? umul_widen_optab : smul_widen_optab), |
362 | - GET_MODE_WIDER_MODE (mode)) |
363 | + && GET_MODE_2XWIDER_MODE (mode) != VOIDmode |
364 | + && (widening_optab_handler ((unsignedp ? umul_widen_optab |
365 | + : smul_widen_optab), |
366 | + GET_MODE_2XWIDER_MODE (mode), mode) |
367 | != CODE_FOR_nothing)) |
368 | { |
369 | temp = expand_binop (GET_MODE_WIDER_MODE (mode), |
370 | @@ -1615,12 +1651,15 @@ |
371 | wider_mode != VOIDmode; |
372 | wider_mode = GET_MODE_WIDER_MODE (wider_mode)) |
373 | { |
374 | - if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing |
375 | + if (optab_handler (binoptab, wider_mode) |
376 | + != CODE_FOR_nothing |
377 | || (binoptab == smul_optab |
378 | && GET_MODE_WIDER_MODE (wider_mode) != VOIDmode |
379 | - && (optab_handler ((unsignedp ? umul_widen_optab |
380 | - : smul_widen_optab), |
381 | - GET_MODE_WIDER_MODE (wider_mode)) |
382 | + && (find_widening_optab_handler ((unsignedp |
383 | + ? umul_widen_optab |
384 | + : smul_widen_optab), |
385 | + GET_MODE_WIDER_MODE (wider_mode), |
386 | + mode, 0) |
387 | != CODE_FOR_nothing))) |
388 | { |
389 | rtx xop0 = op0, xop1 = op1; |
390 | @@ -2043,8 +2082,8 @@ |
391 | && optab_handler (add_optab, word_mode) != CODE_FOR_nothing) |
392 | { |
393 | rtx product = NULL_RTX; |
394 | - |
395 | - if (optab_handler (umul_widen_optab, mode) != CODE_FOR_nothing) |
396 | + if (widening_optab_handler (umul_widen_optab, mode, word_mode) |
397 | + != CODE_FOR_nothing) |
398 | { |
399 | product = expand_doubleword_mult (mode, op0, op1, target, |
400 | true, methods); |
401 | @@ -2053,7 +2092,8 @@ |
402 | } |
403 | |
404 | if (product == NULL_RTX |
405 | - && optab_handler (smul_widen_optab, mode) != CODE_FOR_nothing) |
406 | + && widening_optab_handler (smul_widen_optab, mode, word_mode) |
407 | + != CODE_FOR_nothing) |
408 | { |
409 | product = expand_doubleword_mult (mode, op0, op1, target, |
410 | false, methods); |
411 | @@ -2144,7 +2184,8 @@ |
412 | wider_mode != VOIDmode; |
413 | wider_mode = GET_MODE_WIDER_MODE (wider_mode)) |
414 | { |
415 | - if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing |
416 | + if (find_widening_optab_handler (binoptab, wider_mode, mode, 1) |
417 | + != CODE_FOR_nothing |
418 | || (methods == OPTAB_LIB |
419 | && optab_libfunc (binoptab, wider_mode))) |
420 | { |
421 | |
422 | === modified file 'gcc/optabs.h' |
423 | --- gcc/optabs.h 2011-05-05 15:43:06 +0000 |
424 | +++ gcc/optabs.h 2011-07-19 09:04:38 +0000 |
425 | @@ -42,6 +42,11 @@ |
426 | int insn_code; |
427 | }; |
428 | |
429 | +struct widening_optab_handlers |
430 | +{ |
431 | + struct optab_handlers handlers[NUM_MACHINE_MODES][NUM_MACHINE_MODES]; |
432 | +}; |
433 | + |
434 | struct optab_d |
435 | { |
436 | enum rtx_code code; |
437 | @@ -50,6 +55,7 @@ |
438 | void (*libcall_gen)(struct optab_d *, const char *name, char suffix, |
439 | enum machine_mode); |
440 | struct optab_handlers handlers[NUM_MACHINE_MODES]; |
441 | + struct widening_optab_handlers *widening; |
442 | }; |
443 | typedef struct optab_d * optab; |
444 | |
445 | @@ -799,6 +805,15 @@ |
446 | extern void emit_unop_insn (int, rtx, rtx, enum rtx_code); |
447 | extern bool maybe_emit_unop_insn (int, rtx, rtx, enum rtx_code); |
448 | |
449 | +/* Find a widening optab even if it doesn't widen as much as we want. */ |
450 | +#define find_widening_optab_handler(A,B,C,D) \ |
451 | + find_widening_optab_handler_and_mode (A, B, C, D, NULL) |
452 | +extern enum insn_code find_widening_optab_handler_and_mode (optab, |
453 | + enum machine_mode, |
454 | + enum machine_mode, |
455 | + int, |
456 | + enum machine_mode *); |
457 | + |
458 | /* An extra flag to control optab_for_tree_code's behavior. This is needed to |
459 | distinguish between machines with a vector shift that takes a scalar for the |
460 | shift amount vs. machines that take a vector for the shift amount. */ |
461 | @@ -874,6 +889,23 @@ |
462 | + (int) CODE_FOR_nothing); |
463 | } |
464 | |
465 | +/* Like optab_handler, but for widening_operations that have a TO_MODE and |
466 | + a FROM_MODE. */ |
467 | + |
468 | +static inline enum insn_code |
469 | +widening_optab_handler (optab op, enum machine_mode to_mode, |
470 | + enum machine_mode from_mode) |
471 | +{ |
472 | + if (to_mode == from_mode) |
473 | + return optab_handler (op, to_mode); |
474 | + |
475 | + if (op->widening) |
476 | + return (enum insn_code) (op->widening->handlers[(int) to_mode][(int) from_mode].insn_code |
477 | + + (int) CODE_FOR_nothing); |
478 | + |
479 | + return CODE_FOR_nothing; |
480 | +} |
481 | + |
482 | /* Record that insn CODE should be used to implement mode MODE of OP. */ |
483 | |
484 | static inline void |
485 | @@ -882,6 +914,26 @@ |
486 | op->handlers[(int) mode].insn_code = (int) code - (int) CODE_FOR_nothing; |
487 | } |
488 | |
489 | +/* Like set_optab_handler, but for widening operations that have a TO_MODE |
490 | + and a FROM_MODE. */ |
491 | + |
492 | +static inline void |
493 | +set_widening_optab_handler (optab op, enum machine_mode to_mode, |
494 | + enum machine_mode from_mode, enum insn_code code) |
495 | +{ |
496 | + if (to_mode == from_mode) |
497 | + set_optab_handler (op, to_mode, code); |
498 | + else |
499 | + { |
500 | + if (op->widening == NULL) |
501 | + op->widening = (struct widening_optab_handlers *) |
502 | + xcalloc (1, sizeof (struct widening_optab_handlers)); |
503 | + |
504 | + op->widening->handlers[(int) to_mode][(int) from_mode].insn_code |
505 | + = (int) code - (int) CODE_FOR_nothing; |
506 | + } |
507 | +} |
508 | + |
509 | /* Return the insn used to perform conversion OP from mode FROM_MODE |
510 | to mode TO_MODE; return CODE_FOR_nothing if the target does not have |
511 | such an insn. */ |
512 | |
513 | === added file 'gcc/testsuite/gcc.target/arm/no-wmla-1.c' |
514 | --- gcc/testsuite/gcc.target/arm/no-wmla-1.c 1970-01-01 00:00:00 +0000 |
515 | +++ gcc/testsuite/gcc.target/arm/no-wmla-1.c 2011-07-19 09:04:38 +0000 |
516 | @@ -0,0 +1,11 @@ |
517 | +/* { dg-do compile } */ |
518 | +/* { dg-options "-O2 -march=armv7-a" } */ |
519 | + |
520 | +int |
521 | +foo (int a, short b, short c) |
522 | +{ |
523 | + int bc = b * c; |
524 | + return a + (short)bc; |
525 | +} |
526 | + |
527 | +/* { dg-final { scan-assembler "mul" } } */ |
528 | |
529 | === added file 'gcc/testsuite/gcc.target/arm/wmul-10.c' |
530 | --- gcc/testsuite/gcc.target/arm/wmul-10.c 1970-01-01 00:00:00 +0000 |
531 | +++ gcc/testsuite/gcc.target/arm/wmul-10.c 2011-07-19 09:04:38 +0000 |
532 | @@ -0,0 +1,10 @@ |
533 | +/* { dg-do compile } */ |
534 | +/* { dg-options "-O2 -march=armv7-a" } */ |
535 | + |
536 | +unsigned long long |
537 | +foo (unsigned short a, unsigned short *b, unsigned short *c) |
538 | +{ |
539 | + return (unsigned)a + (unsigned long long)*b * (unsigned long long)*c; |
540 | +} |
541 | + |
542 | +/* { dg-final { scan-assembler "umlal" } } */ |
543 | |
544 | === added file 'gcc/testsuite/gcc.target/arm/wmul-5.c' |
545 | --- gcc/testsuite/gcc.target/arm/wmul-5.c 1970-01-01 00:00:00 +0000 |
546 | +++ gcc/testsuite/gcc.target/arm/wmul-5.c 2011-07-19 09:04:38 +0000 |
547 | @@ -0,0 +1,10 @@ |
548 | +/* { dg-do compile } */ |
549 | +/* { dg-options "-O2 -march=armv7-a" } */ |
550 | + |
551 | +long long |
552 | +foo (long long a, char *b, char *c) |
553 | +{ |
554 | + return a + *b * *c; |
555 | +} |
556 | + |
557 | +/* { dg-final { scan-assembler "umlal" } } */ |
558 | |
559 | === added file 'gcc/testsuite/gcc.target/arm/wmul-6.c' |
560 | --- gcc/testsuite/gcc.target/arm/wmul-6.c 1970-01-01 00:00:00 +0000 |
561 | +++ gcc/testsuite/gcc.target/arm/wmul-6.c 2011-07-19 09:04:38 +0000 |
562 | @@ -0,0 +1,10 @@ |
563 | +/* { dg-do compile } */ |
564 | +/* { dg-options "-O2 -march=armv7-a" } */ |
565 | + |
566 | +long long |
567 | +foo (long long a, unsigned char *b, signed char *c) |
568 | +{ |
569 | + return a + (long long)*b * (long long)*c; |
570 | +} |
571 | + |
572 | +/* { dg-final { scan-assembler "smlal" } } */ |
573 | |
574 | === added file 'gcc/testsuite/gcc.target/arm/wmul-7.c' |
575 | --- gcc/testsuite/gcc.target/arm/wmul-7.c 1970-01-01 00:00:00 +0000 |
576 | +++ gcc/testsuite/gcc.target/arm/wmul-7.c 2011-07-19 09:04:38 +0000 |
577 | @@ -0,0 +1,10 @@ |
578 | +/* { dg-do compile } */ |
579 | +/* { dg-options "-O2 -march=armv7-a" } */ |
580 | + |
581 | +unsigned long long |
582 | +foo (unsigned long long a, unsigned char *b, unsigned short *c) |
583 | +{ |
584 | + return a + *b * *c; |
585 | +} |
586 | + |
587 | +/* { dg-final { scan-assembler "umlal" } } */ |
588 | |
589 | === added file 'gcc/testsuite/gcc.target/arm/wmul-8.c' |
590 | --- gcc/testsuite/gcc.target/arm/wmul-8.c 1970-01-01 00:00:00 +0000 |
591 | +++ gcc/testsuite/gcc.target/arm/wmul-8.c 2011-07-19 09:04:38 +0000 |
592 | @@ -0,0 +1,10 @@ |
593 | +/* { dg-do compile } */ |
594 | +/* { dg-options "-O2 -march=armv7-a" } */ |
595 | + |
596 | +long long |
597 | +foo (long long a, int *b, int *c) |
598 | +{ |
599 | + return a + *b * *c; |
600 | +} |
601 | + |
602 | +/* { dg-final { scan-assembler "smlal" } } */ |
603 | |
604 | === added file 'gcc/testsuite/gcc.target/arm/wmul-9.c' |
605 | --- gcc/testsuite/gcc.target/arm/wmul-9.c 1970-01-01 00:00:00 +0000 |
606 | +++ gcc/testsuite/gcc.target/arm/wmul-9.c 2011-07-19 09:04:38 +0000 |
607 | @@ -0,0 +1,10 @@ |
608 | +/* { dg-do compile } */ |
609 | +/* { dg-options "-O2 -march=armv7-a" } */ |
610 | + |
611 | +long long |
612 | +foo (long long a, short *b, char *c) |
613 | +{ |
614 | + return a + *b * *c; |
615 | +} |
616 | + |
617 | +/* { dg-final { scan-assembler "smlalbb" } } */ |
618 | |
619 | === added file 'gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c' |
620 | --- gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c 1970-01-01 00:00:00 +0000 |
621 | +++ gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c 2011-07-19 09:04:38 +0000 |
622 | @@ -0,0 +1,17 @@ |
623 | +/* { dg-do compile } */ |
624 | +/* { dg-options "-O2 -march=armv7-a" } */ |
625 | + |
626 | +struct bf |
627 | +{ |
628 | + int a : 3; |
629 | + int b : 15; |
630 | + int c : 3; |
631 | +}; |
632 | + |
633 | +long long |
634 | +foo (long long a, struct bf b, struct bf c) |
635 | +{ |
636 | + return a + b.b * c.b; |
637 | +} |
638 | + |
639 | +/* { dg-final { scan-assembler "smlalbb" } } */ |
640 | |
641 | === added file 'gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c' |
642 | --- gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c 1970-01-01 00:00:00 +0000 |
643 | +++ gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c 2011-07-19 09:04:38 +0000 |
644 | @@ -0,0 +1,17 @@ |
645 | +/* { dg-do compile } */ |
646 | +/* { dg-options "-O2 -march=armv7-a" } */ |
647 | + |
648 | +struct bf |
649 | +{ |
650 | + int a : 3; |
651 | + unsigned int b : 15; |
652 | + int c : 3; |
653 | +}; |
654 | + |
655 | +long long |
656 | +foo (long long a, struct bf b, struct bf c) |
657 | +{ |
658 | + return a + b.b * c.c; |
659 | +} |
660 | + |
661 | +/* { dg-final { scan-assembler "smlalbb" } } */ |
662 | |
663 | === modified file 'gcc/tree-cfg.c' |
664 | --- gcc/tree-cfg.c 2011-07-01 09:19:21 +0000 |
665 | +++ gcc/tree-cfg.c 2011-07-19 09:04:38 +0000 |
666 | @@ -3574,7 +3574,7 @@ |
667 | case WIDEN_MULT_EXPR: |
668 | if (TREE_CODE (lhs_type) != INTEGER_TYPE) |
669 | return true; |
670 | - return ((2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type)) |
671 | + return ((2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type)) |
672 | || (TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type))); |
673 | |
674 | case WIDEN_SUM_EXPR: |
675 | @@ -3667,7 +3667,7 @@ |
676 | && !FIXED_POINT_TYPE_P (rhs1_type)) |
677 | || !useless_type_conversion_p (rhs1_type, rhs2_type) |
678 | || !useless_type_conversion_p (lhs_type, rhs3_type) |
679 | - || 2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type) |
680 | + || 2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type) |
681 | || TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type)) |
682 | { |
683 | error ("type mismatch in widening multiply-accumulate expression"); |
684 | |
685 | === modified file 'gcc/tree-ssa-math-opts.c' |
686 | --- gcc/tree-ssa-math-opts.c 2011-03-11 16:36:16 +0000 |
687 | +++ gcc/tree-ssa-math-opts.c 2011-07-19 09:04:38 +0000 |
688 | @@ -1266,42 +1266,68 @@ |
689 | } |
690 | }; |
691 | |
692 | -/* Return true if RHS is a suitable operand for a widening multiplication. |
693 | +/* Build a gimple assignment to cast VAL to TARGET. Insert the statement |
694 | + prior to GSI's current position, and return the fresh SSA name. */ |
695 | + |
696 | +static tree |
697 | +build_and_insert_cast (gimple_stmt_iterator *gsi, location_t loc, |
698 | + tree target, tree val) |
699 | +{ |
700 | + tree result = make_ssa_name (target, NULL); |
701 | + gimple stmt = gimple_build_assign_with_ops (CONVERT_EXPR, result, val, NULL); |
702 | + gimple_set_location (stmt, loc); |
703 | + gsi_insert_before (gsi, stmt, GSI_SAME_STMT); |
704 | + return result; |
705 | +} |
706 | + |
707 | +/* Return true if RHS is a suitable operand for a widening multiplication, |
708 | + assuming a target type of TYPE. |
709 | There are two cases: |
710 | |
711 | - - RHS makes some value twice as wide. Store that value in *NEW_RHS_OUT |
712 | - if so, and store its type in *TYPE_OUT. |
713 | + - RHS makes some value at least twice as wide. Store that value |
714 | + in *NEW_RHS_OUT if so, and store its type in *TYPE_OUT. |
715 | |
716 | - RHS is an integer constant. Store that value in *NEW_RHS_OUT if so, |
717 | but leave *TYPE_OUT untouched. */ |
718 | |
719 | static bool |
720 | -is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out) |
721 | +is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out, |
722 | + tree *new_rhs_out) |
723 | { |
724 | gimple stmt; |
725 | - tree type, type1, rhs1; |
726 | + tree type1, rhs1; |
727 | enum tree_code rhs_code; |
728 | |
729 | if (TREE_CODE (rhs) == SSA_NAME) |
730 | { |
731 | - type = TREE_TYPE (rhs); |
732 | stmt = SSA_NAME_DEF_STMT (rhs); |
733 | if (!is_gimple_assign (stmt)) |
734 | - return false; |
735 | - |
736 | - rhs_code = gimple_assign_rhs_code (stmt); |
737 | - if (TREE_CODE (type) == INTEGER_TYPE |
738 | - ? !CONVERT_EXPR_CODE_P (rhs_code) |
739 | - : rhs_code != FIXED_CONVERT_EXPR) |
740 | - return false; |
741 | - |
742 | - rhs1 = gimple_assign_rhs1 (stmt); |
743 | - type1 = TREE_TYPE (rhs1); |
744 | + { |
745 | + rhs1 = NULL; |
746 | + type1 = TREE_TYPE (rhs); |
747 | + } |
748 | + else |
749 | + { |
750 | + rhs1 = gimple_assign_rhs1 (stmt); |
751 | + type1 = TREE_TYPE (rhs1); |
752 | + } |
753 | + |
754 | if (TREE_CODE (type1) != TREE_CODE (type) |
755 | - || TYPE_PRECISION (type1) * 2 != TYPE_PRECISION (type)) |
756 | + || TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type)) |
757 | return false; |
758 | |
759 | - *new_rhs_out = rhs1; |
760 | + if (rhs1) |
761 | + { |
762 | + rhs_code = gimple_assign_rhs_code (stmt); |
763 | + if (TREE_CODE (type) == INTEGER_TYPE |
764 | + ? !CONVERT_EXPR_CODE_P (rhs_code) |
765 | + : rhs_code != FIXED_CONVERT_EXPR) |
766 | + *new_rhs_out = rhs; |
767 | + else |
768 | + *new_rhs_out = rhs1; |
769 | + } |
770 | + else |
771 | + *new_rhs_out = rhs; |
772 | *type_out = type1; |
773 | return true; |
774 | } |
775 | @@ -1316,28 +1342,27 @@ |
776 | return false; |
777 | } |
778 | |
779 | -/* Return true if STMT performs a widening multiplication. If so, |
780 | - store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT |
781 | - respectively. Also fill *RHS1_OUT and *RHS2_OUT such that converting |
782 | - those operands to types *TYPE1_OUT and *TYPE2_OUT would give the |
783 | - operands of the multiplication. */ |
784 | +/* Return true if STMT performs a widening multiplication, assuming the |
785 | + output type is TYPE. If so, store the unwidened types of the operands |
786 | + in *TYPE1_OUT and *TYPE2_OUT respectively. Also fill *RHS1_OUT and |
787 | + *RHS2_OUT such that converting those operands to types *TYPE1_OUT |
788 | + and *TYPE2_OUT would give the operands of the multiplication. */ |
789 | |
790 | static bool |
791 | -is_widening_mult_p (gimple stmt, |
792 | +is_widening_mult_p (tree type, gimple stmt, |
793 | tree *type1_out, tree *rhs1_out, |
794 | tree *type2_out, tree *rhs2_out) |
795 | { |
796 | - tree type; |
797 | - |
798 | - type = TREE_TYPE (gimple_assign_lhs (stmt)); |
799 | if (TREE_CODE (type) != INTEGER_TYPE |
800 | && TREE_CODE (type) != FIXED_POINT_TYPE) |
801 | return false; |
802 | |
803 | - if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out)) |
804 | + if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out, |
805 | + rhs1_out)) |
806 | return false; |
807 | |
808 | - if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out)) |
809 | + if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out, |
810 | + rhs2_out)) |
811 | return false; |
812 | |
813 | if (*type1_out == NULL) |
814 | @@ -1354,6 +1379,18 @@ |
815 | *type2_out = *type1_out; |
816 | } |
817 | |
818 | + /* Ensure that the larger of the two operands comes first. */ |
819 | + if (TYPE_PRECISION (*type1_out) < TYPE_PRECISION (*type2_out)) |
820 | + { |
821 | + tree tmp; |
822 | + tmp = *type1_out; |
823 | + *type1_out = *type2_out; |
824 | + *type2_out = tmp; |
825 | + tmp = *rhs1_out; |
826 | + *rhs1_out = *rhs2_out; |
827 | + *rhs2_out = tmp; |
828 | + } |
829 | + |
830 | return true; |
831 | } |
832 | |
833 | @@ -1362,31 +1399,94 @@ |
834 | value is true iff we converted the statement. */ |
835 | |
836 | static bool |
837 | -convert_mult_to_widen (gimple stmt) |
838 | +convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) |
839 | { |
840 | - tree lhs, rhs1, rhs2, type, type1, type2; |
841 | + tree lhs, rhs1, rhs2, type, type1, type2, tmp = NULL; |
842 | enum insn_code handler; |
843 | + enum machine_mode to_mode, from_mode, actual_mode; |
844 | + optab op; |
845 | + int actual_precision; |
846 | + location_t loc = gimple_location (stmt); |
847 | + bool from_unsigned1, from_unsigned2; |
848 | |
849 | lhs = gimple_assign_lhs (stmt); |
850 | type = TREE_TYPE (lhs); |
851 | if (TREE_CODE (type) != INTEGER_TYPE) |
852 | return false; |
853 | |
854 | - if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2)) |
855 | + if (!is_widening_mult_p (type, stmt, &type1, &rhs1, &type2, &rhs2)) |
856 | return false; |
857 | |
858 | - if (TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2)) |
859 | - handler = optab_handler (umul_widen_optab, TYPE_MODE (type)); |
860 | - else if (!TYPE_UNSIGNED (type1) && !TYPE_UNSIGNED (type2)) |
861 | - handler = optab_handler (smul_widen_optab, TYPE_MODE (type)); |
862 | + to_mode = TYPE_MODE (type); |
863 | + from_mode = TYPE_MODE (type1); |
864 | + from_unsigned1 = TYPE_UNSIGNED (type1); |
865 | + from_unsigned2 = TYPE_UNSIGNED (type2); |
866 | + |
867 | + if (from_unsigned1 && from_unsigned2) |
868 | + op = umul_widen_optab; |
869 | + else if (!from_unsigned1 && !from_unsigned2) |
870 | + op = smul_widen_optab; |
871 | else |
872 | - handler = optab_handler (usmul_widen_optab, TYPE_MODE (type)); |
873 | + op = usmul_widen_optab; |
874 | + |
875 | + handler = find_widening_optab_handler_and_mode (op, to_mode, from_mode, |
876 | + 0, &actual_mode); |
877 | |
878 | if (handler == CODE_FOR_nothing) |
879 | - return false; |
880 | - |
881 | - gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1)); |
882 | - gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2)); |
883 | + { |
884 | + if (op != smul_widen_optab) |
885 | + { |
886 | + /* We can use a signed multiply with unsigned types as long as |
887 | + there is a wider mode to use, or it is the smaller of the two |
888 | + types that is unsigned. Note that type1 >= type2, always. */ |
889 | + if ((TYPE_UNSIGNED (type1) |
890 | + && TYPE_PRECISION (type1) == GET_MODE_PRECISION (from_mode)) |
891 | + || (TYPE_UNSIGNED (type2) |
892 | + && TYPE_PRECISION (type2) == GET_MODE_PRECISION (from_mode))) |
893 | + { |
894 | + from_mode = GET_MODE_WIDER_MODE (from_mode); |
895 | + if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode)) |
896 | + return false; |
897 | + } |
898 | + |
899 | + op = smul_widen_optab; |
900 | + handler = find_widening_optab_handler_and_mode (op, to_mode, |
901 | + from_mode, 0, |
902 | + &actual_mode); |
903 | + |
904 | + if (handler == CODE_FOR_nothing) |
905 | + return false; |
906 | + |
907 | + from_unsigned1 = from_unsigned2 = false; |
908 | + } |
909 | + else |
910 | + return false; |
911 | + } |
912 | + |
913 | + /* Ensure that the inputs to the handler are in the correct precison |
914 | + for the opcode. This will be the full mode size. */ |
915 | + actual_precision = GET_MODE_PRECISION (actual_mode); |
916 | + if (actual_precision != TYPE_PRECISION (type1) |
917 | + || from_unsigned1 != TYPE_UNSIGNED (type1)) |
918 | + { |
919 | + tmp = create_tmp_var (build_nonstandard_integer_type |
920 | + (actual_precision, from_unsigned1), |
921 | + NULL); |
922 | + rhs1 = build_and_insert_cast (gsi, loc, tmp, rhs1); |
923 | + } |
924 | + if (actual_precision != TYPE_PRECISION (type2) |
925 | + || from_unsigned2 != TYPE_UNSIGNED (type2)) |
926 | + { |
927 | + /* Reuse the same type info, if possible. */ |
928 | + if (!tmp || from_unsigned1 != from_unsigned2) |
929 | + tmp = create_tmp_var (build_nonstandard_integer_type |
930 | + (actual_precision, from_unsigned2), |
931 | + NULL); |
932 | + rhs2 = build_and_insert_cast (gsi, loc, tmp, rhs2); |
933 | + } |
934 | + |
935 | + gimple_assign_set_rhs1 (stmt, rhs1); |
936 | + gimple_assign_set_rhs2 (stmt, rhs2); |
937 | gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR); |
938 | update_stmt (stmt); |
939 | return true; |
940 | @@ -1403,11 +1503,17 @@ |
941 | enum tree_code code) |
942 | { |
943 | gimple rhs1_stmt = NULL, rhs2_stmt = NULL; |
944 | - tree type, type1, type2; |
945 | + gimple conv1_stmt = NULL, conv2_stmt = NULL, conv_stmt; |
946 | + tree type, type1, type2, optype, tmp = NULL; |
947 | tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs; |
948 | enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK; |
949 | optab this_optab; |
950 | enum tree_code wmult_code; |
951 | + enum insn_code handler; |
952 | + enum machine_mode to_mode, from_mode, actual_mode; |
953 | + location_t loc = gimple_location (stmt); |
954 | + int actual_precision; |
955 | + bool from_unsigned1, from_unsigned2; |
956 | |
957 | lhs = gimple_assign_lhs (stmt); |
958 | type = TREE_TYPE (lhs); |
959 | @@ -1441,54 +1547,153 @@ |
960 | else |
961 | return false; |
962 | |
963 | - if (code == PLUS_EXPR && rhs1_code == MULT_EXPR) |
964 | - { |
965 | - if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1, |
966 | - &type2, &mult_rhs2)) |
967 | - return false; |
968 | - add_rhs = rhs2; |
969 | - } |
970 | - else if (rhs2_code == MULT_EXPR) |
971 | - { |
972 | - if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1, |
973 | - &type2, &mult_rhs2)) |
974 | - return false; |
975 | - add_rhs = rhs1; |
976 | - } |
977 | - else if (code == PLUS_EXPR && rhs1_code == WIDEN_MULT_EXPR) |
978 | - { |
979 | - mult_rhs1 = gimple_assign_rhs1 (rhs1_stmt); |
980 | - mult_rhs2 = gimple_assign_rhs2 (rhs1_stmt); |
981 | - type1 = TREE_TYPE (mult_rhs1); |
982 | - type2 = TREE_TYPE (mult_rhs2); |
983 | - add_rhs = rhs2; |
984 | - } |
985 | - else if (rhs2_code == WIDEN_MULT_EXPR) |
986 | - { |
987 | - mult_rhs1 = gimple_assign_rhs1 (rhs2_stmt); |
988 | - mult_rhs2 = gimple_assign_rhs2 (rhs2_stmt); |
989 | - type1 = TREE_TYPE (mult_rhs1); |
990 | - type2 = TREE_TYPE (mult_rhs2); |
991 | - add_rhs = rhs1; |
992 | + /* Allow for one conversion statement between the multiply |
993 | + and addition/subtraction statement. If there are more than |
994 | + one conversions then we assume they would invalidate this |
995 | + transformation. If that's not the case then they should have |
996 | + been folded before now. */ |
997 | + if (CONVERT_EXPR_CODE_P (rhs1_code)) |
998 | + { |
999 | + conv1_stmt = rhs1_stmt; |
1000 | + rhs1 = gimple_assign_rhs1 (rhs1_stmt); |
1001 | + if (TREE_CODE (rhs1) == SSA_NAME) |
1002 | + { |
1003 | + rhs1_stmt = SSA_NAME_DEF_STMT (rhs1); |
1004 | + if (is_gimple_assign (rhs1_stmt)) |
1005 | + rhs1_code = gimple_assign_rhs_code (rhs1_stmt); |
1006 | + } |
1007 | + else |
1008 | + return false; |
1009 | + } |
1010 | + if (CONVERT_EXPR_CODE_P (rhs2_code)) |
1011 | + { |
1012 | + conv2_stmt = rhs2_stmt; |
1013 | + rhs2 = gimple_assign_rhs1 (rhs2_stmt); |
1014 | + if (TREE_CODE (rhs2) == SSA_NAME) |
1015 | + { |
1016 | + rhs2_stmt = SSA_NAME_DEF_STMT (rhs2); |
1017 | + if (is_gimple_assign (rhs2_stmt)) |
1018 | + rhs2_code = gimple_assign_rhs_code (rhs2_stmt); |
1019 | + } |
1020 | + else |
1021 | + return false; |
1022 | + } |
1023 | + |
1024 | + /* If code is WIDEN_MULT_EXPR then it would seem unnecessary to call |
1025 | + is_widening_mult_p, but we still need the rhs returns. |
1026 | + |
1027 | + It might also appear that it would be sufficient to use the existing |
1028 | + operands of the widening multiply, but that would limit the choice of |
1029 | + multiply-and-accumulate instructions. */ |
1030 | + if (code == PLUS_EXPR |
1031 | + && (rhs1_code == MULT_EXPR || rhs1_code == WIDEN_MULT_EXPR)) |
1032 | + { |
1033 | + if (!is_widening_mult_p (type, rhs1_stmt, &type1, &mult_rhs1, |
1034 | + &type2, &mult_rhs2)) |
1035 | + return false; |
1036 | + add_rhs = rhs2; |
1037 | + conv_stmt = conv1_stmt; |
1038 | + } |
1039 | + else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR) |
1040 | + { |
1041 | + if (!is_widening_mult_p (type, rhs2_stmt, &type1, &mult_rhs1, |
1042 | + &type2, &mult_rhs2)) |
1043 | + return false; |
1044 | + add_rhs = rhs1; |
1045 | + conv_stmt = conv2_stmt; |
1046 | } |
1047 | else |
1048 | return false; |
1049 | |
1050 | - if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2)) |
1051 | - return false; |
1052 | + to_mode = TYPE_MODE (type); |
1053 | + from_mode = TYPE_MODE (type1); |
1054 | + from_unsigned1 = TYPE_UNSIGNED (type1); |
1055 | + from_unsigned2 = TYPE_UNSIGNED (type2); |
1056 | + |
1057 | + /* There's no such thing as a mixed sign madd yet, so use a wider mode. */ |
1058 | + if (from_unsigned1 != from_unsigned2) |
1059 | + { |
1060 | + /* We can use a signed multiply with unsigned types as long as |
1061 | + there is a wider mode to use, or it is the smaller of the two |
1062 | + types that is unsigned. Note that type1 >= type2, always. */ |
1063 | + if ((from_unsigned1 |
1064 | + && TYPE_PRECISION (type1) == GET_MODE_PRECISION (from_mode)) |
1065 | + || (from_unsigned2 |
1066 | + && TYPE_PRECISION (type2) == GET_MODE_PRECISION (from_mode))) |
1067 | + { |
1068 | + from_mode = GET_MODE_WIDER_MODE (from_mode); |
1069 | + if (GET_MODE_SIZE (from_mode) >= GET_MODE_SIZE (to_mode)) |
1070 | + return false; |
1071 | + } |
1072 | + |
1073 | + from_unsigned1 = from_unsigned2 = false; |
1074 | + } |
1075 | + |
1076 | + /* If there was a conversion between the multiply and addition |
1077 | + then we need to make sure it fits a multiply-and-accumulate. |
1078 | + The should be a single mode change which does not change the |
1079 | + value. */ |
1080 | + if (conv_stmt) |
1081 | + { |
1082 | + /* We use the original, unmodified data types for this. */ |
1083 | + tree from_type = TREE_TYPE (gimple_assign_rhs1 (conv_stmt)); |
1084 | + tree to_type = TREE_TYPE (gimple_assign_lhs (conv_stmt)); |
1085 | + int data_size = TYPE_PRECISION (type1) + TYPE_PRECISION (type2); |
1086 | + bool is_unsigned = TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2); |
1087 | + |
1088 | + if (TYPE_PRECISION (from_type) > TYPE_PRECISION (to_type)) |
1089 | + { |
1090 | + /* Conversion is a truncate. */ |
1091 | + if (TYPE_PRECISION (to_type) < data_size) |
1092 | + return false; |
1093 | + } |
1094 | + else if (TYPE_PRECISION (from_type) < TYPE_PRECISION (to_type)) |
1095 | + { |
1096 | + /* Conversion is an extend. Check it's the right sort. */ |
1097 | + if (TYPE_UNSIGNED (from_type) != is_unsigned |
1098 | + && !(is_unsigned && TYPE_PRECISION (from_type) > data_size)) |
1099 | + return false; |
1100 | + } |
1101 | + /* else convert is a no-op for our purposes. */ |
1102 | + } |
1103 | |
1104 | /* Verify that the machine can perform a widening multiply |
1105 | accumulate in this mode/signedness combination, otherwise |
1106 | this transformation is likely to pessimize code. */ |
1107 | - this_optab = optab_for_tree_code (wmult_code, type1, optab_default); |
1108 | - if (optab_handler (this_optab, TYPE_MODE (type)) == CODE_FOR_nothing) |
1109 | + optype = build_nonstandard_integer_type (from_mode, from_unsigned1); |
1110 | + this_optab = optab_for_tree_code (wmult_code, optype, optab_default); |
1111 | + handler = find_widening_optab_handler_and_mode (this_optab, to_mode, |
1112 | + from_mode, 0, &actual_mode); |
1113 | + |
1114 | + if (handler == CODE_FOR_nothing) |
1115 | return false; |
1116 | |
1117 | - /* ??? May need some type verification here? */ |
1118 | - |
1119 | - gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, |
1120 | - fold_convert (type1, mult_rhs1), |
1121 | - fold_convert (type2, mult_rhs2), |
1122 | + /* Ensure that the inputs to the handler are in the correct precison |
1123 | + for the opcode. This will be the full mode size. */ |
1124 | + actual_precision = GET_MODE_PRECISION (actual_mode); |
1125 | + if (actual_precision != TYPE_PRECISION (type1) |
1126 | + || from_unsigned1 != TYPE_UNSIGNED (type1)) |
1127 | + { |
1128 | + tmp = create_tmp_var (build_nonstandard_integer_type |
1129 | + (actual_precision, from_unsigned1), |
1130 | + NULL); |
1131 | + mult_rhs1 = build_and_insert_cast (gsi, loc, tmp, mult_rhs1); |
1132 | + } |
1133 | + if (actual_precision != TYPE_PRECISION (type2) |
1134 | + || from_unsigned2 != TYPE_UNSIGNED (type2)) |
1135 | + { |
1136 | + if (!tmp || from_unsigned1 != from_unsigned2) |
1137 | + tmp = create_tmp_var (build_nonstandard_integer_type |
1138 | + (actual_precision, from_unsigned2), |
1139 | + NULL); |
1140 | + mult_rhs2 = build_and_insert_cast (gsi, loc, tmp, mult_rhs2); |
1141 | + } |
1142 | + |
1143 | + if (TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (add_rhs))) |
1144 | + add_rhs = build_and_insert_cast (gsi, loc, create_tmp_var (type, NULL), |
1145 | + add_rhs); |
1146 | + |
1147 | + gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, mult_rhs1, mult_rhs2, |
1148 | add_rhs); |
1149 | update_stmt (gsi_stmt (*gsi)); |
1150 | return true; |
1151 | @@ -1696,7 +1901,7 @@ |
1152 | switch (code) |
1153 | { |
1154 | case MULT_EXPR: |
1155 | - if (!convert_mult_to_widen (stmt) |
1156 | + if (!convert_mult_to_widen (stmt, &gsi) |
1157 | && convert_mult_to_fma (stmt, |
1158 | gimple_assign_rhs1 (stmt), |
1159 | gimple_assign_rhs2 (stmt))) |
cbuild has taken a snapshot of this branch at r106781 and queued it for build.
The snapshot is available at: ex.seabright. co.nz/snapshots /gcc-linaro- 4.6+bzr106781~ ams-codesourcer y~widening- multiplies- 4.6.tar. xdelta3. xz
http://
and will be built on the following builders:
a9-builder armv5-builder i686 x86_64
You can track the build queue at: ex.seabright. co.nz/helpers/ scheduler
http://
cbuild-snapshot: gcc-linaro- 4.6+bzr106781~ ams-codesourcer y~widening- multiplies- 4.6
cbuild-ancestor: lp:gcc-linaro/4.6+bzr106774
cbuild-state: check