Merge lp:~ams-codesourcery/gcc-linaro/widening-multiplies-4.6 into lp:gcc-linaro/4.6
- widening-multiplies-4.6
- Merge into 4.6
Status: | Superseded |
---|---|
Proposed branch: | lp:~ams-codesourcery/gcc-linaro/widening-multiplies-4.6 |
Merge into: | lp:gcc-linaro/4.6 |
Diff against target: |
1276 lines (+737/-143) (has conflicts) 20 files modified
ChangeLog.linaro (+129/-0) gcc/config/arm/arm.md (+1/-1) gcc/expr.c (+14/-15) gcc/genopinit.c (+24/-20) gcc/optabs.c (+69/-15) gcc/optabs.h (+52/-0) gcc/testsuite/gcc.target/arm/no-wmla-1.c (+11/-0) gcc/testsuite/gcc.target/arm/wmul-10.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-11.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-12.c (+11/-0) gcc/testsuite/gcc.target/arm/wmul-13.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-5.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-6.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-7.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-8.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-9.c (+10/-0) gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c (+17/-0) gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c (+17/-0) gcc/tree-cfg.c (+2/-2) gcc/tree-ssa-math-opts.c (+310/-90) Text conflict in ChangeLog.linaro |
To merge this branch: | bzr merge lp:~ams-codesourcery/gcc-linaro/widening-multiplies-4.6 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Michael Hope | Needs Fixing | ||
Review via email: mp+68866@code.launchpad.net |
This proposal supersedes a proposal from 2011-07-19.
This proposal has been superseded by a proposal from 2011-07-27.
Commit message
Description of the change
Widening multiplies optimizations.
The first commit is not approved yet, but the rest are reviewed upstream, and read to commit.
http://<email address hidden>
UPDATE: Now with an extra bug-fix.
Linaro Toolchain Builder (cbuild) wrote : Posted in a previous version of this proposal | # |
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal | # |
cbuild had trouble building this on <proposals.Build instance at 0x2b85680>.
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: armv7l-
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal | # |
cbuild had trouble building this on <proposals.Build instance at 0x2141ea8>.
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: armv7l-
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal | # |
cbuild had trouble building this on <proposals.Build instance at 0x7fe8f501a050>.
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: i686-natty-
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal | # |
cbuild had trouble building this on <proposals.Build instance at 0x7fe8f5028ab8>.
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: x86_64-
Linaro Toolchain Builder (cbuild) wrote : Posted in a previous version of this proposal | # |
cbuild has taken a snapshot of this branch at r106782 and queued it for build.
The snapshot is available at:
http://
and will be built on the following builders:
a9-builder armv5-builder i686 x86_64
You can track the build queue at:
http://
cbuild-snapshot: gcc-linaro-
cbuild-ancestor: lp:gcc-linaro/4.6+bzr106774
cbuild-state: check
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal | # |
cbuild had trouble building this on <proposals.Build instance at 0x3fec710>.
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: i686-natty-
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal | # |
cbuild had trouble building this on <proposals.Build instance at 0x2b08c68>.
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: x86_64-
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal | # |
cbuild had trouble building this on armv7l-
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: armv7l-
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal | # |
cbuild had trouble building this on armv7l-
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: armv7l-
Linaro Toolchain Builder (cbuild) wrote : | # |
cbuild has taken a snapshot of this branch at r106783 and queued it for build.
The snapshot is available at:
http://
and will be built on the following builders:
a9-builder armv5-builder i686 x86_64
You can track the build queue at:
http://
cbuild-snapshot: gcc-linaro-
cbuild-ancestor: lp:gcc-linaro/4.6+bzr106774
cbuild-state: check
Michael Hope (michaelh1) wrote : | # |
cbuild had trouble building this on x86_64-
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: x86_64-
Michael Hope (michaelh1) wrote : | # |
cbuild had trouble building this on i686-natty-
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: i686-natty-
Michael Hope (michaelh1) wrote : | # |
cbuild had trouble building this on armv7l-
See the *failed.txt logs under the build results at:
http://
The test suite results were not checked.
cbuild-checked: armv7l-
Preview Diff
1 | === modified file 'ChangeLog.linaro' |
2 | --- ChangeLog.linaro 2011-07-21 11:30:53 +0000 |
3 | +++ ChangeLog.linaro 2011-07-27 14:18:25 +0000 |
4 | @@ -1,3 +1,4 @@ |
5 | +<<<<<<< TREE |
6 | 2011-07-21 Richard Sandiford <rdsandiford@googlemail.com> |
7 | |
8 | gcc/ |
9 | @@ -137,6 +138,134 @@ |
10 | |
11 | * gcc.c-torture/compile/20110401-1.c: New test. |
12 | |
13 | +======= |
14 | +2011-07-22 Andrew Stubbs <ams@codesourcery.com> |
15 | + |
16 | + Backport from patches proposed for 4.7: |
17 | + |
18 | + 2011-07-22 Andrew Stubbs <ams@codesourcery.com> |
19 | + |
20 | + gcc/ |
21 | + * tree-ssa-math-opts.c (is_widening_mult_rhs_p): Handle constants |
22 | + beyond conversions. |
23 | + (convert_mult_to_widen): Convert constant inputs to the right type. |
24 | + (convert_plusminus_to_widen): Don't automatically reject inputs that |
25 | + are not an SSA_NAME. |
26 | + Convert constant inputs to the right type. |
27 | + |
28 | + gcc/testsuite/ |
29 | + * gcc.target/arm/wmul-11.c: New file. |
30 | + * gcc.target/arm/wmul-12.c: New file. |
31 | + * gcc.target/arm/wmul-13.c: New file. |
32 | + |
33 | + 2011-07-21 Andrew Stubbs <ams@codesourcery.com> |
34 | + |
35 | + gcc/ |
36 | + * tree-ssa-math-opts.c (convert_plusminus_to_widen): Convert add_rhs |
37 | + to the correct type. |
38 | + |
39 | + gcc/testsuite/ |
40 | + * gcc.target/arm/wmul-10.c: New file. |
41 | + |
42 | + 2011-06-24 Andrew Stubbs <ams@codesourcery.com> |
43 | + |
44 | + gcc/ |
45 | + * tree-ssa-math-opts.c (convert_mult_to_widen): Better handle |
46 | + unsigned inputs of different modes. |
47 | + (convert_plusminus_to_widen): Likewise. |
48 | + |
49 | + gcc/testsuite/ |
50 | + * gcc.target/arm/wmul-9.c: New file. |
51 | + * gcc.target/arm/wmul-bitfield-2.c: New file. |
52 | + |
53 | + 2011-07-14 Andrew Stubbs <ams@codesourcery.com> |
54 | + |
55 | + gcc/ |
56 | + * tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument |
57 | + 'type'. |
58 | + Use 'type' from caller, not inferred from 'rhs'. |
59 | + Don't reject non-conversion statements. Do return lhs in this case. |
60 | + (is_widening_mult_p): Add new argument 'type'. |
61 | + Use 'type' from caller, not inferred from 'stmt'. |
62 | + Pass type to is_widening_mult_rhs_p. |
63 | + (convert_mult_to_widen): Pass type to is_widening_mult_p. |
64 | + (convert_plusminus_to_widen): Likewise. |
65 | + |
66 | + gcc/testsuite/ |
67 | + * gcc.target/arm/wmul-8.c: New file. |
68 | + |
69 | + 2011-07-14 Andrew Stubbs <ams@codesourcery.com> |
70 | + |
71 | + gcc/ |
72 | + * tree-ssa-math-opts.c (is_widening_mult_p): Remove FIXME. |
73 | + Ensure the the larger type is the first operand. |
74 | + |
75 | + gcc/testsuite/ |
76 | + * gcc.target/arm/wmul-7.c: New file. |
77 | + |
78 | + 2011-07-14 Andrew Stubbs <ams@codesourcery.com> |
79 | + |
80 | + gcc/ |
81 | + * tree-ssa-math-opts.c (convert_mult_to_widen): Convert |
82 | + unsupported unsigned multiplies to signed. |
83 | + (convert_plusminus_to_widen): Likewise. |
84 | + |
85 | + gcc/testsuite/ |
86 | + * gcc.target/arm/wmul-6.c: New file. |
87 | + |
88 | + 2011-07-14 Andrew Stubbs <ams@codesourcery.com> |
89 | + |
90 | + gcc/ |
91 | + * tree-ssa-math-opts.c (convert_plusminus_to_widen): Permit a single |
92 | + conversion statement separating multiply-and-accumulate. |
93 | + |
94 | + gcc/testsuite/ |
95 | + * gcc.target/arm/wmul-5.c: New file. |
96 | + * gcc.target/arm/no-wmla-1.c: New file. |
97 | + |
98 | + 2011-07-14 Andrew Stubbs <ams@codesourcery.com> |
99 | + |
100 | + gcc/ |
101 | + * config/arm/arm.md (maddhidi4): Remove '*' from name. |
102 | + * expr.c (expand_expr_real_2): Use find_widening_optab_handler. |
103 | + * optabs.c (find_widening_optab_handler_and_mode): New function. |
104 | + (expand_widen_pattern_expr): Use find_widening_optab_handler. |
105 | + (expand_binop_directly): Likewise. |
106 | + (expand_binop): Likewise. |
107 | + * optabs.h (find_widening_optab_handler): New macro define. |
108 | + (find_widening_optab_handler_and_mode): New prototype. |
109 | + * tree-cfg.c (verify_gimple_assign_binary): Adjust WIDEN_MULT_EXPR |
110 | + type precision rules. |
111 | + (verify_gimple_assign_ternary): Likewise for WIDEN_MULT_PLUS_EXPR. |
112 | + * tree-ssa-math-opts.c (build_and_insert_cast): New function. |
113 | + (is_widening_mult_rhs_p): Allow widening by more than one mode. |
114 | + Explicitly disallow mis-matched input types. |
115 | + (convert_mult_to_widen): Use find_widening_optab_handler, and cast |
116 | + input types to fit the new handler. |
117 | + (convert_plusminus_to_widen): Likewise. |
118 | + |
119 | + gcc/testsuite/ |
120 | + * gcc.target/arm/wmul-bitfield-1.c: New file. |
121 | + |
122 | + |
123 | + 2011-07-09 Andrew Stubbs <ams@codesourcery.com> |
124 | + |
125 | + gcc/ |
126 | + * expr.c (expand_expr_real_2): Use widening_optab_handler. |
127 | + * genopinit.c (optabs): Use set_widening_optab_handler for $N. |
128 | + (gen_insn): $N now means $a must be wider than $b, not consecutive. |
129 | + * optabs.c (expand_widen_pattern_expr): Use widening_optab_handler. |
130 | + (expand_binop_directly): Likewise. |
131 | + (expand_binop): Likewise. |
132 | + * optabs.h (widening_optab_handlers): New struct. |
133 | + (optab_d): New member, 'widening'. |
134 | + (widening_optab_handler): New function. |
135 | + (set_widening_optab_handler): New function. |
136 | + * tree-ssa-math-opts.c (convert_mult_to_widen): Use |
137 | + widening_optab_handler. |
138 | + (convert_plusminus_to_widen): Likewise. |
139 | + |
140 | +>>>>>>> MERGE-SOURCE |
141 | 2011-07-13 Richard Sandiford <richard.sandiford@linaro.org> |
142 | |
143 | Backport from mainline: |
144 | |
145 | === modified file 'gcc/config/arm/arm.md' |
146 | --- gcc/config/arm/arm.md 2011-06-28 12:02:27 +0000 |
147 | +++ gcc/config/arm/arm.md 2011-07-27 14:18:25 +0000 |
148 | @@ -1839,7 +1839,7 @@ |
149 | (set_attr "predicable" "yes")] |
150 | ) |
151 | |
152 | -(define_insn "*maddhidi4" |
153 | +(define_insn "maddhidi4" |
154 | [(set (match_operand:DI 0 "s_register_operand" "=r") |
155 | (plus:DI |
156 | (mult:DI (sign_extend:DI |
157 | |
158 | === modified file 'gcc/expr.c' |
159 | --- gcc/expr.c 2011-07-14 11:52:32 +0000 |
160 | +++ gcc/expr.c 2011-07-27 14:18:25 +0000 |
161 | @@ -7680,18 +7680,16 @@ |
162 | { |
163 | enum machine_mode innermode = TYPE_MODE (TREE_TYPE (treeop0)); |
164 | this_optab = usmul_widen_optab; |
165 | - if (mode == GET_MODE_2XWIDER_MODE (innermode)) |
166 | + if (find_widening_optab_handler (this_optab, mode, innermode, 0) |
167 | + != CODE_FOR_nothing) |
168 | { |
169 | - if (optab_handler (this_optab, mode) != CODE_FOR_nothing) |
170 | - { |
171 | - if (TYPE_UNSIGNED (TREE_TYPE (treeop0))) |
172 | - expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1, |
173 | - EXPAND_NORMAL); |
174 | - else |
175 | - expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0, |
176 | - EXPAND_NORMAL); |
177 | - goto binop3; |
178 | - } |
179 | + if (TYPE_UNSIGNED (TREE_TYPE (treeop0))) |
180 | + expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1, |
181 | + EXPAND_NORMAL); |
182 | + else |
183 | + expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0, |
184 | + EXPAND_NORMAL); |
185 | + goto binop3; |
186 | } |
187 | } |
188 | /* Check for a multiplication with matching signedness. */ |
189 | @@ -7706,10 +7704,10 @@ |
190 | optab other_optab = zextend_p ? smul_widen_optab : umul_widen_optab; |
191 | this_optab = zextend_p ? umul_widen_optab : smul_widen_optab; |
192 | |
193 | - if (mode == GET_MODE_2XWIDER_MODE (innermode) |
194 | - && TREE_CODE (treeop0) != INTEGER_CST) |
195 | + if (TREE_CODE (treeop0) != INTEGER_CST) |
196 | { |
197 | - if (optab_handler (this_optab, mode) != CODE_FOR_nothing) |
198 | + if (find_widening_optab_handler (this_optab, mode, innermode, 0) |
199 | + != CODE_FOR_nothing) |
200 | { |
201 | expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1, |
202 | EXPAND_NORMAL); |
203 | @@ -7717,7 +7715,8 @@ |
204 | unsignedp, this_optab); |
205 | return REDUCE_BIT_FIELD (temp); |
206 | } |
207 | - if (optab_handler (other_optab, mode) != CODE_FOR_nothing |
208 | + if (find_widening_optab_handler (other_optab, mode, innermode, 0) |
209 | + != CODE_FOR_nothing |
210 | && innermode == word_mode) |
211 | { |
212 | rtx htem, hipart; |
213 | |
214 | === modified file 'gcc/genopinit.c' |
215 | --- gcc/genopinit.c 2011-05-05 15:43:06 +0000 |
216 | +++ gcc/genopinit.c 2011-07-27 14:18:25 +0000 |
217 | @@ -46,10 +46,12 @@ |
218 | used. $A and $B are replaced with the full name of the mode; $a and $b |
219 | are replaced with the short form of the name, as above. |
220 | |
221 | - If $N is present in the pattern, it means the two modes must be consecutive |
222 | - widths in the same mode class (e.g, QImode and HImode). $I means that |
223 | - only full integer modes should be considered for the next mode, and $F |
224 | - means that only float modes should be considered. |
225 | + If $N is present in the pattern, it means the two modes must be in |
226 | + the same mode class, and $b must be greater than $a (e.g, QImode |
227 | + and HImode). |
228 | + |
229 | + $I means that only full integer modes should be considered for the |
230 | + next mode, and $F means that only float modes should be considered. |
231 | $P means that both full and partial integer modes should be considered. |
232 | $Q means that only fixed-point modes should be considered. |
233 | |
234 | @@ -99,17 +101,17 @@ |
235 | "set_optab_handler (smulv_optab, $A, CODE_FOR_$(mulv$I$a3$))", |
236 | "set_optab_handler (umul_highpart_optab, $A, CODE_FOR_$(umul$a3_highpart$))", |
237 | "set_optab_handler (smul_highpart_optab, $A, CODE_FOR_$(smul$a3_highpart$))", |
238 | - "set_optab_handler (smul_widen_optab, $B, CODE_FOR_$(mul$a$b3$)$N)", |
239 | - "set_optab_handler (umul_widen_optab, $B, CODE_FOR_$(umul$a$b3$)$N)", |
240 | - "set_optab_handler (usmul_widen_optab, $B, CODE_FOR_$(usmul$a$b3$)$N)", |
241 | - "set_optab_handler (smadd_widen_optab, $B, CODE_FOR_$(madd$a$b4$)$N)", |
242 | - "set_optab_handler (umadd_widen_optab, $B, CODE_FOR_$(umadd$a$b4$)$N)", |
243 | - "set_optab_handler (ssmadd_widen_optab, $B, CODE_FOR_$(ssmadd$a$b4$)$N)", |
244 | - "set_optab_handler (usmadd_widen_optab, $B, CODE_FOR_$(usmadd$a$b4$)$N)", |
245 | - "set_optab_handler (smsub_widen_optab, $B, CODE_FOR_$(msub$a$b4$)$N)", |
246 | - "set_optab_handler (umsub_widen_optab, $B, CODE_FOR_$(umsub$a$b4$)$N)", |
247 | - "set_optab_handler (ssmsub_widen_optab, $B, CODE_FOR_$(ssmsub$a$b4$)$N)", |
248 | - "set_optab_handler (usmsub_widen_optab, $B, CODE_FOR_$(usmsub$a$b4$)$N)", |
249 | + "set_widening_optab_handler (smul_widen_optab, $B, $A, CODE_FOR_$(mul$a$b3$)$N)", |
250 | + "set_widening_optab_handler (umul_widen_optab, $B, $A, CODE_FOR_$(umul$a$b3$)$N)", |
251 | + "set_widening_optab_handler (usmul_widen_optab, $B, $A, CODE_FOR_$(usmul$a$b3$)$N)", |
252 | + "set_widening_optab_handler (smadd_widen_optab, $B, $A, CODE_FOR_$(madd$a$b4$)$N)", |
253 | + "set_widening_optab_handler (umadd_widen_optab, $B, $A, CODE_FOR_$(umadd$a$b4$)$N)", |
254 | + "set_widening_optab_handler (ssmadd_widen_optab, $B, $A, CODE_FOR_$(ssmadd$a$b4$)$N)", |
255 | + "set_widening_optab_handler (usmadd_widen_optab, $B, $A, CODE_FOR_$(usmadd$a$b4$)$N)", |
256 | + "set_widening_optab_handler (smsub_widen_optab, $B, $A, CODE_FOR_$(msub$a$b4$)$N)", |
257 | + "set_widening_optab_handler (umsub_widen_optab, $B, $A, CODE_FOR_$(umsub$a$b4$)$N)", |
258 | + "set_widening_optab_handler (ssmsub_widen_optab, $B, $A, CODE_FOR_$(ssmsub$a$b4$)$N)", |
259 | + "set_widening_optab_handler (usmsub_widen_optab, $B, $A, CODE_FOR_$(usmsub$a$b4$)$N)", |
260 | "set_optab_handler (sdiv_optab, $A, CODE_FOR_$(div$a3$))", |
261 | "set_optab_handler (ssdiv_optab, $A, CODE_FOR_$(ssdiv$Q$a3$))", |
262 | "set_optab_handler (sdivv_optab, $A, CODE_FOR_$(div$V$I$a3$))", |
263 | @@ -304,7 +306,7 @@ |
264 | { |
265 | int force_float = 0, force_int = 0, force_partial_int = 0; |
266 | int force_fixed = 0; |
267 | - int force_consec = 0; |
268 | + int force_wider = 0; |
269 | int matches = 1; |
270 | |
271 | for (pp = optabs[pindex]; pp[0] != '$' || pp[1] != '('; pp++) |
272 | @@ -322,7 +324,7 @@ |
273 | switch (*++pp) |
274 | { |
275 | case 'N': |
276 | - force_consec = 1; |
277 | + force_wider = 1; |
278 | break; |
279 | case 'I': |
280 | force_int = 1; |
281 | @@ -391,7 +393,10 @@ |
282 | || mode_class[i] == MODE_VECTOR_FRACT |
283 | || mode_class[i] == MODE_VECTOR_UFRACT |
284 | || mode_class[i] == MODE_VECTOR_ACCUM |
285 | - || mode_class[i] == MODE_VECTOR_UACCUM)) |
286 | + || mode_class[i] == MODE_VECTOR_UACCUM) |
287 | + && (! force_wider |
288 | + || *pp == 'a' |
289 | + || m1 < i)) |
290 | break; |
291 | } |
292 | |
293 | @@ -411,8 +416,7 @@ |
294 | } |
295 | |
296 | if (matches && pp[0] == '$' && pp[1] == ')' |
297 | - && *np == 0 |
298 | - && (! force_consec || (int) GET_MODE_WIDER_MODE(m1) == m2)) |
299 | + && *np == 0) |
300 | break; |
301 | } |
302 | |
303 | |
304 | === modified file 'gcc/optabs.c' |
305 | --- gcc/optabs.c 2011-07-04 14:03:49 +0000 |
306 | +++ gcc/optabs.c 2011-07-27 14:18:25 +0000 |
307 | @@ -225,6 +225,49 @@ |
308 | return 1; |
309 | } |
310 | |
311 | |
312 | +/* Given two input operands, OP0 and OP1, determine what the correct from_mode |
313 | + for a widening operation would be. In most cases this would be OP0, but if |
314 | + that's a constant it'll be VOIDmode, which isn't useful. */ |
315 | + |
316 | +static enum machine_mode |
317 | +widened_mode (rtx op0, rtx op1) |
318 | +{ |
319 | + enum machine_mode m0 = GET_MODE (op0); |
320 | + enum machine_mode m1 = GET_MODE (op1); |
321 | + return (GET_MODE_SIZE (m0) < GET_MODE_SIZE (m1)) ? m1 : m0; |
322 | +} |
323 | + |
324 | |
325 | +/* Find a widening optab even if it doesn't widen as much as we want. |
326 | + E.g. if from_mode is HImode, and to_mode is DImode, and there is no |
327 | + direct HI->SI insn, then return SI->DI, if that exists. |
328 | + If PERMIT_NON_WIDENING is non-zero then this can be used with |
329 | + non-widening optabs also. */ |
330 | + |
331 | +enum insn_code |
332 | +find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode, |
333 | + enum machine_mode from_mode, |
334 | + int permit_non_widening, |
335 | + enum machine_mode *found_mode) |
336 | +{ |
337 | + for (; (permit_non_widening || from_mode != to_mode) |
338 | + && GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode) |
339 | + && from_mode != VOIDmode; |
340 | + from_mode = GET_MODE_WIDER_MODE (from_mode)) |
341 | + { |
342 | + enum insn_code handler = widening_optab_handler (op, to_mode, |
343 | + from_mode); |
344 | + |
345 | + if (handler != CODE_FOR_nothing) |
346 | + { |
347 | + if (found_mode) |
348 | + *found_mode = from_mode; |
349 | + return handler; |
350 | + } |
351 | + } |
352 | + |
353 | + return CODE_FOR_nothing; |
354 | +} |
355 | + |
356 | |
357 | /* Widen OP to MODE and return the rtx for the widened operand. UNSIGNEDP |
358 | says whether OP is signed or unsigned. NO_EXTEND is nonzero if we need |
359 | not actually do a sign-extend or zero-extend, but can leave the |
360 | @@ -517,8 +560,9 @@ |
361 | optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default); |
362 | if (ops->code == WIDEN_MULT_PLUS_EXPR |
363 | || ops->code == WIDEN_MULT_MINUS_EXPR) |
364 | - icode = (int) optab_handler (widen_pattern_optab, |
365 | - TYPE_MODE (TREE_TYPE (ops->op2))); |
366 | + icode = (int) find_widening_optab_handler (widen_pattern_optab, |
367 | + TYPE_MODE (TREE_TYPE (ops->op2)), |
368 | + tmode0, 0); |
369 | else |
370 | icode = (int) optab_handler (widen_pattern_optab, tmode0); |
371 | gcc_assert (icode != CODE_FOR_nothing); |
372 | @@ -1389,7 +1433,9 @@ |
373 | rtx target, int unsignedp, enum optab_methods methods, |
374 | rtx last) |
375 | { |
376 | - int icode = (int) optab_handler (binoptab, mode); |
377 | + enum machine_mode from_mode = widened_mode (op0, op1); |
378 | + int icode = (int) find_widening_optab_handler (binoptab, mode, |
379 | + from_mode, 1); |
380 | enum machine_mode mode0 = insn_data[icode].operand[1].mode; |
381 | enum machine_mode mode1 = insn_data[icode].operand[2].mode; |
382 | enum machine_mode tmp_mode; |
383 | @@ -1546,7 +1592,9 @@ |
384 | /* If we can do it with a three-operand insn, do so. */ |
385 | |
386 | if (methods != OPTAB_MUST_WIDEN |
387 | - && optab_handler (binoptab, mode) != CODE_FOR_nothing) |
388 | + && find_widening_optab_handler (binoptab, mode, |
389 | + widened_mode (op0, op1), 1) |
390 | + != CODE_FOR_nothing) |
391 | { |
392 | temp = expand_binop_directly (mode, binoptab, op0, op1, target, |
393 | unsignedp, methods, last); |
394 | @@ -1585,9 +1633,10 @@ |
395 | takes operands of this mode and makes a wider mode. */ |
396 | |
397 | if (binoptab == smul_optab |
398 | - && GET_MODE_WIDER_MODE (mode) != VOIDmode |
399 | - && (optab_handler ((unsignedp ? umul_widen_optab : smul_widen_optab), |
400 | - GET_MODE_WIDER_MODE (mode)) |
401 | + && GET_MODE_2XWIDER_MODE (mode) != VOIDmode |
402 | + && (widening_optab_handler ((unsignedp ? umul_widen_optab |
403 | + : smul_widen_optab), |
404 | + GET_MODE_2XWIDER_MODE (mode), mode) |
405 | != CODE_FOR_nothing)) |
406 | { |
407 | temp = expand_binop (GET_MODE_WIDER_MODE (mode), |
408 | @@ -1615,12 +1664,15 @@ |
409 | wider_mode != VOIDmode; |
410 | wider_mode = GET_MODE_WIDER_MODE (wider_mode)) |
411 | { |
412 | - if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing |
413 | + if (optab_handler (binoptab, wider_mode) |
414 | + != CODE_FOR_nothing |
415 | || (binoptab == smul_optab |
416 | && GET_MODE_WIDER_MODE (wider_mode) != VOIDmode |
417 | - && (optab_handler ((unsignedp ? umul_widen_optab |
418 | - : smul_widen_optab), |
419 | - GET_MODE_WIDER_MODE (wider_mode)) |
420 | + && (find_widening_optab_handler ((unsignedp |
421 | + ? umul_widen_optab |
422 | + : smul_widen_optab), |
423 | + GET_MODE_WIDER_MODE (wider_mode), |
424 | + mode, 0) |
425 | != CODE_FOR_nothing))) |
426 | { |
427 | rtx xop0 = op0, xop1 = op1; |
428 | @@ -2043,8 +2095,8 @@ |
429 | && optab_handler (add_optab, word_mode) != CODE_FOR_nothing) |
430 | { |
431 | rtx product = NULL_RTX; |
432 | - |
433 | - if (optab_handler (umul_widen_optab, mode) != CODE_FOR_nothing) |
434 | + if (widening_optab_handler (umul_widen_optab, mode, word_mode) |
435 | + != CODE_FOR_nothing) |
436 | { |
437 | product = expand_doubleword_mult (mode, op0, op1, target, |
438 | true, methods); |
439 | @@ -2053,7 +2105,8 @@ |
440 | } |
441 | |
442 | if (product == NULL_RTX |
443 | - && optab_handler (smul_widen_optab, mode) != CODE_FOR_nothing) |
444 | + && widening_optab_handler (smul_widen_optab, mode, word_mode) |
445 | + != CODE_FOR_nothing) |
446 | { |
447 | product = expand_doubleword_mult (mode, op0, op1, target, |
448 | false, methods); |
449 | @@ -2144,7 +2197,8 @@ |
450 | wider_mode != VOIDmode; |
451 | wider_mode = GET_MODE_WIDER_MODE (wider_mode)) |
452 | { |
453 | - if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing |
454 | + if (find_widening_optab_handler (binoptab, wider_mode, mode, 1) |
455 | + != CODE_FOR_nothing |
456 | || (methods == OPTAB_LIB |
457 | && optab_libfunc (binoptab, wider_mode))) |
458 | { |
459 | |
460 | === modified file 'gcc/optabs.h' |
461 | --- gcc/optabs.h 2011-05-05 15:43:06 +0000 |
462 | +++ gcc/optabs.h 2011-07-27 14:18:25 +0000 |
463 | @@ -42,6 +42,11 @@ |
464 | int insn_code; |
465 | }; |
466 | |
467 | +struct widening_optab_handlers |
468 | +{ |
469 | + struct optab_handlers handlers[NUM_MACHINE_MODES][NUM_MACHINE_MODES]; |
470 | +}; |
471 | + |
472 | struct optab_d |
473 | { |
474 | enum rtx_code code; |
475 | @@ -50,6 +55,7 @@ |
476 | void (*libcall_gen)(struct optab_d *, const char *name, char suffix, |
477 | enum machine_mode); |
478 | struct optab_handlers handlers[NUM_MACHINE_MODES]; |
479 | + struct widening_optab_handlers *widening; |
480 | }; |
481 | typedef struct optab_d * optab; |
482 | |
483 | @@ -799,6 +805,15 @@ |
484 | extern void emit_unop_insn (int, rtx, rtx, enum rtx_code); |
485 | extern bool maybe_emit_unop_insn (int, rtx, rtx, enum rtx_code); |
486 | |
487 | +/* Find a widening optab even if it doesn't widen as much as we want. */ |
488 | +#define find_widening_optab_handler(A,B,C,D) \ |
489 | + find_widening_optab_handler_and_mode (A, B, C, D, NULL) |
490 | +extern enum insn_code find_widening_optab_handler_and_mode (optab, |
491 | + enum machine_mode, |
492 | + enum machine_mode, |
493 | + int, |
494 | + enum machine_mode *); |
495 | + |
496 | /* An extra flag to control optab_for_tree_code's behavior. This is needed to |
497 | distinguish between machines with a vector shift that takes a scalar for the |
498 | shift amount vs. machines that take a vector for the shift amount. */ |
499 | @@ -874,6 +889,23 @@ |
500 | + (int) CODE_FOR_nothing); |
501 | } |
502 | |
503 | +/* Like optab_handler, but for widening_operations that have a TO_MODE and |
504 | + a FROM_MODE. */ |
505 | + |
506 | +static inline enum insn_code |
507 | +widening_optab_handler (optab op, enum machine_mode to_mode, |
508 | + enum machine_mode from_mode) |
509 | +{ |
510 | + if (to_mode == from_mode || from_mode == VOIDmode) |
511 | + return optab_handler (op, to_mode); |
512 | + |
513 | + if (op->widening) |
514 | + return (enum insn_code) (op->widening->handlers[(int) to_mode][(int) from_mode].insn_code |
515 | + + (int) CODE_FOR_nothing); |
516 | + |
517 | + return CODE_FOR_nothing; |
518 | +} |
519 | + |
520 | /* Record that insn CODE should be used to implement mode MODE of OP. */ |
521 | |
522 | static inline void |
523 | @@ -882,6 +914,26 @@ |
524 | op->handlers[(int) mode].insn_code = (int) code - (int) CODE_FOR_nothing; |
525 | } |
526 | |
527 | +/* Like set_optab_handler, but for widening operations that have a TO_MODE |
528 | + and a FROM_MODE. */ |
529 | + |
530 | +static inline void |
531 | +set_widening_optab_handler (optab op, enum machine_mode to_mode, |
532 | + enum machine_mode from_mode, enum insn_code code) |
533 | +{ |
534 | + if (to_mode == from_mode) |
535 | + set_optab_handler (op, to_mode, code); |
536 | + else |
537 | + { |
538 | + if (op->widening == NULL) |
539 | + op->widening = (struct widening_optab_handlers *) |
540 | + xcalloc (1, sizeof (struct widening_optab_handlers)); |
541 | + |
542 | + op->widening->handlers[(int) to_mode][(int) from_mode].insn_code |
543 | + = (int) code - (int) CODE_FOR_nothing; |
544 | + } |
545 | +} |
546 | + |
547 | /* Return the insn used to perform conversion OP from mode FROM_MODE |
548 | to mode TO_MODE; return CODE_FOR_nothing if the target does not have |
549 | such an insn. */ |
550 | |
551 | === added file 'gcc/testsuite/gcc.target/arm/no-wmla-1.c' |
552 | --- gcc/testsuite/gcc.target/arm/no-wmla-1.c 1970-01-01 00:00:00 +0000 |
553 | +++ gcc/testsuite/gcc.target/arm/no-wmla-1.c 2011-07-27 14:18:25 +0000 |
554 | @@ -0,0 +1,11 @@ |
555 | +/* { dg-do compile } */ |
556 | +/* { dg-options "-O2 -march=armv7-a" } */ |
557 | + |
558 | +int |
559 | +foo (int a, short b, short c) |
560 | +{ |
561 | + int bc = b * c; |
562 | + return a + (short)bc; |
563 | +} |
564 | + |
565 | +/* { dg-final { scan-assembler "mul" } } */ |
566 | |
567 | === added file 'gcc/testsuite/gcc.target/arm/wmul-10.c' |
568 | --- gcc/testsuite/gcc.target/arm/wmul-10.c 1970-01-01 00:00:00 +0000 |
569 | +++ gcc/testsuite/gcc.target/arm/wmul-10.c 2011-07-27 14:18:25 +0000 |
570 | @@ -0,0 +1,10 @@ |
571 | +/* { dg-do compile } */ |
572 | +/* { dg-options "-O2 -march=armv7-a" } */ |
573 | + |
574 | +unsigned long long |
575 | +foo (unsigned short a, unsigned short *b, unsigned short *c) |
576 | +{ |
577 | + return (unsigned)a + (unsigned long long)*b * (unsigned long long)*c; |
578 | +} |
579 | + |
580 | +/* { dg-final { scan-assembler "umlal" } } */ |
581 | |
582 | === added file 'gcc/testsuite/gcc.target/arm/wmul-11.c' |
583 | --- gcc/testsuite/gcc.target/arm/wmul-11.c 1970-01-01 00:00:00 +0000 |
584 | +++ gcc/testsuite/gcc.target/arm/wmul-11.c 2011-07-27 14:18:25 +0000 |
585 | @@ -0,0 +1,10 @@ |
586 | +/* { dg-do compile } */ |
587 | +/* { dg-options "-O2 -march=armv7-a" } */ |
588 | + |
589 | +long long |
590 | +foo (int *b) |
591 | +{ |
592 | + return 10 * (long long)*b; |
593 | +} |
594 | + |
595 | +/* { dg-final { scan-assembler "smull" } } */ |
596 | |
597 | === added file 'gcc/testsuite/gcc.target/arm/wmul-12.c' |
598 | --- gcc/testsuite/gcc.target/arm/wmul-12.c 1970-01-01 00:00:00 +0000 |
599 | +++ gcc/testsuite/gcc.target/arm/wmul-12.c 2011-07-27 14:18:25 +0000 |
600 | @@ -0,0 +1,11 @@ |
601 | +/* { dg-do compile } */ |
602 | +/* { dg-options "-O2 -march=armv7-a" } */ |
603 | + |
604 | +long long |
605 | +foo (int *b, int *c) |
606 | +{ |
607 | + int tmp = *b * *c; |
608 | + return 10 + (long long)tmp; |
609 | +} |
610 | + |
611 | +/* { dg-final { scan-assembler "smlal" } } */ |
612 | |
613 | === added file 'gcc/testsuite/gcc.target/arm/wmul-13.c' |
614 | --- gcc/testsuite/gcc.target/arm/wmul-13.c 1970-01-01 00:00:00 +0000 |
615 | +++ gcc/testsuite/gcc.target/arm/wmul-13.c 2011-07-27 14:18:25 +0000 |
616 | @@ -0,0 +1,10 @@ |
617 | +/* { dg-do compile } */ |
618 | +/* { dg-options "-O2 -march=armv7-a" } */ |
619 | + |
620 | +long long |
621 | +foo (int *a, int *b) |
622 | +{ |
623 | + return *a + (long long)*b * 10; |
624 | +} |
625 | + |
626 | +/* { dg-final { scan-assembler "smlal" } } */ |
627 | |
628 | === added file 'gcc/testsuite/gcc.target/arm/wmul-5.c' |
629 | --- gcc/testsuite/gcc.target/arm/wmul-5.c 1970-01-01 00:00:00 +0000 |
630 | +++ gcc/testsuite/gcc.target/arm/wmul-5.c 2011-07-27 14:18:25 +0000 |
631 | @@ -0,0 +1,10 @@ |
632 | +/* { dg-do compile } */ |
633 | +/* { dg-options "-O2 -march=armv7-a" } */ |
634 | + |
635 | +long long |
636 | +foo (long long a, char *b, char *c) |
637 | +{ |
638 | + return a + *b * *c; |
639 | +} |
640 | + |
641 | +/* { dg-final { scan-assembler "umlal" } } */ |
642 | |
643 | === added file 'gcc/testsuite/gcc.target/arm/wmul-6.c' |
644 | --- gcc/testsuite/gcc.target/arm/wmul-6.c 1970-01-01 00:00:00 +0000 |
645 | +++ gcc/testsuite/gcc.target/arm/wmul-6.c 2011-07-27 14:18:25 +0000 |
646 | @@ -0,0 +1,10 @@ |
647 | +/* { dg-do compile } */ |
648 | +/* { dg-options "-O2 -march=armv7-a" } */ |
649 | + |
650 | +long long |
651 | +foo (long long a, unsigned char *b, signed char *c) |
652 | +{ |
653 | + return a + (long long)*b * (long long)*c; |
654 | +} |
655 | + |
656 | +/* { dg-final { scan-assembler "smlal" } } */ |
657 | |
658 | === added file 'gcc/testsuite/gcc.target/arm/wmul-7.c' |
659 | --- gcc/testsuite/gcc.target/arm/wmul-7.c 1970-01-01 00:00:00 +0000 |
660 | +++ gcc/testsuite/gcc.target/arm/wmul-7.c 2011-07-27 14:18:25 +0000 |
661 | @@ -0,0 +1,10 @@ |
662 | +/* { dg-do compile } */ |
663 | +/* { dg-options "-O2 -march=armv7-a" } */ |
664 | + |
665 | +unsigned long long |
666 | +foo (unsigned long long a, unsigned char *b, unsigned short *c) |
667 | +{ |
668 | + return a + *b * *c; |
669 | +} |
670 | + |
671 | +/* { dg-final { scan-assembler "umlal" } } */ |
672 | |
673 | === added file 'gcc/testsuite/gcc.target/arm/wmul-8.c' |
674 | --- gcc/testsuite/gcc.target/arm/wmul-8.c 1970-01-01 00:00:00 +0000 |
675 | +++ gcc/testsuite/gcc.target/arm/wmul-8.c 2011-07-27 14:18:25 +0000 |
676 | @@ -0,0 +1,10 @@ |
677 | +/* { dg-do compile } */ |
678 | +/* { dg-options "-O2 -march=armv7-a" } */ |
679 | + |
680 | +long long |
681 | +foo (long long a, int *b, int *c) |
682 | +{ |
683 | + return a + *b * *c; |
684 | +} |
685 | + |
686 | +/* { dg-final { scan-assembler "smlal" } } */ |
687 | |
688 | === added file 'gcc/testsuite/gcc.target/arm/wmul-9.c' |
689 | --- gcc/testsuite/gcc.target/arm/wmul-9.c 1970-01-01 00:00:00 +0000 |
690 | +++ gcc/testsuite/gcc.target/arm/wmul-9.c 2011-07-27 14:18:25 +0000 |
691 | @@ -0,0 +1,10 @@ |
692 | +/* { dg-do compile } */ |
693 | +/* { dg-options "-O2 -march=armv7-a" } */ |
694 | + |
695 | +long long |
696 | +foo (long long a, short *b, char *c) |
697 | +{ |
698 | + return a + *b * *c; |
699 | +} |
700 | + |
701 | +/* { dg-final { scan-assembler "smlalbb" } } */ |
702 | |
703 | === added file 'gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c' |
704 | --- gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c 1970-01-01 00:00:00 +0000 |
705 | +++ gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c 2011-07-27 14:18:25 +0000 |
706 | @@ -0,0 +1,17 @@ |
707 | +/* { dg-do compile } */ |
708 | +/* { dg-options "-O2 -march=armv7-a" } */ |
709 | + |
710 | +struct bf |
711 | +{ |
712 | + int a : 3; |
713 | + int b : 15; |
714 | + int c : 3; |
715 | +}; |
716 | + |
717 | +long long |
718 | +foo (long long a, struct bf b, struct bf c) |
719 | +{ |
720 | + return a + b.b * c.b; |
721 | +} |
722 | + |
723 | +/* { dg-final { scan-assembler "smlalbb" } } */ |
724 | |
725 | === added file 'gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c' |
726 | --- gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c 1970-01-01 00:00:00 +0000 |
727 | +++ gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c 2011-07-27 14:18:25 +0000 |
728 | @@ -0,0 +1,17 @@ |
729 | +/* { dg-do compile } */ |
730 | +/* { dg-options "-O2 -march=armv7-a" } */ |
731 | + |
732 | +struct bf |
733 | +{ |
734 | + int a : 3; |
735 | + unsigned int b : 15; |
736 | + int c : 3; |
737 | +}; |
738 | + |
739 | +long long |
740 | +foo (long long a, struct bf b, struct bf c) |
741 | +{ |
742 | + return a + b.b * c.c; |
743 | +} |
744 | + |
745 | +/* { dg-final { scan-assembler "smlalbb" } } */ |
746 | |
747 | === modified file 'gcc/tree-cfg.c' |
748 | --- gcc/tree-cfg.c 2011-07-01 09:19:21 +0000 |
749 | +++ gcc/tree-cfg.c 2011-07-27 14:18:25 +0000 |
750 | @@ -3574,7 +3574,7 @@ |
751 | case WIDEN_MULT_EXPR: |
752 | if (TREE_CODE (lhs_type) != INTEGER_TYPE) |
753 | return true; |
754 | - return ((2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type)) |
755 | + return ((2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type)) |
756 | || (TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type))); |
757 | |
758 | case WIDEN_SUM_EXPR: |
759 | @@ -3667,7 +3667,7 @@ |
760 | && !FIXED_POINT_TYPE_P (rhs1_type)) |
761 | || !useless_type_conversion_p (rhs1_type, rhs2_type) |
762 | || !useless_type_conversion_p (lhs_type, rhs3_type) |
763 | - || 2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type) |
764 | + || 2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type) |
765 | || TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type)) |
766 | { |
767 | error ("type mismatch in widening multiply-accumulate expression"); |
768 | |
769 | === modified file 'gcc/tree-ssa-math-opts.c' |
770 | --- gcc/tree-ssa-math-opts.c 2011-03-11 16:36:16 +0000 |
771 | +++ gcc/tree-ssa-math-opts.c 2011-07-27 14:18:25 +0000 |
772 | @@ -1266,42 +1266,75 @@ |
773 | } |
774 | }; |
775 | |
776 | -/* Return true if RHS is a suitable operand for a widening multiplication. |
777 | +/* Build a gimple assignment to cast VAL to TARGET. Insert the statement |
778 | + prior to GSI's current position, and return the fresh SSA name. */ |
779 | + |
780 | +static tree |
781 | +build_and_insert_cast (gimple_stmt_iterator *gsi, location_t loc, |
782 | + tree target, tree val) |
783 | +{ |
784 | + tree result = make_ssa_name (target, NULL); |
785 | + gimple stmt = gimple_build_assign_with_ops (CONVERT_EXPR, result, val, NULL); |
786 | + gimple_set_location (stmt, loc); |
787 | + gsi_insert_before (gsi, stmt, GSI_SAME_STMT); |
788 | + return result; |
789 | +} |
790 | + |
791 | +/* Return true if RHS is a suitable operand for a widening multiplication, |
792 | + assuming a target type of TYPE. |
793 | There are two cases: |
794 | |
795 | - - RHS makes some value twice as wide. Store that value in *NEW_RHS_OUT |
796 | - if so, and store its type in *TYPE_OUT. |
797 | + - RHS makes some value at least twice as wide. Store that value |
798 | + in *NEW_RHS_OUT if so, and store its type in *TYPE_OUT. |
799 | |
800 | - RHS is an integer constant. Store that value in *NEW_RHS_OUT if so, |
801 | but leave *TYPE_OUT untouched. */ |
802 | |
803 | static bool |
804 | -is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out) |
805 | +is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out, |
806 | + tree *new_rhs_out) |
807 | { |
808 | gimple stmt; |
809 | - tree type, type1, rhs1; |
810 | + tree type1, rhs1; |
811 | enum tree_code rhs_code; |
812 | |
813 | if (TREE_CODE (rhs) == SSA_NAME) |
814 | { |
815 | - type = TREE_TYPE (rhs); |
816 | stmt = SSA_NAME_DEF_STMT (rhs); |
817 | if (!is_gimple_assign (stmt)) |
818 | - return false; |
819 | - |
820 | - rhs_code = gimple_assign_rhs_code (stmt); |
821 | - if (TREE_CODE (type) == INTEGER_TYPE |
822 | - ? !CONVERT_EXPR_CODE_P (rhs_code) |
823 | - : rhs_code != FIXED_CONVERT_EXPR) |
824 | - return false; |
825 | - |
826 | - rhs1 = gimple_assign_rhs1 (stmt); |
827 | - type1 = TREE_TYPE (rhs1); |
828 | + { |
829 | + rhs1 = NULL; |
830 | + type1 = TREE_TYPE (rhs); |
831 | + } |
832 | + else |
833 | + { |
834 | + rhs1 = gimple_assign_rhs1 (stmt); |
835 | + type1 = TREE_TYPE (rhs1); |
836 | + } |
837 | + |
838 | + if (rhs1 && TREE_CODE (rhs1) == INTEGER_CST) |
839 | + { |
840 | + *new_rhs_out = rhs1; |
841 | + *type_out = NULL; |
842 | + return true; |
843 | + } |
844 | + |
845 | if (TREE_CODE (type1) != TREE_CODE (type) |
846 | - || TYPE_PRECISION (type1) * 2 != TYPE_PRECISION (type)) |
847 | + || TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type)) |
848 | return false; |
849 | |
850 | - *new_rhs_out = rhs1; |
851 | + if (rhs1) |
852 | + { |
853 | + rhs_code = gimple_assign_rhs_code (stmt); |
854 | + if (TREE_CODE (type) == INTEGER_TYPE |
855 | + ? !CONVERT_EXPR_CODE_P (rhs_code) |
856 | + : rhs_code != FIXED_CONVERT_EXPR) |
857 | + *new_rhs_out = rhs; |
858 | + else |
859 | + *new_rhs_out = rhs1; |
860 | + } |
861 | + else |
862 | + *new_rhs_out = rhs; |
863 | *type_out = type1; |
864 | return true; |
865 | } |
866 | @@ -1316,28 +1349,27 @@ |
867 | return false; |
868 | } |
869 | |
870 | -/* Return true if STMT performs a widening multiplication. If so, |
871 | - store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT |
872 | - respectively. Also fill *RHS1_OUT and *RHS2_OUT such that converting |
873 | - those operands to types *TYPE1_OUT and *TYPE2_OUT would give the |
874 | - operands of the multiplication. */ |
875 | +/* Return true if STMT performs a widening multiplication, assuming the |
876 | + output type is TYPE. If so, store the unwidened types of the operands |
877 | + in *TYPE1_OUT and *TYPE2_OUT respectively. Also fill *RHS1_OUT and |
878 | + *RHS2_OUT such that converting those operands to types *TYPE1_OUT |
879 | + and *TYPE2_OUT would give the operands of the multiplication. */ |
880 | |
881 | static bool |
882 | -is_widening_mult_p (gimple stmt, |
883 | +is_widening_mult_p (tree type, gimple stmt, |
884 | tree *type1_out, tree *rhs1_out, |
885 | tree *type2_out, tree *rhs2_out) |
886 | { |
887 | - tree type; |
888 | - |
889 | - type = TREE_TYPE (gimple_assign_lhs (stmt)); |
890 | if (TREE_CODE (type) != INTEGER_TYPE |
891 | && TREE_CODE (type) != FIXED_POINT_TYPE) |
892 | return false; |
893 | |
894 | - if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out)) |
895 | + if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out, |
896 | + rhs1_out)) |
897 | return false; |
898 | |
899 | - if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out)) |
900 | + if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out, |
901 | + rhs2_out)) |
902 | return false; |
903 | |
904 | if (*type1_out == NULL) |
905 | @@ -1354,6 +1386,18 @@ |
906 | *type2_out = *type1_out; |
907 | } |
908 | |
909 | + /* Ensure that the larger of the two operands comes first. */ |
910 | + if (TYPE_PRECISION (*type1_out) < TYPE_PRECISION (*type2_out)) |
911 | + { |
912 | + tree tmp; |
913 | + tmp = *type1_out; |
914 | + *type1_out = *type2_out; |
915 | + *type2_out = tmp; |
916 | + tmp = *rhs1_out; |
917 | + *rhs1_out = *rhs2_out; |
918 | + *rhs2_out = tmp; |
919 | + } |
920 | + |
921 | return true; |
922 | } |
923 | |
924 | @@ -1362,31 +1406,100 @@ |
925 | value is true iff we converted the statement. */ |
926 | |
927 | static bool |
928 | -convert_mult_to_widen (gimple stmt) |
929 | +convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) |
930 | { |
931 | - tree lhs, rhs1, rhs2, type, type1, type2; |
932 | + tree lhs, rhs1, rhs2, type, type1, type2, tmp = NULL; |
933 | enum insn_code handler; |
934 | + enum machine_mode to_mode, from_mode, actual_mode; |
935 | + optab op; |
936 | + int actual_precision; |
937 | + location_t loc = gimple_location (stmt); |
938 | + bool from_unsigned1, from_unsigned2; |
939 | |
940 | lhs = gimple_assign_lhs (stmt); |
941 | type = TREE_TYPE (lhs); |
942 | if (TREE_CODE (type) != INTEGER_TYPE) |
943 | return false; |
944 | |
945 | - if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2)) |
946 | + if (!is_widening_mult_p (type, stmt, &type1, &rhs1, &type2, &rhs2)) |
947 | return false; |
948 | |
949 | - if (TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2)) |
950 | - handler = optab_handler (umul_widen_optab, TYPE_MODE (type)); |
951 | - else if (!TYPE_UNSIGNED (type1) && !TYPE_UNSIGNED (type2)) |
952 | - handler = optab_handler (smul_widen_optab, TYPE_MODE (type)); |
953 | + to_mode = TYPE_MODE (type); |
954 | + from_mode = TYPE_MODE (type1); |
955 | + from_unsigned1 = TYPE_UNSIGNED (type1); |
956 | + from_unsigned2 = TYPE_UNSIGNED (type2); |
957 | + |
958 | + if (from_unsigned1 && from_unsigned2) |
959 | + op = umul_widen_optab; |
960 | + else if (!from_unsigned1 && !from_unsigned2) |
961 | + op = smul_widen_optab; |
962 | else |
963 | - handler = optab_handler (usmul_widen_optab, TYPE_MODE (type)); |
964 | + op = usmul_widen_optab; |
965 | + |
966 | + handler = find_widening_optab_handler_and_mode (op, to_mode, from_mode, |
967 | + 0, &actual_mode); |
968 | |
969 | if (handler == CODE_FOR_nothing) |
970 | - return false; |
971 | - |
972 | - gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1)); |
973 | - gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2)); |
974 | + { |
975 | + if (op != smul_widen_optab) |
976 | + { |
977 | + /* We can use a signed multiply with unsigned types as long as |
978 | + there is a wider mode to use, or it is the smaller of the two |
979 | + types that is unsigned. Note that type1 >= type2, always. */ |
980 | + if ((TYPE_UNSIGNED (type1) |
981 | + && TYPE_PRECISION (type1) == GET_MODE_PRECISION (from_mode)) |
982 | + || (TYPE_UNSIGNED (type2) |
983 | + && TYPE_PRECISION (type2) == GET_MODE_PRECISION (from_mode))) |
984 | + { |
985 | + from_mode = GET_MODE_WIDER_MODE (from_mode); |
986 | + if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode)) |
987 | + return false; |
988 | + } |
989 | + |
990 | + op = smul_widen_optab; |
991 | + handler = find_widening_optab_handler_and_mode (op, to_mode, |
992 | + from_mode, 0, |
993 | + &actual_mode); |
994 | + |
995 | + if (handler == CODE_FOR_nothing) |
996 | + return false; |
997 | + |
998 | + from_unsigned1 = from_unsigned2 = false; |
999 | + } |
1000 | + else |
1001 | + return false; |
1002 | + } |
1003 | + |
1004 | + /* Ensure that the inputs to the handler are in the correct precison |
1005 | + for the opcode. This will be the full mode size. */ |
1006 | + actual_precision = GET_MODE_PRECISION (actual_mode); |
1007 | + if (actual_precision != TYPE_PRECISION (type1) |
1008 | + || from_unsigned1 != TYPE_UNSIGNED (type1)) |
1009 | + { |
1010 | + tmp = create_tmp_var (build_nonstandard_integer_type |
1011 | + (actual_precision, from_unsigned1), |
1012 | + NULL); |
1013 | + rhs1 = build_and_insert_cast (gsi, loc, tmp, rhs1); |
1014 | + } |
1015 | + if (actual_precision != TYPE_PRECISION (type2) |
1016 | + || from_unsigned2 != TYPE_UNSIGNED (type2)) |
1017 | + { |
1018 | + /* Reuse the same type info, if possible. */ |
1019 | + if (!tmp || from_unsigned1 != from_unsigned2) |
1020 | + tmp = create_tmp_var (build_nonstandard_integer_type |
1021 | + (actual_precision, from_unsigned2), |
1022 | + NULL); |
1023 | + rhs2 = build_and_insert_cast (gsi, loc, tmp, rhs2); |
1024 | + } |
1025 | + |
1026 | + /* Handle constants. */ |
1027 | + if (TREE_CODE (rhs1) == INTEGER_CST) |
1028 | + rhs1 = fold_convert (type1, rhs1); |
1029 | + if (TREE_CODE (rhs2) == INTEGER_CST) |
1030 | + rhs2 = fold_convert (type2, rhs2); |
1031 | + |
1032 | + gimple_assign_set_rhs1 (stmt, rhs1); |
1033 | + gimple_assign_set_rhs2 (stmt, rhs2); |
1034 | gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR); |
1035 | update_stmt (stmt); |
1036 | return true; |
1037 | @@ -1403,11 +1516,17 @@ |
1038 | enum tree_code code) |
1039 | { |
1040 | gimple rhs1_stmt = NULL, rhs2_stmt = NULL; |
1041 | - tree type, type1, type2; |
1042 | + gimple conv1_stmt = NULL, conv2_stmt = NULL, conv_stmt; |
1043 | + tree type, type1, type2, optype, tmp = NULL; |
1044 | tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs; |
1045 | enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK; |
1046 | optab this_optab; |
1047 | enum tree_code wmult_code; |
1048 | + enum insn_code handler; |
1049 | + enum machine_mode to_mode, from_mode, actual_mode; |
1050 | + location_t loc = gimple_location (stmt); |
1051 | + int actual_precision; |
1052 | + bool from_unsigned1, from_unsigned2; |
1053 | |
1054 | lhs = gimple_assign_lhs (stmt); |
1055 | type = TREE_TYPE (lhs); |
1056 | @@ -1429,8 +1548,6 @@ |
1057 | if (is_gimple_assign (rhs1_stmt)) |
1058 | rhs1_code = gimple_assign_rhs_code (rhs1_stmt); |
1059 | } |
1060 | - else |
1061 | - return false; |
1062 | |
1063 | if (TREE_CODE (rhs2) == SSA_NAME) |
1064 | { |
1065 | @@ -1438,57 +1555,160 @@ |
1066 | if (is_gimple_assign (rhs2_stmt)) |
1067 | rhs2_code = gimple_assign_rhs_code (rhs2_stmt); |
1068 | } |
1069 | - else |
1070 | - return false; |
1071 | - |
1072 | - if (code == PLUS_EXPR && rhs1_code == MULT_EXPR) |
1073 | - { |
1074 | - if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1, |
1075 | - &type2, &mult_rhs2)) |
1076 | - return false; |
1077 | - add_rhs = rhs2; |
1078 | - } |
1079 | - else if (rhs2_code == MULT_EXPR) |
1080 | - { |
1081 | - if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1, |
1082 | - &type2, &mult_rhs2)) |
1083 | - return false; |
1084 | - add_rhs = rhs1; |
1085 | - } |
1086 | - else if (code == PLUS_EXPR && rhs1_code == WIDEN_MULT_EXPR) |
1087 | - { |
1088 | - mult_rhs1 = gimple_assign_rhs1 (rhs1_stmt); |
1089 | - mult_rhs2 = gimple_assign_rhs2 (rhs1_stmt); |
1090 | - type1 = TREE_TYPE (mult_rhs1); |
1091 | - type2 = TREE_TYPE (mult_rhs2); |
1092 | - add_rhs = rhs2; |
1093 | - } |
1094 | - else if (rhs2_code == WIDEN_MULT_EXPR) |
1095 | - { |
1096 | - mult_rhs1 = gimple_assign_rhs1 (rhs2_stmt); |
1097 | - mult_rhs2 = gimple_assign_rhs2 (rhs2_stmt); |
1098 | - type1 = TREE_TYPE (mult_rhs1); |
1099 | - type2 = TREE_TYPE (mult_rhs2); |
1100 | - add_rhs = rhs1; |
1101 | - } |
1102 | - else |
1103 | - return false; |
1104 | - |
1105 | - if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2)) |
1106 | - return false; |
1107 | + |
1108 | + /* Allow for one conversion statement between the multiply |
1109 | + and addition/subtraction statement. If there are more than |
1110 | + one conversions then we assume they would invalidate this |
1111 | + transformation. If that's not the case then they should have |
1112 | + been folded before now. */ |
1113 | + if (CONVERT_EXPR_CODE_P (rhs1_code)) |
1114 | + { |
1115 | + conv1_stmt = rhs1_stmt; |
1116 | + rhs1 = gimple_assign_rhs1 (rhs1_stmt); |
1117 | + if (TREE_CODE (rhs1) == SSA_NAME) |
1118 | + { |
1119 | + rhs1_stmt = SSA_NAME_DEF_STMT (rhs1); |
1120 | + if (is_gimple_assign (rhs1_stmt)) |
1121 | + rhs1_code = gimple_assign_rhs_code (rhs1_stmt); |
1122 | + } |
1123 | + else |
1124 | + return false; |
1125 | + } |
1126 | + if (CONVERT_EXPR_CODE_P (rhs2_code)) |
1127 | + { |
1128 | + conv2_stmt = rhs2_stmt; |
1129 | + rhs2 = gimple_assign_rhs1 (rhs2_stmt); |
1130 | + if (TREE_CODE (rhs2) == SSA_NAME) |
1131 | + { |
1132 | + rhs2_stmt = SSA_NAME_DEF_STMT (rhs2); |
1133 | + if (is_gimple_assign (rhs2_stmt)) |
1134 | + rhs2_code = gimple_assign_rhs_code (rhs2_stmt); |
1135 | + } |
1136 | + else |
1137 | + return false; |
1138 | + } |
1139 | + |
1140 | + /* If code is WIDEN_MULT_EXPR then it would seem unnecessary to call |
1141 | + is_widening_mult_p, but we still need the rhs returns. |
1142 | + |
1143 | + It might also appear that it would be sufficient to use the existing |
1144 | + operands of the widening multiply, but that would limit the choice of |
1145 | + multiply-and-accumulate instructions. */ |
1146 | + if (code == PLUS_EXPR |
1147 | + && (rhs1_code == MULT_EXPR || rhs1_code == WIDEN_MULT_EXPR)) |
1148 | + { |
1149 | + if (!is_widening_mult_p (type, rhs1_stmt, &type1, &mult_rhs1, |
1150 | + &type2, &mult_rhs2)) |
1151 | + return false; |
1152 | + add_rhs = rhs2; |
1153 | + conv_stmt = conv1_stmt; |
1154 | + } |
1155 | + else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR) |
1156 | + { |
1157 | + if (!is_widening_mult_p (type, rhs2_stmt, &type1, &mult_rhs1, |
1158 | + &type2, &mult_rhs2)) |
1159 | + return false; |
1160 | + add_rhs = rhs1; |
1161 | + conv_stmt = conv2_stmt; |
1162 | + } |
1163 | + else |
1164 | + return false; |
1165 | + |
1166 | + to_mode = TYPE_MODE (type); |
1167 | + from_mode = TYPE_MODE (type1); |
1168 | + from_unsigned1 = TYPE_UNSIGNED (type1); |
1169 | + from_unsigned2 = TYPE_UNSIGNED (type2); |
1170 | + |
1171 | + /* There's no such thing as a mixed sign madd yet, so use a wider mode. */ |
1172 | + if (from_unsigned1 != from_unsigned2) |
1173 | + { |
1174 | + /* We can use a signed multiply with unsigned types as long as |
1175 | + there is a wider mode to use, or it is the smaller of the two |
1176 | + types that is unsigned. Note that type1 >= type2, always. */ |
1177 | + if ((from_unsigned1 |
1178 | + && TYPE_PRECISION (type1) == GET_MODE_PRECISION (from_mode)) |
1179 | + || (from_unsigned2 |
1180 | + && TYPE_PRECISION (type2) == GET_MODE_PRECISION (from_mode))) |
1181 | + { |
1182 | + from_mode = GET_MODE_WIDER_MODE (from_mode); |
1183 | + if (GET_MODE_SIZE (from_mode) >= GET_MODE_SIZE (to_mode)) |
1184 | + return false; |
1185 | + } |
1186 | + |
1187 | + from_unsigned1 = from_unsigned2 = false; |
1188 | + } |
1189 | + |
1190 | + /* If there was a conversion between the multiply and addition |
1191 | + then we need to make sure it fits a multiply-and-accumulate. |
1192 | + The should be a single mode change which does not change the |
1193 | + value. */ |
1194 | + if (conv_stmt) |
1195 | + { |
1196 | + /* We use the original, unmodified data types for this. */ |
1197 | + tree from_type = TREE_TYPE (gimple_assign_rhs1 (conv_stmt)); |
1198 | + tree to_type = TREE_TYPE (gimple_assign_lhs (conv_stmt)); |
1199 | + int data_size = TYPE_PRECISION (type1) + TYPE_PRECISION (type2); |
1200 | + bool is_unsigned = TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2); |
1201 | + |
1202 | + if (TYPE_PRECISION (from_type) > TYPE_PRECISION (to_type)) |
1203 | + { |
1204 | + /* Conversion is a truncate. */ |
1205 | + if (TYPE_PRECISION (to_type) < data_size) |
1206 | + return false; |
1207 | + } |
1208 | + else if (TYPE_PRECISION (from_type) < TYPE_PRECISION (to_type)) |
1209 | + { |
1210 | + /* Conversion is an extend. Check it's the right sort. */ |
1211 | + if (TYPE_UNSIGNED (from_type) != is_unsigned |
1212 | + && !(is_unsigned && TYPE_PRECISION (from_type) > data_size)) |
1213 | + return false; |
1214 | + } |
1215 | + /* else convert is a no-op for our purposes. */ |
1216 | + } |
1217 | |
1218 | /* Verify that the machine can perform a widening multiply |
1219 | accumulate in this mode/signedness combination, otherwise |
1220 | this transformation is likely to pessimize code. */ |
1221 | - this_optab = optab_for_tree_code (wmult_code, type1, optab_default); |
1222 | - if (optab_handler (this_optab, TYPE_MODE (type)) == CODE_FOR_nothing) |
1223 | + optype = build_nonstandard_integer_type (from_mode, from_unsigned1); |
1224 | + this_optab = optab_for_tree_code (wmult_code, optype, optab_default); |
1225 | + handler = find_widening_optab_handler_and_mode (this_optab, to_mode, |
1226 | + from_mode, 0, &actual_mode); |
1227 | + |
1228 | + if (handler == CODE_FOR_nothing) |
1229 | return false; |
1230 | |
1231 | - /* ??? May need some type verification here? */ |
1232 | - |
1233 | - gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, |
1234 | - fold_convert (type1, mult_rhs1), |
1235 | - fold_convert (type2, mult_rhs2), |
1236 | + /* Ensure that the inputs to the handler are in the correct precison |
1237 | + for the opcode. This will be the full mode size. */ |
1238 | + actual_precision = GET_MODE_PRECISION (actual_mode); |
1239 | + if (actual_precision != TYPE_PRECISION (type1) |
1240 | + || from_unsigned1 != TYPE_UNSIGNED (type1)) |
1241 | + { |
1242 | + tmp = create_tmp_var (build_nonstandard_integer_type |
1243 | + (actual_precision, from_unsigned1), |
1244 | + NULL); |
1245 | + mult_rhs1 = build_and_insert_cast (gsi, loc, tmp, mult_rhs1); |
1246 | + } |
1247 | + if (actual_precision != TYPE_PRECISION (type2) |
1248 | + || from_unsigned2 != TYPE_UNSIGNED (type2)) |
1249 | + { |
1250 | + if (!tmp || from_unsigned1 != from_unsigned2) |
1251 | + tmp = create_tmp_var (build_nonstandard_integer_type |
1252 | + (actual_precision, from_unsigned2), |
1253 | + NULL); |
1254 | + mult_rhs2 = build_and_insert_cast (gsi, loc, tmp, mult_rhs2); |
1255 | + } |
1256 | + |
1257 | + if (TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (add_rhs))) |
1258 | + add_rhs = build_and_insert_cast (gsi, loc, create_tmp_var (type, NULL), |
1259 | + add_rhs); |
1260 | + |
1261 | + /* Handle constants. */ |
1262 | + if (TREE_CODE (mult_rhs1) == INTEGER_CST) |
1263 | + rhs1 = fold_convert (type1, mult_rhs1); |
1264 | + if (TREE_CODE (mult_rhs2) == INTEGER_CST) |
1265 | + rhs2 = fold_convert (type2, mult_rhs2); |
1266 | + |
1267 | + gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, mult_rhs1, mult_rhs2, |
1268 | add_rhs); |
1269 | update_stmt (gsi_stmt (*gsi)); |
1270 | return true; |
1271 | @@ -1696,7 +1916,7 @@ |
1272 | switch (code) |
1273 | { |
1274 | case MULT_EXPR: |
1275 | - if (!convert_mult_to_widen (stmt) |
1276 | + if (!convert_mult_to_widen (stmt, &gsi) |
1277 | && convert_mult_to_fma (stmt, |
1278 | gimple_assign_rhs1 (stmt), |
1279 | gimple_assign_rhs2 (stmt))) |
cbuild has taken a snapshot of this branch at r106781 and queued it for build.
The snapshot is available at: ex.seabright. co.nz/snapshots /gcc-linaro- 4.6+bzr106781~ ams-codesourcer y~widening- multiplies- 4.6.tar. xdelta3. xz
http://
and will be built on the following builders:
a9-builder armv5-builder i686 x86_64
You can track the build queue at: ex.seabright. co.nz/helpers/ scheduler
http://
cbuild-snapshot: gcc-linaro- 4.6+bzr106781~ ams-codesourcer y~widening- multiplies- 4.6
cbuild-ancestor: lp:gcc-linaro/4.6+bzr106774
cbuild-state: check