Merge lp:~ams-codesourcery/gcc-linaro/widening-multiplies-4.6 into lp:gcc-linaro/4.6

Proposed by Andrew Stubbs
Status: Superseded
Proposed branch: lp:~ams-codesourcery/gcc-linaro/widening-multiplies-4.6
Merge into: lp:gcc-linaro/4.6
Diff against target: 1157 lines (+645/-134) (has conflicts)
17 files modified
ChangeLog.linaro (+105/-0)
gcc/config/arm/arm.md (+1/-1)
gcc/expr.c (+14/-15)
gcc/genopinit.c (+24/-20)
gcc/optabs.c (+56/-15)
gcc/optabs.h (+52/-0)
gcc/testsuite/gcc.target/arm/no-wmla-1.c (+11/-0)
gcc/testsuite/gcc.target/arm/wmul-10.c (+10/-0)
gcc/testsuite/gcc.target/arm/wmul-5.c (+10/-0)
gcc/testsuite/gcc.target/arm/wmul-6.c (+10/-0)
gcc/testsuite/gcc.target/arm/wmul-7.c (+10/-0)
gcc/testsuite/gcc.target/arm/wmul-8.c (+10/-0)
gcc/testsuite/gcc.target/arm/wmul-9.c (+10/-0)
gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c (+17/-0)
gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c (+17/-0)
gcc/tree-cfg.c (+2/-2)
gcc/tree-ssa-math-opts.c (+286/-81)
Text conflict in ChangeLog.linaro
To merge this branch: bzr merge lp:~ams-codesourcery/gcc-linaro/widening-multiplies-4.6
Reviewer Review Type Date Requested Status
Michael Hope Needs Fixing
Review via email: mp+68349@code.launchpad.net

This proposal supersedes a proposal from 2011-07-15.

This proposal has been superseded by a proposal from 2011-07-22.

Description of the change

Widening multiplies optimizations.

The first commit is not approved yet, but the rest are reviewed upstream, and read to commit.

http://<email address hidden>/msg08720.html

UPDATE: Now with an extra bug-fix.

To post a comment you must log in.
Revision history for this message
Linaro Toolchain Builder (cbuild) wrote : Posted in a previous version of this proposal

cbuild has taken a snapshot of this branch at r106781 and queued it for build.

The snapshot is available at:
 http://ex.seabright.co.nz/snapshots/gcc-linaro-4.6+bzr106781~ams-codesourcery~widening-multiplies-4.6.tar.xdelta3.xz

and will be built on the following builders:
 a9-builder armv5-builder i686 x86_64

You can track the build queue at:
 http://ex.seabright.co.nz/helpers/scheduler

cbuild-snapshot: gcc-linaro-4.6+bzr106781~ams-codesourcery~widening-multiplies-4.6
cbuild-ancestor: lp:gcc-linaro/4.6+bzr106774
cbuild-state: check

Revision history for this message
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal

cbuild had trouble building this on <proposals.Build instance at 0x2b85680>.

See the *failed.txt logs under the build results at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106781~ams-codesourcery~widening-multiplies-4.6/logs/armv7l-natty-cbuild158-ursa3-armv5r2

The test suite results were not checked.

cbuild-checked: armv7l-natty-cbuild158-ursa3-armv5r2

review: Needs Fixing
Revision history for this message
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal

cbuild had trouble building this on <proposals.Build instance at 0x2141ea8>.

See the *failed.txt logs under the build results at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106781~ams-codesourcery~widening-multiplies-4.6/logs/armv7l-natty-cbuild158-ursa4-cortexa9r1

The test suite results were not checked.

cbuild-checked: armv7l-natty-cbuild158-ursa4-cortexa9r1

review: Needs Fixing
Revision history for this message
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal

cbuild had trouble building this on <proposals.Build instance at 0x7fe8f501a050>.

See the *failed.txt logs under the build results at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106781~ams-codesourcery~widening-multiplies-4.6/logs/i686-natty-cbuild158-oort2-i686r1

The test suite results were not checked.

cbuild-checked: i686-natty-cbuild158-oort2-i686r1

review: Needs Fixing
Revision history for this message
Michael Hope (michaelh1) wrote : Posted in a previous version of this proposal

cbuild had trouble building this on <proposals.Build instance at 0x7fe8f5028ab8>.

See the *failed.txt logs under the build results at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106781~ams-codesourcery~widening-multiplies-4.6/logs/x86_64-natty-cbuild158-oort1-x86_64r1

The test suite results were not checked.

cbuild-checked: x86_64-natty-cbuild158-oort1-x86_64r1

review: Needs Fixing
Revision history for this message
Linaro Toolchain Builder (cbuild) wrote :

cbuild has taken a snapshot of this branch at r106782 and queued it for build.

The snapshot is available at:
 http://ex.seabright.co.nz/snapshots/gcc-linaro-4.6+bzr106782~ams-codesourcery~widening-multiplies-4.6.tar.xdelta3.xz

and will be built on the following builders:
 a9-builder armv5-builder i686 x86_64

You can track the build queue at:
 http://ex.seabright.co.nz/helpers/scheduler

cbuild-snapshot: gcc-linaro-4.6+bzr106782~ams-codesourcery~widening-multiplies-4.6
cbuild-ancestor: lp:gcc-linaro/4.6+bzr106774
cbuild-state: check

Revision history for this message
Michael Hope (michaelh1) wrote :

cbuild had trouble building this on <proposals.Build instance at 0x3fec710>.

See the *failed.txt logs under the build results at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106782~ams-codesourcery~widening-multiplies-4.6/logs/i686-natty-cbuild158-oort6-i686r1

The test suite results were not checked.

cbuild-checked: i686-natty-cbuild158-oort6-i686r1

review: Needs Fixing
Revision history for this message
Michael Hope (michaelh1) wrote :

cbuild had trouble building this on <proposals.Build instance at 0x2b08c68>.

See the *failed.txt logs under the build results at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106782~ams-codesourcery~widening-multiplies-4.6/logs/x86_64-natty-cbuild158-oort3-x86_64r1

The test suite results were not checked.

cbuild-checked: x86_64-natty-cbuild158-oort3-x86_64r1

review: Needs Fixing
Revision history for this message
Michael Hope (michaelh1) wrote :

cbuild had trouble building this on armv7l-natty-cbuild158-ursa3-cortexa9r1.

See the *failed.txt logs under the build results at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106782~ams-codesourcery~widening-multiplies-4.6/logs/armv7l-natty-cbuild158-ursa3-cortexa9r1

The test suite results were not checked.

cbuild-checked: armv7l-natty-cbuild158-ursa3-cortexa9r1

review: Needs Fixing
Revision history for this message
Michael Hope (michaelh1) wrote :

cbuild had trouble building this on armv7l-natty-cbuild158-ursa1-armv5r2.

See the *failed.txt logs under the build results at:
 http://ex.seabright.co.nz/build/gcc-linaro-4.6+bzr106782~ams-codesourcery~widening-multiplies-4.6/logs/armv7l-natty-cbuild158-ursa1-armv5r2

The test suite results were not checked.

cbuild-checked: armv7l-natty-cbuild158-ursa1-armv5r2

review: Needs Fixing

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'ChangeLog.linaro'
2--- ChangeLog.linaro 2011-07-18 14:47:22 +0000
3+++ ChangeLog.linaro 2011-07-19 09:04:38 +0000
4@@ -1,3 +1,4 @@
5+<<<<<<< TREE
6 2011-07-18 Andrew Stubbs <ams@codesourcery.com>
7
8 gcc/
9@@ -83,6 +84,110 @@
10
11 * gcc.c-torture/compile/20110401-1.c: New test.
12
13+=======
14+2011-07-15 Andrew Stubbs <ams@codesourcery.com>
15+
16+ Backport from patches proposed for 4.7:
17+
18+ 2011-06-24 Andrew Stubbs <ams@codesourcery.com>
19+
20+ gcc/
21+ * tree-ssa-math-opts.c (convert_mult_to_widen): Better handle
22+ unsigned inputs of different modes.
23+ (convert_plusminus_to_widen): Likewise.
24+
25+ gcc/testsuite/
26+ * gcc.target/arm/wmul-9.c: New file.
27+ * gcc.target/arm/wmul-bitfield-2.c: New file.
28+
29+ 2011-07-14 Andrew Stubbs <ams@codesourcery.com>
30+
31+ gcc/
32+ * tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument
33+ 'type'.
34+ Use 'type' from caller, not inferred from 'rhs'.
35+ Don't reject non-conversion statements. Do return lhs in this case.
36+ (is_widening_mult_p): Add new argument 'type'.
37+ Use 'type' from caller, not inferred from 'stmt'.
38+ Pass type to is_widening_mult_rhs_p.
39+ (convert_mult_to_widen): Pass type to is_widening_mult_p.
40+ (convert_plusminus_to_widen): Likewise.
41+
42+ gcc/testsuite/
43+ * gcc.target/arm/wmul-8.c: New file.
44+
45+ 2011-07-14 Andrew Stubbs <ams@codesourcery.com>
46+
47+ gcc/
48+ * tree-ssa-math-opts.c (is_widening_mult_p): Remove FIXME.
49+ Ensure the the larger type is the first operand.
50+
51+ gcc/testsuite/
52+ * gcc.target/arm/wmul-7.c: New file.
53+
54+ 2011-07-14 Andrew Stubbs <ams@codesourcery.com>
55+
56+ gcc/
57+ * tree-ssa-math-opts.c (convert_mult_to_widen): Convert
58+ unsupported unsigned multiplies to signed.
59+ (convert_plusminus_to_widen): Likewise.
60+
61+ gcc/testsuite/
62+ * gcc.target/arm/wmul-6.c: New file.
63+
64+ 2011-07-14 Andrew Stubbs <ams@codesourcery.com>
65+
66+ gcc/
67+ * tree-ssa-math-opts.c (convert_plusminus_to_widen): Permit a single
68+ conversion statement separating multiply-and-accumulate.
69+
70+ gcc/testsuite/
71+ * gcc.target/arm/wmul-5.c: New file.
72+ * gcc.target/arm/no-wmla-1.c: New file.
73+
74+ 2011-07-14 Andrew Stubbs <ams@codesourcery.com>
75+
76+ gcc/
77+ * config/arm/arm.md (maddhidi4): Remove '*' from name.
78+ * expr.c (expand_expr_real_2): Use find_widening_optab_handler.
79+ * optabs.c (find_widening_optab_handler_and_mode): New function.
80+ (expand_widen_pattern_expr): Use find_widening_optab_handler.
81+ (expand_binop_directly): Likewise.
82+ (expand_binop): Likewise.
83+ * optabs.h (find_widening_optab_handler): New macro define.
84+ (find_widening_optab_handler_and_mode): New prototype.
85+ * tree-cfg.c (verify_gimple_assign_binary): Adjust WIDEN_MULT_EXPR
86+ type precision rules.
87+ (verify_gimple_assign_ternary): Likewise for WIDEN_MULT_PLUS_EXPR.
88+ * tree-ssa-math-opts.c (build_and_insert_cast): New function.
89+ (is_widening_mult_rhs_p): Allow widening by more than one mode.
90+ Explicitly disallow mis-matched input types.
91+ (convert_mult_to_widen): Use find_widening_optab_handler, and cast
92+ input types to fit the new handler.
93+ (convert_plusminus_to_widen): Likewise.
94+
95+ gcc/testsuite/
96+ * gcc.target/arm/wmul-bitfield-1.c: New file.
97+
98+
99+ 2011-07-09 Andrew Stubbs <ams@codesourcery.com>
100+
101+ gcc/
102+ * expr.c (expand_expr_real_2): Use widening_optab_handler.
103+ * genopinit.c (optabs): Use set_widening_optab_handler for $N.
104+ (gen_insn): $N now means $a must be wider than $b, not consecutive.
105+ * optabs.c (expand_widen_pattern_expr): Use widening_optab_handler.
106+ (expand_binop_directly): Likewise.
107+ (expand_binop): Likewise.
108+ * optabs.h (widening_optab_handlers): New struct.
109+ (optab_d): New member, 'widening'.
110+ (widening_optab_handler): New function.
111+ (set_widening_optab_handler): New function.
112+ * tree-ssa-math-opts.c (convert_mult_to_widen): Use
113+ widening_optab_handler.
114+ (convert_plusminus_to_widen): Likewise.
115+
116+>>>>>>> MERGE-SOURCE
117 2011-07-13 Richard Sandiford <richard.sandiford@linaro.org>
118
119 Backport from mainline:
120
121=== modified file 'gcc/config/arm/arm.md'
122--- gcc/config/arm/arm.md 2011-06-28 12:02:27 +0000
123+++ gcc/config/arm/arm.md 2011-07-19 09:04:38 +0000
124@@ -1839,7 +1839,7 @@
125 (set_attr "predicable" "yes")]
126 )
127
128-(define_insn "*maddhidi4"
129+(define_insn "maddhidi4"
130 [(set (match_operand:DI 0 "s_register_operand" "=r")
131 (plus:DI
132 (mult:DI (sign_extend:DI
133
134=== modified file 'gcc/expr.c'
135--- gcc/expr.c 2011-06-02 12:12:00 +0000
136+++ gcc/expr.c 2011-07-19 09:04:38 +0000
137@@ -7658,18 +7658,16 @@
138 {
139 enum machine_mode innermode = TYPE_MODE (TREE_TYPE (treeop0));
140 this_optab = usmul_widen_optab;
141- if (mode == GET_MODE_2XWIDER_MODE (innermode))
142+ if (find_widening_optab_handler (this_optab, mode, innermode, 0)
143+ != CODE_FOR_nothing)
144 {
145- if (optab_handler (this_optab, mode) != CODE_FOR_nothing)
146- {
147- if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
148- expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
149- EXPAND_NORMAL);
150- else
151- expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
152- EXPAND_NORMAL);
153- goto binop3;
154- }
155+ if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
156+ expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
157+ EXPAND_NORMAL);
158+ else
159+ expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
160+ EXPAND_NORMAL);
161+ goto binop3;
162 }
163 }
164 /* Check for a multiplication with matching signedness. */
165@@ -7684,10 +7682,10 @@
166 optab other_optab = zextend_p ? smul_widen_optab : umul_widen_optab;
167 this_optab = zextend_p ? umul_widen_optab : smul_widen_optab;
168
169- if (mode == GET_MODE_2XWIDER_MODE (innermode)
170- && TREE_CODE (treeop0) != INTEGER_CST)
171+ if (TREE_CODE (treeop0) != INTEGER_CST)
172 {
173- if (optab_handler (this_optab, mode) != CODE_FOR_nothing)
174+ if (find_widening_optab_handler (this_optab, mode, innermode, 0)
175+ != CODE_FOR_nothing)
176 {
177 expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
178 EXPAND_NORMAL);
179@@ -7695,7 +7693,8 @@
180 unsignedp, this_optab);
181 return REDUCE_BIT_FIELD (temp);
182 }
183- if (optab_handler (other_optab, mode) != CODE_FOR_nothing
184+ if (find_widening_optab_handler (other_optab, mode, innermode, 0)
185+ != CODE_FOR_nothing
186 && innermode == word_mode)
187 {
188 rtx htem, hipart;
189
190=== modified file 'gcc/genopinit.c'
191--- gcc/genopinit.c 2011-05-05 15:43:06 +0000
192+++ gcc/genopinit.c 2011-07-19 09:04:38 +0000
193@@ -46,10 +46,12 @@
194 used. $A and $B are replaced with the full name of the mode; $a and $b
195 are replaced with the short form of the name, as above.
196
197- If $N is present in the pattern, it means the two modes must be consecutive
198- widths in the same mode class (e.g, QImode and HImode). $I means that
199- only full integer modes should be considered for the next mode, and $F
200- means that only float modes should be considered.
201+ If $N is present in the pattern, it means the two modes must be in
202+ the same mode class, and $b must be greater than $a (e.g, QImode
203+ and HImode).
204+
205+ $I means that only full integer modes should be considered for the
206+ next mode, and $F means that only float modes should be considered.
207 $P means that both full and partial integer modes should be considered.
208 $Q means that only fixed-point modes should be considered.
209
210@@ -99,17 +101,17 @@
211 "set_optab_handler (smulv_optab, $A, CODE_FOR_$(mulv$I$a3$))",
212 "set_optab_handler (umul_highpart_optab, $A, CODE_FOR_$(umul$a3_highpart$))",
213 "set_optab_handler (smul_highpart_optab, $A, CODE_FOR_$(smul$a3_highpart$))",
214- "set_optab_handler (smul_widen_optab, $B, CODE_FOR_$(mul$a$b3$)$N)",
215- "set_optab_handler (umul_widen_optab, $B, CODE_FOR_$(umul$a$b3$)$N)",
216- "set_optab_handler (usmul_widen_optab, $B, CODE_FOR_$(usmul$a$b3$)$N)",
217- "set_optab_handler (smadd_widen_optab, $B, CODE_FOR_$(madd$a$b4$)$N)",
218- "set_optab_handler (umadd_widen_optab, $B, CODE_FOR_$(umadd$a$b4$)$N)",
219- "set_optab_handler (ssmadd_widen_optab, $B, CODE_FOR_$(ssmadd$a$b4$)$N)",
220- "set_optab_handler (usmadd_widen_optab, $B, CODE_FOR_$(usmadd$a$b4$)$N)",
221- "set_optab_handler (smsub_widen_optab, $B, CODE_FOR_$(msub$a$b4$)$N)",
222- "set_optab_handler (umsub_widen_optab, $B, CODE_FOR_$(umsub$a$b4$)$N)",
223- "set_optab_handler (ssmsub_widen_optab, $B, CODE_FOR_$(ssmsub$a$b4$)$N)",
224- "set_optab_handler (usmsub_widen_optab, $B, CODE_FOR_$(usmsub$a$b4$)$N)",
225+ "set_widening_optab_handler (smul_widen_optab, $B, $A, CODE_FOR_$(mul$a$b3$)$N)",
226+ "set_widening_optab_handler (umul_widen_optab, $B, $A, CODE_FOR_$(umul$a$b3$)$N)",
227+ "set_widening_optab_handler (usmul_widen_optab, $B, $A, CODE_FOR_$(usmul$a$b3$)$N)",
228+ "set_widening_optab_handler (smadd_widen_optab, $B, $A, CODE_FOR_$(madd$a$b4$)$N)",
229+ "set_widening_optab_handler (umadd_widen_optab, $B, $A, CODE_FOR_$(umadd$a$b4$)$N)",
230+ "set_widening_optab_handler (ssmadd_widen_optab, $B, $A, CODE_FOR_$(ssmadd$a$b4$)$N)",
231+ "set_widening_optab_handler (usmadd_widen_optab, $B, $A, CODE_FOR_$(usmadd$a$b4$)$N)",
232+ "set_widening_optab_handler (smsub_widen_optab, $B, $A, CODE_FOR_$(msub$a$b4$)$N)",
233+ "set_widening_optab_handler (umsub_widen_optab, $B, $A, CODE_FOR_$(umsub$a$b4$)$N)",
234+ "set_widening_optab_handler (ssmsub_widen_optab, $B, $A, CODE_FOR_$(ssmsub$a$b4$)$N)",
235+ "set_widening_optab_handler (usmsub_widen_optab, $B, $A, CODE_FOR_$(usmsub$a$b4$)$N)",
236 "set_optab_handler (sdiv_optab, $A, CODE_FOR_$(div$a3$))",
237 "set_optab_handler (ssdiv_optab, $A, CODE_FOR_$(ssdiv$Q$a3$))",
238 "set_optab_handler (sdivv_optab, $A, CODE_FOR_$(div$V$I$a3$))",
239@@ -304,7 +306,7 @@
240 {
241 int force_float = 0, force_int = 0, force_partial_int = 0;
242 int force_fixed = 0;
243- int force_consec = 0;
244+ int force_wider = 0;
245 int matches = 1;
246
247 for (pp = optabs[pindex]; pp[0] != '$' || pp[1] != '('; pp++)
248@@ -322,7 +324,7 @@
249 switch (*++pp)
250 {
251 case 'N':
252- force_consec = 1;
253+ force_wider = 1;
254 break;
255 case 'I':
256 force_int = 1;
257@@ -391,7 +393,10 @@
258 || mode_class[i] == MODE_VECTOR_FRACT
259 || mode_class[i] == MODE_VECTOR_UFRACT
260 || mode_class[i] == MODE_VECTOR_ACCUM
261- || mode_class[i] == MODE_VECTOR_UACCUM))
262+ || mode_class[i] == MODE_VECTOR_UACCUM)
263+ && (! force_wider
264+ || *pp == 'a'
265+ || m1 < i))
266 break;
267 }
268
269@@ -411,8 +416,7 @@
270 }
271
272 if (matches && pp[0] == '$' && pp[1] == ')'
273- && *np == 0
274- && (! force_consec || (int) GET_MODE_WIDER_MODE(m1) == m2))
275+ && *np == 0)
276 break;
277 }
278
279
280=== modified file 'gcc/optabs.c'
281--- gcc/optabs.c 2011-07-04 14:03:49 +0000
282+++ gcc/optabs.c 2011-07-19 09:04:38 +0000
283@@ -225,6 +225,37 @@
284 return 1;
285 }
286
287
288+/* Find a widening optab even if it doesn't widen as much as we want.
289+ E.g. if from_mode is HImode, and to_mode is DImode, and there is no
290+ direct HI->SI insn, then return SI->DI, if that exists.
291+ If PERMIT_NON_WIDENING is non-zero then this can be used with
292+ non-widening optabs also. */
293+
294+enum insn_code
295+find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode,
296+ enum machine_mode from_mode,
297+ int permit_non_widening,
298+ enum machine_mode *found_mode)
299+{
300+ for (; (permit_non_widening || from_mode != to_mode)
301+ && GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode)
302+ && from_mode != VOIDmode;
303+ from_mode = GET_MODE_WIDER_MODE (from_mode))
304+ {
305+ enum insn_code handler = widening_optab_handler (op, to_mode,
306+ from_mode);
307+
308+ if (handler != CODE_FOR_nothing)
309+ {
310+ if (found_mode)
311+ *found_mode = from_mode;
312+ return handler;
313+ }
314+ }
315+
316+ return CODE_FOR_nothing;
317+}
318+
319
320 /* Widen OP to MODE and return the rtx for the widened operand. UNSIGNEDP
321 says whether OP is signed or unsigned. NO_EXTEND is nonzero if we need
322 not actually do a sign-extend or zero-extend, but can leave the
323@@ -517,8 +548,9 @@
324 optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default);
325 if (ops->code == WIDEN_MULT_PLUS_EXPR
326 || ops->code == WIDEN_MULT_MINUS_EXPR)
327- icode = (int) optab_handler (widen_pattern_optab,
328- TYPE_MODE (TREE_TYPE (ops->op2)));
329+ icode = (int) find_widening_optab_handler (widen_pattern_optab,
330+ TYPE_MODE (TREE_TYPE (ops->op2)),
331+ tmode0, 0);
332 else
333 icode = (int) optab_handler (widen_pattern_optab, tmode0);
334 gcc_assert (icode != CODE_FOR_nothing);
335@@ -1389,7 +1421,9 @@
336 rtx target, int unsignedp, enum optab_methods methods,
337 rtx last)
338 {
339- int icode = (int) optab_handler (binoptab, mode);
340+ enum machine_mode from_mode = GET_MODE (op0);
341+ int icode = (int) find_widening_optab_handler (binoptab, mode,
342+ from_mode, 1);
343 enum machine_mode mode0 = insn_data[icode].operand[1].mode;
344 enum machine_mode mode1 = insn_data[icode].operand[2].mode;
345 enum machine_mode tmp_mode;
346@@ -1546,7 +1580,8 @@
347 /* If we can do it with a three-operand insn, do so. */
348
349 if (methods != OPTAB_MUST_WIDEN
350- && optab_handler (binoptab, mode) != CODE_FOR_nothing)
351+ && find_widening_optab_handler (binoptab, mode, GET_MODE (op0), 1)
352+ != CODE_FOR_nothing)
353 {
354 temp = expand_binop_directly (mode, binoptab, op0, op1, target,
355 unsignedp, methods, last);
356@@ -1585,9 +1620,10 @@
357 takes operands of this mode and makes a wider mode. */
358
359 if (binoptab == smul_optab
360- && GET_MODE_WIDER_MODE (mode) != VOIDmode
361- && (optab_handler ((unsignedp ? umul_widen_optab : smul_widen_optab),
362- GET_MODE_WIDER_MODE (mode))
363+ && GET_MODE_2XWIDER_MODE (mode) != VOIDmode
364+ && (widening_optab_handler ((unsignedp ? umul_widen_optab
365+ : smul_widen_optab),
366+ GET_MODE_2XWIDER_MODE (mode), mode)
367 != CODE_FOR_nothing))
368 {
369 temp = expand_binop (GET_MODE_WIDER_MODE (mode),
370@@ -1615,12 +1651,15 @@
371 wider_mode != VOIDmode;
372 wider_mode = GET_MODE_WIDER_MODE (wider_mode))
373 {
374- if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
375+ if (optab_handler (binoptab, wider_mode)
376+ != CODE_FOR_nothing
377 || (binoptab == smul_optab
378 && GET_MODE_WIDER_MODE (wider_mode) != VOIDmode
379- && (optab_handler ((unsignedp ? umul_widen_optab
380- : smul_widen_optab),
381- GET_MODE_WIDER_MODE (wider_mode))
382+ && (find_widening_optab_handler ((unsignedp
383+ ? umul_widen_optab
384+ : smul_widen_optab),
385+ GET_MODE_WIDER_MODE (wider_mode),
386+ mode, 0)
387 != CODE_FOR_nothing)))
388 {
389 rtx xop0 = op0, xop1 = op1;
390@@ -2043,8 +2082,8 @@
391 && optab_handler (add_optab, word_mode) != CODE_FOR_nothing)
392 {
393 rtx product = NULL_RTX;
394-
395- if (optab_handler (umul_widen_optab, mode) != CODE_FOR_nothing)
396+ if (widening_optab_handler (umul_widen_optab, mode, word_mode)
397+ != CODE_FOR_nothing)
398 {
399 product = expand_doubleword_mult (mode, op0, op1, target,
400 true, methods);
401@@ -2053,7 +2092,8 @@
402 }
403
404 if (product == NULL_RTX
405- && optab_handler (smul_widen_optab, mode) != CODE_FOR_nothing)
406+ && widening_optab_handler (smul_widen_optab, mode, word_mode)
407+ != CODE_FOR_nothing)
408 {
409 product = expand_doubleword_mult (mode, op0, op1, target,
410 false, methods);
411@@ -2144,7 +2184,8 @@
412 wider_mode != VOIDmode;
413 wider_mode = GET_MODE_WIDER_MODE (wider_mode))
414 {
415- if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
416+ if (find_widening_optab_handler (binoptab, wider_mode, mode, 1)
417+ != CODE_FOR_nothing
418 || (methods == OPTAB_LIB
419 && optab_libfunc (binoptab, wider_mode)))
420 {
421
422=== modified file 'gcc/optabs.h'
423--- gcc/optabs.h 2011-05-05 15:43:06 +0000
424+++ gcc/optabs.h 2011-07-19 09:04:38 +0000
425@@ -42,6 +42,11 @@
426 int insn_code;
427 };
428
429+struct widening_optab_handlers
430+{
431+ struct optab_handlers handlers[NUM_MACHINE_MODES][NUM_MACHINE_MODES];
432+};
433+
434 struct optab_d
435 {
436 enum rtx_code code;
437@@ -50,6 +55,7 @@
438 void (*libcall_gen)(struct optab_d *, const char *name, char suffix,
439 enum machine_mode);
440 struct optab_handlers handlers[NUM_MACHINE_MODES];
441+ struct widening_optab_handlers *widening;
442 };
443 typedef struct optab_d * optab;
444
445@@ -799,6 +805,15 @@
446 extern void emit_unop_insn (int, rtx, rtx, enum rtx_code);
447 extern bool maybe_emit_unop_insn (int, rtx, rtx, enum rtx_code);
448
449+/* Find a widening optab even if it doesn't widen as much as we want. */
450+#define find_widening_optab_handler(A,B,C,D) \
451+ find_widening_optab_handler_and_mode (A, B, C, D, NULL)
452+extern enum insn_code find_widening_optab_handler_and_mode (optab,
453+ enum machine_mode,
454+ enum machine_mode,
455+ int,
456+ enum machine_mode *);
457+
458 /* An extra flag to control optab_for_tree_code's behavior. This is needed to
459 distinguish between machines with a vector shift that takes a scalar for the
460 shift amount vs. machines that take a vector for the shift amount. */
461@@ -874,6 +889,23 @@
462 + (int) CODE_FOR_nothing);
463 }
464
465+/* Like optab_handler, but for widening_operations that have a TO_MODE and
466+ a FROM_MODE. */
467+
468+static inline enum insn_code
469+widening_optab_handler (optab op, enum machine_mode to_mode,
470+ enum machine_mode from_mode)
471+{
472+ if (to_mode == from_mode)
473+ return optab_handler (op, to_mode);
474+
475+ if (op->widening)
476+ return (enum insn_code) (op->widening->handlers[(int) to_mode][(int) from_mode].insn_code
477+ + (int) CODE_FOR_nothing);
478+
479+ return CODE_FOR_nothing;
480+}
481+
482 /* Record that insn CODE should be used to implement mode MODE of OP. */
483
484 static inline void
485@@ -882,6 +914,26 @@
486 op->handlers[(int) mode].insn_code = (int) code - (int) CODE_FOR_nothing;
487 }
488
489+/* Like set_optab_handler, but for widening operations that have a TO_MODE
490+ and a FROM_MODE. */
491+
492+static inline void
493+set_widening_optab_handler (optab op, enum machine_mode to_mode,
494+ enum machine_mode from_mode, enum insn_code code)
495+{
496+ if (to_mode == from_mode)
497+ set_optab_handler (op, to_mode, code);
498+ else
499+ {
500+ if (op->widening == NULL)
501+ op->widening = (struct widening_optab_handlers *)
502+ xcalloc (1, sizeof (struct widening_optab_handlers));
503+
504+ op->widening->handlers[(int) to_mode][(int) from_mode].insn_code
505+ = (int) code - (int) CODE_FOR_nothing;
506+ }
507+}
508+
509 /* Return the insn used to perform conversion OP from mode FROM_MODE
510 to mode TO_MODE; return CODE_FOR_nothing if the target does not have
511 such an insn. */
512
513=== added file 'gcc/testsuite/gcc.target/arm/no-wmla-1.c'
514--- gcc/testsuite/gcc.target/arm/no-wmla-1.c 1970-01-01 00:00:00 +0000
515+++ gcc/testsuite/gcc.target/arm/no-wmla-1.c 2011-07-19 09:04:38 +0000
516@@ -0,0 +1,11 @@
517+/* { dg-do compile } */
518+/* { dg-options "-O2 -march=armv7-a" } */
519+
520+int
521+foo (int a, short b, short c)
522+{
523+ int bc = b * c;
524+ return a + (short)bc;
525+}
526+
527+/* { dg-final { scan-assembler "mul" } } */
528
529=== added file 'gcc/testsuite/gcc.target/arm/wmul-10.c'
530--- gcc/testsuite/gcc.target/arm/wmul-10.c 1970-01-01 00:00:00 +0000
531+++ gcc/testsuite/gcc.target/arm/wmul-10.c 2011-07-19 09:04:38 +0000
532@@ -0,0 +1,10 @@
533+/* { dg-do compile } */
534+/* { dg-options "-O2 -march=armv7-a" } */
535+
536+unsigned long long
537+foo (unsigned short a, unsigned short *b, unsigned short *c)
538+{
539+ return (unsigned)a + (unsigned long long)*b * (unsigned long long)*c;
540+}
541+
542+/* { dg-final { scan-assembler "umlal" } } */
543
544=== added file 'gcc/testsuite/gcc.target/arm/wmul-5.c'
545--- gcc/testsuite/gcc.target/arm/wmul-5.c 1970-01-01 00:00:00 +0000
546+++ gcc/testsuite/gcc.target/arm/wmul-5.c 2011-07-19 09:04:38 +0000
547@@ -0,0 +1,10 @@
548+/* { dg-do compile } */
549+/* { dg-options "-O2 -march=armv7-a" } */
550+
551+long long
552+foo (long long a, char *b, char *c)
553+{
554+ return a + *b * *c;
555+}
556+
557+/* { dg-final { scan-assembler "umlal" } } */
558
559=== added file 'gcc/testsuite/gcc.target/arm/wmul-6.c'
560--- gcc/testsuite/gcc.target/arm/wmul-6.c 1970-01-01 00:00:00 +0000
561+++ gcc/testsuite/gcc.target/arm/wmul-6.c 2011-07-19 09:04:38 +0000
562@@ -0,0 +1,10 @@
563+/* { dg-do compile } */
564+/* { dg-options "-O2 -march=armv7-a" } */
565+
566+long long
567+foo (long long a, unsigned char *b, signed char *c)
568+{
569+ return a + (long long)*b * (long long)*c;
570+}
571+
572+/* { dg-final { scan-assembler "smlal" } } */
573
574=== added file 'gcc/testsuite/gcc.target/arm/wmul-7.c'
575--- gcc/testsuite/gcc.target/arm/wmul-7.c 1970-01-01 00:00:00 +0000
576+++ gcc/testsuite/gcc.target/arm/wmul-7.c 2011-07-19 09:04:38 +0000
577@@ -0,0 +1,10 @@
578+/* { dg-do compile } */
579+/* { dg-options "-O2 -march=armv7-a" } */
580+
581+unsigned long long
582+foo (unsigned long long a, unsigned char *b, unsigned short *c)
583+{
584+ return a + *b * *c;
585+}
586+
587+/* { dg-final { scan-assembler "umlal" } } */
588
589=== added file 'gcc/testsuite/gcc.target/arm/wmul-8.c'
590--- gcc/testsuite/gcc.target/arm/wmul-8.c 1970-01-01 00:00:00 +0000
591+++ gcc/testsuite/gcc.target/arm/wmul-8.c 2011-07-19 09:04:38 +0000
592@@ -0,0 +1,10 @@
593+/* { dg-do compile } */
594+/* { dg-options "-O2 -march=armv7-a" } */
595+
596+long long
597+foo (long long a, int *b, int *c)
598+{
599+ return a + *b * *c;
600+}
601+
602+/* { dg-final { scan-assembler "smlal" } } */
603
604=== added file 'gcc/testsuite/gcc.target/arm/wmul-9.c'
605--- gcc/testsuite/gcc.target/arm/wmul-9.c 1970-01-01 00:00:00 +0000
606+++ gcc/testsuite/gcc.target/arm/wmul-9.c 2011-07-19 09:04:38 +0000
607@@ -0,0 +1,10 @@
608+/* { dg-do compile } */
609+/* { dg-options "-O2 -march=armv7-a" } */
610+
611+long long
612+foo (long long a, short *b, char *c)
613+{
614+ return a + *b * *c;
615+}
616+
617+/* { dg-final { scan-assembler "smlalbb" } } */
618
619=== added file 'gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c'
620--- gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c 1970-01-01 00:00:00 +0000
621+++ gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c 2011-07-19 09:04:38 +0000
622@@ -0,0 +1,17 @@
623+/* { dg-do compile } */
624+/* { dg-options "-O2 -march=armv7-a" } */
625+
626+struct bf
627+{
628+ int a : 3;
629+ int b : 15;
630+ int c : 3;
631+};
632+
633+long long
634+foo (long long a, struct bf b, struct bf c)
635+{
636+ return a + b.b * c.b;
637+}
638+
639+/* { dg-final { scan-assembler "smlalbb" } } */
640
641=== added file 'gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c'
642--- gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c 1970-01-01 00:00:00 +0000
643+++ gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c 2011-07-19 09:04:38 +0000
644@@ -0,0 +1,17 @@
645+/* { dg-do compile } */
646+/* { dg-options "-O2 -march=armv7-a" } */
647+
648+struct bf
649+{
650+ int a : 3;
651+ unsigned int b : 15;
652+ int c : 3;
653+};
654+
655+long long
656+foo (long long a, struct bf b, struct bf c)
657+{
658+ return a + b.b * c.c;
659+}
660+
661+/* { dg-final { scan-assembler "smlalbb" } } */
662
663=== modified file 'gcc/tree-cfg.c'
664--- gcc/tree-cfg.c 2011-07-01 09:19:21 +0000
665+++ gcc/tree-cfg.c 2011-07-19 09:04:38 +0000
666@@ -3574,7 +3574,7 @@
667 case WIDEN_MULT_EXPR:
668 if (TREE_CODE (lhs_type) != INTEGER_TYPE)
669 return true;
670- return ((2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type))
671+ return ((2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type))
672 || (TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type)));
673
674 case WIDEN_SUM_EXPR:
675@@ -3667,7 +3667,7 @@
676 && !FIXED_POINT_TYPE_P (rhs1_type))
677 || !useless_type_conversion_p (rhs1_type, rhs2_type)
678 || !useless_type_conversion_p (lhs_type, rhs3_type)
679- || 2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type)
680+ || 2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type)
681 || TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type))
682 {
683 error ("type mismatch in widening multiply-accumulate expression");
684
685=== modified file 'gcc/tree-ssa-math-opts.c'
686--- gcc/tree-ssa-math-opts.c 2011-03-11 16:36:16 +0000
687+++ gcc/tree-ssa-math-opts.c 2011-07-19 09:04:38 +0000
688@@ -1266,42 +1266,68 @@
689 }
690 };
691
692-/* Return true if RHS is a suitable operand for a widening multiplication.
693+/* Build a gimple assignment to cast VAL to TARGET. Insert the statement
694+ prior to GSI's current position, and return the fresh SSA name. */
695+
696+static tree
697+build_and_insert_cast (gimple_stmt_iterator *gsi, location_t loc,
698+ tree target, tree val)
699+{
700+ tree result = make_ssa_name (target, NULL);
701+ gimple stmt = gimple_build_assign_with_ops (CONVERT_EXPR, result, val, NULL);
702+ gimple_set_location (stmt, loc);
703+ gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
704+ return result;
705+}
706+
707+/* Return true if RHS is a suitable operand for a widening multiplication,
708+ assuming a target type of TYPE.
709 There are two cases:
710
711- - RHS makes some value twice as wide. Store that value in *NEW_RHS_OUT
712- if so, and store its type in *TYPE_OUT.
713+ - RHS makes some value at least twice as wide. Store that value
714+ in *NEW_RHS_OUT if so, and store its type in *TYPE_OUT.
715
716 - RHS is an integer constant. Store that value in *NEW_RHS_OUT if so,
717 but leave *TYPE_OUT untouched. */
718
719 static bool
720-is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
721+is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
722+ tree *new_rhs_out)
723 {
724 gimple stmt;
725- tree type, type1, rhs1;
726+ tree type1, rhs1;
727 enum tree_code rhs_code;
728
729 if (TREE_CODE (rhs) == SSA_NAME)
730 {
731- type = TREE_TYPE (rhs);
732 stmt = SSA_NAME_DEF_STMT (rhs);
733 if (!is_gimple_assign (stmt))
734- return false;
735-
736- rhs_code = gimple_assign_rhs_code (stmt);
737- if (TREE_CODE (type) == INTEGER_TYPE
738- ? !CONVERT_EXPR_CODE_P (rhs_code)
739- : rhs_code != FIXED_CONVERT_EXPR)
740- return false;
741-
742- rhs1 = gimple_assign_rhs1 (stmt);
743- type1 = TREE_TYPE (rhs1);
744+ {
745+ rhs1 = NULL;
746+ type1 = TREE_TYPE (rhs);
747+ }
748+ else
749+ {
750+ rhs1 = gimple_assign_rhs1 (stmt);
751+ type1 = TREE_TYPE (rhs1);
752+ }
753+
754 if (TREE_CODE (type1) != TREE_CODE (type)
755- || TYPE_PRECISION (type1) * 2 != TYPE_PRECISION (type))
756+ || TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
757 return false;
758
759- *new_rhs_out = rhs1;
760+ if (rhs1)
761+ {
762+ rhs_code = gimple_assign_rhs_code (stmt);
763+ if (TREE_CODE (type) == INTEGER_TYPE
764+ ? !CONVERT_EXPR_CODE_P (rhs_code)
765+ : rhs_code != FIXED_CONVERT_EXPR)
766+ *new_rhs_out = rhs;
767+ else
768+ *new_rhs_out = rhs1;
769+ }
770+ else
771+ *new_rhs_out = rhs;
772 *type_out = type1;
773 return true;
774 }
775@@ -1316,28 +1342,27 @@
776 return false;
777 }
778
779-/* Return true if STMT performs a widening multiplication. If so,
780- store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT
781- respectively. Also fill *RHS1_OUT and *RHS2_OUT such that converting
782- those operands to types *TYPE1_OUT and *TYPE2_OUT would give the
783- operands of the multiplication. */
784+/* Return true if STMT performs a widening multiplication, assuming the
785+ output type is TYPE. If so, store the unwidened types of the operands
786+ in *TYPE1_OUT and *TYPE2_OUT respectively. Also fill *RHS1_OUT and
787+ *RHS2_OUT such that converting those operands to types *TYPE1_OUT
788+ and *TYPE2_OUT would give the operands of the multiplication. */
789
790 static bool
791-is_widening_mult_p (gimple stmt,
792+is_widening_mult_p (tree type, gimple stmt,
793 tree *type1_out, tree *rhs1_out,
794 tree *type2_out, tree *rhs2_out)
795 {
796- tree type;
797-
798- type = TREE_TYPE (gimple_assign_lhs (stmt));
799 if (TREE_CODE (type) != INTEGER_TYPE
800 && TREE_CODE (type) != FIXED_POINT_TYPE)
801 return false;
802
803- if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out))
804+ if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out,
805+ rhs1_out))
806 return false;
807
808- if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out))
809+ if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out,
810+ rhs2_out))
811 return false;
812
813 if (*type1_out == NULL)
814@@ -1354,6 +1379,18 @@
815 *type2_out = *type1_out;
816 }
817
818+ /* Ensure that the larger of the two operands comes first. */
819+ if (TYPE_PRECISION (*type1_out) < TYPE_PRECISION (*type2_out))
820+ {
821+ tree tmp;
822+ tmp = *type1_out;
823+ *type1_out = *type2_out;
824+ *type2_out = tmp;
825+ tmp = *rhs1_out;
826+ *rhs1_out = *rhs2_out;
827+ *rhs2_out = tmp;
828+ }
829+
830 return true;
831 }
832
833@@ -1362,31 +1399,94 @@
834 value is true iff we converted the statement. */
835
836 static bool
837-convert_mult_to_widen (gimple stmt)
838+convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
839 {
840- tree lhs, rhs1, rhs2, type, type1, type2;
841+ tree lhs, rhs1, rhs2, type, type1, type2, tmp = NULL;
842 enum insn_code handler;
843+ enum machine_mode to_mode, from_mode, actual_mode;
844+ optab op;
845+ int actual_precision;
846+ location_t loc = gimple_location (stmt);
847+ bool from_unsigned1, from_unsigned2;
848
849 lhs = gimple_assign_lhs (stmt);
850 type = TREE_TYPE (lhs);
851 if (TREE_CODE (type) != INTEGER_TYPE)
852 return false;
853
854- if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2))
855+ if (!is_widening_mult_p (type, stmt, &type1, &rhs1, &type2, &rhs2))
856 return false;
857
858- if (TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2))
859- handler = optab_handler (umul_widen_optab, TYPE_MODE (type));
860- else if (!TYPE_UNSIGNED (type1) && !TYPE_UNSIGNED (type2))
861- handler = optab_handler (smul_widen_optab, TYPE_MODE (type));
862+ to_mode = TYPE_MODE (type);
863+ from_mode = TYPE_MODE (type1);
864+ from_unsigned1 = TYPE_UNSIGNED (type1);
865+ from_unsigned2 = TYPE_UNSIGNED (type2);
866+
867+ if (from_unsigned1 && from_unsigned2)
868+ op = umul_widen_optab;
869+ else if (!from_unsigned1 && !from_unsigned2)
870+ op = smul_widen_optab;
871 else
872- handler = optab_handler (usmul_widen_optab, TYPE_MODE (type));
873+ op = usmul_widen_optab;
874+
875+ handler = find_widening_optab_handler_and_mode (op, to_mode, from_mode,
876+ 0, &actual_mode);
877
878 if (handler == CODE_FOR_nothing)
879- return false;
880-
881- gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
882- gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));
883+ {
884+ if (op != smul_widen_optab)
885+ {
886+ /* We can use a signed multiply with unsigned types as long as
887+ there is a wider mode to use, or it is the smaller of the two
888+ types that is unsigned. Note that type1 >= type2, always. */
889+ if ((TYPE_UNSIGNED (type1)
890+ && TYPE_PRECISION (type1) == GET_MODE_PRECISION (from_mode))
891+ || (TYPE_UNSIGNED (type2)
892+ && TYPE_PRECISION (type2) == GET_MODE_PRECISION (from_mode)))
893+ {
894+ from_mode = GET_MODE_WIDER_MODE (from_mode);
895+ if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
896+ return false;
897+ }
898+
899+ op = smul_widen_optab;
900+ handler = find_widening_optab_handler_and_mode (op, to_mode,
901+ from_mode, 0,
902+ &actual_mode);
903+
904+ if (handler == CODE_FOR_nothing)
905+ return false;
906+
907+ from_unsigned1 = from_unsigned2 = false;
908+ }
909+ else
910+ return false;
911+ }
912+
913+ /* Ensure that the inputs to the handler are in the correct precison
914+ for the opcode. This will be the full mode size. */
915+ actual_precision = GET_MODE_PRECISION (actual_mode);
916+ if (actual_precision != TYPE_PRECISION (type1)
917+ || from_unsigned1 != TYPE_UNSIGNED (type1))
918+ {
919+ tmp = create_tmp_var (build_nonstandard_integer_type
920+ (actual_precision, from_unsigned1),
921+ NULL);
922+ rhs1 = build_and_insert_cast (gsi, loc, tmp, rhs1);
923+ }
924+ if (actual_precision != TYPE_PRECISION (type2)
925+ || from_unsigned2 != TYPE_UNSIGNED (type2))
926+ {
927+ /* Reuse the same type info, if possible. */
928+ if (!tmp || from_unsigned1 != from_unsigned2)
929+ tmp = create_tmp_var (build_nonstandard_integer_type
930+ (actual_precision, from_unsigned2),
931+ NULL);
932+ rhs2 = build_and_insert_cast (gsi, loc, tmp, rhs2);
933+ }
934+
935+ gimple_assign_set_rhs1 (stmt, rhs1);
936+ gimple_assign_set_rhs2 (stmt, rhs2);
937 gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR);
938 update_stmt (stmt);
939 return true;
940@@ -1403,11 +1503,17 @@
941 enum tree_code code)
942 {
943 gimple rhs1_stmt = NULL, rhs2_stmt = NULL;
944- tree type, type1, type2;
945+ gimple conv1_stmt = NULL, conv2_stmt = NULL, conv_stmt;
946+ tree type, type1, type2, optype, tmp = NULL;
947 tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs;
948 enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
949 optab this_optab;
950 enum tree_code wmult_code;
951+ enum insn_code handler;
952+ enum machine_mode to_mode, from_mode, actual_mode;
953+ location_t loc = gimple_location (stmt);
954+ int actual_precision;
955+ bool from_unsigned1, from_unsigned2;
956
957 lhs = gimple_assign_lhs (stmt);
958 type = TREE_TYPE (lhs);
959@@ -1441,54 +1547,153 @@
960 else
961 return false;
962
963- if (code == PLUS_EXPR && rhs1_code == MULT_EXPR)
964- {
965- if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
966- &type2, &mult_rhs2))
967- return false;
968- add_rhs = rhs2;
969- }
970- else if (rhs2_code == MULT_EXPR)
971- {
972- if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1,
973- &type2, &mult_rhs2))
974- return false;
975- add_rhs = rhs1;
976- }
977- else if (code == PLUS_EXPR && rhs1_code == WIDEN_MULT_EXPR)
978- {
979- mult_rhs1 = gimple_assign_rhs1 (rhs1_stmt);
980- mult_rhs2 = gimple_assign_rhs2 (rhs1_stmt);
981- type1 = TREE_TYPE (mult_rhs1);
982- type2 = TREE_TYPE (mult_rhs2);
983- add_rhs = rhs2;
984- }
985- else if (rhs2_code == WIDEN_MULT_EXPR)
986- {
987- mult_rhs1 = gimple_assign_rhs1 (rhs2_stmt);
988- mult_rhs2 = gimple_assign_rhs2 (rhs2_stmt);
989- type1 = TREE_TYPE (mult_rhs1);
990- type2 = TREE_TYPE (mult_rhs2);
991- add_rhs = rhs1;
992+ /* Allow for one conversion statement between the multiply
993+ and addition/subtraction statement. If there are more than
994+ one conversions then we assume they would invalidate this
995+ transformation. If that's not the case then they should have
996+ been folded before now. */
997+ if (CONVERT_EXPR_CODE_P (rhs1_code))
998+ {
999+ conv1_stmt = rhs1_stmt;
1000+ rhs1 = gimple_assign_rhs1 (rhs1_stmt);
1001+ if (TREE_CODE (rhs1) == SSA_NAME)
1002+ {
1003+ rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
1004+ if (is_gimple_assign (rhs1_stmt))
1005+ rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
1006+ }
1007+ else
1008+ return false;
1009+ }
1010+ if (CONVERT_EXPR_CODE_P (rhs2_code))
1011+ {
1012+ conv2_stmt = rhs2_stmt;
1013+ rhs2 = gimple_assign_rhs1 (rhs2_stmt);
1014+ if (TREE_CODE (rhs2) == SSA_NAME)
1015+ {
1016+ rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
1017+ if (is_gimple_assign (rhs2_stmt))
1018+ rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
1019+ }
1020+ else
1021+ return false;
1022+ }
1023+
1024+ /* If code is WIDEN_MULT_EXPR then it would seem unnecessary to call
1025+ is_widening_mult_p, but we still need the rhs returns.
1026+
1027+ It might also appear that it would be sufficient to use the existing
1028+ operands of the widening multiply, but that would limit the choice of
1029+ multiply-and-accumulate instructions. */
1030+ if (code == PLUS_EXPR
1031+ && (rhs1_code == MULT_EXPR || rhs1_code == WIDEN_MULT_EXPR))
1032+ {
1033+ if (!is_widening_mult_p (type, rhs1_stmt, &type1, &mult_rhs1,
1034+ &type2, &mult_rhs2))
1035+ return false;
1036+ add_rhs = rhs2;
1037+ conv_stmt = conv1_stmt;
1038+ }
1039+ else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
1040+ {
1041+ if (!is_widening_mult_p (type, rhs2_stmt, &type1, &mult_rhs1,
1042+ &type2, &mult_rhs2))
1043+ return false;
1044+ add_rhs = rhs1;
1045+ conv_stmt = conv2_stmt;
1046 }
1047 else
1048 return false;
1049
1050- if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
1051- return false;
1052+ to_mode = TYPE_MODE (type);
1053+ from_mode = TYPE_MODE (type1);
1054+ from_unsigned1 = TYPE_UNSIGNED (type1);
1055+ from_unsigned2 = TYPE_UNSIGNED (type2);
1056+
1057+ /* There's no such thing as a mixed sign madd yet, so use a wider mode. */
1058+ if (from_unsigned1 != from_unsigned2)
1059+ {
1060+ /* We can use a signed multiply with unsigned types as long as
1061+ there is a wider mode to use, or it is the smaller of the two
1062+ types that is unsigned. Note that type1 >= type2, always. */
1063+ if ((from_unsigned1
1064+ && TYPE_PRECISION (type1) == GET_MODE_PRECISION (from_mode))
1065+ || (from_unsigned2
1066+ && TYPE_PRECISION (type2) == GET_MODE_PRECISION (from_mode)))
1067+ {
1068+ from_mode = GET_MODE_WIDER_MODE (from_mode);
1069+ if (GET_MODE_SIZE (from_mode) >= GET_MODE_SIZE (to_mode))
1070+ return false;
1071+ }
1072+
1073+ from_unsigned1 = from_unsigned2 = false;
1074+ }
1075+
1076+ /* If there was a conversion between the multiply and addition
1077+ then we need to make sure it fits a multiply-and-accumulate.
1078+ The should be a single mode change which does not change the
1079+ value. */
1080+ if (conv_stmt)
1081+ {
1082+ /* We use the original, unmodified data types for this. */
1083+ tree from_type = TREE_TYPE (gimple_assign_rhs1 (conv_stmt));
1084+ tree to_type = TREE_TYPE (gimple_assign_lhs (conv_stmt));
1085+ int data_size = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);
1086+ bool is_unsigned = TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2);
1087+
1088+ if (TYPE_PRECISION (from_type) > TYPE_PRECISION (to_type))
1089+ {
1090+ /* Conversion is a truncate. */
1091+ if (TYPE_PRECISION (to_type) < data_size)
1092+ return false;
1093+ }
1094+ else if (TYPE_PRECISION (from_type) < TYPE_PRECISION (to_type))
1095+ {
1096+ /* Conversion is an extend. Check it's the right sort. */
1097+ if (TYPE_UNSIGNED (from_type) != is_unsigned
1098+ && !(is_unsigned && TYPE_PRECISION (from_type) > data_size))
1099+ return false;
1100+ }
1101+ /* else convert is a no-op for our purposes. */
1102+ }
1103
1104 /* Verify that the machine can perform a widening multiply
1105 accumulate in this mode/signedness combination, otherwise
1106 this transformation is likely to pessimize code. */
1107- this_optab = optab_for_tree_code (wmult_code, type1, optab_default);
1108- if (optab_handler (this_optab, TYPE_MODE (type)) == CODE_FOR_nothing)
1109+ optype = build_nonstandard_integer_type (from_mode, from_unsigned1);
1110+ this_optab = optab_for_tree_code (wmult_code, optype, optab_default);
1111+ handler = find_widening_optab_handler_and_mode (this_optab, to_mode,
1112+ from_mode, 0, &actual_mode);
1113+
1114+ if (handler == CODE_FOR_nothing)
1115 return false;
1116
1117- /* ??? May need some type verification here? */
1118-
1119- gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code,
1120- fold_convert (type1, mult_rhs1),
1121- fold_convert (type2, mult_rhs2),
1122+ /* Ensure that the inputs to the handler are in the correct precison
1123+ for the opcode. This will be the full mode size. */
1124+ actual_precision = GET_MODE_PRECISION (actual_mode);
1125+ if (actual_precision != TYPE_PRECISION (type1)
1126+ || from_unsigned1 != TYPE_UNSIGNED (type1))
1127+ {
1128+ tmp = create_tmp_var (build_nonstandard_integer_type
1129+ (actual_precision, from_unsigned1),
1130+ NULL);
1131+ mult_rhs1 = build_and_insert_cast (gsi, loc, tmp, mult_rhs1);
1132+ }
1133+ if (actual_precision != TYPE_PRECISION (type2)
1134+ || from_unsigned2 != TYPE_UNSIGNED (type2))
1135+ {
1136+ if (!tmp || from_unsigned1 != from_unsigned2)
1137+ tmp = create_tmp_var (build_nonstandard_integer_type
1138+ (actual_precision, from_unsigned2),
1139+ NULL);
1140+ mult_rhs2 = build_and_insert_cast (gsi, loc, tmp, mult_rhs2);
1141+ }
1142+
1143+ if (TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (add_rhs)))
1144+ add_rhs = build_and_insert_cast (gsi, loc, create_tmp_var (type, NULL),
1145+ add_rhs);
1146+
1147+ gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, mult_rhs1, mult_rhs2,
1148 add_rhs);
1149 update_stmt (gsi_stmt (*gsi));
1150 return true;
1151@@ -1696,7 +1901,7 @@
1152 switch (code)
1153 {
1154 case MULT_EXPR:
1155- if (!convert_mult_to_widen (stmt)
1156+ if (!convert_mult_to_widen (stmt, &gsi)
1157 && convert_mult_to_fma (stmt,
1158 gimple_assign_rhs1 (stmt),
1159 gimple_assign_rhs2 (stmt)))

Subscribers

People subscribed via source and target branches