Merge lp:~mgiuca/mars/switchfactor into lp:mars
- switchfactor
- Merge into trunk
Proposed by
Matt Giuca
Status: | Merged | ||||
---|---|---|---|---|---|
Merged at revision: | not available | ||||
Proposed branch: | lp:~mgiuca/mars/switchfactor | ||||
Merge into: | lp:mars | ||||
Diff against target: |
2194 lines (+1570/-166) 17 files modified
doc/dev/index.rst (+31/-0) doc/dev/switchfactor.rst (+795/-0) doc/index.rst (+1/-0) doc/ref/stmts.rst (+0/-8) src/ast_cfg.m (+494/-72) src/interpret.m (+46/-53) src/ir.m (+25/-9) src/mars.m (+1/-1) src/marsc.m (+1/-1) src/pretty.m (+30/-12) test/cases/compiler/switch.mar (+109/-0) test/cases/compiler/switchfail.mar (+0/-2) test/cases/compiler/switchfail.mtc (+4/-8) test/cases/compiler/switchfail2.mar (+13/-0) test/cases/compiler/switchfail2.mtc (+5/-0) test/cases/compiler/switchfail3.mar (+10/-0) test/cases/compiler/switchfail3.mtc (+5/-0) |
||||
To merge this branch: | bzr merge lp:~mgiuca/mars/switchfactor | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Matt Giuca | Approve | ||
Review via email: mp+17991@code.launchpad.net |
Commit message
Description of the change
To post a comment you must log in.
Revision history for this message
Matt Giuca (mgiuca) wrote : | # |
lp:~mgiuca/mars/switchfactor
updated
- 981. By Matt Giuca
-
Merged from trunk -- Repeated variable in pattern test.
- 982. By Matt Giuca
-
Test suite: switch: Added 3 new test cases from other switch examples.
- 983. By Matt Giuca
-
doc/dev/
switchfactor: Added another fully-worked example, from a test case (note in the test case where the full working is).
Preview Diff
[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1 | === added directory 'doc/dev' |
2 | === added file 'doc/dev/index.rst' |
3 | --- doc/dev/index.rst 1970-01-01 00:00:00 +0000 |
4 | +++ doc/dev/index.rst 2010-01-25 07:28:13 +0000 |
5 | @@ -0,0 +1,31 @@ |
6 | +.. Mars Documentation |
7 | + Copyright (C) 2010 Matt Giuca |
8 | + 19/1/2010 |
9 | + |
10 | +.. This program is free software: you can redistribute it and/or modify |
11 | + it under the terms of the GNU General Public License as published by |
12 | + the Free Software Foundation, either version 3 of the License, or |
13 | + (at your option) any later version. |
14 | + |
15 | +.. This program is distributed in the hope that it will be useful, |
16 | + but WITHOUT ANY WARRANTY; without even the implied warranty of |
17 | + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
18 | + GNU General Public License for more details. |
19 | + |
20 | +.. You should have received a copy of the GNU General Public License |
21 | + along with this program. If not, see <http://www.gnu.org/licenses/>. |
22 | + |
23 | +.. _ref-dev: |
24 | + |
25 | +Developer Documentation |
26 | +======================= |
27 | + |
28 | +.. sectionauthor:: Matt Giuca <matt.giuca@gmail.com> |
29 | + |
30 | +This section contains information about the workings of *Mars*, for developers |
31 | +and other interested parties. |
32 | + |
33 | +.. toctree:: |
34 | + :maxdepth: 2 |
35 | + |
36 | + switchfactor.rst |
37 | |
38 | === added file 'doc/dev/switchfactor.rst' |
39 | --- doc/dev/switchfactor.rst 1970-01-01 00:00:00 +0000 |
40 | +++ doc/dev/switchfactor.rst 2010-01-25 07:28:13 +0000 |
41 | @@ -0,0 +1,795 @@ |
42 | +.. Mars Documentation |
43 | + Copyright (C) 2010 Matt Giuca |
44 | + 19/1/2010 |
45 | + |
46 | +.. This program is free software: you can redistribute it and/or modify |
47 | + it under the terms of the GNU General Public License as published by |
48 | + the Free Software Foundation, either version 3 of the License, or |
49 | + (at your option) any later version. |
50 | + |
51 | +.. This program is distributed in the hope that it will be useful, |
52 | + but WITHOUT ANY WARRANTY; without even the implied warranty of |
53 | + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
54 | + GNU General Public License for more details. |
55 | + |
56 | +.. You should have received a copy of the GNU General Public License |
57 | + along with this program. If not, see <http://www.gnu.org/licenses/>. |
58 | + |
59 | +.. _ref-switchfactor: |
60 | + |
61 | +************************** |
62 | +Switch factoring algorithm |
63 | +************************** |
64 | + |
65 | +.. sectionauthor:: Matt Giuca <matt.giuca@gmail.com> |
66 | + |
67 | +This is the algorithm in Mars for transforming an AST-level switch statement |
68 | +to a CFG-level one. This flattens patterns out into multi-level switch |
69 | +statements, so it is a non-recursive data structure (like the rest of CFG), |
70 | +and represents the actual execution paths taken by the machine. It also |
71 | +generates errors and warnings for malformed switch statements in the source. |
72 | + |
73 | +First, we transform each case into a ":token:`matcher`", a temporary data |
74 | +structure representing a single case in an equivalent, but flattened, form. |
75 | +For each case, recurse through the pattern structure, and generate a flat |
76 | +"matcher". This results in a list of matchers, corresponding to the original |
77 | +list of cases. |
78 | + |
79 | +.. productionlist:: |
80 | + matcher: `pair`(`list`(`match`), `stmt_block`) |
81 | + |
82 | +A matcher list is an assoc list associating sequences of ":token:`match`"es |
83 | +with blocks, just like a switch is an assoc list associating patterns with |
84 | +blocks. A ":token:`match`" is a pattern without arguments (ctors have arities |
85 | +rather than arguments) -- unlike patterns, matches don't recurse. |
86 | + |
87 | +.. productionlist:: |
88 | + match: `match_any` | `match_var` | `match_ctor`(Name, Arity) | `match_intlit`(Val) |
89 | + |
90 | +A key operation in this algorithm is **generate_matcher_code**, which |
91 | +generates low-level code from a matcher list. This recursive operation |
92 | +performs most of the work, once the matcher list is generated. It is called |
93 | +as:: |
94 | + |
95 | + generate_matcher_code(arg_list, matcher_list) |
96 | + |
97 | +Where `arg_list` is a list of atoms (variable names), denoting the variables |
98 | +holding the switch values for the match list. The `arg_list` is always shorter |
99 | +or equal in length to the `matcher_list`, as arg_list only has the arguments |
100 | +for the current level, while `matcher_list` is fully expanded out. |
101 | + |
102 | +Example:: |
103 | + |
104 | + switch x: |
105 | + case Pair(A, 1) |
106 | + case Pair(B, 2) |
107 | + case Pair(A, 3) |
108 | + case Pair(q, _) |
109 | + case Pair(A, 5) |
110 | + |
111 | +We want the final output to be:: |
112 | + |
113 | + switch x: |
114 | + case Pair: |
115 | + a = x.Pair(0) |
116 | + n = x.Pair(1) |
117 | + switch a: |
118 | + case A: |
119 | + switch n: |
120 | + case 1 |
121 | + case 3 |
122 | + default: q = a |
123 | + case B: |
124 | + switch n: |
125 | + case 2 |
126 | + default: q = a |
127 | + default: q = a |
128 | + |
129 | +The arguments to pair are patterns, so this data structure is recursive. We |
130 | +want to flatten it out. Therefore, the matchers we generate are: |
131 | + |
132 | + [(Pair/2,A,1)-block1, (Pair/2,B,2)-block2, (Pair/2,A,3)-block3, (Pair/2,q,_)-block4, (Pair/2,A,5)-block5] |
133 | + |
134 | +(All the match arguments would have an arity, such as A/0 and 1/0, but we |
135 | +don't show /0 in the list above. But not the default arguments, such as q.) |
136 | + |
137 | +Now the constructors have no arguments -- we have a flat list of patterns. Note |
138 | +that we recursively flatten this data, so if A had two arguments, we could see |
139 | +a matcher like: |
140 | + |
141 | + (Pair/2,A/2,J,K,1)-block1 |
142 | + |
143 | +This reads operationally, "match a Pair, match the Pair's first argument with |
144 | +A, match the A's first argument with J and its second with K, then match the |
145 | +Pair's second argument with 1." So matchers are just a flat data structure for |
146 | +representing the same information as the case statement, but with an |
147 | +additional operational reading. Not all matchers have the same number of |
148 | +steps, as different ctors have different arities. |
149 | + |
150 | +Now before we generate code for the matchers, we add a layer of abstraction |
151 | +over the blocks, to avoid duplicating blocks when we duplicate matchers. |
152 | +Hence, our matcher list becomes: |
153 | + |
154 | + [(Pair/2,A,1)-"block1", (Pair/2,B,2)-"block2", (Pair/2,A,3)-"block3", (Pair/2,q,_)-"block4", (Pair/2,A,5)-"block5"] |
155 | + |
156 | + "block1": block1 |
157 | + |
158 | + "block2": block2 |
159 | + |
160 | + "block3": block3 |
161 | + |
162 | + "block4": block4 |
163 | + |
164 | + "block5": block5 |
165 | + |
166 | +(The unquoted block names represent the full block code; the quoted ones |
167 | +represent a thin reference to the block name). |
168 | + |
169 | +The overall algorithm, outside of generating matcher code, goes, for |
170 | +switch(`switchvar`, `cases`): |
171 | + |
172 | +1. Construct the matcher list, `matcher_list`, from cases, by flattening out |
173 | + the switch statement. |
174 | + |
175 | + * Generate the code for each target block, and store the Block Ids in the |
176 | + matcher list, not the blocks themselves. |
177 | +2. Call :func:`generate_matcher_code` ([`switchvar`], `matcher_list`). |
178 | + |
179 | +The result is a CFG-level switch statement. CFG-level switch statements are |
180 | +very simple jump tables. They switch over an atom, and map ints and ctor names |
181 | +to block IDs. They have an optional default clause, which is required to be |
182 | +last. It is up to the backend to decide what to do if there is no default |
183 | +clause, but the backend is free to exhibit undefined behaviour if a switch |
184 | +doesn't match and there is no default clause, as the code generator will never |
185 | +generate such an instruction due to exhaustive check, described below. |
186 | + |
187 | +In this example, we call:: |
188 | + |
189 | + generate_matcher_code(["x"], [(Pair/2,A,1)-"block1", (Pair/2,B,2)-"block2", |
190 | + (Pair/2,A,3)-"block3", (Pair/2,q,_)-"block4", |
191 | + (Pair/2,A,5)-"block5"]) |
192 | + |
193 | +====================================== |
194 | +Generating code for a list of matchers |
195 | +====================================== |
196 | + |
197 | +.. function:: generate_matcher_code(arg_list, matcher_list) |
198 | + |
199 | +Summary of the algorithm: Focus only on the first argument for each matcher -- |
200 | +the rest will be handled by recursion. Convert variable-binding matchers into |
201 | +wildcard/default matchers which bind a variable in their code. For every |
202 | +relevant head value, group together every possible matcher (those with an |
203 | +exact match, and wildcard/default matchers). Make these groupings into switch |
204 | +statements, and recurse to generate the body. |
205 | + |
206 | +Note that we assume we will switch on arguments from left to right. We could |
207 | +also pick a more intelligent order, such as choosing the argument with the |
208 | +most distinct cases first (which means earlier switches eliminate the most |
209 | +cases). In this case, the example below would choose the 2nd argument first, |
210 | +but we'll assume left-to-right for now. Note that the flattening means we have |
211 | +to make this decision *before* generating the matcher list. |
212 | + |
213 | +Each call to :func:`generate_matcher_code` is given a block to write to (this |
214 | +block will be completed by the end of the call). When a choice is required, |
215 | +the block will be finished with a switch statement, and new blocks will be |
216 | +created for each case (and finished off by recursive calls). When no choice is |
217 | +required, some code may be appended to the block, and then the (single) |
218 | +recursive call will complete the same block. |
219 | + |
220 | +Calls to :func:`generate_matcher_code` may also append phi instructions to the |
221 | +target blocks (which is the mechanism used to make explicit variable |
222 | +bindings). |
223 | + |
224 | +The function also needs to track two tables, for keeping variable bindings. |
225 | +Ultimately, variable bindings are stored in phi instructions in the target |
226 | +blocks, but during the code generation, we track them separately, because we |
227 | +can't build phi nodes piece-by-piece. The **binding table** only moves down |
228 | +through the stack, so any changes made to it are undone on the way up. This |
229 | +maps :class:`blockid` to :class:`varname` to :class:`varname`, such that if |
230 | +BB-BoundVar maps to SourceVar, it says "if BB wants BoundVar, it is currently |
231 | +available in SourceVar." The **phi table** moves down and up through the |
232 | +stack, so changes made to it are permanent throughout the algorithm. This is a |
233 | +multimap mapping :class:`blockid` to :class:`varname` to |
234 | +:class:`pair(blockid,varname)`, such that if TargetBlock-BoundVar maps to |
235 | +SourceBlock-SourceVar, it says "TargetBlock should have a phi node for |
236 | +BoundVar, where one entry maps SourceBlock to SourceVar." Both tables start |
237 | +out empty. |
238 | + |
239 | +Base case |
240 | +--------- |
241 | + |
242 | +When `arg_list` is empty. It should always be the case that |
243 | +`matcher_list` contains exactly one element (but assume it has multiple and |
244 | +pick the first), whose match list is empty. In this case, generate a branch to |
245 | +the block named by the first element of `matcher_list`. |
246 | + |
247 | +Also add entries to the phi table for the target block. Get the target block |
248 | +out of the binding table. For each entry in this table (BoundVar-SourceVar), |
249 | +create a phi table entry (TargetBlock-BoundVar) = (CurrentBlock-SourceVar). |
250 | + |
251 | +Step 1 -- Update binding table with explicit variable bindings |
252 | +-------------------------------------------------------------- |
253 | + |
254 | +The first step of the operation (which in the current example will do nothing, |
255 | +but we'll see it working later) is to update the binding table for explicit |
256 | +variable bindings (as distinct from the implicit temporary variables, which |
257 | +are created during switch generation). For each matcher whose first argument |
258 | +is a variable name: |
259 | + |
260 | +1. Add an entry in the binding table mapping (that matcher's block - the first |
261 | + variable in `arg_list`) to the given variable name. For example, if |
262 | + `arg_list` [1] is "a" and the first entry in a matcher is "q", then the |
263 | + variable "q" must be bound to the first switch value. Therefore, add to the |
264 | + binding table (``<matcher's block>`` - `q`) -> `a`. |
265 | +2. Replace the first argument of the matcher with match_any ("_"). This is |
266 | + sound, as variables match anything, and we no longer need to worry about |
267 | + the variable binding, as it will be taken care of as soon as we branch to |
268 | + the code. |
269 | + |
270 | +Step 2 -- Create association list |
271 | +--------------------------------- |
272 | + |
273 | +The next step is to create an assoc list for every head value of interest, for |
274 | +the first argument. The assoc list maps "match"es onto match lists. "Head |
275 | +value" means the ctor without its arguments, or the integer. "Of interest" |
276 | +means the head values referred to by any matcher. Head values which are never |
277 | +referred to (e.g., all the integers which aren't explicitly mentioned) are all |
278 | +the same as far as we're concerned. |
279 | + |
280 | +Now we have to be *very careful* to preserve order, because default matchers |
281 | +which are before more specific matchers need to supersede the specific ones. |
282 | +(This assumes a language like Haskell, where the first match is chosen, not |
283 | +Prolog/Mercury, where the order technically doesn't matter.) The grouping of |
284 | +matchers with the same head will conceptually change the order, but this |
285 | +*never* matters, because when we make a grouping, it is a committed choice. In |
286 | +other words, once a match is made, we cover all possible subsequent cases |
287 | +inside the match, and never fall back out to a default case. So we don't |
288 | +re-order within a group, but once we have selected the groupings, their order |
289 | +doesn't matter (though we keep them in order for the sake of predictable code |
290 | +generation). |
291 | + |
292 | +Let's return to our example, with the matcher list: |
293 | + |
294 | + [(Pair/2,A,1)-"block1", (Pair/2,B,2)-"block2", (Pair/2,A,3)-"block3", (Pair/2,q,_)-"block4", (Pair/2,A,5)-"block5"] |
295 | + |
296 | +The resulting assoc list is very basic at this level, since all of the |
297 | +matchers have the same head, Pair/2: |
298 | + |
299 | + Pair/2: [(A,1)-"block1", (B,2)-"block2", (A,3)-"block3", (q,_)-"block4", (A,5)-"block5"] |
300 | + |
301 | +It's difficult to see at this stage, but the operation is to take each head |
302 | +value of interest, and give it its own entry in the assoc list. Then, place in |
303 | +the assoc list any matcher which begins with either the head value, or a "_". |
304 | + |
305 | +* For matchers which begin with the head value, remove the first column of the |
306 | + matcher (so the matched head value no longer appears in the match list). |
307 | +* For matchers which begin with a "_", replace the first column of the |
308 | + matcher (the "_") with one "_" for each argument of the head value (0 or |
309 | + more). For example, if placing a default matcher in the Cons/2 row, replace |
310 | + its front "_" with "_", "_" -- one for each argument. If placing a default |
311 | + matcher in the Nil row, remove its front "_". |
312 | + |
313 | +If there are any default cases, also create an assoc list entry for "_", and |
314 | +only place in the assoc list any matcher which begins with a "_". Remove the |
315 | +first column of such matchers, as if matching a 0-arity head value. |
316 | + |
317 | +**Horizontal rule**: For any matcher (looking at the remaining matches), if it |
318 | +is *covered* by all of the matchers already on a row, remove it, as it is |
319 | +redundant. By "covered", I mean for all of the remaining matches, if it is an |
320 | +exact match, it already appears in a previous matcher, or if it is a default, |
321 | +there is already a default in the previous matchers or between them they |
322 | +include all constructors of that type. When removing a matcher, if the first |
323 | +argument is not a default match, give a warning, as it is a useless clause. |
324 | + |
325 | +**Vertical rule**: For any assoc list entry, if its head is *covered* by all |
326 | +of the entries already in the table, remove it, as it is redundant. By |
327 | +"covered", I mean if it is an exact match, that head already appeared in the |
328 | +assoc list. If it is a default, there is already a default in the assoc list |
329 | +or between them, all entries include all constructors of that type. When |
330 | +removing an assoc list entry, give a warning for each matcher, as they are all |
331 | +useless. (Perhaps do not give a warning for a default matcher covered by all |
332 | +constructors, as that probably represents defensive programming. For example, |
333 | +the programmer might just be using the default rule as an assert because he |
334 | +doesn't trust the compiler to generate "not exhaustive" errors -- then again, |
335 | +maybe it should be a warning because he should trust the compiler!) When a |
336 | +default entry is placed in the assoc list, it will always be the last (this is |
337 | +a structural requirement of the CFG-level switch statement). |
338 | + |
339 | +Once this grouping is done (at each level), test that there is *either* a |
340 | +default case, or that all ctors of the switch type have a case. If not, it is |
341 | +a compiler error ("switch not exhaustive"). The error message should include |
342 | +all of the missing cases ("requires default case or cases for A, B, C"), if it |
343 | +is an ADT. If it is an integer type, it must have a default case, so the error |
344 | +message should just read "switch not exhaustive: default case required for |
345 | +switch on integer". |
346 | + |
347 | +Step 3 -- Generate switch statement and implicit unpacking instructions |
348 | +----------------------------------------------------------------------- |
349 | + |
350 | +This enables us to generate the top-level switch statement. Switch over the |
351 | +first variable of `arg_list`, and generate a case statement for each assoc |
352 | +list entry. If the assoc list entry is default, generate the "default" clause |
353 | +of the switch statement (which must be last). Within each case statement, we |
354 | +unpack all arguments, giving them a temporary name. |
355 | + |
356 | +**Unnecessary optimisation**: If the variable's original pattern is _ in all |
357 | +matchers, don't unpack it -- note that to know this, you actually shouldn't |
358 | +erase the variable name in step 1, or you won't know which are needed and |
359 | +which aren't. Also, if their pattern is a variable name in all matchers, we |
360 | +can use that variable name and avoid the mov instruction on a later recursion |
361 | +of step 1, but for an optimising backend like LLVM, that won't save anything. |
362 | + |
363 | +:: |
364 | + |
365 | + <continue current block> |
366 | + switch x: |
367 | + case Pair: goto casepair |
368 | + |
369 | + casepair: |
370 | + a = x.Pair(0) |
371 | + n = x.Pair(1) |
372 | + <generate_matcher_code([a,n], [(A,1)-"block1", (B,2)-"block2", |
373 | + (A,3)-"block3", (q,_)-"block4", |
374 | + (A,5)-"block5"])> |
375 | + |
376 | +**Optimisation**: If `arg_list` has a single element, and there were no |
377 | +variables unpacked, then all recursive calls will be base cases with a simple |
378 | +goto. Avoid generating a bunch of new goto-only blocks by making the case |
379 | +statements directly branch to the first target block in each case, as the base |
380 | +case would. Must also update the phi table as the base case does. |
381 | + |
382 | +**Optimisation**: I naively generated a single-case switch statement. Instead, |
383 | +we should avoid generating such a thing. If there is exactly one entry in the |
384 | +assoc list, just generate the unpacking code directly:: |
385 | + |
386 | + <continue current block> |
387 | + a = x.Pair(0) |
388 | + n = x.Pair(1) |
389 | + <generate_matcher_code([a,n], [(A,1)-"block1", (B,2)-"block2", |
390 | + (A,3)-"block3", (q,_)-"block4", |
391 | + (A,5)-"block5"])> |
392 | + |
393 | +Step 4 -- Recurse |
394 | +----------------- |
395 | + |
396 | +Now we recurse on :func:`generate_matcher_code`. The recursive calls are shown |
397 | +in the example above. Note that we recurse once for each assoc list entry (so |
398 | +we iterate, then recurse). |
399 | + |
400 | +The `arg_list` on the recursive calls is the current `arg_list` with the first |
401 | +element removed, and the unpacked temporary variable names prepended. These |
402 | +new arguments correspond to the first entries in the new matcher lists. The |
403 | +`matcher_list` on the recursive calls is simply the corresponding value in the |
404 | +assoc list, for this head. |
405 | + |
406 | +================================================================ |
407 | +Afterwards -- Generating phi code for explicit variable bindings |
408 | +================================================================ |
409 | + |
410 | +After all calls to :func:`generate_matcher_code` have ended, the phi table has |
411 | +been built. This describes exactly where each target block should read the |
412 | +variable bindings from, conditionally dependent on the block it came from. |
413 | + |
414 | +These should be converted into phi instructions for each block, in the obvious |
415 | +way. **Unnecessary optimisation**: Any entries in the phi table which have the |
416 | +same source variable name for every source block can be converted into mov |
417 | +instructions instead. |
418 | + |
419 | +====================== |
420 | +Completing the example |
421 | +====================== |
422 | + |
423 | +I now present the rest of the worked example, stepping through the recursive |
424 | +calls. :: |
425 | + |
426 | + generate_matcher_code([a,n], [(A,1)-"block1", (B,2)-"block2", (A,3)-"block3", |
427 | + (q,_)-"block4", (A,5)-"block5"]) |
428 | + |
429 | +First, we apply step 1, binding q for the matcher for (q,_). |
430 | + |
431 | +1. prepend to the actual code of block 4, the instruction "q = a" (where a is |
432 | + the first variable we are matching against). |
433 | +2. Replace q with _ in the matchers list. Hence: |
434 | + |
435 | + [(A,1)-"block1", (B,2)-"block2", (A,3)-"block3", (_,_)-"block4", (A,5)-"block5"] |
436 | + |
437 | +Now we again group together the matchers by their first argument -- this time |
438 | +we have multiple ctors and default matchers in the first argument. For every |
439 | +matched ctor, we duplicate all of the default matchers. Thus, the groups are: |
440 | + |
441 | + A: [(1)-"block1", (3)-"block3", (_)-"block4", (5)-"block5"] |
442 | + |
443 | + B: [(2)-"block2", (_)-"block4"] |
444 | + |
445 | + _: [(_)-"block4"] |
446 | + |
447 | +Note that block4 was duplicated for the A group, even though it did not |
448 | +explicitly match A. This preserves the semantics of the switch statement -- if |
449 | +we had not duplicated it, then matching A would discount that default case. As |
450 | +explained above, when duplicating a default value, first test to see if it is |
451 | +*covered* by all of the matchers before it, and if so, do not duplicate. |
452 | + |
453 | +This allows us to generate the next switch statement:: |
454 | + |
455 | + switch a: |
456 | + case A: |
457 | + <generate_matcher_code([n], [(1)-"block1", (3)-"block3", |
458 | + (_)-"block4", (5)-"block5"])> |
459 | + case B: |
460 | + <generate_matcher_code([n], [(2)-"block2", (_)-"block4"])> |
461 | + default: |
462 | + <generate_matcher_code([n], [(_)-"block4"])> |
463 | + |
464 | +Now that we have three cases, we iterate over all of them, and recurse three |
465 | +times. |
466 | + |
467 | +The assoc list for case A: |
468 | + |
469 | + 1: [()-"block1"] |
470 | + |
471 | + 3: [()-"block3"] |
472 | + |
473 | + _: [()-"block4"] |
474 | + |
475 | +(Applying the vertical rule to remove the entry "5: [()-"block5"]"). |
476 | + |
477 | +The assoc list for case B: |
478 | + |
479 | + 2: [()-"block2"] |
480 | + |
481 | + _: [()-"block4"] |
482 | + |
483 | +The assoc list for the default case: |
484 | + |
485 | + _: [()-"block4"] |
486 | + |
487 | +This brings us to seven base cases, of <generate_matcher_code([], |
488 | +[()-"blockn"])>, which just generates the goto statement to the given block. |
489 | + |
490 | +We end up with this code:: |
491 | + |
492 | + switch a: |
493 | + case A: |
494 | + switch n: |
495 | + case 1: goto "block1" |
496 | + case 3: goto "block3" |
497 | + default: goto "block4" |
498 | + case B: |
499 | + switch n: |
500 | + case 2: goto "block2" |
501 | + default: goto "block4" |
502 | + default: goto "block4" |
503 | + |
504 | +Then we generate all of the code for each block. Note that each block only has |
505 | +code generated once, but may be targeted by multiple case statements, such as |
506 | +block4. |
507 | + |
508 | +================ |
509 | +Further examples |
510 | +================ |
511 | + |
512 | +Above example, continued |
513 | +------------------------ |
514 | + |
515 | +Now, let us imagine a slightly different scenario, where we had these |
516 | +matchers: |
517 | + |
518 | + A: [(1,_)-"block1", (3,7)-"block3", (_,_)-"block4", (5,_)-"block5"] |
519 | + |
520 | +(Same as above for case A, but with an extra argument.) This means, by the |
521 | +rule above, we need to copy the default match for each concrete match. Hence, |
522 | +we generate the following grouping for case A: |
523 | + |
524 | + 1: [(_)-"block1", (_)-"block4"] |
525 | + |
526 | + 3: [(7)-"block3", (_)-"block4"] |
527 | + |
528 | + _: [(_)-"block4"] |
529 | + |
530 | + 5: [(_)-"block5", (_)-"block4"] |
531 | + |
532 | +This is inefficient, but perfectly correct. Operationally, it reads: "If you |
533 | +match 1, match all and go to block1, and if that fails, match all and go to |
534 | +block4." (Obviously, we will pick block1). Then, "If you match 3, match 7 and |
535 | +go to block3, or match all and go to block4". Then it says "Match all and go |
536 | +to block4." Finally, "If you match 5, match all and go to block5, and if that |
537 | +fails, match all and go to block4." There are two forms of redundancy here. |
538 | + |
539 | +First, once a default case occurs, all subsequent cases can be removed |
540 | +(preferably with a warning): |
541 | + |
542 | + 1: [(_)-"block1", (_)-"block4"] |
543 | + |
544 | + 3: [(7)-"block3", (_)-"block4"] |
545 | + |
546 | + _: [(_)-"block4"] |
547 | + |
548 | +Second, once a matcher which *cannot fail* is seen, all subsequent matchers |
549 | +can be removed: |
550 | + |
551 | + 1: [(_)-"block1"] |
552 | + |
553 | + 3: [(7)-"block3", (_)-"block4"] |
554 | + |
555 | + _: [(_)-"block4"] |
556 | + |
557 | +This still perfectly preserves the semantics of the above matcher list -- |
558 | +there is no possibility of going to block5, or block4 from a match of 1. |
559 | + |
560 | +However, we would like to generate a warning in this last case as well, which |
561 | +would be spurious if we duplicated any defaults then eliminated them again. So |
562 | +we need to avoid duplicating in the first place. |
563 | + |
564 | +Therefore: When duplicating a default value, first test to see if any of the |
565 | +matchers before it will always succeed, and if so, do not duplicate. |
566 | + |
567 | +Now, we can be even more judicious in preventing duplication. First, don't |
568 | +just assume a matcher set cannot fail if it has a default case -- also check |
569 | +for exhaustive ctor coverage. Secondly, it's not necessary for the matchers to |
570 | +be unable to fail, it is merely sufficient for them to cover the duplicated |
571 | +patterns. So: When duplicating a default value, first test to see if it is |
572 | +covered by all of the matchers before it, and if so, do not duplicate. |
573 | + |
574 | +Also: Do not think of it as "duplicating" the default values. Simpler to think |
575 | +of it as "projecting" away all values which don't match. So, include all exact |
576 | +matches and also default values. Then, eliminate all later matches which are |
577 | +covered by earlier matches, giving a warning iff the eliminated matches are |
578 | +exact (and therefore are useless). If they are default matches, then they |
579 | +aren't useless unless covered by all constructors on the outside -- they |
580 | +should have already been checked by the vertical rule. |
581 | + |
582 | +An example of this: |
583 | + |
584 | + [(1,A)-"block1", (1,B)-"block2", (2,A)-"block3", (_,B)-"block4", (_,_)-"block5"] |
585 | + |
586 | +Naively transformed: |
587 | + |
588 | + 1: [(A)-"block1", (B)-"block2", (B)-"block4", (_)-"block5"] |
589 | + |
590 | + 2: [(A)-"block3", (B)-"block4", (_)-"block5"] |
591 | + |
592 | + _: [(B)-"block4", (_)-"block5"] |
593 | + |
594 | +However, we didn't need to duplicate the (_,B) case to case 1, as it is |
595 | +covered by the (1,B) case. So: |
596 | + |
597 | + 1: [(A)-"block1", (B)-"block2", (_)-"block5"] |
598 | + |
599 | + 2: [(A)-"block3", (B)-"block4", (_)-"block5"] |
600 | + |
601 | + _: [(B)-"block4", (_)-"block5"] |
602 | + |
603 | +If A and B are the only ctors for the switch type, then it would not be |
604 | +necessary to copy the (_,_) case either if 1 or 2 is matched, as it is covered |
605 | +in both cases: |
606 | + |
607 | + 1: [(A)-"block1", (B)-"block2"] |
608 | + |
609 | + 2: [(A)-"block3", (B)-"block4"] |
610 | + |
611 | + _: [(B)-"block4", (_)-"block5"] |
612 | + |
613 | +Tricky variable binding example |
614 | +------------------------------- |
615 | + |
616 | +Consider this code:: |
617 | + |
618 | + switch x: |
619 | + case Pair(49, Box(3)) |
620 | + case Pair(m, Box(n)) |
621 | + |
622 | +This example is important because in this case, the binding for `n` depends on |
623 | +the path taken to the second block. This explains the need for Phi |
624 | +instructions when making variable bindings. In this example, the second case's |
625 | +`n` variable needs to be copied from a temporary unpack variable -- which |
626 | +variable will depend on whether the first number was 49. |
627 | + |
628 | +Args: [x] |
629 | + |
630 | +Binds: [] |
631 | + |
632 | +Matchers:: |
633 | + |
634 | + Pair, 49, Box, 3 - block1 |
635 | + Pair, m, Box, n - block2 |
636 | + |
637 | +Assoc:: |
638 | + |
639 | + Pair: [49, Box, 3]-block1, [m, Box, n]-block2 |
640 | + |
641 | +Codegen:: |
642 | + |
643 | + entry: |
644 | + a = x.Pair(0) |
645 | + b = x.Pair(1) |
646 | + <continue> |
647 | + |
648 | +---- |
649 | + |
650 | +Args: [a,b] |
651 | + |
652 | +Binds: [] |
653 | + |
654 | +Matchers:: |
655 | + |
656 | + 49, Box, 3 - block1 |
657 | + m, Box, n - block2 |
658 | + |
659 | +Bind m=a in block2's binding table. |
660 | + |
661 | +:: |
662 | + |
663 | + 49, Box, 3 - block1 |
664 | + _, Box, n - block2 |
665 | + |
666 | +Assoc list:: |
667 | + |
668 | + 49: [Box, 3]-block1, [Box, n]-block2 |
669 | + _: [Box, n]-block2 |
670 | + |
671 | +Codegen:: |
672 | + |
673 | + <continue entry> |
674 | + |
675 | + switch a: |
676 | + case 49: goto case49 |
677 | + default: goto default |
678 | + case49: |
679 | + <continue> |
680 | + default: |
681 | + <continue> |
682 | + |
683 | +---- |
684 | + |
685 | +**(case 49)** |
686 | + |
687 | +Args: [b] |
688 | + |
689 | +Binds: [(block2-m)=a] |
690 | + |
691 | +Matchers:: |
692 | + |
693 | + Box, 3 - block1 |
694 | + Box, n - block2 |
695 | + |
696 | +Assoc list:: |
697 | + |
698 | + Box: [3]-block1, [n]-block2 |
699 | + |
700 | +Codegen:: |
701 | + |
702 | + <continue case49> |
703 | + c = b.Box(0) |
704 | + <continue> |
705 | + |
706 | +---- |
707 | + |
708 | +**(case 49, Box)** |
709 | + |
710 | +Args: [c] |
711 | + |
712 | +Binds: [(block2-m)=a] |
713 | + |
714 | +Matchers:: |
715 | + |
716 | + 3 - block1 |
717 | + n - block2 |
718 | + |
719 | +Bind n=c in block2's binding table. |
720 | + |
721 | +.. note:: A phi is required, as the binding is different to that below. |
722 | + |
723 | +:: |
724 | + |
725 | + 3 - block1 |
726 | + _ - block2 |
727 | + |
728 | +Codegen:: |
729 | + |
730 | + <continue case49> |
731 | + switch c: |
732 | + case 3: goto block1 |
733 | + default: goto block2 |
734 | + |
735 | +.. note:: This isn't quite the base case. We could recurse once more, |
736 | + generating two new blocks, and then each block would consist |
737 | + entirely of a goto. However, this pre-base-case allows more |
738 | + efficient code generation. |
739 | + |
740 | +Write to phi table: (block2-m)=(case49-a) |
741 | + |
742 | +Write to phi table: (block2-n)=(case49-c) |
743 | + |
744 | +---- |
745 | + |
746 | +**(case default)** |
747 | + |
748 | +Args: [b] |
749 | + |
750 | +Binds: [(block2-m)=a] |
751 | + |
752 | +Matchers:: |
753 | + |
754 | + Box, n - block2 |
755 | + |
756 | +Assoc list:: |
757 | + |
758 | + Box: [n]-block2 |
759 | + |
760 | +Codegen:: |
761 | + |
762 | + <continue default> |
763 | + d = b.Box(0) |
764 | + <continue> |
765 | + |
766 | +---- |
767 | + |
768 | +**(case default, Box)** |
769 | + |
770 | +Args: [d] |
771 | + |
772 | +Binds: [(block2-m)=a] |
773 | + |
774 | +Matchers:: |
775 | + |
776 | + n - block2 |
777 | + |
778 | +Bind n=d in block2's binding table. |
779 | + |
780 | +.. note:: A phi is required, as the binding is different to that above. |
781 | + |
782 | +:: |
783 | + |
784 | + _ - block2 |
785 | + |
786 | +Assoc list:: |
787 | + |
788 | + _: []-block2 |
789 | + |
790 | +Codegen:: |
791 | + |
792 | + <continue default> |
793 | + goto block2 |
794 | + |
795 | +Write to phi table: (block2-m)=(default-a) |
796 | + |
797 | +Write to phi table: (block2-n)=(default-d) |
798 | + |
799 | +---- |
800 | + |
801 | +**(Final generated code)** |
802 | + |
803 | +Final phi table:: |
804 | + |
805 | + (block2-m)=[(case49-a), (default-a)] |
806 | + (block2-n)=[(case49-c), (default-d)] |
807 | + |
808 | +:: |
809 | + |
810 | + entry: |
811 | + a = x.Pair(0) |
812 | + b = x.Pair(1) |
813 | + switch a: |
814 | + case 49: goto case49 |
815 | + default: goto default |
816 | + |
817 | + case49: |
818 | + c = b.Box(0) |
819 | + switch c: |
820 | + case 3: goto block1 |
821 | + default: goto block2 |
822 | + |
823 | + default: |
824 | + d = b.Box(0) |
825 | + goto block2 |
826 | + |
827 | + block1: |
828 | + ... |
829 | + |
830 | + block2: |
831 | + n = phi(case49 = c, default = d) |
832 | + m = phi(case49 = a, default = a) |
833 | + ... |
834 | + |
835 | +.. note:: The phi node for m in block2 may be converted into a mov |
836 | + instruction, because the binding is the same for all predecessors. |
837 | |
838 | === modified file 'doc/index.rst' |
839 | --- doc/index.rst 2009-10-02 06:21:23 +0000 |
840 | +++ doc/index.rst 2010-01-25 07:28:13 +0000 |
841 | @@ -35,6 +35,7 @@ |
842 | ref/index.rst |
843 | lib/index.rst |
844 | faq.rst |
845 | + dev/index.rst |
846 | testing.rst |
847 | |
848 | Indices and tables |
849 | |
850 | === modified file 'doc/ref/stmts.rst' |
851 | --- doc/ref/stmts.rst 2009-02-05 07:36:52 +0000 |
852 | +++ doc/ref/stmts.rst 2010-01-25 07:28:13 +0000 |
853 | @@ -227,14 +227,6 @@ |
854 | proven either by having a wildcard pattern (underscore or variable), or by |
855 | matching all alternatives of a user-defined data type. |
856 | |
857 | -.. warning:: |
858 | - The current implementation does not obey this last rule. Currently, this |
859 | - check is made at runtime, not compile time. If no case of a switch |
860 | - statement matches, the program will terminate with a runtime error, |
861 | - "pattern match failure". |
862 | - Programmers should ensure that the above rule holds statically for all |
863 | - switch statements, as a future version may enforce this rule. |
864 | - |
865 | Note that the effects of variable binding outlast the :keyword:`switch` |
866 | statement. Their binding is like an |
867 | :ref:`assignment statement <ref-assignment-statements>`. |
868 | |
869 | === modified file 'src/ast_cfg.m' |
870 | --- src/ast_cfg.m 2010-01-25 01:23:14 +0000 |
871 | +++ src/ast_cfg.m 2010-01-25 07:28:13 +0000 |
872 | @@ -29,12 +29,13 @@ |
873 | :- interface. |
874 | |
875 | :- import_module ir. |
876 | +:- import_module tables. |
877 | |
878 | :- import_module list. |
879 | |
880 | % prog_to_cfg(!Program) converts all functions in a program to CFG form, |
881 | % calling func_to_cfg on each of them. |
882 | -:- pred prog_to_cfg(program::in, program::out) is det. |
883 | +:- pred prog_to_cfg(progtable::in, program::in, program::out) is det. |
884 | |
885 | % func_to_cfg(!Function) converts a function to CFG form if possible. |
886 | % Built-in functions and CFG functions remain the same. |
887 | @@ -44,7 +45,7 @@ |
888 | % - There exists a viable code path which does not return a value. |
889 | % - There is an attempt to read from a variable which may not have been |
890 | % assigned on some or all code paths. |
891 | -:- pred func_to_cfg(function::in, function::out) is det. |
892 | +:- pred func_to_cfg(progtable::in, function::in, function::out) is det. |
893 | |
894 | % expr_to_instrs(+Expr, -VarName, -Instrs). |
895 | % Converts a single expression into a sequence of low-level instructions |
896 | @@ -68,28 +69,35 @@ |
897 | :- implementation. |
898 | |
899 | :- import_module pair. |
900 | +:- import_module assoc_list. |
901 | :- import_module map. |
902 | +:- import_module multi_map. |
903 | +:- import_module svmap. |
904 | +:- import_module svmulti_map. |
905 | :- import_module maybe. |
906 | :- import_module string. |
907 | :- import_module int. |
908 | +:- import_module integer. |
909 | :- import_module exception. |
910 | |
911 | :- import_module cfg. |
912 | :- import_module context. |
913 | - |
914 | -prog_to_cfg(program(Nodes0), program(Nodes)) :- |
915 | - prog_nodes_to_cfg(Nodes0, Nodes). |
916 | - |
917 | -:- pred prog_nodes_to_cfg(list(program_node)::in, list(program_node)::out). |
918 | -prog_nodes_to_cfg([], []). |
919 | -prog_nodes_to_cfg([Node0|Nodes0], [Node|Nodes]) :- |
920 | +:- import_module util. |
921 | + |
922 | +prog_to_cfg(PT, program(Nodes0), program(Nodes)) :- |
923 | + prog_nodes_to_cfg(PT, Nodes0, Nodes). |
924 | + |
925 | +:- pred prog_nodes_to_cfg(progtable::in, |
926 | + list(program_node)::in, list(program_node)::out). |
927 | +prog_nodes_to_cfg(_, [], []). |
928 | +prog_nodes_to_cfg(PT, [Node0|Nodes0], [Node|Nodes]) :- |
929 | ( Node0 = pfunction(Func0) -> |
930 | - func_to_cfg(Func0, Func), |
931 | + func_to_cfg(PT, Func0, Func), |
932 | Node = pfunction(Func) |
933 | ; |
934 | Node = Node0 |
935 | ), |
936 | - prog_nodes_to_cfg(Nodes0, Nodes). |
937 | + prog_nodes_to_cfg(PT, Nodes0, Nodes). |
938 | |
939 | :- pred new_def_map(def_map::out) is det. |
940 | new_def_map(Map) :- |
941 | @@ -111,7 +119,7 @@ |
942 | def_map_insert_local(Name-_, !DefMap) :- |
943 | map.det_insert(!.DefMap, Name, unbound, !:DefMap). |
944 | |
945 | -func_to_cfg(!Func) :- |
946 | +func_to_cfg(PT, !Func) :- |
947 | % Create a new defmap with all of the params. |
948 | new_def_map(DefMap0), |
949 | Params = !.Func ^ func_params, |
950 | @@ -123,28 +131,28 @@ |
951 | Params = yes(YesParams), |
952 | foldl(def_map_insert_param, YesParams, DefMap0, DefMap) |
953 | ), |
954 | - func_body_to_cfg(DefMap, !.Func ^ func_context, !.Func ^ func_body, |
955 | + func_body_to_cfg(PT, DefMap, !.Func ^ func_context, !.Func ^ func_body, |
956 | CFGBody), |
957 | !:Func = !.Func ^ func_body := CFGBody. |
958 | |
959 | % Note: def_map argument should have all of the function's parameters already |
960 | % bound in the map. |
961 | -:- pred func_body_to_cfg(def_map::in, context::in, |
962 | +:- pred func_body_to_cfg(progtable::in, def_map::in, context::in, |
963 | func_body::in, func_body::out) is det. |
964 | -func_body_to_cfg(_DefMap, _Ctx, B@func_builtin, B). |
965 | -func_body_to_cfg(_DefMap, _Ctx, B@func_body_cfg(_,_), B). |
966 | -func_body_to_cfg(DefMap0, Ctx, func_body_ast(Decls,Stmts), NewFuncBody) :- |
967 | +func_body_to_cfg(_PT, _DefMap, _Ctx, B@func_builtin, B). |
968 | +func_body_to_cfg(_PT, _DefMap, _Ctx, B@func_body_cfg(_,_), B). |
969 | +func_body_to_cfg(PT, DefMap0, Ctx, func_body_ast(Decls,Stmts), NewFuncBody) :- |
970 | NewFuncBody = 'new func_body_cfg'(Decls, CFG), |
971 | % Insert all of the function's locals into the defmap, unbound |
972 | foldl(def_map_insert_local, Decls, DefMap0, DefMap), |
973 | - ast_to_cfg(DefMap, Ctx, Stmts, CFG). |
974 | + ast_to_cfg(PT, DefMap, Ctx, Stmts, CFG). |
975 | |
976 | % ast_to_cfg(DefMap, Stmts, CFG) converts a statement block (AST form) to a |
977 | % control flow graph, translating the code. |
978 | % This also performs error checking (see func_to_cfg). |
979 | -:- some [S] pred ast_to_cfg(def_map::in, context::in, stmt_block::in, |
980 | - cfg(S)::out) is det. |
981 | -ast_to_cfg(DefMap, Ctx, Stmts, CFG) :- |
982 | +:- some [S] pred ast_to_cfg(progtable::in, def_map::in, context::in, |
983 | + stmt_block::in, cfg(S)::out) is det. |
984 | +ast_to_cfg(PT, DefMap, Ctx, Stmts, CFG) :- |
985 | some [!CFG] |
986 | ( |
987 | % Need to create the entry block and exit block so that |
988 | @@ -153,8 +161,8 @@ |
989 | !:CFG = cfg.new_cfg, |
990 | Entry = cfg.get_entry(!.CFG), |
991 | Exit = cfg.get_exit(!.CFG), |
992 | - stmt_block_to_cfg(Stmts, Entry, Exit, DefMap, map.init, AfterPredMap, |
993 | - map.init, ExitPredMap, !CFG), |
994 | + stmt_block_to_cfg(PT, Stmts, Entry, Exit, DefMap, |
995 | + map.init,AfterPredMap, map.init, ExitPredMap, !CFG), |
996 | % The union of AfterPredMap and ExitPredMap contains information about |
997 | % all predecessors to Exit. Use this to compute phi instructions for |
998 | % Exit. |
999 | @@ -464,38 +472,39 @@ |
1000 | % for the purpose of computing phis. |
1001 | % Throws a context error if there is an attempt to read from a variable which |
1002 | % may not have been assigned on some or all code paths. |
1003 | -:- pred stmt_block_to_cfg(stmt_block::in, bbref(S)::in, bbref(S)::in, |
1004 | - def_map::in, pred_map(S)::in, pred_map(S)::out, |
1005 | +:- pred stmt_block_to_cfg(progtable::in, stmt_block::in, bbref(S)::in, |
1006 | + bbref(S)::in, def_map::in, pred_map(S)::in, pred_map(S)::out, |
1007 | pred_map(S)::in, pred_map(S)::out, cfg(S)::in, cfg(S)::out) |
1008 | is det. |
1009 | -stmt_block_to_cfg(Stmts, BBCurrent, BBAfter, DefMap, !AfterPredMap, |
1010 | +stmt_block_to_cfg(PT, Stmts, BBCurrent, BBAfter, DefMap, !AfterPredMap, |
1011 | !ExitPredMap, !CFG) :- |
1012 | map.init(SubscriptMap), |
1013 | - stmt_block_to_cfg(Stmts, BBCurrent, BBAfter, SubscriptMap, DefMap, |
1014 | + stmt_block_to_cfg(PT, Stmts, BBCurrent, BBAfter, SubscriptMap, DefMap, |
1015 | !AfterPredMap, !ExitPredMap, !CFG). |
1016 | |
1017 | -:- pred stmt_block_to_cfg(stmt_block::in, bbref(S)::in, bbref(S)::in, |
1018 | +:- pred stmt_block_to_cfg(progtable::in, stmt_block::in, bbref(S)::in, |
1019 | + bbref(S)::in, |
1020 | subscript_map::in, def_map::in, pred_map(S)::in, pred_map(S)::out, |
1021 | pred_map(S)::in, pred_map(S)::out, cfg(S)::in, cfg(S)::out) is det. |
1022 | -stmt_block_to_cfg([], BBCurrent, BBAfter, _SubscriptMap, DefMap, |
1023 | +stmt_block_to_cfg(_PT, [], BBCurrent, BBAfter, _SubscriptMap, DefMap, |
1024 | !AfterPredMap, !ExitPredMap, !CFG) :- |
1025 | % Special case for empty block - just branch to BBAfter. |
1026 | add_to_pred_map(BBCurrent, DefMap, BBAfter, !AfterPredMap, !CFG), |
1027 | cfg.set_terminator(BBCurrent, branch(BBAfter, blank_context), !CFG). |
1028 | -stmt_block_to_cfg([S], BBCurrent, BBAfter, SubscriptMap, DefMap, |
1029 | +stmt_block_to_cfg(PT, [S], BBCurrent, BBAfter, SubscriptMap, DefMap, |
1030 | !AfterPredMap, !ExitPredMap, !CFG) :- |
1031 | % Base-case for the last statement in a block - compile statement into |
1032 | % BBCurrent, but end up by branching to BBAfter. |
1033 | - stmt_to_cfg_last(S, BBCurrent, BBAfter, SubscriptMap, DefMap, |
1034 | + stmt_to_cfg_last(PT, S, BBCurrent, BBAfter, SubscriptMap, DefMap, |
1035 | !AfterPredMap, !ExitPredMap, !CFG). |
1036 | -stmt_block_to_cfg([S|Ss@[_|_]], BBCurrent, BBAfter, SubscriptMap0, DefMap0, |
1037 | +stmt_block_to_cfg(PT, [S|Ss@[_|_]], BBCurrent, BBAfter, SubscriptMap0,DefMap0, |
1038 | !AfterPredMap, !ExitPredMap, !CFG) :- |
1039 | % Compile a single statement into BBCurrent. BBContinue is the block to |
1040 | % compile subsequent statements into (may be equal to BBCurrent). |
1041 | - stmt_to_cfg(S, BBCurrent, BBContinue, SubscriptMap0, SubscriptMap, |
1042 | + stmt_to_cfg(PT, S, BBCurrent, BBContinue, SubscriptMap0, SubscriptMap, |
1043 | DefMap0, DefMap, !ExitPredMap, !CFG), |
1044 | % Compile subsequent statements into BBContinue. |
1045 | - stmt_block_to_cfg(Ss, BBContinue, BBAfter, SubscriptMap, DefMap, |
1046 | + stmt_block_to_cfg(PT, Ss, BBContinue, BBAfter, SubscriptMap, DefMap, |
1047 | !AfterPredMap, !ExitPredMap, !CFG). |
1048 | |
1049 | %%% HIGH-LEVEL STATEMENT-COMPILATION PREDICATES %%% |
1050 | @@ -514,18 +523,18 @@ |
1051 | % will be completed (have a terminator set), and SubscriptMap will be reset |
1052 | % to refer to the new block (and typically be empty). |
1053 | % May also update def_map with any new variable definitions. |
1054 | -:- pred stmt_to_cfg(stmt::in, bbref(S)::in, bbref(S)::out, |
1055 | +:- pred stmt_to_cfg(progtable::in, stmt::in, bbref(S)::in, bbref(S)::out, |
1056 | subscript_map::in, subscript_map::out, def_map::in, def_map::out, |
1057 | pred_map(S)::in, pred_map(S)::out, cfg(S)::in, cfg(S)::out) is det. |
1058 | -stmt_to_cfg(basic_stmt(S,Ctx), !BBRef, !SubscriptMap, !DefMap, !ExitPredMap, |
1059 | - !CFG) :- |
1060 | +stmt_to_cfg(_PT, basic_stmt(S,Ctx), !BBRef, !SubscriptMap, !DefMap, |
1061 | + !ExitPredMap, !CFG) :- |
1062 | basic_stmt_to_cfg(S, Ctx, !.BBRef, !SubscriptMap, !DefMap, !CFG). |
1063 | -stmt_to_cfg(compound_stmt(S,Ctx), BBCurrent, BBAfter, !SubscriptMap, |
1064 | +stmt_to_cfg(PT, compound_stmt(S,Ctx), BBCurrent, BBAfter, !SubscriptMap, |
1065 | !DefMap, !ExitPredMap, !CFG) :- |
1066 | % Generate a target block for the compound statement to branch to |
1067 | cfg.new_basic_block(BBAfter, !CFG), |
1068 | - compound_stmt_to_cfg(S, Ctx, BBCurrent, BBAfter, !.SubscriptMap, !.DefMap, |
1069 | - map.init, AfterPredMap, !ExitPredMap, !CFG), |
1070 | + compound_stmt_to_cfg(PT, S, Ctx, BBCurrent, BBAfter, !.SubscriptMap, |
1071 | + !.DefMap, map.init, AfterPredMap, !ExitPredMap, !CFG), |
1072 | % AfterPredMap contains information about all predecessors to BBAfter. |
1073 | % Use this to compute phi instructions for BBAfter and update DefMap. |
1074 | AfterID = ref_id(BBAfter, !.CFG), |
1075 | @@ -544,19 +553,19 @@ |
1076 | % for the purpose of computing phis. |
1077 | % May update DefMap with new bindings, but SubscriptMap is read-only (since it |
1078 | % will definitely branch to a new block). |
1079 | -:- pred stmt_to_cfg_last(stmt::in, bbref(S)::in, bbref(S)::in, |
1080 | +:- pred stmt_to_cfg_last(progtable::in, stmt::in, bbref(S)::in, bbref(S)::in, |
1081 | subscript_map::in, def_map::in, pred_map(S)::in, pred_map(S)::out, |
1082 | pred_map(S)::in, pred_map(S)::out, cfg(S)::in, cfg(S)::out) is det. |
1083 | -stmt_to_cfg_last(basic_stmt(S,Ctx), BBCurrent, BBAfter, SubscriptMap, |
1084 | +stmt_to_cfg_last(_PT, basic_stmt(S,Ctx), BBCurrent, BBAfter, SubscriptMap, |
1085 | DefMap, !AfterPredMap, !ExitPredMap, !CFG) :- |
1086 | basic_stmt_to_cfg(S, Ctx, BBCurrent, SubscriptMap, _, DefMap, NewDefMap, |
1087 | !CFG), |
1088 | % Branch to the provided BBAfter |
1089 | add_to_pred_map(BBCurrent, NewDefMap, BBAfter, !AfterPredMap, !CFG), |
1090 | cfg.set_terminator(BBCurrent, branch(BBAfter, Ctx), !CFG). |
1091 | -stmt_to_cfg_last(compound_stmt(S,Ctx), BBCurrent, BBAfter, SubscriptMap, |
1092 | +stmt_to_cfg_last(PT, compound_stmt(S,Ctx), BBCurrent, BBAfter, SubscriptMap, |
1093 | DefMap, !AfterPredMap, !ExitPredMap, !CFG) :- |
1094 | - compound_stmt_to_cfg(S, Ctx, BBCurrent, BBAfter, SubscriptMap, |
1095 | + compound_stmt_to_cfg(PT, S, Ctx, BBCurrent, BBAfter, SubscriptMap, |
1096 | DefMap, !AfterPredMap, !ExitPredMap, !CFG). |
1097 | |
1098 | %%% LOW-LEVEL STATEMENT-COMPILATION PREDICATES %%% |
1099 | @@ -729,11 +738,11 @@ |
1100 | % SubscriptMap is read-only (since it will definitely branch to a new block). |
1101 | % DefMap is read-only (the resulting def-maps should be appended to |
1102 | % AfterPredMap). |
1103 | -:- pred compound_stmt_to_cfg(compound_stmt::in, context::in, bbref(S)::in, |
1104 | - bbref(S)::in, subscript_map::in, def_map::in, |
1105 | +:- pred compound_stmt_to_cfg(progtable::in, compound_stmt::in, context::in, |
1106 | + bbref(S)::in, bbref(S)::in, subscript_map::in, def_map::in, |
1107 | pred_map(S)::in, pred_map(S)::out, pred_map(S)::in, pred_map(S)::out, |
1108 | cfg(S)::in, cfg(S)::out) is det. |
1109 | -compound_stmt_to_cfg(return(Expr), Ctx, BBCurrent, _BBAfter, SubscriptMap, |
1110 | +compound_stmt_to_cfg(_PT,return(Expr), Ctx, BBCurrent, _BBAfter, SubscriptMap, |
1111 | DefMap0, !AfterPredMap, !ExitPredMap, !CFG) :- |
1112 | % Return is expressed in CFG as $RET = expr, followed by a branch to the |
1113 | % exit block. |
1114 | @@ -742,17 +751,26 @@ |
1115 | ExitBlock = cfg.get_exit(!.CFG), |
1116 | add_to_pred_map(BBCurrent, DefMap, ExitBlock, !ExitPredMap, !CFG), |
1117 | cfg.set_terminator(BBCurrent, branch(ExitBlock, Ctx), !CFG). |
1118 | -compound_stmt_to_cfg(switch(Ctrl0, Cases0), Ctx, BBCurrent, BBAfter, |
1119 | +compound_stmt_to_cfg(PT, switch(Ctrl0, Cases0), Ctx, BBCurrent, BBAfter, |
1120 | SubscriptMap, DefMap, !AfterPredMap, !ExitPredMap, !CFG) :- |
1121 | % Calculate the condition expression at the end of BBCurrent |
1122 | apply_def_map_to_expr(DefMap, Ctx, Ctrl0, Ctrl), |
1123 | BlockID = ref_id(BBCurrent, !.CFG), |
1124 | expr_to_instrs(BlockID, Ctx, Ctrl, CtrlVar, CtrlInstrs, SubscriptMap, _), |
1125 | cfg.append_instrs(BBCurrent, CtrlInstrs, !CFG), |
1126 | - list.map_foldl3(case_stmt_to_cfg(BBCurrent,BBAfter,SubscriptMap,DefMap), |
1127 | + % Convert all patterns to use SSA variables, and generate CFG code for the |
1128 | + % bodies of all case statements. The resulting Cases contains goto |
1129 | + % statements for each generated block. |
1130 | + list.map_foldl3(case_stmt_to_jmp(PT,BBCurrent,BBAfter,SubscriptMap,DefMap), |
1131 | Cases0, Cases, !AfterPredMap, !ExitPredMap, !CFG), |
1132 | - cfg.set_terminator(BBCurrent, t_switch(CtrlVar, Cases, Ctx), !CFG). |
1133 | -compound_stmt_to_cfg(if_then_else(Cond0,ThenPart,ElsePart), Ctx, BBCurrent, |
1134 | + % Now begin switch factoring transformation |
1135 | + % Convert each case statement to a matcher (flatten the patterns) |
1136 | + Matchers = list.map(case_to_matcher, Cases), |
1137 | + % Generate all switch code. This terminates the current block. |
1138 | + generate_matcher_code(PT, [CtrlVar], Matchers, map.init, BBCurrent, |
1139 | + SubscriptMap, Ctx, map.init, PhiTable, !CFG), |
1140 | + map.foldl(generate_case_block_phis, PhiTable, !CFG). |
1141 | +compound_stmt_to_cfg(PT, if_then_else(Cond0,ThenPart,ElsePart), Ctx,BBCurrent, |
1142 | BBAfter, SubscriptMap, DefMap, !AfterPredMap, !ExitPredMap, !CFG) :- |
1143 | % Calculate the condition expression at the end of BBCurrent |
1144 | apply_def_map_to_expr(DefMap, Ctx, Cond0, Cond), |
1145 | @@ -762,7 +780,7 @@ |
1146 | % Create a new basic block for the "then" part. |
1147 | cfg.new_basic_block(ThenEnter, !CFG), |
1148 | cfg.append_predecessor(ThenEnter, BBCurrent, !CFG), |
1149 | - stmt_block_to_cfg(ThenPart, ThenEnter, BBAfter, DefMap, !AfterPredMap, |
1150 | + stmt_block_to_cfg(PT, ThenPart, ThenEnter, BBAfter, DefMap, !AfterPredMap, |
1151 | !ExitPredMap, !CFG), |
1152 | ( list.is_empty(ElsePart) -> |
1153 | % Branch straight to the "after" block if the condition fails |
1154 | @@ -773,14 +791,14 @@ |
1155 | % Create a new basic block for the "else" part. |
1156 | cfg.new_basic_block(ElseEnter, !CFG), |
1157 | cfg.append_predecessor(ElseEnter, BBCurrent, !CFG), |
1158 | - stmt_block_to_cfg(ElsePart, ElseEnter, BBAfter, DefMap, !AfterPredMap, |
1159 | - !ExitPredMap, !CFG), |
1160 | + stmt_block_to_cfg(PT, ElsePart, ElseEnter, BBAfter, DefMap, |
1161 | + !AfterPredMap, !ExitPredMap, !CFG), |
1162 | % Have the current block terminate by branching into the then/else |
1163 | % blocks |
1164 | cfg.set_terminator(BBCurrent, |
1165 | cond_branch(CondVar,ThenEnter,ElseEnter,Ctx), !CFG) |
1166 | ). |
1167 | -compound_stmt_to_cfg(while(Cond0, Body), Ctx, BBCurrent, BBAfter, |
1168 | +compound_stmt_to_cfg(PT, while(Cond0, Body), Ctx, BBCurrent, BBAfter, |
1169 | SubscriptMap, DefMap, !AfterPredMap, !ExitPredMap, !CFG) :- |
1170 | % Create a new basic block for the condition and body |
1171 | cfg.new_basic_block(CondBlock, !CFG), |
1172 | @@ -800,7 +818,7 @@ |
1173 | |
1174 | % Do the whole fixpoint. Updates CFG by setting CondBlock's phis, and |
1175 | % compiling Body. |
1176 | - while_fixpoint(CondBlock, BodyEnter, Body, CondBlockPredMap, |
1177 | + while_fixpoint(PT, CondBlock, BodyEnter, Body, CondBlockPredMap, |
1178 | !ExitPredMap, DefMap, CondBlockDefMap, !CFG), |
1179 | |
1180 | % CondBlock branches to BodyBlock or BBAfter, based on the condition |
1181 | @@ -825,22 +843,24 @@ |
1182 | % Updates the CFG as follows: |
1183 | % - Sets the phis of CondBlock to the computed phi nodes. |
1184 | % - Compiles the body into BodyEnter, branching to CondBlock. |
1185 | -:- pred while_fixpoint(bbref(S)::in, bbref(S)::in, stmt_block::in, |
1186 | +:- pred while_fixpoint(progtable::in, bbref(S)::in, bbref(S)::in, |
1187 | + stmt_block::in, |
1188 | pred_map(S)::in, pred_map(S)::in, pred_map(S)::out, |
1189 | def_map::in, def_map::out, cfg(S)::in, cfg(S)::out) is det. |
1190 | -while_fixpoint(CondBlock, BodyEnter, Body, CondPredMap, !ExitPredMap, |
1191 | +while_fixpoint(PT, CondBlock, BodyEnter, Body, CondPredMap, !ExitPredMap, |
1192 | !DefMap, !CFG) :- |
1193 | % First iteration, assume there are no phis |
1194 | - while_fixpoint_(CondBlock, BodyEnter, Body, CondPredMap, [], !ExitPredMap, |
1195 | - !DefMap, !CFG). |
1196 | + while_fixpoint_(PT, CondBlock, BodyEnter, Body, CondPredMap, [], |
1197 | + !ExitPredMap, !DefMap, !CFG). |
1198 | |
1199 | -:- pred while_fixpoint_(bbref(S)::in, bbref(S)::in, stmt_block::in, |
1200 | +:- pred while_fixpoint_(progtable::in, bbref(S)::in, bbref(S)::in, |
1201 | + stmt_block::in, |
1202 | pred_map(S)::in, list(phi(S))::in, pred_map(S)::in, pred_map(S)::out, |
1203 | def_map::in, def_map::out, cfg(S)::in, cfg(S)::out) is det. |
1204 | -while_fixpoint_(CondBlock, BodyEnter, Body, CondPredMap0, Phis0, |
1205 | +while_fixpoint_(PT, CondBlock, BodyEnter, Body, CondPredMap0, Phis0, |
1206 | !ExitPredMap, !CondDefMap, !CFG) :- |
1207 | % Compute CondPredMap (after another iteration of the body) |
1208 | - stmt_block_to_cfg(Body, BodyEnter, CondBlock, !.CondDefMap, |
1209 | + stmt_block_to_cfg(PT, Body, BodyEnter, CondBlock, !.CondDefMap, |
1210 | CondPredMap0, CondPredMap, !ExitPredMap, !.CFG, PotentialFinalCFG), |
1211 | % Now we can compute the phis for the cond block |
1212 | CondBlockID = ref_id(CondBlock, !.CFG), |
1213 | @@ -853,13 +873,416 @@ |
1214 | !:CondDefMap = CondDefMap1 |
1215 | ; |
1216 | % Now iterate till fixpoint |
1217 | - while_fixpoint_(CondBlock, BodyEnter, Body, CondPredMap, Phis, |
1218 | + while_fixpoint_(PT, CondBlock, BodyEnter, Body, CondPredMap, Phis, |
1219 | !ExitPredMap, CondDefMap1, !:CondDefMap, !CFG) |
1220 | ). |
1221 | |
1222 | -% case_stmt_to_cfg(+BBAfter, +SubscriptMap, +DefMap, +CaseStmt, -CFGCaseStmt, |
1223 | +% Intermediate representation of a case statement -- still contains recursive |
1224 | +% patterns, but branches to block references rather than containing AST code. |
1225 | +% (This is half-way between a case_stmt and a matcher.) |
1226 | +:- type jmp_case_stmt(S) |
1227 | + ---> jmp_case_stmt(pattern, cfg.bbref(S), context.context). |
1228 | + |
1229 | +% Intermediate representation for switch statements being factored |
1230 | +% A "matcher" is like a flattened case statement. Rather than having just a |
1231 | +% single pattern, it has a list of matches, which represent the pattern heads |
1232 | +% which need to be matched, in order, in a depth-first left-to-right traversal |
1233 | +% of the pattern tree. |
1234 | +:- type matcher(S) |
1235 | + ---> matcher(list(match), cfg.bbref(S)). |
1236 | +% A match is the head of a pattern -- same as a pattern but ctors aren't |
1237 | +% recursive. |
1238 | +:- type match |
1239 | + % "_" pattern (don't care / wildcard) |
1240 | + ---> match_any |
1241 | + % Variable. Matches anything, and binds a new variable of the |
1242 | + % given name. |
1243 | + ; match_var(varname) |
1244 | + % Constructor match. Matches only values with the given ctor |
1245 | + % name and arity. |
1246 | + ; match_ctor(string, int) |
1247 | + % Integer literal. Matches exactly this value. |
1248 | + ; match_intlit(integer). |
1249 | + |
1250 | +% Binding table for switch generation. Determines, for a given target block |
1251 | +% and bound variable name, which source variable holds the value of the bound |
1252 | +% variable. |
1253 | +:- type binding_table(S) == map(cfg.bbref(S), map(varname, varname)). |
1254 | + |
1255 | +% Phi table for switch generation. Dynamically-constructed table of phi nodes |
1256 | +% for blocks which are targets of case statements. |
1257 | +:- type phi_table(S) == map(cfg.bbref(S), |
1258 | + multi_map(varname, pair(cfg.bbref(S), varname))). |
1259 | + |
1260 | +% Create a new entry in the phi table. |
1261 | +:- pred create_phi_table_entry(bbref(S)::in, varname::in, bbref(S)::in, |
1262 | + varname::in, phi_table(S)::in, phi_table(S)::out) is det. |
1263 | +create_phi_table_entry(Target, BoundVar, SourceBlock, SourceVar, !PhiTable) :- |
1264 | + ( map.search(!.PhiTable, Target, TPM) -> |
1265 | + TargetPhiMap0 = TPM |
1266 | + ; |
1267 | + TargetPhiMap0 = multi_map.init |
1268 | + ), |
1269 | + Source = SourceBlock-SourceVar, |
1270 | + multi_map.add(TargetPhiMap0, BoundVar, Source, TargetPhiMap), |
1271 | + svmap.set(Target, TargetPhiMap, !PhiTable). |
1272 | + |
1273 | +% Flatten a case statement out to a "matcher" |
1274 | +:- func case_to_matcher(jmp_case_stmt(S)) = matcher(S). |
1275 | +case_to_matcher(jmp_case_stmt(Pat, BBRef, _)) = matcher(Matches, BBRef) :- |
1276 | + Matches = pattern_to_matches(Pat). |
1277 | + |
1278 | +% Flatten a pattern out to a list of matches |
1279 | +:- func pattern_to_matches(pattern) = list(match). |
1280 | +pattern_to_matches(Pat0) = Matches :- |
1281 | + Pat = Pat0 ^ pat_pattern, |
1282 | + % Convert the pattern Pat into a flat list of matches, Matches |
1283 | + ( |
1284 | + Pat = pat_any, |
1285 | + Matches = [match_any] |
1286 | + ; |
1287 | + Pat = pat_var(Varname), |
1288 | + Matches = [match_var(Varname)] |
1289 | + ; |
1290 | + % Nullary constructor -- just treat as if 0-ary |
1291 | + Pat = pat_ctor(Name, no), |
1292 | + Matches = [match_ctor(Name, 0)] |
1293 | + ; |
1294 | + Pat = pat_ctor(Name, yes(Patterns)), |
1295 | + FirstMatch = match_ctor(Name, list.length(Patterns)), |
1296 | + % Need to recurse, and subsequently match all sub-patterns |
1297 | + % Generate matches for each sub-pattern, then condense all into one |
1298 | + % continuous list. |
1299 | + SubMatches = list.condense(list.map(pattern_to_matches, Patterns)), |
1300 | + % Place the first match at the front, then the subsequent matches. |
1301 | + Matches = [FirstMatch|SubMatches] |
1302 | + ; |
1303 | + Pat = pat_intlit(Val), |
1304 | + Matches = [match_intlit(Val)] |
1305 | + ). |
1306 | + |
1307 | +% generate_matcher_code(ArgList, MatcherList, BindingTable, BBCurrent, |
1308 | +% SubscriptMap, SwitchCtx, !PhiTable, !CFG). |
1309 | +% Generates switch code for a list of matchers. ArgList is a list of variable |
1310 | +% names which will be matched in order against the patterns in each matcher of |
1311 | +% MatcherList. |
1312 | +% The code is appended to block BBCurrent (may also generate new blocks). |
1313 | +% (The matches in each matcher of MatcherList may be longer than ArgList, as a |
1314 | +% top-level variable may occupy many elements of MatcherList). |
1315 | +% This algorithm is explained in detail in doc/dev/switchfactor. |
1316 | +:- pred generate_matcher_code(progtable::in, list(varname)::in, |
1317 | + list(matcher(S))::in, |
1318 | + binding_table(S)::in, bbref(S)::in, subscript_map::in, context::in, |
1319 | + phi_table(S)::in, phi_table(S)::out, cfg(S)::in, cfg(S)::out) is det. |
1320 | +generate_matcher_code(_PT, [], MatcherList, BindingTable, BBCurrent, |
1321 | + _SubscriptMap, Ctx, !PhiTable, !CFG) :- |
1322 | + % Base case -- No more args to switch on. |
1323 | + % This generates a branch, which is inefficient code. This should usually |
1324 | + % be handled on the recursive call, meaning that you should usually not |
1325 | + % get here at all. |
1326 | + % Expect that the first item in MatcherList has 0 matches, so just branch |
1327 | + % to it. |
1328 | + ( |
1329 | + MatcherList = [], |
1330 | + throw("ast_cfg.generate_matcher_code: No matchers remaining.") |
1331 | + ; |
1332 | + MatcherList = [matcher([_|_], _)|_], |
1333 | + throw("ast_cfg.generate_matcher_code: " |
1334 | + ++ "ArgList empty, matcher not empty.") |
1335 | + ; |
1336 | + MatcherList = [matcher([], Target)|_] |
1337 | + ), |
1338 | + % Just generate a branch to the Target of the first matcher |
1339 | + cfg.append_predecessor(Target, BBCurrent, !CFG), |
1340 | + cfg.set_terminator(BBCurrent, branch(Target, Ctx), !CFG), |
1341 | + % Create entries in the phi table for each entry in the binding table |
1342 | + ( map.search(BindingTable, Target, TargetBinding) -> |
1343 | + map.foldl((pred(BoundVar::in,SourceVar::in,Phi0::in,Phi::out) is det:- |
1344 | + % For each binding table entry (BoundVar-SourceVar), create a |
1345 | + % phi table entry (Target-BoundVar) = (BBCurrent-SourceVar). |
1346 | + create_phi_table_entry(Target, BoundVar, BBCurrent, SourceVar, |
1347 | + Phi0, Phi) |
1348 | + ), TargetBinding, !PhiTable) |
1349 | + ; |
1350 | + true |
1351 | + ). |
1352 | +generate_matcher_code(PT, [MatchVar|RestVars], MatcherList0, BindingTable0, |
1353 | + BBCurrent, SubscriptMap0, Ctx, !PhiTable, !CFG) :- |
1354 | + % Generate code for matching MatchVar against the first pattern in each |
1355 | + % matcher of MatcherList, then recurse. |
1356 | + |
1357 | + % Step 1 - Store explicit variable bindings |
1358 | + % If the first argument of a matcher is match_var, add a variable binding |
1359 | + % instruction to its target block. |
1360 | + % Modify MatcherList, replacing all initial match_vars with match_any. |
1361 | + list.map_foldl(matcher_variable_binding(MatchVar), |
1362 | + MatcherList0, MatcherList, BindingTable0, BindingTable), |
1363 | + |
1364 | + % Step 2 - Create association list |
1365 | + % The values we're going to switch on |
1366 | + ValuesOfInterest = list.remove_dups(list.map(matcher_first_match, |
1367 | + MatcherList)), |
1368 | + AssocList = list.map((func(K) = K-select_matchers(K, MatcherList)), |
1369 | + ValuesOfInterest), |
1370 | + |
1371 | + % Test for incomplete switch (error case) |
1372 | + switch_exhaustion_check(PT, ValuesOfInterest, Ctx), |
1373 | + |
1374 | + % Step 3 - Generate switch statement |
1375 | + % We create exactly one case (incl. default) for each value of interest |
1376 | + % This also gets AssocListBlocks, which has a third column, the block ID, |
1377 | + % and also has any entries after the first "_" removed. |
1378 | + ( (list.length(AssocList) = 1 ; AssocList = [match_any-_|_]) -> |
1379 | + % Special case -- Only one case or first case is default |
1380 | + % Do not generate a switch statement. Just continue on from the end of |
1381 | + % the current block. |
1382 | + Match-Matchers = list.det_head(AssocList), |
1383 | + AssocListBlocks = [{Match, Matchers, BBCurrent}], |
1384 | + % The target block is the same as the current one -- keep the same |
1385 | + % subscript map |
1386 | + SubscriptMap = SubscriptMap0 |
1387 | + ; |
1388 | + % Generate the switch statement |
1389 | + generate_case_stmts(BBCurrent, AssocList, Cases, Default, |
1390 | + AssocListBlocks, !CFG), |
1391 | + % Write the switch statement to the current block |
1392 | + cfg.set_terminator(BBCurrent, t_switch(MatchVar, Cases, Default, Ctx), |
1393 | + !CFG), |
1394 | + % All of the new blocks are fresh, so the target block's subscript map |
1395 | + % is empty |
1396 | + SubscriptMap = map.init |
1397 | + ), |
1398 | + |
1399 | + % Step 4 - Write implicit unpacking instructions to each block and recurse |
1400 | + % Start the new block with a fresh subscript map |
1401 | + % (This calls generate_matcher_code for each target block). |
1402 | + list.foldl2(write_unpack_instructions(PT,MatchVar, RestVars, BindingTable, |
1403 | + SubscriptMap, Ctx), |
1404 | + AssocListBlocks, !PhiTable, !CFG), |
1405 | + |
1406 | + true. |
1407 | + |
1408 | +% matcher_variable_binding(MatchVar, !Matcher, !BindingTable). |
1409 | +% If Matcher's first match is a match_var, creates a binding from MatchVar to |
1410 | +% the given variable name, in the Matcher's target block. Also updates |
1411 | +% Matcher, replacing the match_var with match_any. |
1412 | +:- pred matcher_variable_binding(varname::in, matcher(S)::in, matcher(S)::out, |
1413 | + binding_table(S)::in, binding_table(S)::out) is det. |
1414 | +matcher_variable_binding(MatchVar, !Matcher, !BindingTable) :- |
1415 | + ( !.Matcher = matcher([match_var(BindVar)|Rest], TargetBB) -> |
1416 | + % Add a binding table entry for TargetBB |
1417 | + ( map.search(!.BindingTable, TargetBB, BlockTable_) -> |
1418 | + BlockTable0 = BlockTable_ |
1419 | + ; |
1420 | + BlockTable0 = map.init |
1421 | + ), |
1422 | + svmap.det_insert(BindVar, MatchVar, BlockTable0, BlockTable), |
1423 | + svmap.set(TargetBB, BlockTable, !BindingTable), |
1424 | + % Replace match_var with match_any |
1425 | + !:Matcher = matcher([match_any|Rest], TargetBB) |
1426 | + ; |
1427 | + true |
1428 | + ). |
1429 | + |
1430 | +% Gets the first match of a matcher. |
1431 | +% Aborts if there is none. |
1432 | +:- func matcher_first_match(matcher(S)) = match. |
1433 | +matcher_first_match(matcher([First|_],_)) = First. |
1434 | +matcher_first_match(matcher([],_)) = throw("Matcher is empty"). |
1435 | + |
1436 | +% Selects all matchers which begin with match. |
1437 | +% Removes the match from the beginning, and adjusts wildcard matches so they |
1438 | +% have the correct number of start entries. |
1439 | +:- func select_matchers(match, list(matcher(S))) = list(matcher(S)). |
1440 | +select_matchers(StartMatch, Matchers) = FilteredMatchers :- |
1441 | + list.filter_map(select_matcher(StartMatch), Matchers, FilteredMatchers). |
1442 | + |
1443 | +:- pred select_matcher(match::in, matcher(S)::in, matcher(S)::out) is semidet. |
1444 | +select_matcher(StartMatch, Matcher0, Matcher) :- |
1445 | + Matcher0 = matcher([First|Rest], Target), |
1446 | + ( First = StartMatch -> |
1447 | + % First match is correct (ctor, intlit or _). Just remove it. |
1448 | + Matcher = matcher(Rest, Target) |
1449 | + ; First = match_any -> |
1450 | + % First match is _. Always match, but carefully adjust. |
1451 | + ( StartMatch = match_ctor(_, Arity) -> |
1452 | + % The wildcard match is matching a ctor, so we need to perform |
1453 | + % sub-matches (which would all be wildcards) |
1454 | + Matcher = matcher(list.duplicate(Arity, match_any) ++ Rest, |
1455 | + Target) |
1456 | + ; |
1457 | + Matcher = matcher(Rest, Target) |
1458 | + ) |
1459 | + ; |
1460 | + fail |
1461 | + ). |
1462 | + |
1463 | +% Given a list of matches, representing the cases at one level of a switch |
1464 | +% statement, test that the switch (at that level) cannot possibly fail. |
1465 | +% This means it has a default, or all cases are covered for the switched type. |
1466 | +% If a switch fails this test, throws a context error. Otherwise, succeeds. |
1467 | +:- pred switch_exhaustion_check(progtable::in, list(match)::in, context::in) |
1468 | + is det. |
1469 | +switch_exhaustion_check(PT, Matches, Ctx) :- |
1470 | + ( list.member(match_any, Matches) -> |
1471 | + % Fine -- The switch has a default case |
1472 | + true |
1473 | + % Block ends with a switch with no default. Closely analyse its cases |
1474 | + % to determine whether it is exhaustive. |
1475 | + % Note: List must be entirely match_intlit or match_ctor |
1476 | + ; SomeMatch = list.head(Matches) -> |
1477 | + % Any match will do, since they all (must) have the same type |
1478 | + % We are only interested in the match's type |
1479 | + ( |
1480 | + SomeMatch = match_ctor(SomeCtor, _), |
1481 | + % Use SomeCtor to retrieve the typedef we are switching over |
1482 | + % (Any Ctor will do). This is easier than finding the typedef from |
1483 | + % a type value. |
1484 | + ( tables.lookup_ctor(PT, SomeCtor, SwitchTypeDef, _) -> |
1485 | + % Get the names of all ctors for the type |
1486 | + AllCtors = map(func(Ctor) = Ctor ^ ctor_name, |
1487 | + SwitchTypeDef ^ typedef_ctors), |
1488 | + % Get the names of all ctors named in a case |
1489 | + MatchCtors = map((func(Match) = CtorName :- |
1490 | + ( Match = match_ctor(C, _) -> |
1491 | + CtorName = C |
1492 | + ; |
1493 | + throw("ast_cfg: Not all matches are ctors") |
1494 | + )), |
1495 | + Matches), |
1496 | + % The two sets should be equal |
1497 | + % Delete each element of CaseCtor from AllCtors |
1498 | + % UnhandledCtors = AllCtors / CaseCtors (should be empty) |
1499 | + list.foldl((pred(MatchCtor::in, AC0::in, AC::out) is det :- |
1500 | + ( list.delete_first(AC0, MatchCtor, AC_) -> |
1501 | + AC = AC_ |
1502 | + ; |
1503 | + throw("ast_cfg: Match " ++ MatchCtor ++ |
1504 | + " not a member of switch type") |
1505 | + ) |
1506 | + ), |
1507 | + MatchCtors, AllCtors, UnhandledCtors), |
1508 | + ( list.is_empty(UnhandledCtors) -> |
1509 | + % Fine -- All cases handled in switch |
1510 | + true |
1511 | + ; |
1512 | + context.throw_error( |
1513 | + "Incomplete switch: requires case(s) for " |
1514 | + ++ string.join_list(", ", UnhandledCtors), Ctx) |
1515 | + ) |
1516 | + ; |
1517 | + throw("ast_cfg: Ctor name not defined") |
1518 | + ) |
1519 | + ; |
1520 | + SomeMatch = match_intlit(_), |
1521 | + % It is ALWAYS an error to have an int switch without a default |
1522 | + % case, since you can't cover all the ints. |
1523 | + context.throw_error("Incomplete switch: requires default case " |
1524 | + ++ "for switch on integer", Ctx) |
1525 | + ; |
1526 | + SomeMatch = match_any, |
1527 | + throw("ast_cfg: match_any in match list") |
1528 | + ; |
1529 | + SomeMatch = match_var(_), |
1530 | + throw("ast_cfg: match_var in match list") |
1531 | + ) |
1532 | + ; |
1533 | + throw("ast_cfg: Switch check on empty match list") |
1534 | + ). |
1535 | + |
1536 | +% generate_case_stmts(+BBCurrent, +AssocList, -Cases, -Default, |
1537 | +% -AssocListBlocks, !CFG). |
1538 | +% Given an assoc list (a match for each case, mapped to a list of matchers), |
1539 | +% generate a case statement for each matcher, and create a new target block in |
1540 | +% the CFG for the case statement. |
1541 | +% Returns AssocListBlocks, which is AssocList augmented with a target block |
1542 | +% for each AssocList entry. |
1543 | +% Will ignore any matches after a match_any. Aborts on match_var. |
1544 | +:- pred generate_case_stmts(bbref(S)::in, |
1545 | + assoc_list(match, list(matcher(S)))::in, |
1546 | + assoc_list(case_value, bbref(S))::out, maybe(bbref(S))::out, |
1547 | + list({match, list(matcher(S)), bbref(S)})::out, |
1548 | + cfg(S)::in, cfg(S)::out) is det. |
1549 | +% Base case: No matches. Therefore no cases and no default. |
1550 | +generate_case_stmts(_, [], [], no, [], !CFG). |
1551 | +% Base case: Match is "_". Therefore, generate a default case and stop. |
1552 | +generate_case_stmts(BBCurrent, [match_any-M|_], [], yes(Default), [ALB], |
1553 | + !CFG) :- |
1554 | + cfg.new_basic_block(Default, !CFG), |
1555 | + cfg.append_predecessor(Default, BBCurrent, !CFG), |
1556 | + ALB = {match_any, M, Default}. |
1557 | +% Error case: Match is a var. Should have been taken out by now. |
1558 | +generate_case_stmts(_, [match_var(_)-_|_], _, _, _, !CFG) :- |
1559 | + throw("ast_cfg.generate_case_stmts: Found match_var"). |
1560 | +generate_case_stmts(BBCurrent, [match_ctor(Name, Arity)-M|Matches], |
1561 | + [Case|Cases], Default, [ALB|ALBs], !CFG) :- |
1562 | + cfg.new_basic_block(Target, !CFG), |
1563 | + cfg.append_predecessor(Target, BBCurrent, !CFG), |
1564 | + Case = case_ctor(Name) - Target, |
1565 | + ALB = {match_ctor(Name, Arity), M, Target}, |
1566 | + % Recurse |
1567 | + generate_case_stmts(BBCurrent, Matches, Cases, Default, ALBs, !CFG). |
1568 | +generate_case_stmts(BBCurrent, [match_intlit(Val)-M|Matches], |
1569 | + [Case|Cases], Default, [ALB|ALBs], !CFG) :- |
1570 | + cfg.new_basic_block(Target, !CFG), |
1571 | + cfg.append_predecessor(Target, BBCurrent, !CFG), |
1572 | + Case = case_intlit(Val) - Target, |
1573 | + ALB = {match_intlit(Val), M, Target}, |
1574 | + % Recurse |
1575 | + generate_case_stmts(BBCurrent, Matches, Cases, Default, ALBs, !CFG). |
1576 | + |
1577 | +% write_unpack_instructions(SourceVar, RestVars, BindingTable, SubscriptMap, |
1578 | +% Ctx, {Match, Matchers, Block}, !PhiTable, !CFG). |
1579 | +% Writes implicit record unpacking instructions for the constructor mentioned |
1580 | +% by Match to the end of Block, and then calls generate_matcher_code to |
1581 | +% recursively generate the rest of the switch statement for that block. |
1582 | +:- pred write_unpack_instructions(progtable::in, varname::in, |
1583 | + list(varname)::in, binding_table(S)::in, subscript_map::in, context::in, |
1584 | + {match, list(matcher(S)), bbref(S)}::in, |
1585 | + phi_table(S)::in, phi_table(S)::out, cfg(S)::in, cfg(S)::out) is det. |
1586 | +write_unpack_instructions(PT, SourceVar, RestVars, BindingTable, SubscriptMap, |
1587 | + Ctx, AssocEntry, !PhiTable, !CFG) :- |
1588 | + {Match, Matchers, Block} = AssocEntry, |
1589 | + ( Match = match_ctor(Name, Arity) -> |
1590 | + % Write Arity unpack instructions to Block |
1591 | + % Store the names of the unpacked vars in UnpackedVars |
1592 | + list.map_foldl2((pred(VarID::in, TempVName::out, SM0::in, SM::out, |
1593 | + C0::in, C::out) is det :- |
1594 | + % Write one, then recurse |
1595 | + new_temp_variable(ref_id(Block, C0), TempVName, SM0, SM), |
1596 | + Instr = instr(ld_field_idx(TempVName, SourceVar, Name, VarID), |
1597 | + context.blank_context), |
1598 | + cfg.append_instr(Block, Instr, C0, C) |
1599 | + ), util.range(Arity), UnpackedVars, SubscriptMap, _, !CFG) |
1600 | + ; |
1601 | + UnpackedVars = [] |
1602 | + ), |
1603 | + % Construct the new ArgList. |
1604 | + % First, need to switch on all of the unpacked variables, then the |
1605 | + % remaining ArgList variables. |
1606 | + ArgList = UnpackedVars ++ RestVars, |
1607 | + % Recursively call generate_matcher_code for this block |
1608 | + generate_matcher_code(PT, ArgList, Matchers, BindingTable, Block, |
1609 | + SubscriptMap, Ctx, !PhiTable, !CFG). |
1610 | + |
1611 | +% generate_case_block_phis(TargetBlock, Phis, !CFG). |
1612 | +% Generates all phis for case block TargetBlock. Phis is the phi table entry |
1613 | +% for TargetBlock, a multimap mapping bound varnames to source block/varname |
1614 | +% pairs. |
1615 | +:- pred generate_case_block_phis(bbref(S)::in, |
1616 | + multi_map(varname, pair(bbref(S), varname))::in, cfg(S)::in, cfg(S)::out) |
1617 | + is det. |
1618 | +generate_case_block_phis(TargetBlock, Phis, !CFG) :- |
1619 | + map.foldl((pred(BoundVar::in, Bindings::in, CFG0::in, CFG::out) is det :- |
1620 | + % For each phi |
1621 | + Preds = map.from_assoc_list(Bindings), |
1622 | + PhiInstr = phi(BoundVar, Preds), |
1623 | + cfg.append_phi(TargetBlock, PhiInstr, CFG0, CFG) |
1624 | + ), Phis, !CFG). |
1625 | + |
1626 | +% case_stmt_to_jmp(+BBAfter, +SubscriptMap, +DefMap, +CaseStmt, -CFGCaseStmt, |
1627 | % !AfterPredMap, !ExitPredMap, !CFG). |
1628 | -% Compiles a case statement into a CFG case statement (which is |
1629 | +% Compiles a case statement into a jmp case statement (which is |
1630 | % branch-oriented rather than sub-statement-block-oriented). |
1631 | % Compiles the sub-statement-block in the case statement to new basic blocks, |
1632 | % which will ultimately branch to BBAfter. |
1633 | @@ -870,12 +1293,12 @@ |
1634 | % SubscriptMap is read-only (since it will definitely branch to a new block). |
1635 | % DefMap is read-only (the resulting def-maps should be appended to |
1636 | % AfterPredMap). |
1637 | -:- pred case_stmt_to_cfg(bbref(S)::in, bbref(S)::in, |
1638 | - subscript_map::in, def_map::in, case_stmt::in, cfg_case_stmt(S)::out, |
1639 | +:- pred case_stmt_to_jmp(progtable::in, bbref(S)::in, bbref(S)::in, |
1640 | + subscript_map::in, def_map::in, case_stmt::in, jmp_case_stmt(S)::out, |
1641 | pred_map(S)::in, pred_map(S)::out, pred_map(S)::in, pred_map(S)::out, |
1642 | cfg(S)::in, cfg(S)::out) is det. |
1643 | -case_stmt_to_cfg(BBCurrent, BBAfter, SubscriptMap, DefMap, |
1644 | - case_stmt(Pattern0, Stmts, Ctx), cfg_case_stmt(Pattern, BBFirst, Ctx), |
1645 | +case_stmt_to_jmp(PT, BBCurrent, BBAfter, SubscriptMap, DefMap, |
1646 | + case_stmt(Pattern0, Stmts, Ctx), jmp_case_stmt(Pattern, BBFirst, Ctx), |
1647 | !AfterPredMap, !ExitPredMap, !CFG) :- |
1648 | % Traverse the pattern and uniquely label each newly-introduced |
1649 | % variable, as well as adding them to DefMap. |
1650 | @@ -884,8 +1307,7 @@ |
1651 | InnerScopeDefMap), |
1652 | |
1653 | cfg.new_basic_block(BBFirst, !CFG), |
1654 | - cfg.append_predecessor(BBFirst, BBCurrent, !CFG), |
1655 | - stmt_block_to_cfg(Stmts, BBFirst, BBAfter, InnerScopeDefMap, |
1656 | + stmt_block_to_cfg(PT, Stmts, BBFirst, BBAfter, InnerScopeDefMap, |
1657 | !AfterPredMap, !ExitPredMap, !CFG). |
1658 | |
1659 | % pattern_to_ssa(BlockID, !Pattern, !SubscriptMap, !DefMap). |
1660 | |
1661 | === modified file 'src/interpret.m' |
1662 | --- src/interpret.m 2010-01-25 01:23:14 +0000 |
1663 | +++ src/interpret.m 2010-01-25 07:28:13 +0000 |
1664 | @@ -128,6 +128,7 @@ |
1665 | :- import_module svmap. |
1666 | :- import_module set. |
1667 | :- import_module pair. |
1668 | +:- import_module assoc_list. |
1669 | :- import_module array. |
1670 | :- import_module exception. |
1671 | :- import_module store. |
1672 | @@ -468,6 +469,20 @@ |
1673 | env_assign_local(Dst, Result, !Env). |
1674 | exec_instr_(ld_field(_Dst, _Val, _FieldName), _Ctx, !Env, !IO) :- |
1675 | throw("Not Implemented: Field reference"). |
1676 | +exec_instr_(ld_field_idx(Dst, ValName, Ctor, Idx), _Ctx, !Env, !IO) :- |
1677 | + read_var(!.Env, ValName, Val, !IO), |
1678 | + ( Val = val_term(Ctor, Fields) -> % Must match supplied Ctor name |
1679 | + ( list.index0(Fields, Idx, FieldVal) -> % Fail if bounds error |
1680 | + Result = FieldVal |
1681 | + ; |
1682 | + throw("Interpreter Error: Field index out of range") |
1683 | + ) |
1684 | + ; Val = val_term(_, _) -> |
1685 | + throw("Interpreter Error: Field reference with wrong ctor") |
1686 | + ; |
1687 | + throw("Interpreter Error: Field reference to something not an ADT") |
1688 | + ), |
1689 | + env_assign_local(Dst, Result, !Env). |
1690 | exec_instr_(call(Dst, Func, Args), _Ctx, !Env, !IO) :- |
1691 | % Evaluate Func and Arg (eagerly) |
1692 | read_var(!.Env, Func, FuncVal, !IO), |
1693 | @@ -627,9 +642,9 @@ |
1694 | % Execute the terminator |
1695 | Term = Block ^ bb_terminator, |
1696 | ( |
1697 | - Term = t_switch(Control, Cases, _), |
1698 | + Term = t_switch(Control, Cases, Default, _), |
1699 | read_var(!.Env, Control, ControlVal, !IO), |
1700 | - exec_cases(CFG, BBRef, !Env, ControlVal, Cases, !IO) |
1701 | + exec_cases(CFG, BBRef, !Env, ControlVal, Cases, Default, !IO) |
1702 | ; |
1703 | Term = cond_branch(Cond, Then, Else, _), |
1704 | read_var(!.Env, Cond, CondVal, !IO), |
1705 | @@ -668,69 +683,47 @@ |
1706 | throw("Interpreter Error: Phi missing prececessor in CFG") |
1707 | ). |
1708 | |
1709 | -% exec_cases(CFG, Predecessor, !Env, ControlValue, Cases, !IO). |
1710 | +% exec_cases(CFG, Predecessor, !Env, ControlValue, Cases, Default, !IO). |
1711 | % Performs a switch over Cases, and executes the first one which matches. |
1712 | :- pred exec_cases(cfg(S)::in, bbref(S)::in, env::in, env::out, |
1713 | - value::in, list(cfg_case_stmt(S))::in, io::di, io::uo) is det. |
1714 | -exec_cases(_CFG, _Predecessor, !Env, _ControlValue, [], !IO) :- |
1715 | - % No pattern matched |
1716 | - % XXX Should be checked statically so this never occurs, and therefore, we |
1717 | - % raise an interpreter error at this point. |
1718 | - % Currently, no check is made, and ast_cfg goes nuts if a switch |
1719 | - % statement isn't handled by default. Therefore, we need to raise a |
1720 | - % friendly runtime error for the time being. |
1721 | - throw_mars_error("Pattern match failure"). |
1722 | - %throw("Interpreter Error: Pattern match failure"). |
1723 | -exec_cases(CFG, Predecessor, !Env, ControlValue, [Case|Cases], !IO) :- |
1724 | - Case = cfg_case_stmt(Pattern, Target, _Ctx), |
1725 | - ( pattern_match(Pattern, ControlValue, !Env) -> |
1726 | - % Succeeded. Branch to Target (with the new bindings) and ignore the |
1727 | - % remaining Cases. |
1728 | + value::in, assoc_list(case_value, bbref(S))::in, maybe(bbref(S))::in, |
1729 | + io::di, io::uo) is det. |
1730 | +exec_cases(CFG, Predecessor, !Env, _ControlValue, [], Default, !IO) :- |
1731 | + % No case matched, execute the default |
1732 | + ( |
1733 | + Default = yes(DefTarget), |
1734 | + exec_block(CFG, Predecessor, DefTarget, !Env, !IO) |
1735 | + ; |
1736 | + Default = no, |
1737 | + % Should never occur -- by the invariant of t_switch, there MUST be a |
1738 | + % default case if the case statements are not exhaustive. |
1739 | + % (This should be guaranteed by ast_cfg). Therefore, internal error. |
1740 | + throw("Interpreter Error: No case matched and no default case") |
1741 | + ). |
1742 | +exec_cases(CFG, Predecessor, !Env, ControlValue, [Case|Cases], Default,!IO) :- |
1743 | + Case = CaseVal - Target, |
1744 | + ( check_case_value(CaseVal, ControlValue) -> |
1745 | + % Succeeded. Branch to Target and ignore the remaining Cases. |
1746 | exec_block(CFG, Predecessor, Target, !Env, !IO) |
1747 | ; |
1748 | % That match failed. Try the rest. |
1749 | - exec_cases(CFG, Predecessor, !Env, ControlValue, Cases, !IO) |
1750 | + exec_cases(CFG, Predecessor, !Env, ControlValue, Cases, Default, !IO) |
1751 | ). |
1752 | |
1753 | -% pattern_match(Pattern, Value, !Env). |
1754 | -% Matches Value against Pattern, making bindings in !Env if successful. |
1755 | -% Fails if the pattern match fails. |
1756 | -:- pred pattern_match(pattern::in, value::in, env::in, env::out) is semidet. |
1757 | -pattern_match(pattern(Pattern, _PatType), Value, !Env) :- |
1758 | +% check_case_value(CaseVal, Value). |
1759 | +% Succeeds if Value matches the case value CaseVal. |
1760 | +:- pred check_case_value(case_value::in, value::in) is semidet. |
1761 | +check_case_value(CaseVal, Value) :- |
1762 | ( |
1763 | - Pattern = pat_any |
1764 | - % Done, succeed with no bindings |
1765 | - ; |
1766 | - Pattern = pat_var(VarName), |
1767 | - % Succeed, and bind |
1768 | - env_assign_local(VarName, Value, !Env) |
1769 | - ; |
1770 | - Pattern = pat_ctor(CtorName, SubPatterns), |
1771 | - Value = val_term(CtorName, SubValues), % Can fail |
1772 | - % Recursively match all sub-terms with sub-patterns |
1773 | - ( |
1774 | - SubPatterns = yes(YesSubPatterns), |
1775 | - pattern_match_list(YesSubPatterns, SubValues, !Env) % Can fail |
1776 | - ; |
1777 | - SubPatterns = no |
1778 | - ) |
1779 | - ; |
1780 | - Pattern = pat_intlit(Int), |
1781 | + CaseVal = case_ctor(CtorName), |
1782 | + Value = val_term(CtorName, _) % Can fail |
1783 | + % Note: Do not recurse; succeed regardless of the term's arguments |
1784 | + ; |
1785 | + CaseVal = case_intlit(Int), |
1786 | % Succeed if the value is a matching int |
1787 | Value = val_int(Int) % Can fail |
1788 | ). |
1789 | |
1790 | -% pattern_match_list(Patterns, Values, !Env). |
1791 | -% Matches Values against Patterns, making bindings in !Env if successful. |
1792 | -% Succeeds only if each value matches its corresponding pattern, and the lists |
1793 | -% are exactly the same length. |
1794 | -:- pred pattern_match_list(list(pattern)::in, list(value)::in, |
1795 | - env::in, env::out) is semidet. |
1796 | -pattern_match_list([], [], !Env). |
1797 | -pattern_match_list([P|Ps], [V|Vs], !Env) :- |
1798 | - pattern_match(P, V, !Env), % Can fail |
1799 | - pattern_match_list(Ps, Vs, !Env). % Can fail |
1800 | - |
1801 | % -------------------------------------------------------------------------- % |
1802 | % BUILTINS |
1803 | % -------------------------------------------------------------------------- % |
1804 | |
1805 | === modified file 'src/ir.m' |
1806 | --- src/ir.m 2010-01-25 01:23:14 +0000 |
1807 | +++ src/ir.m 2010-01-25 07:28:13 +0000 |
1808 | @@ -51,6 +51,7 @@ |
1809 | :- import_module string. |
1810 | :- import_module bool. |
1811 | :- import_module list. |
1812 | +:- import_module assoc_list. |
1813 | :- import_module maybe. |
1814 | :- import_module pair. |
1815 | |
1816 | @@ -207,6 +208,9 @@ |
1817 | % Load a field of a variable into a variable. |
1818 | % (R, S, F) <=> R := S.F |
1819 | ; ld_field(varname, varname, string) |
1820 | + % Load a field of a variable, by constructor(index) |
1821 | + % (R, S, C, I) <=> R := S.C(I) |
1822 | + ; ld_field_idx(varname, varname, string, int) |
1823 | % Function call. (Result, Function, Args). |
1824 | ; call(varname, varname, list(varname)) |
1825 | % Partial function application. (Result, Function, Args). |
1826 | @@ -219,10 +223,17 @@ |
1827 | % type" used to prevent terminators from referencing basic blocks in the wrong |
1828 | % CFG. |
1829 | :- type terminator_instr(S) |
1830 | - % Switch instruction. Performs pattern matching of algebraic data |
1831 | - % types. Branches to the first basic block to match the pattern |
1832 | - % (binding variables from the pattern over the block's scope). |
1833 | - ---> t_switch(varname, list(cfg_case_stmt(S)), context.context) |
1834 | + % t_switch(ControlVar, Cases, Default). |
1835 | + % Simplified switch instruction. Performs matching of algebraic |
1836 | + % data types' head values, or integers. |
1837 | + % If ControlVar matches one of the Cases, branches to that basic |
1838 | + % block. Otherwise, branches to Default. |
1839 | + % Default is optional, but may ONLY be omitted if the cases cover |
1840 | + % all possible values of an ADT, and is mandatory for switches |
1841 | + % over integers. |
1842 | + % Two cases may not have the same case value. |
1843 | + ---> t_switch(varname, assoc_list(case_value, bbref(S)), |
1844 | + maybe(bbref(S)), context.context) |
1845 | % Conditional branch instruction. Expr must be of type Int. |
1846 | % Evaluates Expr, then branches to the first basic block if it is |
1847 | % nonzero, or the second basic block if it is zero. |
1848 | @@ -241,11 +252,7 @@ |
1849 | % An AST representation's case statement. |
1850 | :- type case_stmt |
1851 | ---> case_stmt(pattern, stmt_block, context.context). |
1852 | - |
1853 | -% A CFG representation's case statement. |
1854 | -:- type cfg_case_stmt(S) |
1855 | - ---> cfg_case_stmt(pattern, cfg.bbref(S), context.context). |
1856 | -% A pattern (part of a case statement). |
1857 | +% An AST representation's pattern (part of a case statement). |
1858 | :- type pattern |
1859 | ---> pattern( |
1860 | pat_pattern :: pattern_, |
1861 | @@ -263,6 +270,15 @@ |
1862 | % Integer literal. Matches exactly this value. |
1863 | ; pat_intlit(integer). |
1864 | |
1865 | +% A value which is matched by a case of a t_switch instruction in CFG |
1866 | +% representation. |
1867 | +:- type case_value |
1868 | + % Constructor match. Matches only values with the given ctor |
1869 | + % name, without checking arity or arguments. |
1870 | + ---> case_ctor(string) |
1871 | + % Integer literal. Matches exactly this value. |
1872 | + ; case_intlit(integer). |
1873 | + |
1874 | % A uniqueness annotation for a particular argument of a function |
1875 | :- type argmode |
1876 | ---> argmode( |
1877 | |
1878 | === modified file 'src/mars.m' |
1879 | --- src/mars.m 2010-01-22 01:54:37 +0000 |
1880 | +++ src/mars.m 2010-01-25 07:28:13 +0000 |
1881 | @@ -155,7 +155,7 @@ |
1882 | % Typecheck the program |
1883 | typecheck.check_program(ProgTable, Prog, Prog_Checked), |
1884 | % Convert the program from AST representation into CFG representation |
1885 | - ast_cfg.prog_to_cfg(Prog_Checked, ProgCFG), |
1886 | + ast_cfg.prog_to_cfg(ProgTable, Prog_Checked, ProgCFG), |
1887 | % Build another ProgTable |
1888 | tables.build_progtable(ProgCFG, builtins.builtins_progtable, |
1889 | ProgTableCFG), |
1890 | |
1891 | === modified file 'src/marsc.m' |
1892 | --- src/marsc.m 2010-01-22 01:54:37 +0000 |
1893 | +++ src/marsc.m 2010-01-25 07:28:13 +0000 |
1894 | @@ -155,7 +155,7 @@ |
1895 | check_main_func(ProgTable, MainFunc), |
1896 | ( backend_use_cfg(BackendName) -> |
1897 | % Convert the program from AST representation into CFG representation |
1898 | - ast_cfg.prog_to_cfg(Prog_Checked, Prog2), |
1899 | + ast_cfg.prog_to_cfg(ProgTable, Prog_Checked, Prog2), |
1900 | % Build another ProgTable |
1901 | tables.build_progtable(Prog2, builtins.builtins_progtable, |
1902 | ProgTable2) |
1903 | |
1904 | === modified file 'src/pretty.m' |
1905 | --- src/pretty.m 2010-01-25 01:23:14 +0000 |
1906 | +++ src/pretty.m 2010-01-25 07:28:13 +0000 |
1907 | @@ -367,6 +367,10 @@ |
1908 | io.write_string(ir.varname_to_string_noescape(Dst), !IO), |
1909 | io.write_string(" = ", !IO), |
1910 | io.write_string(ir.varname_to_string_noescape(Obj) ++ "." ++ Field, !IO). |
1911 | +print_instr_(Indent, ld_field_idx(Dst, Obj, Ctor, Index), !IO) :- |
1912 | + indent(Indent, !IO), |
1913 | + io.format("%s = %s.%s(%d)", [s(ir.varname_to_string_noescape(Dst)), |
1914 | + s(ir.varname_to_string_noescape(Obj)), s(Ctor), i(Index)], !IO). |
1915 | print_instr_(Indent, call(Dst, Func, Args), !IO) :- |
1916 | indent(Indent, !IO), |
1917 | io.write_string(ir.varname_to_string_noescape(Dst), !IO), |
1918 | @@ -386,12 +390,22 @@ |
1919 | |
1920 | :- pred print_terminator_instr(int::in, cfg(S)::in, terminator_instr(S)::in, |
1921 | io::di,io::uo) is det. |
1922 | -print_terminator_instr(Indent, CFG, t_switch(Control, Cases, _), !IO) :- |
1923 | +print_terminator_instr(Indent, CFG, t_switch(Control,Cases,Default,_), !IO) :- |
1924 | indent(Indent, !IO), |
1925 | io.write_string("switch ", !IO), |
1926 | io.write_string(ir.varname_to_string_noescape(Control), !IO), |
1927 | io.write_string(":\n", !IO), |
1928 | - foldl(print_cfg_case(Indent+1, CFG), Cases, !IO). |
1929 | + foldl(print_cfg_case(Indent+1, CFG), Cases, !IO), |
1930 | + % Print the default case, if any |
1931 | + ( |
1932 | + Default = yes(DefTarget), |
1933 | + indent(Indent+1, !IO), |
1934 | + io.write_string("default: goto ", !IO), |
1935 | + io.write(cfg.ref_id(DefTarget, CFG), !IO), |
1936 | + io.nl(!IO) |
1937 | + ; |
1938 | + Default = no |
1939 | + ). |
1940 | print_terminator_instr(Indent, CFG, cond_branch(Cond, ThenPart, ElsePart, _), |
1941 | !IO) :- |
1942 | indent(Indent, !IO), |
1943 | @@ -419,16 +433,6 @@ |
1944 | io.write_string(":\n", !IO), |
1945 | foldl(print_stmt(Indent+1), Stmts, !IO). |
1946 | |
1947 | -:- pred print_cfg_case(int::in, cfg(S)::in, cfg_case_stmt(S)::in, |
1948 | - io::di, io::uo) is det. |
1949 | -print_cfg_case(Indent, CFG, cfg_case_stmt(Pattern, Target, _), !IO) :- |
1950 | - indent(Indent, !IO), |
1951 | - io.write_string("case ", !IO), |
1952 | - io.write_string(string_pattern(Pattern), !IO), |
1953 | - io.write_string(": goto ", !IO), |
1954 | - io.write(cfg.ref_id(Target, CFG), !IO), |
1955 | - io.nl(!IO). |
1956 | - |
1957 | string_pattern(pattern(P,_)) = string_pattern_(P). |
1958 | |
1959 | :- func string_pattern_(pattern_) = string. |
1960 | @@ -447,6 +451,20 @@ |
1961 | ). |
1962 | string_pattern_(pat_intlit(I)) = integer.to_string(I). |
1963 | |
1964 | +:- pred print_cfg_case(int::in, cfg(S)::in, pair(case_value, bbref(S))::in, |
1965 | + io::di, io::uo) is det. |
1966 | +print_cfg_case(Indent, CFG, CaseVal - Target, !IO) :- |
1967 | + indent(Indent, !IO), |
1968 | + io.write_string("case ", !IO), |
1969 | + io.write_string(string_caseval(CaseVal), !IO), |
1970 | + io.write_string(": goto ", !IO), |
1971 | + io.write(cfg.ref_id(Target, CFG), !IO), |
1972 | + io.nl(!IO). |
1973 | + |
1974 | +:- func string_caseval(case_value) = string. |
1975 | +string_caseval(case_ctor(Name)) = Name. |
1976 | +string_caseval(case_intlit(I)) = integer.to_string(I). |
1977 | + |
1978 | print_function_head(Func, !IO) :- |
1979 | io.write_string(Func^func_name, !IO), |
1980 | ( |
1981 | |
1982 | === modified file 'test/cases/compiler/switch.mar' |
1983 | --- test/cases/compiler/switch.mar 2010-01-25 05:33:55 +0000 |
1984 | +++ test/cases/compiler/switch.mar 2010-01-25 07:28:13 +0000 |
1985 | @@ -13,6 +13,9 @@ |
1986 | B |
1987 | C |
1988 | |
1989 | +type Box(a): |
1990 | + Box(a) |
1991 | + |
1992 | # Test cases with multiple levels of pattern matching |
1993 | def switchtest(x :: Foo) :: Int: |
1994 | switch x: |
1995 | @@ -56,6 +59,46 @@ |
1996 | x = a |
1997 | return add(x, a) |
1998 | |
1999 | +# Test trivial (unconditional) binding of a variable |
2000 | +# Although the test suite won't test for this, it should be compiled to code |
2001 | +# without a switch, because there is no actual condition. |
2002 | +def trivialbind(x :: Int) :: Int: |
2003 | + switch x: |
2004 | + case n: |
2005 | + return n |
2006 | + |
2007 | +# Test simple binding of a variable |
2008 | +def simplebind(x :: Int) :: Int: |
2009 | + switch x: |
2010 | + case 1: |
2011 | + return 2 |
2012 | + case n: |
2013 | + return n |
2014 | + |
2015 | +# Test binding of a variable on multiple paths |
2016 | +# Fully-worked example in doc/dev/switchfactor. |
2017 | +def trickybind(x :: Pair(Int, Box(Int))) :: Int: |
2018 | + switch x: |
2019 | + case Pair(49, Box(3)): |
2020 | + return -1 |
2021 | + case Pair(m, Box(n)): |
2022 | + return add(m, n) |
2023 | + |
2024 | +# From fully-worked example in doc/dev/switchfactor. |
2025 | +# Test that factoring doesn't break the precedence of the cases |
2026 | +def factor(x :: Pair(Bar, Int)) :: Int: |
2027 | + switch x: |
2028 | + case Pair(B, 1): |
2029 | + return 1 |
2030 | + case Pair(C, 2): |
2031 | + return 2 |
2032 | + case Pair(B, 3): |
2033 | + return 3 |
2034 | + case Pair(q, _): |
2035 | + return 4 |
2036 | + case Pair(B, 5): # Redundant rule: Should never be chosen |
2037 | + return 5 |
2038 | + |
2039 | # Unit test for switchtest function |
2040 | def test_switch(_x :: Int) :: Int: |
2041 | begin_case("switch") |
2042 | @@ -88,3 +131,69 @@ |
2043 | assert_eq(scopes(B), 8, "scopes(B) != 8") |
2044 | end_case("switch (scopes)") |
2045 | return 0 |
2046 | + |
2047 | +def test_trivialbind(_x :: Int) :: Int: |
2048 | + begin_case("switch (trivialbind)") |
2049 | + assert_eq(trivialbind(7), 7, "trivialbind(7) != 7") |
2050 | + end_case("switch (trivialbind)") |
2051 | + return 0 |
2052 | + |
2053 | +def test_simplebind(_x :: Int) :: Int: |
2054 | + begin_case("switch (simplebind)") |
2055 | + assert_eq(simplebind(1), 2, "simplebind(1) != 2") |
2056 | + end_case("switch (simplebind)") |
2057 | + |
2058 | + begin_case("switch (simplebind)") |
2059 | + assert_eq(simplebind(7), 7, "simplebind(7) != 7") |
2060 | + end_case("switch (simplebind)") |
2061 | + return 0 |
2062 | + |
2063 | +def test_trickybind(_x :: Int) :: Int: |
2064 | + begin_case("switch (trickybind)") |
2065 | + assert_eq(trickybind(Pair(49, Box(3))), -1, "trickybind(Pair(49, Box(3))) != -1") |
2066 | + end_case("switch (trickybind)") |
2067 | + |
2068 | + begin_case("switch (trickybind)") |
2069 | + assert_eq(trickybind(Pair(49, Box(2))), 51, "trickybind(Pair(49, Box(2))) != 51") |
2070 | + end_case("switch (trickybind)") |
2071 | + |
2072 | + begin_case("switch (trickybind)") |
2073 | + assert_eq(trickybind(Pair(7, Box(3))), 10, "trickybind(Pair(7, Box(3))) != 10") |
2074 | + end_case("switch (trickybind)") |
2075 | + |
2076 | + begin_case("switch (trickybind)") |
2077 | + assert_eq(trickybind(Pair(2, Box(2))), 4, "trickybind(Pair(2, Box(2))) != 4") |
2078 | + end_case("switch (trickybind)") |
2079 | + return 0 |
2080 | + |
2081 | +def test_factor() :: Int: |
2082 | + begin_case("switch (factoring)") |
2083 | + assert_eq(factor(Pair(B, 1)), 1, "factor(Pair(B, 1)) != 1") |
2084 | + end_case("switch (factoring)") |
2085 | + begin_case("switch (factoring)") |
2086 | + assert_eq(factor(Pair(B, 2)), 4, "factor(Pair(B, 2)) != 4") |
2087 | + end_case("switch (factoring)") |
2088 | + begin_case("switch (factoring)") |
2089 | + assert_eq(factor(Pair(B, 3)), 3, "factor(Pair(B, 3)) != 3") |
2090 | + end_case("switch (factoring)") |
2091 | + begin_case("switch (factoring)") |
2092 | + assert_eq(factor(Pair(B, 4)), 4, "factor(Pair(B, 4)) != 4") |
2093 | + end_case("switch (factoring)") |
2094 | + # Try to trick compiler -- should pick default case, not B,5 case |
2095 | + begin_case("switch (factoring)") |
2096 | + assert_eq(factor(Pair(B, 5)), 4, "factor(Pair(B, 5)) != 4") |
2097 | + end_case("switch (factoring)") |
2098 | + |
2099 | + begin_case("switch (factoring)") |
2100 | + assert_eq(factor(Pair(C, 2)), 2, "factor(Pair(C, 2)) != 2") |
2101 | + end_case("switch (factoring)") |
2102 | + begin_case("switch (factoring)") |
2103 | + assert_eq(factor(Pair(C, 3)), 4, "factor(Pair(C, 3)) != 4") |
2104 | + end_case("switch (factoring)") |
2105 | + |
2106 | + # Make sure default case also works when initial constructor has no |
2107 | + # possible match |
2108 | + begin_case("switch (factoring)") |
2109 | + assert_eq(factor(Pair(A(7), 2)), 4, "factor(Pair(A(7), 2)) != 4") |
2110 | + end_case("switch (factoring)") |
2111 | + return 0 |
2112 | |
2113 | === modified file 'test/cases/compiler/switchfail.mar' |
2114 | --- test/cases/compiler/switchfail.mar 2010-01-08 01:36:52 +0000 |
2115 | +++ test/cases/compiler/switchfail.mar 2010-01-25 07:28:13 +0000 |
2116 | @@ -1,7 +1,5 @@ |
2117 | # Test failing switch statements (which do not cover all cases) |
2118 | |
2119 | -import unittest |
2120 | - |
2121 | type Foo: |
2122 | X(a :: Bar) |
2123 | Y(x :: Int) |
2124 | |
2125 | === modified file 'test/cases/compiler/switchfail.mtc' |
2126 | --- test/cases/compiler/switchfail.mtc 2010-01-08 01:36:52 +0000 |
2127 | +++ test/cases/compiler/switchfail.mtc 2010-01-25 07:28:13 +0000 |
2128 | @@ -1,9 +1,5 @@ |
2129 | # vim: filetype=yaml |
2130 | - |
2131 | -# Runtime tests |
2132 | -# Note: Not including unit tests; this just tests some runtime errors |
2133 | - |
2134 | -run: |
2135 | - - call: matchfail(X(C)) |
2136 | - stderr: | |
2137 | - Runtime Error: Pattern match failure |
2138 | +compile: |
2139 | + outcome: compile_error |
2140 | + errors: | |
2141 | + switchfail.mar:15: (matchfail) Incomplete switch: requires case(s) for C. |
2142 | |
2143 | === added file 'test/cases/compiler/switchfail2.mar' |
2144 | --- test/cases/compiler/switchfail2.mar 1970-01-01 00:00:00 +0000 |
2145 | +++ test/cases/compiler/switchfail2.mar 2010-01-25 07:28:13 +0000 |
2146 | @@ -0,0 +1,13 @@ |
2147 | +# Test failing switch statements (which do not cover all cases) |
2148 | + |
2149 | +type Foo: |
2150 | + A(x :: Int) |
2151 | + B(x :: Int) |
2152 | + C |
2153 | + |
2154 | +# Test failing cases with multiple levels of pattern matching |
2155 | +def matchfail(x :: Foo) :: Int: |
2156 | + switch x: |
2157 | + # Matches A, but not B or C |
2158 | + case A(i): |
2159 | + return i |
2160 | |
2161 | === added file 'test/cases/compiler/switchfail2.mtc' |
2162 | --- test/cases/compiler/switchfail2.mtc 1970-01-01 00:00:00 +0000 |
2163 | +++ test/cases/compiler/switchfail2.mtc 2010-01-25 07:28:13 +0000 |
2164 | @@ -0,0 +1,5 @@ |
2165 | +# vim: filetype=yaml |
2166 | +compile: |
2167 | + outcome: compile_error |
2168 | + errors: | |
2169 | + switchfail2.mar:10: (matchfail) Incomplete switch: requires case(s) for B, C. |
2170 | |
2171 | === added file 'test/cases/compiler/switchfail3.mar' |
2172 | --- test/cases/compiler/switchfail3.mar 1970-01-01 00:00:00 +0000 |
2173 | +++ test/cases/compiler/switchfail3.mar 2010-01-25 07:28:13 +0000 |
2174 | @@ -0,0 +1,10 @@ |
2175 | +# Test failing switch statements (which do not cover all cases) |
2176 | + |
2177 | +# Test failing cases with multiple levels of pattern matching |
2178 | +def matchfail(x :: Int) :: Int: |
2179 | + switch x: |
2180 | + # Matches some integer values, but has no default case |
2181 | + case 0: |
2182 | + return 1 |
2183 | + case 13: |
2184 | + return x |
2185 | |
2186 | === added file 'test/cases/compiler/switchfail3.mtc' |
2187 | --- test/cases/compiler/switchfail3.mtc 1970-01-01 00:00:00 +0000 |
2188 | +++ test/cases/compiler/switchfail3.mtc 2010-01-25 07:28:14 +0000 |
2189 | @@ -0,0 +1,5 @@ |
2190 | +# vim: filetype=yaml |
2191 | +compile: |
2192 | + outcome: compile_error |
2193 | + errors: | |
2194 | + switchfail3.mar:5: (matchfail) Incomplete switch: requires default case for switch on integer. |
Branch is complete. It has two new capabilities over trunk:
- CFG code for switches is completely different and much simpler.
- Errors are produced for incomplete switches.
Test cases and documentation are up to date and correct.
This has two ramifications for merging:
- Other branches of trunk may need to be modified, perhaps heavily, if they deal with the CFG representation of switches.
- Some Mars code may break due to the additional error. (But it shouldn't -- the manual already specified that this was an error). All official Mars code in the trunk has been tested and works.