Merge lp:~mgiuca/mars/switchfactor into lp:mars

Proposed by Matt Giuca
Status: Merged
Merged at revision: not available
Proposed branch: lp:~mgiuca/mars/switchfactor
Merge into: lp:mars
Diff against target: 2194 lines (+1570/-166)
17 files modified
doc/dev/index.rst (+31/-0)
doc/dev/switchfactor.rst (+795/-0)
doc/index.rst (+1/-0)
doc/ref/stmts.rst (+0/-8)
src/ast_cfg.m (+494/-72)
src/interpret.m (+46/-53)
src/ir.m (+25/-9)
src/mars.m (+1/-1)
src/marsc.m (+1/-1)
src/pretty.m (+30/-12)
test/cases/compiler/switch.mar (+109/-0)
test/cases/compiler/switchfail.mar (+0/-2)
test/cases/compiler/switchfail.mtc (+4/-8)
test/cases/compiler/switchfail2.mar (+13/-0)
test/cases/compiler/switchfail2.mtc (+5/-0)
test/cases/compiler/switchfail3.mar (+10/-0)
test/cases/compiler/switchfail3.mtc (+5/-0)
To merge this branch: bzr merge lp:~mgiuca/mars/switchfactor
Reviewer Review Type Date Requested Status
Matt Giuca Approve
Review via email: mp+17991@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Matt Giuca (mgiuca) wrote :

Branch is complete. It has two new capabilities over trunk:
- CFG code for switches is completely different and much simpler.
- Errors are produced for incomplete switches.

Test cases and documentation are up to date and correct.

This has two ramifications for merging:
- Other branches of trunk may need to be modified, perhaps heavily, if they deal with the CFG representation of switches.
- Some Mars code may break due to the additional error. (But it shouldn't -- the manual already specified that this was an error). All official Mars code in the trunk has been tested and works.

lp:~mgiuca/mars/switchfactor updated
981. By Matt Giuca

Merged from trunk -- Repeated variable in pattern test.

982. By Matt Giuca

Test suite: switch: Added 3 new test cases from other switch examples.

983. By Matt Giuca

doc/dev/switchfactor: Added another fully-worked example, from a test case (note in the test case where the full working is).

Revision history for this message
Matt Giuca (mgiuca) wrote :

All good.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added directory 'doc/dev'
2=== added file 'doc/dev/index.rst'
3--- doc/dev/index.rst 1970-01-01 00:00:00 +0000
4+++ doc/dev/index.rst 2010-01-25 07:28:13 +0000
5@@ -0,0 +1,31 @@
6+.. Mars Documentation
7+ Copyright (C) 2010 Matt Giuca
8+ 19/1/2010
9+
10+.. This program is free software: you can redistribute it and/or modify
11+ it under the terms of the GNU General Public License as published by
12+ the Free Software Foundation, either version 3 of the License, or
13+ (at your option) any later version.
14+
15+.. This program is distributed in the hope that it will be useful,
16+ but WITHOUT ANY WARRANTY; without even the implied warranty of
17+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
18+ GNU General Public License for more details.
19+
20+.. You should have received a copy of the GNU General Public License
21+ along with this program. If not, see <http://www.gnu.org/licenses/>.
22+
23+.. _ref-dev:
24+
25+Developer Documentation
26+=======================
27+
28+.. sectionauthor:: Matt Giuca <matt.giuca@gmail.com>
29+
30+This section contains information about the workings of *Mars*, for developers
31+and other interested parties.
32+
33+.. toctree::
34+ :maxdepth: 2
35+
36+ switchfactor.rst
37
38=== added file 'doc/dev/switchfactor.rst'
39--- doc/dev/switchfactor.rst 1970-01-01 00:00:00 +0000
40+++ doc/dev/switchfactor.rst 2010-01-25 07:28:13 +0000
41@@ -0,0 +1,795 @@
42+.. Mars Documentation
43+ Copyright (C) 2010 Matt Giuca
44+ 19/1/2010
45+
46+.. This program is free software: you can redistribute it and/or modify
47+ it under the terms of the GNU General Public License as published by
48+ the Free Software Foundation, either version 3 of the License, or
49+ (at your option) any later version.
50+
51+.. This program is distributed in the hope that it will be useful,
52+ but WITHOUT ANY WARRANTY; without even the implied warranty of
53+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
54+ GNU General Public License for more details.
55+
56+.. You should have received a copy of the GNU General Public License
57+ along with this program. If not, see <http://www.gnu.org/licenses/>.
58+
59+.. _ref-switchfactor:
60+
61+**************************
62+Switch factoring algorithm
63+**************************
64+
65+.. sectionauthor:: Matt Giuca <matt.giuca@gmail.com>
66+
67+This is the algorithm in Mars for transforming an AST-level switch statement
68+to a CFG-level one. This flattens patterns out into multi-level switch
69+statements, so it is a non-recursive data structure (like the rest of CFG),
70+and represents the actual execution paths taken by the machine. It also
71+generates errors and warnings for malformed switch statements in the source.
72+
73+First, we transform each case into a ":token:`matcher`", a temporary data
74+structure representing a single case in an equivalent, but flattened, form.
75+For each case, recurse through the pattern structure, and generate a flat
76+"matcher". This results in a list of matchers, corresponding to the original
77+list of cases.
78+
79+.. productionlist::
80+ matcher: `pair`(`list`(`match`), `stmt_block`)
81+
82+A matcher list is an assoc list associating sequences of ":token:`match`"es
83+with blocks, just like a switch is an assoc list associating patterns with
84+blocks. A ":token:`match`" is a pattern without arguments (ctors have arities
85+rather than arguments) -- unlike patterns, matches don't recurse.
86+
87+.. productionlist::
88+ match: `match_any` | `match_var` | `match_ctor`(Name, Arity) | `match_intlit`(Val)
89+
90+A key operation in this algorithm is **generate_matcher_code**, which
91+generates low-level code from a matcher list. This recursive operation
92+performs most of the work, once the matcher list is generated. It is called
93+as::
94+
95+ generate_matcher_code(arg_list, matcher_list)
96+
97+Where `arg_list` is a list of atoms (variable names), denoting the variables
98+holding the switch values for the match list. The `arg_list` is always shorter
99+or equal in length to the `matcher_list`, as arg_list only has the arguments
100+for the current level, while `matcher_list` is fully expanded out.
101+
102+Example::
103+
104+ switch x:
105+ case Pair(A, 1)
106+ case Pair(B, 2)
107+ case Pair(A, 3)
108+ case Pair(q, _)
109+ case Pair(A, 5)
110+
111+We want the final output to be::
112+
113+ switch x:
114+ case Pair:
115+ a = x.Pair(0)
116+ n = x.Pair(1)
117+ switch a:
118+ case A:
119+ switch n:
120+ case 1
121+ case 3
122+ default: q = a
123+ case B:
124+ switch n:
125+ case 2
126+ default: q = a
127+ default: q = a
128+
129+The arguments to pair are patterns, so this data structure is recursive. We
130+want to flatten it out. Therefore, the matchers we generate are:
131+
132+ [(Pair/2,A,1)-block1, (Pair/2,B,2)-block2, (Pair/2,A,3)-block3, (Pair/2,q,_)-block4, (Pair/2,A,5)-block5]
133+
134+(All the match arguments would have an arity, such as A/0 and 1/0, but we
135+don't show /0 in the list above. But not the default arguments, such as q.)
136+
137+Now the constructors have no arguments -- we have a flat list of patterns. Note
138+that we recursively flatten this data, so if A had two arguments, we could see
139+a matcher like:
140+
141+ (Pair/2,A/2,J,K,1)-block1
142+
143+This reads operationally, "match a Pair, match the Pair's first argument with
144+A, match the A's first argument with J and its second with K, then match the
145+Pair's second argument with 1." So matchers are just a flat data structure for
146+representing the same information as the case statement, but with an
147+additional operational reading. Not all matchers have the same number of
148+steps, as different ctors have different arities.
149+
150+Now before we generate code for the matchers, we add a layer of abstraction
151+over the blocks, to avoid duplicating blocks when we duplicate matchers.
152+Hence, our matcher list becomes:
153+
154+ [(Pair/2,A,1)-"block1", (Pair/2,B,2)-"block2", (Pair/2,A,3)-"block3", (Pair/2,q,_)-"block4", (Pair/2,A,5)-"block5"]
155+
156+ "block1": block1
157+
158+ "block2": block2
159+
160+ "block3": block3
161+
162+ "block4": block4
163+
164+ "block5": block5
165+
166+(The unquoted block names represent the full block code; the quoted ones
167+represent a thin reference to the block name).
168+
169+The overall algorithm, outside of generating matcher code, goes, for
170+switch(`switchvar`, `cases`):
171+
172+1. Construct the matcher list, `matcher_list`, from cases, by flattening out
173+ the switch statement.
174+
175+ * Generate the code for each target block, and store the Block Ids in the
176+ matcher list, not the blocks themselves.
177+2. Call :func:`generate_matcher_code` ([`switchvar`], `matcher_list`).
178+
179+The result is a CFG-level switch statement. CFG-level switch statements are
180+very simple jump tables. They switch over an atom, and map ints and ctor names
181+to block IDs. They have an optional default clause, which is required to be
182+last. It is up to the backend to decide what to do if there is no default
183+clause, but the backend is free to exhibit undefined behaviour if a switch
184+doesn't match and there is no default clause, as the code generator will never
185+generate such an instruction due to exhaustive check, described below.
186+
187+In this example, we call::
188+
189+ generate_matcher_code(["x"], [(Pair/2,A,1)-"block1", (Pair/2,B,2)-"block2",
190+ (Pair/2,A,3)-"block3", (Pair/2,q,_)-"block4",
191+ (Pair/2,A,5)-"block5"])
192+
193+======================================
194+Generating code for a list of matchers
195+======================================
196+
197+.. function:: generate_matcher_code(arg_list, matcher_list)
198+
199+Summary of the algorithm: Focus only on the first argument for each matcher --
200+the rest will be handled by recursion. Convert variable-binding matchers into
201+wildcard/default matchers which bind a variable in their code. For every
202+relevant head value, group together every possible matcher (those with an
203+exact match, and wildcard/default matchers). Make these groupings into switch
204+statements, and recurse to generate the body.
205+
206+Note that we assume we will switch on arguments from left to right. We could
207+also pick a more intelligent order, such as choosing the argument with the
208+most distinct cases first (which means earlier switches eliminate the most
209+cases). In this case, the example below would choose the 2nd argument first,
210+but we'll assume left-to-right for now. Note that the flattening means we have
211+to make this decision *before* generating the matcher list.
212+
213+Each call to :func:`generate_matcher_code` is given a block to write to (this
214+block will be completed by the end of the call). When a choice is required,
215+the block will be finished with a switch statement, and new blocks will be
216+created for each case (and finished off by recursive calls). When no choice is
217+required, some code may be appended to the block, and then the (single)
218+recursive call will complete the same block.
219+
220+Calls to :func:`generate_matcher_code` may also append phi instructions to the
221+target blocks (which is the mechanism used to make explicit variable
222+bindings).
223+
224+The function also needs to track two tables, for keeping variable bindings.
225+Ultimately, variable bindings are stored in phi instructions in the target
226+blocks, but during the code generation, we track them separately, because we
227+can't build phi nodes piece-by-piece. The **binding table** only moves down
228+through the stack, so any changes made to it are undone on the way up. This
229+maps :class:`blockid` to :class:`varname` to :class:`varname`, such that if
230+BB-BoundVar maps to SourceVar, it says "if BB wants BoundVar, it is currently
231+available in SourceVar." The **phi table** moves down and up through the
232+stack, so changes made to it are permanent throughout the algorithm. This is a
233+multimap mapping :class:`blockid` to :class:`varname` to
234+:class:`pair(blockid,varname)`, such that if TargetBlock-BoundVar maps to
235+SourceBlock-SourceVar, it says "TargetBlock should have a phi node for
236+BoundVar, where one entry maps SourceBlock to SourceVar." Both tables start
237+out empty.
238+
239+Base case
240+---------
241+
242+When `arg_list` is empty. It should always be the case that
243+`matcher_list` contains exactly one element (but assume it has multiple and
244+pick the first), whose match list is empty. In this case, generate a branch to
245+the block named by the first element of `matcher_list`.
246+
247+Also add entries to the phi table for the target block. Get the target block
248+out of the binding table. For each entry in this table (BoundVar-SourceVar),
249+create a phi table entry (TargetBlock-BoundVar) = (CurrentBlock-SourceVar).
250+
251+Step 1 -- Update binding table with explicit variable bindings
252+--------------------------------------------------------------
253+
254+The first step of the operation (which in the current example will do nothing,
255+but we'll see it working later) is to update the binding table for explicit
256+variable bindings (as distinct from the implicit temporary variables, which
257+are created during switch generation). For each matcher whose first argument
258+is a variable name:
259+
260+1. Add an entry in the binding table mapping (that matcher's block - the first
261+ variable in `arg_list`) to the given variable name. For example, if
262+ `arg_list` [1] is "a" and the first entry in a matcher is "q", then the
263+ variable "q" must be bound to the first switch value. Therefore, add to the
264+ binding table (``<matcher's block>`` - `q`) -> `a`.
265+2. Replace the first argument of the matcher with match_any ("_"). This is
266+ sound, as variables match anything, and we no longer need to worry about
267+ the variable binding, as it will be taken care of as soon as we branch to
268+ the code.
269+
270+Step 2 -- Create association list
271+---------------------------------
272+
273+The next step is to create an assoc list for every head value of interest, for
274+the first argument. The assoc list maps "match"es onto match lists. "Head
275+value" means the ctor without its arguments, or the integer. "Of interest"
276+means the head values referred to by any matcher. Head values which are never
277+referred to (e.g., all the integers which aren't explicitly mentioned) are all
278+the same as far as we're concerned.
279+
280+Now we have to be *very careful* to preserve order, because default matchers
281+which are before more specific matchers need to supersede the specific ones.
282+(This assumes a language like Haskell, where the first match is chosen, not
283+Prolog/Mercury, where the order technically doesn't matter.) The grouping of
284+matchers with the same head will conceptually change the order, but this
285+*never* matters, because when we make a grouping, it is a committed choice. In
286+other words, once a match is made, we cover all possible subsequent cases
287+inside the match, and never fall back out to a default case. So we don't
288+re-order within a group, but once we have selected the groupings, their order
289+doesn't matter (though we keep them in order for the sake of predictable code
290+generation).
291+
292+Let's return to our example, with the matcher list:
293+
294+ [(Pair/2,A,1)-"block1", (Pair/2,B,2)-"block2", (Pair/2,A,3)-"block3", (Pair/2,q,_)-"block4", (Pair/2,A,5)-"block5"]
295+
296+The resulting assoc list is very basic at this level, since all of the
297+matchers have the same head, Pair/2:
298+
299+ Pair/2: [(A,1)-"block1", (B,2)-"block2", (A,3)-"block3", (q,_)-"block4", (A,5)-"block5"]
300+
301+It's difficult to see at this stage, but the operation is to take each head
302+value of interest, and give it its own entry in the assoc list. Then, place in
303+the assoc list any matcher which begins with either the head value, or a "_".
304+
305+* For matchers which begin with the head value, remove the first column of the
306+ matcher (so the matched head value no longer appears in the match list).
307+* For matchers which begin with a "_", replace the first column of the
308+ matcher (the "_") with one "_" for each argument of the head value (0 or
309+ more). For example, if placing a default matcher in the Cons/2 row, replace
310+ its front "_" with "_", "_" -- one for each argument. If placing a default
311+ matcher in the Nil row, remove its front "_".
312+
313+If there are any default cases, also create an assoc list entry for "_", and
314+only place in the assoc list any matcher which begins with a "_". Remove the
315+first column of such matchers, as if matching a 0-arity head value.
316+
317+**Horizontal rule**: For any matcher (looking at the remaining matches), if it
318+is *covered* by all of the matchers already on a row, remove it, as it is
319+redundant. By "covered", I mean for all of the remaining matches, if it is an
320+exact match, it already appears in a previous matcher, or if it is a default,
321+there is already a default in the previous matchers or between them they
322+include all constructors of that type. When removing a matcher, if the first
323+argument is not a default match, give a warning, as it is a useless clause.
324+
325+**Vertical rule**: For any assoc list entry, if its head is *covered* by all
326+of the entries already in the table, remove it, as it is redundant. By
327+"covered", I mean if it is an exact match, that head already appeared in the
328+assoc list. If it is a default, there is already a default in the assoc list
329+or between them, all entries include all constructors of that type. When
330+removing an assoc list entry, give a warning for each matcher, as they are all
331+useless. (Perhaps do not give a warning for a default matcher covered by all
332+constructors, as that probably represents defensive programming. For example,
333+the programmer might just be using the default rule as an assert because he
334+doesn't trust the compiler to generate "not exhaustive" errors -- then again,
335+maybe it should be a warning because he should trust the compiler!) When a
336+default entry is placed in the assoc list, it will always be the last (this is
337+a structural requirement of the CFG-level switch statement).
338+
339+Once this grouping is done (at each level), test that there is *either* a
340+default case, or that all ctors of the switch type have a case. If not, it is
341+a compiler error ("switch not exhaustive"). The error message should include
342+all of the missing cases ("requires default case or cases for A, B, C"), if it
343+is an ADT. If it is an integer type, it must have a default case, so the error
344+message should just read "switch not exhaustive: default case required for
345+switch on integer".
346+
347+Step 3 -- Generate switch statement and implicit unpacking instructions
348+-----------------------------------------------------------------------
349+
350+This enables us to generate the top-level switch statement. Switch over the
351+first variable of `arg_list`, and generate a case statement for each assoc
352+list entry. If the assoc list entry is default, generate the "default" clause
353+of the switch statement (which must be last). Within each case statement, we
354+unpack all arguments, giving them a temporary name.
355+
356+**Unnecessary optimisation**: If the variable's original pattern is _ in all
357+matchers, don't unpack it -- note that to know this, you actually shouldn't
358+erase the variable name in step 1, or you won't know which are needed and
359+which aren't. Also, if their pattern is a variable name in all matchers, we
360+can use that variable name and avoid the mov instruction on a later recursion
361+of step 1, but for an optimising backend like LLVM, that won't save anything.
362+
363+::
364+
365+ <continue current block>
366+ switch x:
367+ case Pair: goto casepair
368+
369+ casepair:
370+ a = x.Pair(0)
371+ n = x.Pair(1)
372+ <generate_matcher_code([a,n], [(A,1)-"block1", (B,2)-"block2",
373+ (A,3)-"block3", (q,_)-"block4",
374+ (A,5)-"block5"])>
375+
376+**Optimisation**: If `arg_list` has a single element, and there were no
377+variables unpacked, then all recursive calls will be base cases with a simple
378+goto. Avoid generating a bunch of new goto-only blocks by making the case
379+statements directly branch to the first target block in each case, as the base
380+case would. Must also update the phi table as the base case does.
381+
382+**Optimisation**: I naively generated a single-case switch statement. Instead,
383+we should avoid generating such a thing. If there is exactly one entry in the
384+assoc list, just generate the unpacking code directly::
385+
386+ <continue current block>
387+ a = x.Pair(0)
388+ n = x.Pair(1)
389+ <generate_matcher_code([a,n], [(A,1)-"block1", (B,2)-"block2",
390+ (A,3)-"block3", (q,_)-"block4",
391+ (A,5)-"block5"])>
392+
393+Step 4 -- Recurse
394+-----------------
395+
396+Now we recurse on :func:`generate_matcher_code`. The recursive calls are shown
397+in the example above. Note that we recurse once for each assoc list entry (so
398+we iterate, then recurse).
399+
400+The `arg_list` on the recursive calls is the current `arg_list` with the first
401+element removed, and the unpacked temporary variable names prepended. These
402+new arguments correspond to the first entries in the new matcher lists. The
403+`matcher_list` on the recursive calls is simply the corresponding value in the
404+assoc list, for this head.
405+
406+================================================================
407+Afterwards -- Generating phi code for explicit variable bindings
408+================================================================
409+
410+After all calls to :func:`generate_matcher_code` have ended, the phi table has
411+been built. This describes exactly where each target block should read the
412+variable bindings from, conditionally dependent on the block it came from.
413+
414+These should be converted into phi instructions for each block, in the obvious
415+way. **Unnecessary optimisation**: Any entries in the phi table which have the
416+same source variable name for every source block can be converted into mov
417+instructions instead.
418+
419+======================
420+Completing the example
421+======================
422+
423+I now present the rest of the worked example, stepping through the recursive
424+calls. ::
425+
426+ generate_matcher_code([a,n], [(A,1)-"block1", (B,2)-"block2", (A,3)-"block3",
427+ (q,_)-"block4", (A,5)-"block5"])
428+
429+First, we apply step 1, binding q for the matcher for (q,_).
430+
431+1. prepend to the actual code of block 4, the instruction "q = a" (where a is
432+ the first variable we are matching against).
433+2. Replace q with _ in the matchers list. Hence:
434+
435+ [(A,1)-"block1", (B,2)-"block2", (A,3)-"block3", (_,_)-"block4", (A,5)-"block5"]
436+
437+Now we again group together the matchers by their first argument -- this time
438+we have multiple ctors and default matchers in the first argument. For every
439+matched ctor, we duplicate all of the default matchers. Thus, the groups are:
440+
441+ A: [(1)-"block1", (3)-"block3", (_)-"block4", (5)-"block5"]
442+
443+ B: [(2)-"block2", (_)-"block4"]
444+
445+ _: [(_)-"block4"]
446+
447+Note that block4 was duplicated for the A group, even though it did not
448+explicitly match A. This preserves the semantics of the switch statement -- if
449+we had not duplicated it, then matching A would discount that default case. As
450+explained above, when duplicating a default value, first test to see if it is
451+*covered* by all of the matchers before it, and if so, do not duplicate.
452+
453+This allows us to generate the next switch statement::
454+
455+ switch a:
456+ case A:
457+ <generate_matcher_code([n], [(1)-"block1", (3)-"block3",
458+ (_)-"block4", (5)-"block5"])>
459+ case B:
460+ <generate_matcher_code([n], [(2)-"block2", (_)-"block4"])>
461+ default:
462+ <generate_matcher_code([n], [(_)-"block4"])>
463+
464+Now that we have three cases, we iterate over all of them, and recurse three
465+times.
466+
467+The assoc list for case A:
468+
469+ 1: [()-"block1"]
470+
471+ 3: [()-"block3"]
472+
473+ _: [()-"block4"]
474+
475+(Applying the vertical rule to remove the entry "5: [()-"block5"]").
476+
477+The assoc list for case B:
478+
479+ 2: [()-"block2"]
480+
481+ _: [()-"block4"]
482+
483+The assoc list for the default case:
484+
485+ _: [()-"block4"]
486+
487+This brings us to seven base cases, of <generate_matcher_code([],
488+[()-"blockn"])>, which just generates the goto statement to the given block.
489+
490+We end up with this code::
491+
492+ switch a:
493+ case A:
494+ switch n:
495+ case 1: goto "block1"
496+ case 3: goto "block3"
497+ default: goto "block4"
498+ case B:
499+ switch n:
500+ case 2: goto "block2"
501+ default: goto "block4"
502+ default: goto "block4"
503+
504+Then we generate all of the code for each block. Note that each block only has
505+code generated once, but may be targeted by multiple case statements, such as
506+block4.
507+
508+================
509+Further examples
510+================
511+
512+Above example, continued
513+------------------------
514+
515+Now, let us imagine a slightly different scenario, where we had these
516+matchers:
517+
518+ A: [(1,_)-"block1", (3,7)-"block3", (_,_)-"block4", (5,_)-"block5"]
519+
520+(Same as above for case A, but with an extra argument.) This means, by the
521+rule above, we need to copy the default match for each concrete match. Hence,
522+we generate the following grouping for case A:
523+
524+ 1: [(_)-"block1", (_)-"block4"]
525+
526+ 3: [(7)-"block3", (_)-"block4"]
527+
528+ _: [(_)-"block4"]
529+
530+ 5: [(_)-"block5", (_)-"block4"]
531+
532+This is inefficient, but perfectly correct. Operationally, it reads: "If you
533+match 1, match all and go to block1, and if that fails, match all and go to
534+block4." (Obviously, we will pick block1). Then, "If you match 3, match 7 and
535+go to block3, or match all and go to block4". Then it says "Match all and go
536+to block4." Finally, "If you match 5, match all and go to block5, and if that
537+fails, match all and go to block4." There are two forms of redundancy here.
538+
539+First, once a default case occurs, all subsequent cases can be removed
540+(preferably with a warning):
541+
542+ 1: [(_)-"block1", (_)-"block4"]
543+
544+ 3: [(7)-"block3", (_)-"block4"]
545+
546+ _: [(_)-"block4"]
547+
548+Second, once a matcher which *cannot fail* is seen, all subsequent matchers
549+can be removed:
550+
551+ 1: [(_)-"block1"]
552+
553+ 3: [(7)-"block3", (_)-"block4"]
554+
555+ _: [(_)-"block4"]
556+
557+This still perfectly preserves the semantics of the above matcher list --
558+there is no possibility of going to block5, or block4 from a match of 1.
559+
560+However, we would like to generate a warning in this last case as well, which
561+would be spurious if we duplicated any defaults then eliminated them again. So
562+we need to avoid duplicating in the first place.
563+
564+Therefore: When duplicating a default value, first test to see if any of the
565+matchers before it will always succeed, and if so, do not duplicate.
566+
567+Now, we can be even more judicious in preventing duplication. First, don't
568+just assume a matcher set cannot fail if it has a default case -- also check
569+for exhaustive ctor coverage. Secondly, it's not necessary for the matchers to
570+be unable to fail, it is merely sufficient for them to cover the duplicated
571+patterns. So: When duplicating a default value, first test to see if it is
572+covered by all of the matchers before it, and if so, do not duplicate.
573+
574+Also: Do not think of it as "duplicating" the default values. Simpler to think
575+of it as "projecting" away all values which don't match. So, include all exact
576+matches and also default values. Then, eliminate all later matches which are
577+covered by earlier matches, giving a warning iff the eliminated matches are
578+exact (and therefore are useless). If they are default matches, then they
579+aren't useless unless covered by all constructors on the outside -- they
580+should have already been checked by the vertical rule.
581+
582+An example of this:
583+
584+ [(1,A)-"block1", (1,B)-"block2", (2,A)-"block3", (_,B)-"block4", (_,_)-"block5"]
585+
586+Naively transformed:
587+
588+ 1: [(A)-"block1", (B)-"block2", (B)-"block4", (_)-"block5"]
589+
590+ 2: [(A)-"block3", (B)-"block4", (_)-"block5"]
591+
592+ _: [(B)-"block4", (_)-"block5"]
593+
594+However, we didn't need to duplicate the (_,B) case to case 1, as it is
595+covered by the (1,B) case. So:
596+
597+ 1: [(A)-"block1", (B)-"block2", (_)-"block5"]
598+
599+ 2: [(A)-"block3", (B)-"block4", (_)-"block5"]
600+
601+ _: [(B)-"block4", (_)-"block5"]
602+
603+If A and B are the only ctors for the switch type, then it would not be
604+necessary to copy the (_,_) case either if 1 or 2 is matched, as it is covered
605+in both cases:
606+
607+ 1: [(A)-"block1", (B)-"block2"]
608+
609+ 2: [(A)-"block3", (B)-"block4"]
610+
611+ _: [(B)-"block4", (_)-"block5"]
612+
613+Tricky variable binding example
614+-------------------------------
615+
616+Consider this code::
617+
618+ switch x:
619+ case Pair(49, Box(3))
620+ case Pair(m, Box(n))
621+
622+This example is important because in this case, the binding for `n` depends on
623+the path taken to the second block. This explains the need for Phi
624+instructions when making variable bindings. In this example, the second case's
625+`n` variable needs to be copied from a temporary unpack variable -- which
626+variable will depend on whether the first number was 49.
627+
628+Args: [x]
629+
630+Binds: []
631+
632+Matchers::
633+
634+ Pair, 49, Box, 3 - block1
635+ Pair, m, Box, n - block2
636+
637+Assoc::
638+
639+ Pair: [49, Box, 3]-block1, [m, Box, n]-block2
640+
641+Codegen::
642+
643+ entry:
644+ a = x.Pair(0)
645+ b = x.Pair(1)
646+ <continue>
647+
648+----
649+
650+Args: [a,b]
651+
652+Binds: []
653+
654+Matchers::
655+
656+ 49, Box, 3 - block1
657+ m, Box, n - block2
658+
659+Bind m=a in block2's binding table.
660+
661+::
662+
663+ 49, Box, 3 - block1
664+ _, Box, n - block2
665+
666+Assoc list::
667+
668+ 49: [Box, 3]-block1, [Box, n]-block2
669+ _: [Box, n]-block2
670+
671+Codegen::
672+
673+ <continue entry>
674+
675+ switch a:
676+ case 49: goto case49
677+ default: goto default
678+ case49:
679+ <continue>
680+ default:
681+ <continue>
682+
683+----
684+
685+**(case 49)**
686+
687+Args: [b]
688+
689+Binds: [(block2-m)=a]
690+
691+Matchers::
692+
693+ Box, 3 - block1
694+ Box, n - block2
695+
696+Assoc list::
697+
698+ Box: [3]-block1, [n]-block2
699+
700+Codegen::
701+
702+ <continue case49>
703+ c = b.Box(0)
704+ <continue>
705+
706+----
707+
708+**(case 49, Box)**
709+
710+Args: [c]
711+
712+Binds: [(block2-m)=a]
713+
714+Matchers::
715+
716+ 3 - block1
717+ n - block2
718+
719+Bind n=c in block2's binding table.
720+
721+.. note:: A phi is required, as the binding is different to that below.
722+
723+::
724+
725+ 3 - block1
726+ _ - block2
727+
728+Codegen::
729+
730+ <continue case49>
731+ switch c:
732+ case 3: goto block1
733+ default: goto block2
734+
735+.. note:: This isn't quite the base case. We could recurse once more,
736+ generating two new blocks, and then each block would consist
737+ entirely of a goto. However, this pre-base-case allows more
738+ efficient code generation.
739+
740+Write to phi table: (block2-m)=(case49-a)
741+
742+Write to phi table: (block2-n)=(case49-c)
743+
744+----
745+
746+**(case default)**
747+
748+Args: [b]
749+
750+Binds: [(block2-m)=a]
751+
752+Matchers::
753+
754+ Box, n - block2
755+
756+Assoc list::
757+
758+ Box: [n]-block2
759+
760+Codegen::
761+
762+ <continue default>
763+ d = b.Box(0)
764+ <continue>
765+
766+----
767+
768+**(case default, Box)**
769+
770+Args: [d]
771+
772+Binds: [(block2-m)=a]
773+
774+Matchers::
775+
776+ n - block2
777+
778+Bind n=d in block2's binding table.
779+
780+.. note:: A phi is required, as the binding is different to that above.
781+
782+::
783+
784+ _ - block2
785+
786+Assoc list::
787+
788+ _: []-block2
789+
790+Codegen::
791+
792+ <continue default>
793+ goto block2
794+
795+Write to phi table: (block2-m)=(default-a)
796+
797+Write to phi table: (block2-n)=(default-d)
798+
799+----
800+
801+**(Final generated code)**
802+
803+Final phi table::
804+
805+ (block2-m)=[(case49-a), (default-a)]
806+ (block2-n)=[(case49-c), (default-d)]
807+
808+::
809+
810+ entry:
811+ a = x.Pair(0)
812+ b = x.Pair(1)
813+ switch a:
814+ case 49: goto case49
815+ default: goto default
816+
817+ case49:
818+ c = b.Box(0)
819+ switch c:
820+ case 3: goto block1
821+ default: goto block2
822+
823+ default:
824+ d = b.Box(0)
825+ goto block2
826+
827+ block1:
828+ ...
829+
830+ block2:
831+ n = phi(case49 = c, default = d)
832+ m = phi(case49 = a, default = a)
833+ ...
834+
835+.. note:: The phi node for m in block2 may be converted into a mov
836+ instruction, because the binding is the same for all predecessors.
837
838=== modified file 'doc/index.rst'
839--- doc/index.rst 2009-10-02 06:21:23 +0000
840+++ doc/index.rst 2010-01-25 07:28:13 +0000
841@@ -35,6 +35,7 @@
842 ref/index.rst
843 lib/index.rst
844 faq.rst
845+ dev/index.rst
846 testing.rst
847
848 Indices and tables
849
850=== modified file 'doc/ref/stmts.rst'
851--- doc/ref/stmts.rst 2009-02-05 07:36:52 +0000
852+++ doc/ref/stmts.rst 2010-01-25 07:28:13 +0000
853@@ -227,14 +227,6 @@
854 proven either by having a wildcard pattern (underscore or variable), or by
855 matching all alternatives of a user-defined data type.
856
857-.. warning::
858- The current implementation does not obey this last rule. Currently, this
859- check is made at runtime, not compile time. If no case of a switch
860- statement matches, the program will terminate with a runtime error,
861- "pattern match failure".
862- Programmers should ensure that the above rule holds statically for all
863- switch statements, as a future version may enforce this rule.
864-
865 Note that the effects of variable binding outlast the :keyword:`switch`
866 statement. Their binding is like an
867 :ref:`assignment statement <ref-assignment-statements>`.
868
869=== modified file 'src/ast_cfg.m'
870--- src/ast_cfg.m 2010-01-25 01:23:14 +0000
871+++ src/ast_cfg.m 2010-01-25 07:28:13 +0000
872@@ -29,12 +29,13 @@
873 :- interface.
874
875 :- import_module ir.
876+:- import_module tables.
877
878 :- import_module list.
879
880 % prog_to_cfg(!Program) converts all functions in a program to CFG form,
881 % calling func_to_cfg on each of them.
882-:- pred prog_to_cfg(program::in, program::out) is det.
883+:- pred prog_to_cfg(progtable::in, program::in, program::out) is det.
884
885 % func_to_cfg(!Function) converts a function to CFG form if possible.
886 % Built-in functions and CFG functions remain the same.
887@@ -44,7 +45,7 @@
888 % - There exists a viable code path which does not return a value.
889 % - There is an attempt to read from a variable which may not have been
890 % assigned on some or all code paths.
891-:- pred func_to_cfg(function::in, function::out) is det.
892+:- pred func_to_cfg(progtable::in, function::in, function::out) is det.
893
894 % expr_to_instrs(+Expr, -VarName, -Instrs).
895 % Converts a single expression into a sequence of low-level instructions
896@@ -68,28 +69,35 @@
897 :- implementation.
898
899 :- import_module pair.
900+:- import_module assoc_list.
901 :- import_module map.
902+:- import_module multi_map.
903+:- import_module svmap.
904+:- import_module svmulti_map.
905 :- import_module maybe.
906 :- import_module string.
907 :- import_module int.
908+:- import_module integer.
909 :- import_module exception.
910
911 :- import_module cfg.
912 :- import_module context.
913-
914-prog_to_cfg(program(Nodes0), program(Nodes)) :-
915- prog_nodes_to_cfg(Nodes0, Nodes).
916-
917-:- pred prog_nodes_to_cfg(list(program_node)::in, list(program_node)::out).
918-prog_nodes_to_cfg([], []).
919-prog_nodes_to_cfg([Node0|Nodes0], [Node|Nodes]) :-
920+:- import_module util.
921+
922+prog_to_cfg(PT, program(Nodes0), program(Nodes)) :-
923+ prog_nodes_to_cfg(PT, Nodes0, Nodes).
924+
925+:- pred prog_nodes_to_cfg(progtable::in,
926+ list(program_node)::in, list(program_node)::out).
927+prog_nodes_to_cfg(_, [], []).
928+prog_nodes_to_cfg(PT, [Node0|Nodes0], [Node|Nodes]) :-
929 ( Node0 = pfunction(Func0) ->
930- func_to_cfg(Func0, Func),
931+ func_to_cfg(PT, Func0, Func),
932 Node = pfunction(Func)
933 ;
934 Node = Node0
935 ),
936- prog_nodes_to_cfg(Nodes0, Nodes).
937+ prog_nodes_to_cfg(PT, Nodes0, Nodes).
938
939 :- pred new_def_map(def_map::out) is det.
940 new_def_map(Map) :-
941@@ -111,7 +119,7 @@
942 def_map_insert_local(Name-_, !DefMap) :-
943 map.det_insert(!.DefMap, Name, unbound, !:DefMap).
944
945-func_to_cfg(!Func) :-
946+func_to_cfg(PT, !Func) :-
947 % Create a new defmap with all of the params.
948 new_def_map(DefMap0),
949 Params = !.Func ^ func_params,
950@@ -123,28 +131,28 @@
951 Params = yes(YesParams),
952 foldl(def_map_insert_param, YesParams, DefMap0, DefMap)
953 ),
954- func_body_to_cfg(DefMap, !.Func ^ func_context, !.Func ^ func_body,
955+ func_body_to_cfg(PT, DefMap, !.Func ^ func_context, !.Func ^ func_body,
956 CFGBody),
957 !:Func = !.Func ^ func_body := CFGBody.
958
959 % Note: def_map argument should have all of the function's parameters already
960 % bound in the map.
961-:- pred func_body_to_cfg(def_map::in, context::in,
962+:- pred func_body_to_cfg(progtable::in, def_map::in, context::in,
963 func_body::in, func_body::out) is det.
964-func_body_to_cfg(_DefMap, _Ctx, B@func_builtin, B).
965-func_body_to_cfg(_DefMap, _Ctx, B@func_body_cfg(_,_), B).
966-func_body_to_cfg(DefMap0, Ctx, func_body_ast(Decls,Stmts), NewFuncBody) :-
967+func_body_to_cfg(_PT, _DefMap, _Ctx, B@func_builtin, B).
968+func_body_to_cfg(_PT, _DefMap, _Ctx, B@func_body_cfg(_,_), B).
969+func_body_to_cfg(PT, DefMap0, Ctx, func_body_ast(Decls,Stmts), NewFuncBody) :-
970 NewFuncBody = 'new func_body_cfg'(Decls, CFG),
971 % Insert all of the function's locals into the defmap, unbound
972 foldl(def_map_insert_local, Decls, DefMap0, DefMap),
973- ast_to_cfg(DefMap, Ctx, Stmts, CFG).
974+ ast_to_cfg(PT, DefMap, Ctx, Stmts, CFG).
975
976 % ast_to_cfg(DefMap, Stmts, CFG) converts a statement block (AST form) to a
977 % control flow graph, translating the code.
978 % This also performs error checking (see func_to_cfg).
979-:- some [S] pred ast_to_cfg(def_map::in, context::in, stmt_block::in,
980- cfg(S)::out) is det.
981-ast_to_cfg(DefMap, Ctx, Stmts, CFG) :-
982+:- some [S] pred ast_to_cfg(progtable::in, def_map::in, context::in,
983+ stmt_block::in, cfg(S)::out) is det.
984+ast_to_cfg(PT, DefMap, Ctx, Stmts, CFG) :-
985 some [!CFG]
986 (
987 % Need to create the entry block and exit block so that
988@@ -153,8 +161,8 @@
989 !:CFG = cfg.new_cfg,
990 Entry = cfg.get_entry(!.CFG),
991 Exit = cfg.get_exit(!.CFG),
992- stmt_block_to_cfg(Stmts, Entry, Exit, DefMap, map.init, AfterPredMap,
993- map.init, ExitPredMap, !CFG),
994+ stmt_block_to_cfg(PT, Stmts, Entry, Exit, DefMap,
995+ map.init,AfterPredMap, map.init, ExitPredMap, !CFG),
996 % The union of AfterPredMap and ExitPredMap contains information about
997 % all predecessors to Exit. Use this to compute phi instructions for
998 % Exit.
999@@ -464,38 +472,39 @@
1000 % for the purpose of computing phis.
1001 % Throws a context error if there is an attempt to read from a variable which
1002 % may not have been assigned on some or all code paths.
1003-:- pred stmt_block_to_cfg(stmt_block::in, bbref(S)::in, bbref(S)::in,
1004- def_map::in, pred_map(S)::in, pred_map(S)::out,
1005+:- pred stmt_block_to_cfg(progtable::in, stmt_block::in, bbref(S)::in,
1006+ bbref(S)::in, def_map::in, pred_map(S)::in, pred_map(S)::out,
1007 pred_map(S)::in, pred_map(S)::out, cfg(S)::in, cfg(S)::out)
1008 is det.
1009-stmt_block_to_cfg(Stmts, BBCurrent, BBAfter, DefMap, !AfterPredMap,
1010+stmt_block_to_cfg(PT, Stmts, BBCurrent, BBAfter, DefMap, !AfterPredMap,
1011 !ExitPredMap, !CFG) :-
1012 map.init(SubscriptMap),
1013- stmt_block_to_cfg(Stmts, BBCurrent, BBAfter, SubscriptMap, DefMap,
1014+ stmt_block_to_cfg(PT, Stmts, BBCurrent, BBAfter, SubscriptMap, DefMap,
1015 !AfterPredMap, !ExitPredMap, !CFG).
1016
1017-:- pred stmt_block_to_cfg(stmt_block::in, bbref(S)::in, bbref(S)::in,
1018+:- pred stmt_block_to_cfg(progtable::in, stmt_block::in, bbref(S)::in,
1019+ bbref(S)::in,
1020 subscript_map::in, def_map::in, pred_map(S)::in, pred_map(S)::out,
1021 pred_map(S)::in, pred_map(S)::out, cfg(S)::in, cfg(S)::out) is det.
1022-stmt_block_to_cfg([], BBCurrent, BBAfter, _SubscriptMap, DefMap,
1023+stmt_block_to_cfg(_PT, [], BBCurrent, BBAfter, _SubscriptMap, DefMap,
1024 !AfterPredMap, !ExitPredMap, !CFG) :-
1025 % Special case for empty block - just branch to BBAfter.
1026 add_to_pred_map(BBCurrent, DefMap, BBAfter, !AfterPredMap, !CFG),
1027 cfg.set_terminator(BBCurrent, branch(BBAfter, blank_context), !CFG).
1028-stmt_block_to_cfg([S], BBCurrent, BBAfter, SubscriptMap, DefMap,
1029+stmt_block_to_cfg(PT, [S], BBCurrent, BBAfter, SubscriptMap, DefMap,
1030 !AfterPredMap, !ExitPredMap, !CFG) :-
1031 % Base-case for the last statement in a block - compile statement into
1032 % BBCurrent, but end up by branching to BBAfter.
1033- stmt_to_cfg_last(S, BBCurrent, BBAfter, SubscriptMap, DefMap,
1034+ stmt_to_cfg_last(PT, S, BBCurrent, BBAfter, SubscriptMap, DefMap,
1035 !AfterPredMap, !ExitPredMap, !CFG).
1036-stmt_block_to_cfg([S|Ss@[_|_]], BBCurrent, BBAfter, SubscriptMap0, DefMap0,
1037+stmt_block_to_cfg(PT, [S|Ss@[_|_]], BBCurrent, BBAfter, SubscriptMap0,DefMap0,
1038 !AfterPredMap, !ExitPredMap, !CFG) :-
1039 % Compile a single statement into BBCurrent. BBContinue is the block to
1040 % compile subsequent statements into (may be equal to BBCurrent).
1041- stmt_to_cfg(S, BBCurrent, BBContinue, SubscriptMap0, SubscriptMap,
1042+ stmt_to_cfg(PT, S, BBCurrent, BBContinue, SubscriptMap0, SubscriptMap,
1043 DefMap0, DefMap, !ExitPredMap, !CFG),
1044 % Compile subsequent statements into BBContinue.
1045- stmt_block_to_cfg(Ss, BBContinue, BBAfter, SubscriptMap, DefMap,
1046+ stmt_block_to_cfg(PT, Ss, BBContinue, BBAfter, SubscriptMap, DefMap,
1047 !AfterPredMap, !ExitPredMap, !CFG).
1048
1049 %%% HIGH-LEVEL STATEMENT-COMPILATION PREDICATES %%%
1050@@ -514,18 +523,18 @@
1051 % will be completed (have a terminator set), and SubscriptMap will be reset
1052 % to refer to the new block (and typically be empty).
1053 % May also update def_map with any new variable definitions.
1054-:- pred stmt_to_cfg(stmt::in, bbref(S)::in, bbref(S)::out,
1055+:- pred stmt_to_cfg(progtable::in, stmt::in, bbref(S)::in, bbref(S)::out,
1056 subscript_map::in, subscript_map::out, def_map::in, def_map::out,
1057 pred_map(S)::in, pred_map(S)::out, cfg(S)::in, cfg(S)::out) is det.
1058-stmt_to_cfg(basic_stmt(S,Ctx), !BBRef, !SubscriptMap, !DefMap, !ExitPredMap,
1059- !CFG) :-
1060+stmt_to_cfg(_PT, basic_stmt(S,Ctx), !BBRef, !SubscriptMap, !DefMap,
1061+ !ExitPredMap, !CFG) :-
1062 basic_stmt_to_cfg(S, Ctx, !.BBRef, !SubscriptMap, !DefMap, !CFG).
1063-stmt_to_cfg(compound_stmt(S,Ctx), BBCurrent, BBAfter, !SubscriptMap,
1064+stmt_to_cfg(PT, compound_stmt(S,Ctx), BBCurrent, BBAfter, !SubscriptMap,
1065 !DefMap, !ExitPredMap, !CFG) :-
1066 % Generate a target block for the compound statement to branch to
1067 cfg.new_basic_block(BBAfter, !CFG),
1068- compound_stmt_to_cfg(S, Ctx, BBCurrent, BBAfter, !.SubscriptMap, !.DefMap,
1069- map.init, AfterPredMap, !ExitPredMap, !CFG),
1070+ compound_stmt_to_cfg(PT, S, Ctx, BBCurrent, BBAfter, !.SubscriptMap,
1071+ !.DefMap, map.init, AfterPredMap, !ExitPredMap, !CFG),
1072 % AfterPredMap contains information about all predecessors to BBAfter.
1073 % Use this to compute phi instructions for BBAfter and update DefMap.
1074 AfterID = ref_id(BBAfter, !.CFG),
1075@@ -544,19 +553,19 @@
1076 % for the purpose of computing phis.
1077 % May update DefMap with new bindings, but SubscriptMap is read-only (since it
1078 % will definitely branch to a new block).
1079-:- pred stmt_to_cfg_last(stmt::in, bbref(S)::in, bbref(S)::in,
1080+:- pred stmt_to_cfg_last(progtable::in, stmt::in, bbref(S)::in, bbref(S)::in,
1081 subscript_map::in, def_map::in, pred_map(S)::in, pred_map(S)::out,
1082 pred_map(S)::in, pred_map(S)::out, cfg(S)::in, cfg(S)::out) is det.
1083-stmt_to_cfg_last(basic_stmt(S,Ctx), BBCurrent, BBAfter, SubscriptMap,
1084+stmt_to_cfg_last(_PT, basic_stmt(S,Ctx), BBCurrent, BBAfter, SubscriptMap,
1085 DefMap, !AfterPredMap, !ExitPredMap, !CFG) :-
1086 basic_stmt_to_cfg(S, Ctx, BBCurrent, SubscriptMap, _, DefMap, NewDefMap,
1087 !CFG),
1088 % Branch to the provided BBAfter
1089 add_to_pred_map(BBCurrent, NewDefMap, BBAfter, !AfterPredMap, !CFG),
1090 cfg.set_terminator(BBCurrent, branch(BBAfter, Ctx), !CFG).
1091-stmt_to_cfg_last(compound_stmt(S,Ctx), BBCurrent, BBAfter, SubscriptMap,
1092+stmt_to_cfg_last(PT, compound_stmt(S,Ctx), BBCurrent, BBAfter, SubscriptMap,
1093 DefMap, !AfterPredMap, !ExitPredMap, !CFG) :-
1094- compound_stmt_to_cfg(S, Ctx, BBCurrent, BBAfter, SubscriptMap,
1095+ compound_stmt_to_cfg(PT, S, Ctx, BBCurrent, BBAfter, SubscriptMap,
1096 DefMap, !AfterPredMap, !ExitPredMap, !CFG).
1097
1098 %%% LOW-LEVEL STATEMENT-COMPILATION PREDICATES %%%
1099@@ -729,11 +738,11 @@
1100 % SubscriptMap is read-only (since it will definitely branch to a new block).
1101 % DefMap is read-only (the resulting def-maps should be appended to
1102 % AfterPredMap).
1103-:- pred compound_stmt_to_cfg(compound_stmt::in, context::in, bbref(S)::in,
1104- bbref(S)::in, subscript_map::in, def_map::in,
1105+:- pred compound_stmt_to_cfg(progtable::in, compound_stmt::in, context::in,
1106+ bbref(S)::in, bbref(S)::in, subscript_map::in, def_map::in,
1107 pred_map(S)::in, pred_map(S)::out, pred_map(S)::in, pred_map(S)::out,
1108 cfg(S)::in, cfg(S)::out) is det.
1109-compound_stmt_to_cfg(return(Expr), Ctx, BBCurrent, _BBAfter, SubscriptMap,
1110+compound_stmt_to_cfg(_PT,return(Expr), Ctx, BBCurrent, _BBAfter, SubscriptMap,
1111 DefMap0, !AfterPredMap, !ExitPredMap, !CFG) :-
1112 % Return is expressed in CFG as $RET = expr, followed by a branch to the
1113 % exit block.
1114@@ -742,17 +751,26 @@
1115 ExitBlock = cfg.get_exit(!.CFG),
1116 add_to_pred_map(BBCurrent, DefMap, ExitBlock, !ExitPredMap, !CFG),
1117 cfg.set_terminator(BBCurrent, branch(ExitBlock, Ctx), !CFG).
1118-compound_stmt_to_cfg(switch(Ctrl0, Cases0), Ctx, BBCurrent, BBAfter,
1119+compound_stmt_to_cfg(PT, switch(Ctrl0, Cases0), Ctx, BBCurrent, BBAfter,
1120 SubscriptMap, DefMap, !AfterPredMap, !ExitPredMap, !CFG) :-
1121 % Calculate the condition expression at the end of BBCurrent
1122 apply_def_map_to_expr(DefMap, Ctx, Ctrl0, Ctrl),
1123 BlockID = ref_id(BBCurrent, !.CFG),
1124 expr_to_instrs(BlockID, Ctx, Ctrl, CtrlVar, CtrlInstrs, SubscriptMap, _),
1125 cfg.append_instrs(BBCurrent, CtrlInstrs, !CFG),
1126- list.map_foldl3(case_stmt_to_cfg(BBCurrent,BBAfter,SubscriptMap,DefMap),
1127+ % Convert all patterns to use SSA variables, and generate CFG code for the
1128+ % bodies of all case statements. The resulting Cases contains goto
1129+ % statements for each generated block.
1130+ list.map_foldl3(case_stmt_to_jmp(PT,BBCurrent,BBAfter,SubscriptMap,DefMap),
1131 Cases0, Cases, !AfterPredMap, !ExitPredMap, !CFG),
1132- cfg.set_terminator(BBCurrent, t_switch(CtrlVar, Cases, Ctx), !CFG).
1133-compound_stmt_to_cfg(if_then_else(Cond0,ThenPart,ElsePart), Ctx, BBCurrent,
1134+ % Now begin switch factoring transformation
1135+ % Convert each case statement to a matcher (flatten the patterns)
1136+ Matchers = list.map(case_to_matcher, Cases),
1137+ % Generate all switch code. This terminates the current block.
1138+ generate_matcher_code(PT, [CtrlVar], Matchers, map.init, BBCurrent,
1139+ SubscriptMap, Ctx, map.init, PhiTable, !CFG),
1140+ map.foldl(generate_case_block_phis, PhiTable, !CFG).
1141+compound_stmt_to_cfg(PT, if_then_else(Cond0,ThenPart,ElsePart), Ctx,BBCurrent,
1142 BBAfter, SubscriptMap, DefMap, !AfterPredMap, !ExitPredMap, !CFG) :-
1143 % Calculate the condition expression at the end of BBCurrent
1144 apply_def_map_to_expr(DefMap, Ctx, Cond0, Cond),
1145@@ -762,7 +780,7 @@
1146 % Create a new basic block for the "then" part.
1147 cfg.new_basic_block(ThenEnter, !CFG),
1148 cfg.append_predecessor(ThenEnter, BBCurrent, !CFG),
1149- stmt_block_to_cfg(ThenPart, ThenEnter, BBAfter, DefMap, !AfterPredMap,
1150+ stmt_block_to_cfg(PT, ThenPart, ThenEnter, BBAfter, DefMap, !AfterPredMap,
1151 !ExitPredMap, !CFG),
1152 ( list.is_empty(ElsePart) ->
1153 % Branch straight to the "after" block if the condition fails
1154@@ -773,14 +791,14 @@
1155 % Create a new basic block for the "else" part.
1156 cfg.new_basic_block(ElseEnter, !CFG),
1157 cfg.append_predecessor(ElseEnter, BBCurrent, !CFG),
1158- stmt_block_to_cfg(ElsePart, ElseEnter, BBAfter, DefMap, !AfterPredMap,
1159- !ExitPredMap, !CFG),
1160+ stmt_block_to_cfg(PT, ElsePart, ElseEnter, BBAfter, DefMap,
1161+ !AfterPredMap, !ExitPredMap, !CFG),
1162 % Have the current block terminate by branching into the then/else
1163 % blocks
1164 cfg.set_terminator(BBCurrent,
1165 cond_branch(CondVar,ThenEnter,ElseEnter,Ctx), !CFG)
1166 ).
1167-compound_stmt_to_cfg(while(Cond0, Body), Ctx, BBCurrent, BBAfter,
1168+compound_stmt_to_cfg(PT, while(Cond0, Body), Ctx, BBCurrent, BBAfter,
1169 SubscriptMap, DefMap, !AfterPredMap, !ExitPredMap, !CFG) :-
1170 % Create a new basic block for the condition and body
1171 cfg.new_basic_block(CondBlock, !CFG),
1172@@ -800,7 +818,7 @@
1173
1174 % Do the whole fixpoint. Updates CFG by setting CondBlock's phis, and
1175 % compiling Body.
1176- while_fixpoint(CondBlock, BodyEnter, Body, CondBlockPredMap,
1177+ while_fixpoint(PT, CondBlock, BodyEnter, Body, CondBlockPredMap,
1178 !ExitPredMap, DefMap, CondBlockDefMap, !CFG),
1179
1180 % CondBlock branches to BodyBlock or BBAfter, based on the condition
1181@@ -825,22 +843,24 @@
1182 % Updates the CFG as follows:
1183 % - Sets the phis of CondBlock to the computed phi nodes.
1184 % - Compiles the body into BodyEnter, branching to CondBlock.
1185-:- pred while_fixpoint(bbref(S)::in, bbref(S)::in, stmt_block::in,
1186+:- pred while_fixpoint(progtable::in, bbref(S)::in, bbref(S)::in,
1187+ stmt_block::in,
1188 pred_map(S)::in, pred_map(S)::in, pred_map(S)::out,
1189 def_map::in, def_map::out, cfg(S)::in, cfg(S)::out) is det.
1190-while_fixpoint(CondBlock, BodyEnter, Body, CondPredMap, !ExitPredMap,
1191+while_fixpoint(PT, CondBlock, BodyEnter, Body, CondPredMap, !ExitPredMap,
1192 !DefMap, !CFG) :-
1193 % First iteration, assume there are no phis
1194- while_fixpoint_(CondBlock, BodyEnter, Body, CondPredMap, [], !ExitPredMap,
1195- !DefMap, !CFG).
1196+ while_fixpoint_(PT, CondBlock, BodyEnter, Body, CondPredMap, [],
1197+ !ExitPredMap, !DefMap, !CFG).
1198
1199-:- pred while_fixpoint_(bbref(S)::in, bbref(S)::in, stmt_block::in,
1200+:- pred while_fixpoint_(progtable::in, bbref(S)::in, bbref(S)::in,
1201+ stmt_block::in,
1202 pred_map(S)::in, list(phi(S))::in, pred_map(S)::in, pred_map(S)::out,
1203 def_map::in, def_map::out, cfg(S)::in, cfg(S)::out) is det.
1204-while_fixpoint_(CondBlock, BodyEnter, Body, CondPredMap0, Phis0,
1205+while_fixpoint_(PT, CondBlock, BodyEnter, Body, CondPredMap0, Phis0,
1206 !ExitPredMap, !CondDefMap, !CFG) :-
1207 % Compute CondPredMap (after another iteration of the body)
1208- stmt_block_to_cfg(Body, BodyEnter, CondBlock, !.CondDefMap,
1209+ stmt_block_to_cfg(PT, Body, BodyEnter, CondBlock, !.CondDefMap,
1210 CondPredMap0, CondPredMap, !ExitPredMap, !.CFG, PotentialFinalCFG),
1211 % Now we can compute the phis for the cond block
1212 CondBlockID = ref_id(CondBlock, !.CFG),
1213@@ -853,13 +873,416 @@
1214 !:CondDefMap = CondDefMap1
1215 ;
1216 % Now iterate till fixpoint
1217- while_fixpoint_(CondBlock, BodyEnter, Body, CondPredMap, Phis,
1218+ while_fixpoint_(PT, CondBlock, BodyEnter, Body, CondPredMap, Phis,
1219 !ExitPredMap, CondDefMap1, !:CondDefMap, !CFG)
1220 ).
1221
1222-% case_stmt_to_cfg(+BBAfter, +SubscriptMap, +DefMap, +CaseStmt, -CFGCaseStmt,
1223+% Intermediate representation of a case statement -- still contains recursive
1224+% patterns, but branches to block references rather than containing AST code.
1225+% (This is half-way between a case_stmt and a matcher.)
1226+:- type jmp_case_stmt(S)
1227+ ---> jmp_case_stmt(pattern, cfg.bbref(S), context.context).
1228+
1229+% Intermediate representation for switch statements being factored
1230+% A "matcher" is like a flattened case statement. Rather than having just a
1231+% single pattern, it has a list of matches, which represent the pattern heads
1232+% which need to be matched, in order, in a depth-first left-to-right traversal
1233+% of the pattern tree.
1234+:- type matcher(S)
1235+ ---> matcher(list(match), cfg.bbref(S)).
1236+% A match is the head of a pattern -- same as a pattern but ctors aren't
1237+% recursive.
1238+:- type match
1239+ % "_" pattern (don't care / wildcard)
1240+ ---> match_any
1241+ % Variable. Matches anything, and binds a new variable of the
1242+ % given name.
1243+ ; match_var(varname)
1244+ % Constructor match. Matches only values with the given ctor
1245+ % name and arity.
1246+ ; match_ctor(string, int)
1247+ % Integer literal. Matches exactly this value.
1248+ ; match_intlit(integer).
1249+
1250+% Binding table for switch generation. Determines, for a given target block
1251+% and bound variable name, which source variable holds the value of the bound
1252+% variable.
1253+:- type binding_table(S) == map(cfg.bbref(S), map(varname, varname)).
1254+
1255+% Phi table for switch generation. Dynamically-constructed table of phi nodes
1256+% for blocks which are targets of case statements.
1257+:- type phi_table(S) == map(cfg.bbref(S),
1258+ multi_map(varname, pair(cfg.bbref(S), varname))).
1259+
1260+% Create a new entry in the phi table.
1261+:- pred create_phi_table_entry(bbref(S)::in, varname::in, bbref(S)::in,
1262+ varname::in, phi_table(S)::in, phi_table(S)::out) is det.
1263+create_phi_table_entry(Target, BoundVar, SourceBlock, SourceVar, !PhiTable) :-
1264+ ( map.search(!.PhiTable, Target, TPM) ->
1265+ TargetPhiMap0 = TPM
1266+ ;
1267+ TargetPhiMap0 = multi_map.init
1268+ ),
1269+ Source = SourceBlock-SourceVar,
1270+ multi_map.add(TargetPhiMap0, BoundVar, Source, TargetPhiMap),
1271+ svmap.set(Target, TargetPhiMap, !PhiTable).
1272+
1273+% Flatten a case statement out to a "matcher"
1274+:- func case_to_matcher(jmp_case_stmt(S)) = matcher(S).
1275+case_to_matcher(jmp_case_stmt(Pat, BBRef, _)) = matcher(Matches, BBRef) :-
1276+ Matches = pattern_to_matches(Pat).
1277+
1278+% Flatten a pattern out to a list of matches
1279+:- func pattern_to_matches(pattern) = list(match).
1280+pattern_to_matches(Pat0) = Matches :-
1281+ Pat = Pat0 ^ pat_pattern,
1282+ % Convert the pattern Pat into a flat list of matches, Matches
1283+ (
1284+ Pat = pat_any,
1285+ Matches = [match_any]
1286+ ;
1287+ Pat = pat_var(Varname),
1288+ Matches = [match_var(Varname)]
1289+ ;
1290+ % Nullary constructor -- just treat as if 0-ary
1291+ Pat = pat_ctor(Name, no),
1292+ Matches = [match_ctor(Name, 0)]
1293+ ;
1294+ Pat = pat_ctor(Name, yes(Patterns)),
1295+ FirstMatch = match_ctor(Name, list.length(Patterns)),
1296+ % Need to recurse, and subsequently match all sub-patterns
1297+ % Generate matches for each sub-pattern, then condense all into one
1298+ % continuous list.
1299+ SubMatches = list.condense(list.map(pattern_to_matches, Patterns)),
1300+ % Place the first match at the front, then the subsequent matches.
1301+ Matches = [FirstMatch|SubMatches]
1302+ ;
1303+ Pat = pat_intlit(Val),
1304+ Matches = [match_intlit(Val)]
1305+ ).
1306+
1307+% generate_matcher_code(ArgList, MatcherList, BindingTable, BBCurrent,
1308+% SubscriptMap, SwitchCtx, !PhiTable, !CFG).
1309+% Generates switch code for a list of matchers. ArgList is a list of variable
1310+% names which will be matched in order against the patterns in each matcher of
1311+% MatcherList.
1312+% The code is appended to block BBCurrent (may also generate new blocks).
1313+% (The matches in each matcher of MatcherList may be longer than ArgList, as a
1314+% top-level variable may occupy many elements of MatcherList).
1315+% This algorithm is explained in detail in doc/dev/switchfactor.
1316+:- pred generate_matcher_code(progtable::in, list(varname)::in,
1317+ list(matcher(S))::in,
1318+ binding_table(S)::in, bbref(S)::in, subscript_map::in, context::in,
1319+ phi_table(S)::in, phi_table(S)::out, cfg(S)::in, cfg(S)::out) is det.
1320+generate_matcher_code(_PT, [], MatcherList, BindingTable, BBCurrent,
1321+ _SubscriptMap, Ctx, !PhiTable, !CFG) :-
1322+ % Base case -- No more args to switch on.
1323+ % This generates a branch, which is inefficient code. This should usually
1324+ % be handled on the recursive call, meaning that you should usually not
1325+ % get here at all.
1326+ % Expect that the first item in MatcherList has 0 matches, so just branch
1327+ % to it.
1328+ (
1329+ MatcherList = [],
1330+ throw("ast_cfg.generate_matcher_code: No matchers remaining.")
1331+ ;
1332+ MatcherList = [matcher([_|_], _)|_],
1333+ throw("ast_cfg.generate_matcher_code: "
1334+ ++ "ArgList empty, matcher not empty.")
1335+ ;
1336+ MatcherList = [matcher([], Target)|_]
1337+ ),
1338+ % Just generate a branch to the Target of the first matcher
1339+ cfg.append_predecessor(Target, BBCurrent, !CFG),
1340+ cfg.set_terminator(BBCurrent, branch(Target, Ctx), !CFG),
1341+ % Create entries in the phi table for each entry in the binding table
1342+ ( map.search(BindingTable, Target, TargetBinding) ->
1343+ map.foldl((pred(BoundVar::in,SourceVar::in,Phi0::in,Phi::out) is det:-
1344+ % For each binding table entry (BoundVar-SourceVar), create a
1345+ % phi table entry (Target-BoundVar) = (BBCurrent-SourceVar).
1346+ create_phi_table_entry(Target, BoundVar, BBCurrent, SourceVar,
1347+ Phi0, Phi)
1348+ ), TargetBinding, !PhiTable)
1349+ ;
1350+ true
1351+ ).
1352+generate_matcher_code(PT, [MatchVar|RestVars], MatcherList0, BindingTable0,
1353+ BBCurrent, SubscriptMap0, Ctx, !PhiTable, !CFG) :-
1354+ % Generate code for matching MatchVar against the first pattern in each
1355+ % matcher of MatcherList, then recurse.
1356+
1357+ % Step 1 - Store explicit variable bindings
1358+ % If the first argument of a matcher is match_var, add a variable binding
1359+ % instruction to its target block.
1360+ % Modify MatcherList, replacing all initial match_vars with match_any.
1361+ list.map_foldl(matcher_variable_binding(MatchVar),
1362+ MatcherList0, MatcherList, BindingTable0, BindingTable),
1363+
1364+ % Step 2 - Create association list
1365+ % The values we're going to switch on
1366+ ValuesOfInterest = list.remove_dups(list.map(matcher_first_match,
1367+ MatcherList)),
1368+ AssocList = list.map((func(K) = K-select_matchers(K, MatcherList)),
1369+ ValuesOfInterest),
1370+
1371+ % Test for incomplete switch (error case)
1372+ switch_exhaustion_check(PT, ValuesOfInterest, Ctx),
1373+
1374+ % Step 3 - Generate switch statement
1375+ % We create exactly one case (incl. default) for each value of interest
1376+ % This also gets AssocListBlocks, which has a third column, the block ID,
1377+ % and also has any entries after the first "_" removed.
1378+ ( (list.length(AssocList) = 1 ; AssocList = [match_any-_|_]) ->
1379+ % Special case -- Only one case or first case is default
1380+ % Do not generate a switch statement. Just continue on from the end of
1381+ % the current block.
1382+ Match-Matchers = list.det_head(AssocList),
1383+ AssocListBlocks = [{Match, Matchers, BBCurrent}],
1384+ % The target block is the same as the current one -- keep the same
1385+ % subscript map
1386+ SubscriptMap = SubscriptMap0
1387+ ;
1388+ % Generate the switch statement
1389+ generate_case_stmts(BBCurrent, AssocList, Cases, Default,
1390+ AssocListBlocks, !CFG),
1391+ % Write the switch statement to the current block
1392+ cfg.set_terminator(BBCurrent, t_switch(MatchVar, Cases, Default, Ctx),
1393+ !CFG),
1394+ % All of the new blocks are fresh, so the target block's subscript map
1395+ % is empty
1396+ SubscriptMap = map.init
1397+ ),
1398+
1399+ % Step 4 - Write implicit unpacking instructions to each block and recurse
1400+ % Start the new block with a fresh subscript map
1401+ % (This calls generate_matcher_code for each target block).
1402+ list.foldl2(write_unpack_instructions(PT,MatchVar, RestVars, BindingTable,
1403+ SubscriptMap, Ctx),
1404+ AssocListBlocks, !PhiTable, !CFG),
1405+
1406+ true.
1407+
1408+% matcher_variable_binding(MatchVar, !Matcher, !BindingTable).
1409+% If Matcher's first match is a match_var, creates a binding from MatchVar to
1410+% the given variable name, in the Matcher's target block. Also updates
1411+% Matcher, replacing the match_var with match_any.
1412+:- pred matcher_variable_binding(varname::in, matcher(S)::in, matcher(S)::out,
1413+ binding_table(S)::in, binding_table(S)::out) is det.
1414+matcher_variable_binding(MatchVar, !Matcher, !BindingTable) :-
1415+ ( !.Matcher = matcher([match_var(BindVar)|Rest], TargetBB) ->
1416+ % Add a binding table entry for TargetBB
1417+ ( map.search(!.BindingTable, TargetBB, BlockTable_) ->
1418+ BlockTable0 = BlockTable_
1419+ ;
1420+ BlockTable0 = map.init
1421+ ),
1422+ svmap.det_insert(BindVar, MatchVar, BlockTable0, BlockTable),
1423+ svmap.set(TargetBB, BlockTable, !BindingTable),
1424+ % Replace match_var with match_any
1425+ !:Matcher = matcher([match_any|Rest], TargetBB)
1426+ ;
1427+ true
1428+ ).
1429+
1430+% Gets the first match of a matcher.
1431+% Aborts if there is none.
1432+:- func matcher_first_match(matcher(S)) = match.
1433+matcher_first_match(matcher([First|_],_)) = First.
1434+matcher_first_match(matcher([],_)) = throw("Matcher is empty").
1435+
1436+% Selects all matchers which begin with match.
1437+% Removes the match from the beginning, and adjusts wildcard matches so they
1438+% have the correct number of start entries.
1439+:- func select_matchers(match, list(matcher(S))) = list(matcher(S)).
1440+select_matchers(StartMatch, Matchers) = FilteredMatchers :-
1441+ list.filter_map(select_matcher(StartMatch), Matchers, FilteredMatchers).
1442+
1443+:- pred select_matcher(match::in, matcher(S)::in, matcher(S)::out) is semidet.
1444+select_matcher(StartMatch, Matcher0, Matcher) :-
1445+ Matcher0 = matcher([First|Rest], Target),
1446+ ( First = StartMatch ->
1447+ % First match is correct (ctor, intlit or _). Just remove it.
1448+ Matcher = matcher(Rest, Target)
1449+ ; First = match_any ->
1450+ % First match is _. Always match, but carefully adjust.
1451+ ( StartMatch = match_ctor(_, Arity) ->
1452+ % The wildcard match is matching a ctor, so we need to perform
1453+ % sub-matches (which would all be wildcards)
1454+ Matcher = matcher(list.duplicate(Arity, match_any) ++ Rest,
1455+ Target)
1456+ ;
1457+ Matcher = matcher(Rest, Target)
1458+ )
1459+ ;
1460+ fail
1461+ ).
1462+
1463+% Given a list of matches, representing the cases at one level of a switch
1464+% statement, test that the switch (at that level) cannot possibly fail.
1465+% This means it has a default, or all cases are covered for the switched type.
1466+% If a switch fails this test, throws a context error. Otherwise, succeeds.
1467+:- pred switch_exhaustion_check(progtable::in, list(match)::in, context::in)
1468+ is det.
1469+switch_exhaustion_check(PT, Matches, Ctx) :-
1470+ ( list.member(match_any, Matches) ->
1471+ % Fine -- The switch has a default case
1472+ true
1473+ % Block ends with a switch with no default. Closely analyse its cases
1474+ % to determine whether it is exhaustive.
1475+ % Note: List must be entirely match_intlit or match_ctor
1476+ ; SomeMatch = list.head(Matches) ->
1477+ % Any match will do, since they all (must) have the same type
1478+ % We are only interested in the match's type
1479+ (
1480+ SomeMatch = match_ctor(SomeCtor, _),
1481+ % Use SomeCtor to retrieve the typedef we are switching over
1482+ % (Any Ctor will do). This is easier than finding the typedef from
1483+ % a type value.
1484+ ( tables.lookup_ctor(PT, SomeCtor, SwitchTypeDef, _) ->
1485+ % Get the names of all ctors for the type
1486+ AllCtors = map(func(Ctor) = Ctor ^ ctor_name,
1487+ SwitchTypeDef ^ typedef_ctors),
1488+ % Get the names of all ctors named in a case
1489+ MatchCtors = map((func(Match) = CtorName :-
1490+ ( Match = match_ctor(C, _) ->
1491+ CtorName = C
1492+ ;
1493+ throw("ast_cfg: Not all matches are ctors")
1494+ )),
1495+ Matches),
1496+ % The two sets should be equal
1497+ % Delete each element of CaseCtor from AllCtors
1498+ % UnhandledCtors = AllCtors / CaseCtors (should be empty)
1499+ list.foldl((pred(MatchCtor::in, AC0::in, AC::out) is det :-
1500+ ( list.delete_first(AC0, MatchCtor, AC_) ->
1501+ AC = AC_
1502+ ;
1503+ throw("ast_cfg: Match " ++ MatchCtor ++
1504+ " not a member of switch type")
1505+ )
1506+ ),
1507+ MatchCtors, AllCtors, UnhandledCtors),
1508+ ( list.is_empty(UnhandledCtors) ->
1509+ % Fine -- All cases handled in switch
1510+ true
1511+ ;
1512+ context.throw_error(
1513+ "Incomplete switch: requires case(s) for "
1514+ ++ string.join_list(", ", UnhandledCtors), Ctx)
1515+ )
1516+ ;
1517+ throw("ast_cfg: Ctor name not defined")
1518+ )
1519+ ;
1520+ SomeMatch = match_intlit(_),
1521+ % It is ALWAYS an error to have an int switch without a default
1522+ % case, since you can't cover all the ints.
1523+ context.throw_error("Incomplete switch: requires default case "
1524+ ++ "for switch on integer", Ctx)
1525+ ;
1526+ SomeMatch = match_any,
1527+ throw("ast_cfg: match_any in match list")
1528+ ;
1529+ SomeMatch = match_var(_),
1530+ throw("ast_cfg: match_var in match list")
1531+ )
1532+ ;
1533+ throw("ast_cfg: Switch check on empty match list")
1534+ ).
1535+
1536+% generate_case_stmts(+BBCurrent, +AssocList, -Cases, -Default,
1537+% -AssocListBlocks, !CFG).
1538+% Given an assoc list (a match for each case, mapped to a list of matchers),
1539+% generate a case statement for each matcher, and create a new target block in
1540+% the CFG for the case statement.
1541+% Returns AssocListBlocks, which is AssocList augmented with a target block
1542+% for each AssocList entry.
1543+% Will ignore any matches after a match_any. Aborts on match_var.
1544+:- pred generate_case_stmts(bbref(S)::in,
1545+ assoc_list(match, list(matcher(S)))::in,
1546+ assoc_list(case_value, bbref(S))::out, maybe(bbref(S))::out,
1547+ list({match, list(matcher(S)), bbref(S)})::out,
1548+ cfg(S)::in, cfg(S)::out) is det.
1549+% Base case: No matches. Therefore no cases and no default.
1550+generate_case_stmts(_, [], [], no, [], !CFG).
1551+% Base case: Match is "_". Therefore, generate a default case and stop.
1552+generate_case_stmts(BBCurrent, [match_any-M|_], [], yes(Default), [ALB],
1553+ !CFG) :-
1554+ cfg.new_basic_block(Default, !CFG),
1555+ cfg.append_predecessor(Default, BBCurrent, !CFG),
1556+ ALB = {match_any, M, Default}.
1557+% Error case: Match is a var. Should have been taken out by now.
1558+generate_case_stmts(_, [match_var(_)-_|_], _, _, _, !CFG) :-
1559+ throw("ast_cfg.generate_case_stmts: Found match_var").
1560+generate_case_stmts(BBCurrent, [match_ctor(Name, Arity)-M|Matches],
1561+ [Case|Cases], Default, [ALB|ALBs], !CFG) :-
1562+ cfg.new_basic_block(Target, !CFG),
1563+ cfg.append_predecessor(Target, BBCurrent, !CFG),
1564+ Case = case_ctor(Name) - Target,
1565+ ALB = {match_ctor(Name, Arity), M, Target},
1566+ % Recurse
1567+ generate_case_stmts(BBCurrent, Matches, Cases, Default, ALBs, !CFG).
1568+generate_case_stmts(BBCurrent, [match_intlit(Val)-M|Matches],
1569+ [Case|Cases], Default, [ALB|ALBs], !CFG) :-
1570+ cfg.new_basic_block(Target, !CFG),
1571+ cfg.append_predecessor(Target, BBCurrent, !CFG),
1572+ Case = case_intlit(Val) - Target,
1573+ ALB = {match_intlit(Val), M, Target},
1574+ % Recurse
1575+ generate_case_stmts(BBCurrent, Matches, Cases, Default, ALBs, !CFG).
1576+
1577+% write_unpack_instructions(SourceVar, RestVars, BindingTable, SubscriptMap,
1578+% Ctx, {Match, Matchers, Block}, !PhiTable, !CFG).
1579+% Writes implicit record unpacking instructions for the constructor mentioned
1580+% by Match to the end of Block, and then calls generate_matcher_code to
1581+% recursively generate the rest of the switch statement for that block.
1582+:- pred write_unpack_instructions(progtable::in, varname::in,
1583+ list(varname)::in, binding_table(S)::in, subscript_map::in, context::in,
1584+ {match, list(matcher(S)), bbref(S)}::in,
1585+ phi_table(S)::in, phi_table(S)::out, cfg(S)::in, cfg(S)::out) is det.
1586+write_unpack_instructions(PT, SourceVar, RestVars, BindingTable, SubscriptMap,
1587+ Ctx, AssocEntry, !PhiTable, !CFG) :-
1588+ {Match, Matchers, Block} = AssocEntry,
1589+ ( Match = match_ctor(Name, Arity) ->
1590+ % Write Arity unpack instructions to Block
1591+ % Store the names of the unpacked vars in UnpackedVars
1592+ list.map_foldl2((pred(VarID::in, TempVName::out, SM0::in, SM::out,
1593+ C0::in, C::out) is det :-
1594+ % Write one, then recurse
1595+ new_temp_variable(ref_id(Block, C0), TempVName, SM0, SM),
1596+ Instr = instr(ld_field_idx(TempVName, SourceVar, Name, VarID),
1597+ context.blank_context),
1598+ cfg.append_instr(Block, Instr, C0, C)
1599+ ), util.range(Arity), UnpackedVars, SubscriptMap, _, !CFG)
1600+ ;
1601+ UnpackedVars = []
1602+ ),
1603+ % Construct the new ArgList.
1604+ % First, need to switch on all of the unpacked variables, then the
1605+ % remaining ArgList variables.
1606+ ArgList = UnpackedVars ++ RestVars,
1607+ % Recursively call generate_matcher_code for this block
1608+ generate_matcher_code(PT, ArgList, Matchers, BindingTable, Block,
1609+ SubscriptMap, Ctx, !PhiTable, !CFG).
1610+
1611+% generate_case_block_phis(TargetBlock, Phis, !CFG).
1612+% Generates all phis for case block TargetBlock. Phis is the phi table entry
1613+% for TargetBlock, a multimap mapping bound varnames to source block/varname
1614+% pairs.
1615+:- pred generate_case_block_phis(bbref(S)::in,
1616+ multi_map(varname, pair(bbref(S), varname))::in, cfg(S)::in, cfg(S)::out)
1617+ is det.
1618+generate_case_block_phis(TargetBlock, Phis, !CFG) :-
1619+ map.foldl((pred(BoundVar::in, Bindings::in, CFG0::in, CFG::out) is det :-
1620+ % For each phi
1621+ Preds = map.from_assoc_list(Bindings),
1622+ PhiInstr = phi(BoundVar, Preds),
1623+ cfg.append_phi(TargetBlock, PhiInstr, CFG0, CFG)
1624+ ), Phis, !CFG).
1625+
1626+% case_stmt_to_jmp(+BBAfter, +SubscriptMap, +DefMap, +CaseStmt, -CFGCaseStmt,
1627 % !AfterPredMap, !ExitPredMap, !CFG).
1628-% Compiles a case statement into a CFG case statement (which is
1629+% Compiles a case statement into a jmp case statement (which is
1630 % branch-oriented rather than sub-statement-block-oriented).
1631 % Compiles the sub-statement-block in the case statement to new basic blocks,
1632 % which will ultimately branch to BBAfter.
1633@@ -870,12 +1293,12 @@
1634 % SubscriptMap is read-only (since it will definitely branch to a new block).
1635 % DefMap is read-only (the resulting def-maps should be appended to
1636 % AfterPredMap).
1637-:- pred case_stmt_to_cfg(bbref(S)::in, bbref(S)::in,
1638- subscript_map::in, def_map::in, case_stmt::in, cfg_case_stmt(S)::out,
1639+:- pred case_stmt_to_jmp(progtable::in, bbref(S)::in, bbref(S)::in,
1640+ subscript_map::in, def_map::in, case_stmt::in, jmp_case_stmt(S)::out,
1641 pred_map(S)::in, pred_map(S)::out, pred_map(S)::in, pred_map(S)::out,
1642 cfg(S)::in, cfg(S)::out) is det.
1643-case_stmt_to_cfg(BBCurrent, BBAfter, SubscriptMap, DefMap,
1644- case_stmt(Pattern0, Stmts, Ctx), cfg_case_stmt(Pattern, BBFirst, Ctx),
1645+case_stmt_to_jmp(PT, BBCurrent, BBAfter, SubscriptMap, DefMap,
1646+ case_stmt(Pattern0, Stmts, Ctx), jmp_case_stmt(Pattern, BBFirst, Ctx),
1647 !AfterPredMap, !ExitPredMap, !CFG) :-
1648 % Traverse the pattern and uniquely label each newly-introduced
1649 % variable, as well as adding them to DefMap.
1650@@ -884,8 +1307,7 @@
1651 InnerScopeDefMap),
1652
1653 cfg.new_basic_block(BBFirst, !CFG),
1654- cfg.append_predecessor(BBFirst, BBCurrent, !CFG),
1655- stmt_block_to_cfg(Stmts, BBFirst, BBAfter, InnerScopeDefMap,
1656+ stmt_block_to_cfg(PT, Stmts, BBFirst, BBAfter, InnerScopeDefMap,
1657 !AfterPredMap, !ExitPredMap, !CFG).
1658
1659 % pattern_to_ssa(BlockID, !Pattern, !SubscriptMap, !DefMap).
1660
1661=== modified file 'src/interpret.m'
1662--- src/interpret.m 2010-01-25 01:23:14 +0000
1663+++ src/interpret.m 2010-01-25 07:28:13 +0000
1664@@ -128,6 +128,7 @@
1665 :- import_module svmap.
1666 :- import_module set.
1667 :- import_module pair.
1668+:- import_module assoc_list.
1669 :- import_module array.
1670 :- import_module exception.
1671 :- import_module store.
1672@@ -468,6 +469,20 @@
1673 env_assign_local(Dst, Result, !Env).
1674 exec_instr_(ld_field(_Dst, _Val, _FieldName), _Ctx, !Env, !IO) :-
1675 throw("Not Implemented: Field reference").
1676+exec_instr_(ld_field_idx(Dst, ValName, Ctor, Idx), _Ctx, !Env, !IO) :-
1677+ read_var(!.Env, ValName, Val, !IO),
1678+ ( Val = val_term(Ctor, Fields) -> % Must match supplied Ctor name
1679+ ( list.index0(Fields, Idx, FieldVal) -> % Fail if bounds error
1680+ Result = FieldVal
1681+ ;
1682+ throw("Interpreter Error: Field index out of range")
1683+ )
1684+ ; Val = val_term(_, _) ->
1685+ throw("Interpreter Error: Field reference with wrong ctor")
1686+ ;
1687+ throw("Interpreter Error: Field reference to something not an ADT")
1688+ ),
1689+ env_assign_local(Dst, Result, !Env).
1690 exec_instr_(call(Dst, Func, Args), _Ctx, !Env, !IO) :-
1691 % Evaluate Func and Arg (eagerly)
1692 read_var(!.Env, Func, FuncVal, !IO),
1693@@ -627,9 +642,9 @@
1694 % Execute the terminator
1695 Term = Block ^ bb_terminator,
1696 (
1697- Term = t_switch(Control, Cases, _),
1698+ Term = t_switch(Control, Cases, Default, _),
1699 read_var(!.Env, Control, ControlVal, !IO),
1700- exec_cases(CFG, BBRef, !Env, ControlVal, Cases, !IO)
1701+ exec_cases(CFG, BBRef, !Env, ControlVal, Cases, Default, !IO)
1702 ;
1703 Term = cond_branch(Cond, Then, Else, _),
1704 read_var(!.Env, Cond, CondVal, !IO),
1705@@ -668,69 +683,47 @@
1706 throw("Interpreter Error: Phi missing prececessor in CFG")
1707 ).
1708
1709-% exec_cases(CFG, Predecessor, !Env, ControlValue, Cases, !IO).
1710+% exec_cases(CFG, Predecessor, !Env, ControlValue, Cases, Default, !IO).
1711 % Performs a switch over Cases, and executes the first one which matches.
1712 :- pred exec_cases(cfg(S)::in, bbref(S)::in, env::in, env::out,
1713- value::in, list(cfg_case_stmt(S))::in, io::di, io::uo) is det.
1714-exec_cases(_CFG, _Predecessor, !Env, _ControlValue, [], !IO) :-
1715- % No pattern matched
1716- % XXX Should be checked statically so this never occurs, and therefore, we
1717- % raise an interpreter error at this point.
1718- % Currently, no check is made, and ast_cfg goes nuts if a switch
1719- % statement isn't handled by default. Therefore, we need to raise a
1720- % friendly runtime error for the time being.
1721- throw_mars_error("Pattern match failure").
1722- %throw("Interpreter Error: Pattern match failure").
1723-exec_cases(CFG, Predecessor, !Env, ControlValue, [Case|Cases], !IO) :-
1724- Case = cfg_case_stmt(Pattern, Target, _Ctx),
1725- ( pattern_match(Pattern, ControlValue, !Env) ->
1726- % Succeeded. Branch to Target (with the new bindings) and ignore the
1727- % remaining Cases.
1728+ value::in, assoc_list(case_value, bbref(S))::in, maybe(bbref(S))::in,
1729+ io::di, io::uo) is det.
1730+exec_cases(CFG, Predecessor, !Env, _ControlValue, [], Default, !IO) :-
1731+ % No case matched, execute the default
1732+ (
1733+ Default = yes(DefTarget),
1734+ exec_block(CFG, Predecessor, DefTarget, !Env, !IO)
1735+ ;
1736+ Default = no,
1737+ % Should never occur -- by the invariant of t_switch, there MUST be a
1738+ % default case if the case statements are not exhaustive.
1739+ % (This should be guaranteed by ast_cfg). Therefore, internal error.
1740+ throw("Interpreter Error: No case matched and no default case")
1741+ ).
1742+exec_cases(CFG, Predecessor, !Env, ControlValue, [Case|Cases], Default,!IO) :-
1743+ Case = CaseVal - Target,
1744+ ( check_case_value(CaseVal, ControlValue) ->
1745+ % Succeeded. Branch to Target and ignore the remaining Cases.
1746 exec_block(CFG, Predecessor, Target, !Env, !IO)
1747 ;
1748 % That match failed. Try the rest.
1749- exec_cases(CFG, Predecessor, !Env, ControlValue, Cases, !IO)
1750+ exec_cases(CFG, Predecessor, !Env, ControlValue, Cases, Default, !IO)
1751 ).
1752
1753-% pattern_match(Pattern, Value, !Env).
1754-% Matches Value against Pattern, making bindings in !Env if successful.
1755-% Fails if the pattern match fails.
1756-:- pred pattern_match(pattern::in, value::in, env::in, env::out) is semidet.
1757-pattern_match(pattern(Pattern, _PatType), Value, !Env) :-
1758+% check_case_value(CaseVal, Value).
1759+% Succeeds if Value matches the case value CaseVal.
1760+:- pred check_case_value(case_value::in, value::in) is semidet.
1761+check_case_value(CaseVal, Value) :-
1762 (
1763- Pattern = pat_any
1764- % Done, succeed with no bindings
1765- ;
1766- Pattern = pat_var(VarName),
1767- % Succeed, and bind
1768- env_assign_local(VarName, Value, !Env)
1769- ;
1770- Pattern = pat_ctor(CtorName, SubPatterns),
1771- Value = val_term(CtorName, SubValues), % Can fail
1772- % Recursively match all sub-terms with sub-patterns
1773- (
1774- SubPatterns = yes(YesSubPatterns),
1775- pattern_match_list(YesSubPatterns, SubValues, !Env) % Can fail
1776- ;
1777- SubPatterns = no
1778- )
1779- ;
1780- Pattern = pat_intlit(Int),
1781+ CaseVal = case_ctor(CtorName),
1782+ Value = val_term(CtorName, _) % Can fail
1783+ % Note: Do not recurse; succeed regardless of the term's arguments
1784+ ;
1785+ CaseVal = case_intlit(Int),
1786 % Succeed if the value is a matching int
1787 Value = val_int(Int) % Can fail
1788 ).
1789
1790-% pattern_match_list(Patterns, Values, !Env).
1791-% Matches Values against Patterns, making bindings in !Env if successful.
1792-% Succeeds only if each value matches its corresponding pattern, and the lists
1793-% are exactly the same length.
1794-:- pred pattern_match_list(list(pattern)::in, list(value)::in,
1795- env::in, env::out) is semidet.
1796-pattern_match_list([], [], !Env).
1797-pattern_match_list([P|Ps], [V|Vs], !Env) :-
1798- pattern_match(P, V, !Env), % Can fail
1799- pattern_match_list(Ps, Vs, !Env). % Can fail
1800-
1801 % -------------------------------------------------------------------------- %
1802 % BUILTINS
1803 % -------------------------------------------------------------------------- %
1804
1805=== modified file 'src/ir.m'
1806--- src/ir.m 2010-01-25 01:23:14 +0000
1807+++ src/ir.m 2010-01-25 07:28:13 +0000
1808@@ -51,6 +51,7 @@
1809 :- import_module string.
1810 :- import_module bool.
1811 :- import_module list.
1812+:- import_module assoc_list.
1813 :- import_module maybe.
1814 :- import_module pair.
1815
1816@@ -207,6 +208,9 @@
1817 % Load a field of a variable into a variable.
1818 % (R, S, F) <=> R := S.F
1819 ; ld_field(varname, varname, string)
1820+ % Load a field of a variable, by constructor(index)
1821+ % (R, S, C, I) <=> R := S.C(I)
1822+ ; ld_field_idx(varname, varname, string, int)
1823 % Function call. (Result, Function, Args).
1824 ; call(varname, varname, list(varname))
1825 % Partial function application. (Result, Function, Args).
1826@@ -219,10 +223,17 @@
1827 % type" used to prevent terminators from referencing basic blocks in the wrong
1828 % CFG.
1829 :- type terminator_instr(S)
1830- % Switch instruction. Performs pattern matching of algebraic data
1831- % types. Branches to the first basic block to match the pattern
1832- % (binding variables from the pattern over the block's scope).
1833- ---> t_switch(varname, list(cfg_case_stmt(S)), context.context)
1834+ % t_switch(ControlVar, Cases, Default).
1835+ % Simplified switch instruction. Performs matching of algebraic
1836+ % data types' head values, or integers.
1837+ % If ControlVar matches one of the Cases, branches to that basic
1838+ % block. Otherwise, branches to Default.
1839+ % Default is optional, but may ONLY be omitted if the cases cover
1840+ % all possible values of an ADT, and is mandatory for switches
1841+ % over integers.
1842+ % Two cases may not have the same case value.
1843+ ---> t_switch(varname, assoc_list(case_value, bbref(S)),
1844+ maybe(bbref(S)), context.context)
1845 % Conditional branch instruction. Expr must be of type Int.
1846 % Evaluates Expr, then branches to the first basic block if it is
1847 % nonzero, or the second basic block if it is zero.
1848@@ -241,11 +252,7 @@
1849 % An AST representation's case statement.
1850 :- type case_stmt
1851 ---> case_stmt(pattern, stmt_block, context.context).
1852-
1853-% A CFG representation's case statement.
1854-:- type cfg_case_stmt(S)
1855- ---> cfg_case_stmt(pattern, cfg.bbref(S), context.context).
1856-% A pattern (part of a case statement).
1857+% An AST representation's pattern (part of a case statement).
1858 :- type pattern
1859 ---> pattern(
1860 pat_pattern :: pattern_,
1861@@ -263,6 +270,15 @@
1862 % Integer literal. Matches exactly this value.
1863 ; pat_intlit(integer).
1864
1865+% A value which is matched by a case of a t_switch instruction in CFG
1866+% representation.
1867+:- type case_value
1868+ % Constructor match. Matches only values with the given ctor
1869+ % name, without checking arity or arguments.
1870+ ---> case_ctor(string)
1871+ % Integer literal. Matches exactly this value.
1872+ ; case_intlit(integer).
1873+
1874 % A uniqueness annotation for a particular argument of a function
1875 :- type argmode
1876 ---> argmode(
1877
1878=== modified file 'src/mars.m'
1879--- src/mars.m 2010-01-22 01:54:37 +0000
1880+++ src/mars.m 2010-01-25 07:28:13 +0000
1881@@ -155,7 +155,7 @@
1882 % Typecheck the program
1883 typecheck.check_program(ProgTable, Prog, Prog_Checked),
1884 % Convert the program from AST representation into CFG representation
1885- ast_cfg.prog_to_cfg(Prog_Checked, ProgCFG),
1886+ ast_cfg.prog_to_cfg(ProgTable, Prog_Checked, ProgCFG),
1887 % Build another ProgTable
1888 tables.build_progtable(ProgCFG, builtins.builtins_progtable,
1889 ProgTableCFG),
1890
1891=== modified file 'src/marsc.m'
1892--- src/marsc.m 2010-01-22 01:54:37 +0000
1893+++ src/marsc.m 2010-01-25 07:28:13 +0000
1894@@ -155,7 +155,7 @@
1895 check_main_func(ProgTable, MainFunc),
1896 ( backend_use_cfg(BackendName) ->
1897 % Convert the program from AST representation into CFG representation
1898- ast_cfg.prog_to_cfg(Prog_Checked, Prog2),
1899+ ast_cfg.prog_to_cfg(ProgTable, Prog_Checked, Prog2),
1900 % Build another ProgTable
1901 tables.build_progtable(Prog2, builtins.builtins_progtable,
1902 ProgTable2)
1903
1904=== modified file 'src/pretty.m'
1905--- src/pretty.m 2010-01-25 01:23:14 +0000
1906+++ src/pretty.m 2010-01-25 07:28:13 +0000
1907@@ -367,6 +367,10 @@
1908 io.write_string(ir.varname_to_string_noescape(Dst), !IO),
1909 io.write_string(" = ", !IO),
1910 io.write_string(ir.varname_to_string_noescape(Obj) ++ "." ++ Field, !IO).
1911+print_instr_(Indent, ld_field_idx(Dst, Obj, Ctor, Index), !IO) :-
1912+ indent(Indent, !IO),
1913+ io.format("%s = %s.%s(%d)", [s(ir.varname_to_string_noescape(Dst)),
1914+ s(ir.varname_to_string_noescape(Obj)), s(Ctor), i(Index)], !IO).
1915 print_instr_(Indent, call(Dst, Func, Args), !IO) :-
1916 indent(Indent, !IO),
1917 io.write_string(ir.varname_to_string_noescape(Dst), !IO),
1918@@ -386,12 +390,22 @@
1919
1920 :- pred print_terminator_instr(int::in, cfg(S)::in, terminator_instr(S)::in,
1921 io::di,io::uo) is det.
1922-print_terminator_instr(Indent, CFG, t_switch(Control, Cases, _), !IO) :-
1923+print_terminator_instr(Indent, CFG, t_switch(Control,Cases,Default,_), !IO) :-
1924 indent(Indent, !IO),
1925 io.write_string("switch ", !IO),
1926 io.write_string(ir.varname_to_string_noescape(Control), !IO),
1927 io.write_string(":\n", !IO),
1928- foldl(print_cfg_case(Indent+1, CFG), Cases, !IO).
1929+ foldl(print_cfg_case(Indent+1, CFG), Cases, !IO),
1930+ % Print the default case, if any
1931+ (
1932+ Default = yes(DefTarget),
1933+ indent(Indent+1, !IO),
1934+ io.write_string("default: goto ", !IO),
1935+ io.write(cfg.ref_id(DefTarget, CFG), !IO),
1936+ io.nl(!IO)
1937+ ;
1938+ Default = no
1939+ ).
1940 print_terminator_instr(Indent, CFG, cond_branch(Cond, ThenPart, ElsePart, _),
1941 !IO) :-
1942 indent(Indent, !IO),
1943@@ -419,16 +433,6 @@
1944 io.write_string(":\n", !IO),
1945 foldl(print_stmt(Indent+1), Stmts, !IO).
1946
1947-:- pred print_cfg_case(int::in, cfg(S)::in, cfg_case_stmt(S)::in,
1948- io::di, io::uo) is det.
1949-print_cfg_case(Indent, CFG, cfg_case_stmt(Pattern, Target, _), !IO) :-
1950- indent(Indent, !IO),
1951- io.write_string("case ", !IO),
1952- io.write_string(string_pattern(Pattern), !IO),
1953- io.write_string(": goto ", !IO),
1954- io.write(cfg.ref_id(Target, CFG), !IO),
1955- io.nl(!IO).
1956-
1957 string_pattern(pattern(P,_)) = string_pattern_(P).
1958
1959 :- func string_pattern_(pattern_) = string.
1960@@ -447,6 +451,20 @@
1961 ).
1962 string_pattern_(pat_intlit(I)) = integer.to_string(I).
1963
1964+:- pred print_cfg_case(int::in, cfg(S)::in, pair(case_value, bbref(S))::in,
1965+ io::di, io::uo) is det.
1966+print_cfg_case(Indent, CFG, CaseVal - Target, !IO) :-
1967+ indent(Indent, !IO),
1968+ io.write_string("case ", !IO),
1969+ io.write_string(string_caseval(CaseVal), !IO),
1970+ io.write_string(": goto ", !IO),
1971+ io.write(cfg.ref_id(Target, CFG), !IO),
1972+ io.nl(!IO).
1973+
1974+:- func string_caseval(case_value) = string.
1975+string_caseval(case_ctor(Name)) = Name.
1976+string_caseval(case_intlit(I)) = integer.to_string(I).
1977+
1978 print_function_head(Func, !IO) :-
1979 io.write_string(Func^func_name, !IO),
1980 (
1981
1982=== modified file 'test/cases/compiler/switch.mar'
1983--- test/cases/compiler/switch.mar 2010-01-25 05:33:55 +0000
1984+++ test/cases/compiler/switch.mar 2010-01-25 07:28:13 +0000
1985@@ -13,6 +13,9 @@
1986 B
1987 C
1988
1989+type Box(a):
1990+ Box(a)
1991+
1992 # Test cases with multiple levels of pattern matching
1993 def switchtest(x :: Foo) :: Int:
1994 switch x:
1995@@ -56,6 +59,46 @@
1996 x = a
1997 return add(x, a)
1998
1999+# Test trivial (unconditional) binding of a variable
2000+# Although the test suite won't test for this, it should be compiled to code
2001+# without a switch, because there is no actual condition.
2002+def trivialbind(x :: Int) :: Int:
2003+ switch x:
2004+ case n:
2005+ return n
2006+
2007+# Test simple binding of a variable
2008+def simplebind(x :: Int) :: Int:
2009+ switch x:
2010+ case 1:
2011+ return 2
2012+ case n:
2013+ return n
2014+
2015+# Test binding of a variable on multiple paths
2016+# Fully-worked example in doc/dev/switchfactor.
2017+def trickybind(x :: Pair(Int, Box(Int))) :: Int:
2018+ switch x:
2019+ case Pair(49, Box(3)):
2020+ return -1
2021+ case Pair(m, Box(n)):
2022+ return add(m, n)
2023+
2024+# From fully-worked example in doc/dev/switchfactor.
2025+# Test that factoring doesn't break the precedence of the cases
2026+def factor(x :: Pair(Bar, Int)) :: Int:
2027+ switch x:
2028+ case Pair(B, 1):
2029+ return 1
2030+ case Pair(C, 2):
2031+ return 2
2032+ case Pair(B, 3):
2033+ return 3
2034+ case Pair(q, _):
2035+ return 4
2036+ case Pair(B, 5): # Redundant rule: Should never be chosen
2037+ return 5
2038+
2039 # Unit test for switchtest function
2040 def test_switch(_x :: Int) :: Int:
2041 begin_case("switch")
2042@@ -88,3 +131,69 @@
2043 assert_eq(scopes(B), 8, "scopes(B) != 8")
2044 end_case("switch (scopes)")
2045 return 0
2046+
2047+def test_trivialbind(_x :: Int) :: Int:
2048+ begin_case("switch (trivialbind)")
2049+ assert_eq(trivialbind(7), 7, "trivialbind(7) != 7")
2050+ end_case("switch (trivialbind)")
2051+ return 0
2052+
2053+def test_simplebind(_x :: Int) :: Int:
2054+ begin_case("switch (simplebind)")
2055+ assert_eq(simplebind(1), 2, "simplebind(1) != 2")
2056+ end_case("switch (simplebind)")
2057+
2058+ begin_case("switch (simplebind)")
2059+ assert_eq(simplebind(7), 7, "simplebind(7) != 7")
2060+ end_case("switch (simplebind)")
2061+ return 0
2062+
2063+def test_trickybind(_x :: Int) :: Int:
2064+ begin_case("switch (trickybind)")
2065+ assert_eq(trickybind(Pair(49, Box(3))), -1, "trickybind(Pair(49, Box(3))) != -1")
2066+ end_case("switch (trickybind)")
2067+
2068+ begin_case("switch (trickybind)")
2069+ assert_eq(trickybind(Pair(49, Box(2))), 51, "trickybind(Pair(49, Box(2))) != 51")
2070+ end_case("switch (trickybind)")
2071+
2072+ begin_case("switch (trickybind)")
2073+ assert_eq(trickybind(Pair(7, Box(3))), 10, "trickybind(Pair(7, Box(3))) != 10")
2074+ end_case("switch (trickybind)")
2075+
2076+ begin_case("switch (trickybind)")
2077+ assert_eq(trickybind(Pair(2, Box(2))), 4, "trickybind(Pair(2, Box(2))) != 4")
2078+ end_case("switch (trickybind)")
2079+ return 0
2080+
2081+def test_factor() :: Int:
2082+ begin_case("switch (factoring)")
2083+ assert_eq(factor(Pair(B, 1)), 1, "factor(Pair(B, 1)) != 1")
2084+ end_case("switch (factoring)")
2085+ begin_case("switch (factoring)")
2086+ assert_eq(factor(Pair(B, 2)), 4, "factor(Pair(B, 2)) != 4")
2087+ end_case("switch (factoring)")
2088+ begin_case("switch (factoring)")
2089+ assert_eq(factor(Pair(B, 3)), 3, "factor(Pair(B, 3)) != 3")
2090+ end_case("switch (factoring)")
2091+ begin_case("switch (factoring)")
2092+ assert_eq(factor(Pair(B, 4)), 4, "factor(Pair(B, 4)) != 4")
2093+ end_case("switch (factoring)")
2094+ # Try to trick compiler -- should pick default case, not B,5 case
2095+ begin_case("switch (factoring)")
2096+ assert_eq(factor(Pair(B, 5)), 4, "factor(Pair(B, 5)) != 4")
2097+ end_case("switch (factoring)")
2098+
2099+ begin_case("switch (factoring)")
2100+ assert_eq(factor(Pair(C, 2)), 2, "factor(Pair(C, 2)) != 2")
2101+ end_case("switch (factoring)")
2102+ begin_case("switch (factoring)")
2103+ assert_eq(factor(Pair(C, 3)), 4, "factor(Pair(C, 3)) != 4")
2104+ end_case("switch (factoring)")
2105+
2106+ # Make sure default case also works when initial constructor has no
2107+ # possible match
2108+ begin_case("switch (factoring)")
2109+ assert_eq(factor(Pair(A(7), 2)), 4, "factor(Pair(A(7), 2)) != 4")
2110+ end_case("switch (factoring)")
2111+ return 0
2112
2113=== modified file 'test/cases/compiler/switchfail.mar'
2114--- test/cases/compiler/switchfail.mar 2010-01-08 01:36:52 +0000
2115+++ test/cases/compiler/switchfail.mar 2010-01-25 07:28:13 +0000
2116@@ -1,7 +1,5 @@
2117 # Test failing switch statements (which do not cover all cases)
2118
2119-import unittest
2120-
2121 type Foo:
2122 X(a :: Bar)
2123 Y(x :: Int)
2124
2125=== modified file 'test/cases/compiler/switchfail.mtc'
2126--- test/cases/compiler/switchfail.mtc 2010-01-08 01:36:52 +0000
2127+++ test/cases/compiler/switchfail.mtc 2010-01-25 07:28:13 +0000
2128@@ -1,9 +1,5 @@
2129 # vim: filetype=yaml
2130-
2131-# Runtime tests
2132-# Note: Not including unit tests; this just tests some runtime errors
2133-
2134-run:
2135- - call: matchfail(X(C))
2136- stderr: |
2137- Runtime Error: Pattern match failure
2138+compile:
2139+ outcome: compile_error
2140+ errors: |
2141+ switchfail.mar:15: (matchfail) Incomplete switch: requires case(s) for C.
2142
2143=== added file 'test/cases/compiler/switchfail2.mar'
2144--- test/cases/compiler/switchfail2.mar 1970-01-01 00:00:00 +0000
2145+++ test/cases/compiler/switchfail2.mar 2010-01-25 07:28:13 +0000
2146@@ -0,0 +1,13 @@
2147+# Test failing switch statements (which do not cover all cases)
2148+
2149+type Foo:
2150+ A(x :: Int)
2151+ B(x :: Int)
2152+ C
2153+
2154+# Test failing cases with multiple levels of pattern matching
2155+def matchfail(x :: Foo) :: Int:
2156+ switch x:
2157+ # Matches A, but not B or C
2158+ case A(i):
2159+ return i
2160
2161=== added file 'test/cases/compiler/switchfail2.mtc'
2162--- test/cases/compiler/switchfail2.mtc 1970-01-01 00:00:00 +0000
2163+++ test/cases/compiler/switchfail2.mtc 2010-01-25 07:28:13 +0000
2164@@ -0,0 +1,5 @@
2165+# vim: filetype=yaml
2166+compile:
2167+ outcome: compile_error
2168+ errors: |
2169+ switchfail2.mar:10: (matchfail) Incomplete switch: requires case(s) for B, C.
2170
2171=== added file 'test/cases/compiler/switchfail3.mar'
2172--- test/cases/compiler/switchfail3.mar 1970-01-01 00:00:00 +0000
2173+++ test/cases/compiler/switchfail3.mar 2010-01-25 07:28:13 +0000
2174@@ -0,0 +1,10 @@
2175+# Test failing switch statements (which do not cover all cases)
2176+
2177+# Test failing cases with multiple levels of pattern matching
2178+def matchfail(x :: Int) :: Int:
2179+ switch x:
2180+ # Matches some integer values, but has no default case
2181+ case 0:
2182+ return 1
2183+ case 13:
2184+ return x
2185
2186=== added file 'test/cases/compiler/switchfail3.mtc'
2187--- test/cases/compiler/switchfail3.mtc 1970-01-01 00:00:00 +0000
2188+++ test/cases/compiler/switchfail3.mtc 2010-01-25 07:28:14 +0000
2189@@ -0,0 +1,5 @@
2190+# vim: filetype=yaml
2191+compile:
2192+ outcome: compile_error
2193+ errors: |
2194+ switchfail3.mar:5: (matchfail) Incomplete switch: requires default case for switch on integer.

Subscribers

People subscribed via source and target branches

to all changes: