Commit | Line | Data |
---|---|---|
96ad2fee SK |
1 | Tiger.ml |
2 | ======== | |
188ac950 | 3 | A Tiger-compiler implementation in (OCa)ML |
96ad2fee | 4 | |
6a560b4a SK |
5 | Status |
6 | ------ | |
7 | ||
e6e82c08 | 8 | ![screenshot-tests-head](screenshots/tests-head.jpg) |
3206cc89 | 9 | ... |
e6e82c08 | 10 | ![screenshot-tests-tail](screenshots/tests-tail.jpg) |
188ac950 SK |
11 | |
12 | ### Features | |
13 | #### Done | |
6a560b4a SK |
14 | - [x] ch 1: Warm-up AST |
15 | - [x] ch 2: Lexer | |
16 | - [x] ch 3: Parser | |
17 | - [x] ch 4: AST | |
3206cc89 | 18 | - [x] ch 5: Semantic Analysis (type checking) |
cc540a7e | 19 | - [x] ch 6: Activation Records |
188ac950 | 20 | #### In-progress |
6a560b4a | 21 | - [ ] ch 7: Translation to Intermediate Code |
cc540a7e | 22 | #### TODO (short-term) |
6a560b4a SK |
23 | - [ ] ch 08: Basic Blocks and Traces |
24 | - [ ] ch 09: Instruction Selection | |
25 | - [ ] ch 10: Liveness Analysis | |
26 | - [ ] ch 11: Register Allocation | |
27 | - [ ] ch 12: Putting It All Together | |
188ac950 | 28 | #### TODO (long-term) |
6a560b4a SK |
29 | - [ ] ch 13: Garbage Collection |
30 | - [ ] ch 15: Functional Programming Languages | |
31 | - [ ] ch 16: Polymorphic Types | |
32 | - [ ] ch 17: Dataflow Analysis | |
33 | - [ ] ch 18: Loop Optimizations | |
34 | - [ ] ch 19: Static Single-Assignment Form | |
35 | - [ ] ch 20: Pipelining and Scheduling | |
36 | - [ ] ch 21: The Memory Hierarchy | |
188ac950 | 37 | #### Maybe |
6a560b4a SK |
38 | - [ ] ch 14: Object-Oriented Languages |
39 | ||
3206cc89 SK |
40 | ### Technical issues |
41 | - [-] testing framework | |
42 | - [x] run arbitrary code snippets | |
43 | - [x] check non-failures | |
44 | - [x] check expected output | |
45 | - [-] check expected exceptions | |
46 | - [x] semant stage | |
47 | - [ ] generalized expect `Output ('a option) | Exception of (exn -> bool)` | |
48 | - [x] run all book test case files | |
49 | - [-] grid view (cols: lex, pars, semant, etc.; rows: test cases.) | |
50 | - [x] implementation | |
51 | - [ ] refactoring | |
e6e82c08 SK |
52 | - [ ] test time-outs (motive: cycle non-detection caused an infinite loop) |
53 | - [ ] parallel test execution | |
3206cc89 SK |
54 | - [ ] Travis CI |
55 | ||
5f295d04 SK |
56 | Implementation Notes |
57 | -------------------- | |
58 | ||
59 | ### Parser | |
60 | ||
61 | #### shift/reduce conflicts | |
62 | ##### grouping consecutive declarations | |
299dd054 SK |
63 | In order to support mutual recursion, we need to group consecutive |
64 | type and function declarations (see Tiger-book pages 97-99). | |
65 | ||
66 | Initially, I defined the rules to do so as: | |
67 | ||
68 | decs: | |
69 | | dec { $1 :: [] } | |
70 | | dec decs { $1 :: $2 } | |
71 | ; | |
72 | dec: | |
73 | | var_dec { $1 } | |
74 | | typ_decs { Ast.TypeDecs $1 } | |
75 | | fun_decs { Ast.FunDecs $1 } | |
76 | ; | |
77 | ||
78 | which, while straightforward (and working, because `ocamlyacc` defaults to | |
79 | shift in case of a conflict), nonetheless caused a shift/reduce conflict in | |
80 | each of: `typ_decs` and `fun_decs`; where the parser did not know whether to | |
81 | shift and stay in `(typ|fun_)_dec` state or to reduce and get back to `dec` | |
82 | state. | |
83 | ||
84 | Sadly, tagging the rules with a lower precedence (to explicitly favor | |
85 | shifting) - does not help :( | |
86 | ||
87 | %nonassoc LOWEST | |
88 | ... | |
89 | dec: | |
90 | | var_dec { $1 } | |
91 | | typ_decs %prec LOWEST { Ast.TypeDecs $1 } | |
92 | | fun_decs %prec LOWEST { Ast.FunDecs $1 } | |
93 | ; | |
94 | ||
95 | The difficulty seems to be in the lack of a separator token which would be | |
96 | able to definitively mark the end of each sequence of consecutive | |
97 | `(typ_|fun_)` declarations. | |
98 | ||
99 | Keeping this in mind, another alternative is to manually capture the possible | |
100 | interspersion patterns in the rules like: | |
101 | ||
102 | (N * foo) followed-by (N * not-foo) | |
103 | ||
104 | for the exception of `var_dec`, which, since we do not need to group its | |
105 | consecutive sequences, can be reduced upon first sighting. | |
106 | ||
c7cdbbb6 SK |
107 | The final rules I ended-up with are: |
108 | ||
109 | decs: | |
110 | | var_dec decs_any { $1 :: $2 } | |
111 | | fun_decs decs_any_but_fun { (Ast.FunDecs $1) :: $2 } | |
112 | | typ_decs decs_any_but_typ { (Ast.TypeDecs $1) :: $2 } | |
113 | ; | |
114 | ||
115 | decs_any: | |
116 | | { [] } | |
117 | | var_dec decs_any { $1 :: $2 } | |
118 | | fun_decs decs_any_but_fun { (Ast.FunDecs $1) :: $2 } | |
119 | | typ_decs decs_any_but_typ { (Ast.TypeDecs $1) :: $2 } | |
120 | ; | |
121 | ||
122 | decs_any_but_fun: | |
123 | | { [] } | |
124 | | var_dec decs_any { $1 :: $2 } | |
125 | | typ_decs decs_any_but_typ { (Ast.TypeDecs $1) :: $2 } | |
126 | ; | |
127 | ||
128 | decs_any_but_typ: | |
129 | | { [] } | |
130 | | var_dec decs_any { $1 :: $2 } | |
131 | | fun_decs decs_any_but_fun { (Ast.FunDecs $1) :: $2 } | |
132 | ; | |
133 | ||
5f295d04 SK |
134 | ##### lval |
135 | ||
136 | ### AST | |
137 | ||
138 | #### print as M-exp | |
188ac950 | 139 | |
299dd054 SK |
140 | I chose to pretty-print AST as an (indented) |
141 | [M-expression](https://en.wikipedia.org/wiki/M-expression) - an underrated | |
142 | format, used in Mathematica and was intended for Lisp by McCarthy himself; it | |
143 | is nearly as flexible as S-expressions, but significantly more readable (IMO). | |
144 | ||
145 | As an example, here is what `test28.tig` looks like after parsing and | |
146 | pretty-printing: | |
147 | ||
148 | LetExp[ | |
149 | [ | |
150 | TypeDecs[ | |
151 | TypeDec[ | |
152 | arrtype1, | |
153 | ArrayTy[ | |
154 | int]], | |
155 | TypeDec[ | |
156 | arrtype2, | |
157 | ArrayTy[ | |
158 | int]]], | |
159 | VarDec[ | |
160 | arr1, | |
161 | arrtype1, | |
162 | ArrayExp[ | |
163 | arrtype2, | |
164 | IntExp[ | |
165 | 10], | |
166 | IntExp[ | |
167 | 0]]]], | |
168 | SeqExp[ | |
169 | VarExp[ | |
170 | SimpleVar[ | |
171 | arr1]]]] | |
172 | ||
188ac950 SK |
173 | ### Machine |
174 | Will most-likely compile to RISC and execute using SPIM (as favored by Appel) |