Begin translation module
[tiger.ml.git] / README.md
CommitLineData
96ad2fee
SK
1Tiger.ml
2========
188ac950 3A Tiger-compiler implementation in (OCa)ML
96ad2fee 4
6a560b4a
SK
5Status
6------
7
e6e82c08 8![screenshot-tests-head](screenshots/tests-head.jpg)
3206cc89 9...
e6e82c08 10![screenshot-tests-tail](screenshots/tests-tail.jpg)
188ac950
SK
11
12### Features
13#### Done
6a560b4a
SK
14- [x] ch 1: Warm-up AST
15- [x] ch 2: Lexer
16- [x] ch 3: Parser
17- [x] ch 4: AST
3206cc89 18- [x] ch 5: Semantic Analysis (type checking)
cc540a7e 19- [x] ch 6: Activation Records
188ac950 20#### In-progress
6a560b4a 21- [ ] ch 7: Translation to Intermediate Code
cc540a7e 22#### TODO (short-term)
6a560b4a
SK
23- [ ] ch 08: Basic Blocks and Traces
24- [ ] ch 09: Instruction Selection
25- [ ] ch 10: Liveness Analysis
26- [ ] ch 11: Register Allocation
27- [ ] ch 12: Putting It All Together
188ac950 28#### TODO (long-term)
6a560b4a
SK
29- [ ] ch 13: Garbage Collection
30- [ ] ch 15: Functional Programming Languages
31- [ ] ch 16: Polymorphic Types
32- [ ] ch 17: Dataflow Analysis
33- [ ] ch 18: Loop Optimizations
34- [ ] ch 19: Static Single-Assignment Form
35- [ ] ch 20: Pipelining and Scheduling
36- [ ] ch 21: The Memory Hierarchy
188ac950 37#### Maybe
6a560b4a
SK
38- [ ] ch 14: Object-Oriented Languages
39
3206cc89
SK
40### Technical issues
41- [-] testing framework
42 - [x] run arbitrary code snippets
43 - [x] check non-failures
44 - [x] check expected output
45 - [-] check expected exceptions
46 - [x] semant stage
47 - [ ] generalized expect `Output ('a option) | Exception of (exn -> bool)`
48 - [x] run all book test case files
49 - [-] grid view (cols: lex, pars, semant, etc.; rows: test cases.)
50 - [x] implementation
51 - [ ] refactoring
e6e82c08
SK
52 - [ ] test time-outs (motive: cycle non-detection caused an infinite loop)
53 - [ ] parallel test execution
3206cc89
SK
54- [ ] Travis CI
55
5f295d04
SK
56Implementation Notes
57--------------------
58
59### Parser
60
61#### shift/reduce conflicts
62##### grouping consecutive declarations
299dd054
SK
63In order to support mutual recursion, we need to group consecutive
64type and function declarations (see Tiger-book pages 97-99).
65
66Initially, I defined the rules to do so as:
67
68 decs:
69 | dec { $1 :: [] }
70 | dec decs { $1 :: $2 }
71 ;
72 dec:
73 | var_dec { $1 }
74 | typ_decs { Ast.TypeDecs $1 }
75 | fun_decs { Ast.FunDecs $1 }
76 ;
77
78which, while straightforward (and working, because `ocamlyacc` defaults to
79shift in case of a conflict), nonetheless caused a shift/reduce conflict in
80each of: `typ_decs` and `fun_decs`; where the parser did not know whether to
81shift and stay in `(typ|fun_)_dec` state or to reduce and get back to `dec`
82state.
83
84Sadly, tagging the rules with a lower precedence (to explicitly favor
85shifting) - does not help :(
86
87 %nonassoc LOWEST
88 ...
89 dec:
90 | var_dec { $1 }
91 | typ_decs %prec LOWEST { Ast.TypeDecs $1 }
92 | fun_decs %prec LOWEST { Ast.FunDecs $1 }
93 ;
94
95The difficulty seems to be in the lack of a separator token which would be
96able to definitively mark the end of each sequence of consecutive
97`(typ_|fun_)` declarations.
98
99Keeping this in mind, another alternative is to manually capture the possible
100interspersion patterns in the rules like:
101
102 (N * foo) followed-by (N * not-foo)
103
104for the exception of `var_dec`, which, since we do not need to group its
105consecutive sequences, can be reduced upon first sighting.
106
c7cdbbb6
SK
107The final rules I ended-up with are:
108
109 decs:
110 | var_dec decs_any { $1 :: $2 }
111 | fun_decs decs_any_but_fun { (Ast.FunDecs $1) :: $2 }
112 | typ_decs decs_any_but_typ { (Ast.TypeDecs $1) :: $2 }
113 ;
114
115 decs_any:
116 | { [] }
117 | var_dec decs_any { $1 :: $2 }
118 | fun_decs decs_any_but_fun { (Ast.FunDecs $1) :: $2 }
119 | typ_decs decs_any_but_typ { (Ast.TypeDecs $1) :: $2 }
120 ;
121
122 decs_any_but_fun:
123 | { [] }
124 | var_dec decs_any { $1 :: $2 }
125 | typ_decs decs_any_but_typ { (Ast.TypeDecs $1) :: $2 }
126 ;
127
128 decs_any_but_typ:
129 | { [] }
130 | var_dec decs_any { $1 :: $2 }
131 | fun_decs decs_any_but_fun { (Ast.FunDecs $1) :: $2 }
132 ;
133
5f295d04
SK
134##### lval
135
136### AST
137
138#### print as M-exp
188ac950 139
299dd054
SK
140I chose to pretty-print AST as an (indented)
141[M-expression](https://en.wikipedia.org/wiki/M-expression) - an underrated
142format, used in Mathematica and was intended for Lisp by McCarthy himself; it
143is nearly as flexible as S-expressions, but significantly more readable (IMO).
144
145As an example, here is what `test28.tig` looks like after parsing and
146pretty-printing:
147
148 LetExp[
149 [
150 TypeDecs[
151 TypeDec[
152 arrtype1,
153 ArrayTy[
154 int]],
155 TypeDec[
156 arrtype2,
157 ArrayTy[
158 int]]],
159 VarDec[
160 arr1,
161 arrtype1,
162 ArrayExp[
163 arrtype2,
164 IntExp[
165 10],
166 IntExp[
167 0]]]],
168 SeqExp[
169 VarExp[
170 SimpleVar[
171 arr1]]]]
172
188ac950
SK
173### Machine
174Will most-likely compile to RISC and execute using SPIM (as favored by Appel)
This page took 0.031254 seconds and 4 git commands to generate.