X-Git-Url: https://git.xandkar.net/?p=tiger.ml.git;a=blobdiff_plain;f=README.md;h=6127cf991010a65848ce5f8173d98db7a4512445;hp=4f0cd80ffe2001d55f2da193dbe84ede3808f5a7;hb=HEAD;hpb=6a560b4aedc04a0ca1c0a6514c4f66aaddff78b2 diff --git a/README.md b/README.md index 4f0cd80..6127cf9 100644 --- a/README.md +++ b/README.md @@ -1,24 +1,31 @@ Tiger.ml ======== +A Tiger-compiler implementation in (OCa)ML Status ------ -### Done +![screenshot-tests-head](screenshots/tests-head.jpg) +... +![screenshot-tests-tail](screenshots/tests-tail.jpg) + +### Features +#### Done - [x] ch 1: Warm-up AST - [x] ch 2: Lexer - [x] ch 3: Parser - [x] ch 4: AST -### TODO (short-term) -- [-] ch 5: Semantic Analysis (type checking) -- [ ] ch 6: Activation Records +- [x] ch 5: Semantic Analysis (type checking) +- [x] ch 6: Activation Records +#### In-progress - [ ] ch 7: Translation to Intermediate Code +#### TODO (short-term) - [ ] ch 08: Basic Blocks and Traces - [ ] ch 09: Instruction Selection - [ ] ch 10: Liveness Analysis - [ ] ch 11: Register Allocation - [ ] ch 12: Putting It All Together -### TODO (long-term) +#### TODO (long-term) - [ ] ch 13: Garbage Collection - [ ] ch 15: Functional Programming Languages - [ ] ch 16: Polymorphic Types @@ -27,9 +34,25 @@ Status - [ ] ch 19: Static Single-Assignment Form - [ ] ch 20: Pipelining and Scheduling - [ ] ch 21: The Memory Hierarchy -### Maybe +#### Maybe - [ ] ch 14: Object-Oriented Languages +### Technical issues +- [-] testing framework + - [x] run arbitrary code snippets + - [x] check non-failures + - [x] check expected output + - [-] check expected exceptions + - [x] semant stage + - [ ] generalized expect `Output ('a option) | Exception of (exn -> bool)` + - [x] run all book test case files + - [-] grid view (cols: lex, pars, semant, etc.; rows: test cases.) + - [x] implementation + - [ ] refactoring + - [ ] test time-outs (motive: cycle non-detection caused an infinite loop) + - [ ] parallel test execution +- [ ] Travis CI + Implementation Notes -------------------- @@ -37,8 +60,115 @@ Implementation Notes #### shift/reduce conflicts ##### grouping consecutive declarations +In order to support mutual recursion, we need to group consecutive +type and function declarations (see Tiger-book pages 97-99). + +Initially, I defined the rules to do so as: + + decs: + | dec { $1 :: [] } + | dec decs { $1 :: $2 } + ; + dec: + | var_dec { $1 } + | typ_decs { Ast.TypeDecs $1 } + | fun_decs { Ast.FunDecs $1 } + ; + +which, while straightforward (and working, because `ocamlyacc` defaults to +shift in case of a conflict), nonetheless caused a shift/reduce conflict in +each of: `typ_decs` and `fun_decs`; where the parser did not know whether to +shift and stay in `(typ|fun_)_dec` state or to reduce and get back to `dec` +state. + +Sadly, tagging the rules with a lower precedence (to explicitly favor +shifting) - does not help :( + + %nonassoc LOWEST + ... + dec: + | var_dec { $1 } + | typ_decs %prec LOWEST { Ast.TypeDecs $1 } + | fun_decs %prec LOWEST { Ast.FunDecs $1 } + ; + +The difficulty seems to be in the lack of a separator token which would be +able to definitively mark the end of each sequence of consecutive +`(typ_|fun_)` declarations. + +Keeping this in mind, another alternative is to manually capture the possible +interspersion patterns in the rules like: + + (N * foo) followed-by (N * not-foo) + +for the exception of `var_dec`, which, since we do not need to group its +consecutive sequences, can be reduced upon first sighting. + +The final rules I ended-up with are: + + decs: + | var_dec decs_any { $1 :: $2 } + | fun_decs decs_any_but_fun { (Ast.FunDecs $1) :: $2 } + | typ_decs decs_any_but_typ { (Ast.TypeDecs $1) :: $2 } + ; + + decs_any: + | { [] } + | var_dec decs_any { $1 :: $2 } + | fun_decs decs_any_but_fun { (Ast.FunDecs $1) :: $2 } + | typ_decs decs_any_but_typ { (Ast.TypeDecs $1) :: $2 } + ; + + decs_any_but_fun: + | { [] } + | var_dec decs_any { $1 :: $2 } + | typ_decs decs_any_but_typ { (Ast.TypeDecs $1) :: $2 } + ; + + decs_any_but_typ: + | { [] } + | var_dec decs_any { $1 :: $2 } + | fun_decs decs_any_but_fun { (Ast.FunDecs $1) :: $2 } + ; + ##### lval ### AST #### print as M-exp + +I chose to pretty-print AST as an (indented) +[M-expression](https://en.wikipedia.org/wiki/M-expression) - an underrated +format, used in Mathematica and was intended for Lisp by McCarthy himself; it +is nearly as flexible as S-expressions, but significantly more readable (IMO). + +As an example, here is what `test28.tig` looks like after parsing and +pretty-printing: + + LetExp[ + [ + TypeDecs[ + TypeDec[ + arrtype1, + ArrayTy[ + int]], + TypeDec[ + arrtype2, + ArrayTy[ + int]]], + VarDec[ + arr1, + arrtype1, + ArrayExp[ + arrtype2, + IntExp[ + 10], + IntExp[ + 0]]]], + SeqExp[ + VarExp[ + SimpleVar[ + arr1]]]] + +### Machine +Will most-likely compile to RISC and execute using SPIM (as favored by Appel)