Assignment 2, csc431, Spring 2018

7.0.0.7

Assignment 2, csc431, Spring 2018

1 Goal

Extend Milestone 1 to support booleans, functions, and branching.

2 Grammar of Language

Here’s the grammar for Consieuten 2.

‹program›	::=	‹function›*
‹type›	::=	int \| bool \| ( ‹type›* -> ‹type› )
‹declaration›	::=	‹type› ‹id-list› ;
‹id-list›	::=	‹id› {, ‹id›}*
‹type-id-list›	::=	‹type› ‹id› {, ‹type› ‹id›}*
‹function›	::=	fun ‹id› ( [‹type-id-list›] ) ‹type› { ‹function contents› }
‹function-contents›	::=	‹declaration›* ‹function›* ‹statement›*
‹block›	::=	{ ‹statement›* }
‹statement›	::=	‹assignment› \| ‹conditional› \| ‹ret›
‹assignment›	::=	‹lvalue› = {‹expression›} ;
‹conditional›	::=	if ( ‹expression› ) ‹block› [else ‹block›]
‹ret›	::=	return ‹expression› ;
‹invocation›	::=	‹id› ‹arguments› ;
‹lvalue›	::=	‹id›
‹expression›	::=	‹eqterm› {{&& \| \|\|} ‹eqterm›}*
‹eqterm›	::=	‹boolterm› {{== \| !=} ‹boolterm›}*
‹boolterm›	::=	‹simple› {{< \| > \| <= \| >=} ‹simple›}*
‹simple›	::=	‹term› {{+ \| -} ‹term›}*
‹term›	::=	‹unary› {{* \| /} ‹unary›}*
‹unary›	::=	{! \| -}* ‹factor›
‹factor›	::=	( ‹expression› ) \| ‹id› [‹arguments›] \| ‹number›
	\|	true \| false
‹arguments›	::=	( )
	\|	( ‹expression› {, ‹expression›}* )

3 How To Do It

Following the outline presented in class, this milestone consists of a whole bunch of discrete steps.

3.1 Parsing

Parsing should be similar to Assignment 1, except that the grammar has been extended to include additional forms: functions are allowed inside functions, and users may specify arrow types, using parentheses and an arrow.

I’ve extended the Antlr parser, along with the JSON and AST visitors, to handle these two extensions. I have verified that they compile, and that they handle at least one example; this is by no means a guarantee that they work for all examples. Please let me know if you find problems with them.

I’ve pushed the changes to the

https://github.com/jbclements/2184-csc431

repo. This repo is also the one that is the basis for your compiler project, so I believe that it should be possible to run

git remote add upstream https://github.com/jbclements/2184-csc431

to add a remote named "upstream". You should then be able to

git fetch upstream

and then either

git rebase upstream/master

... if you’re a rebase kind of person, or

git merge upstream/master

otherwise.

EDIT: I’ve also added another directory, "racket-path", containing the current Racket parser.

3.2 IR Conversion

Convert your AST to our intermediate representation. This IR should include the following forms:

integer literals (NumC),
boolean literals (BoolC),
binary operations (BinopC),
if expressions (IfC),
let bindings (LetC),
recursive value bindings (LetrecC),
function definitions (LamC), and
function calls (CallC).

Make the most direct translation possible; don’t try to combine any other steps with this one.

EDIT: Along the way, you need to combine the declarations with the assignments. That is, any variable assignments in the body of the function whose name is one of those declared needs to have the given type associated with it, and you must ensure that none of the variables declared in the function is used without having been initialized.

You should be able to serialize this structure into text; this will make debugging the rest of your tool chain vastly easier.

Another helpful tool may be an interpreter, that can directly evaluate the terms in this IR language. Having this can help you debug the remaining stages in the compiler, and help to convince you that each stage in the pipeline is preserving the meaning of the program.

3.3 Type Checking

Type checking should ensure that every expression in the program is well-typed, following standard rules: function calls must match number and types of arguments, the test expression of an if must be a boolean, etc. etc.

Please let me know if you want more detail on type checking.

EDIT: Also, you should check to make sure there’s a single main function that accepts no arguments and returns an integer.

EDIT 2: Some people have asked whether type checking should be performed on the AST or on the IR. Either of these can work fine. The IR has slightly fewer language forms, but whichever one works best for you is fine.

3.4 Lambda Lifting

Lambda lifting should accept a program written in IR, and return an equivalent one that is transformed so that all functions (LamCs) are lifted to a top-level LetrecC. The original locations of the LamCs should now contain variables referring to the new bindings.

You can be clever about using the original names bound to the functions to give them good names, or you can just use the labels on the AST nodes to generate guaranteed-fresh names (relabel before this step).

3.5 ANF conversion

ANF Conversion should accept a program written in IR, and return an equivalent one that is transformed so that all binops (BinopCs) and calls (CallCs) have only “argable” arguments. Use the labels on the AST notes to generate temporary names (again, relabel before this step).

3.6 Code generation

Code generation should accept a program written in IR which has been lambda-lifted and is in ANF, and produce a list of assembly language instructions.

3.7 Output Code

Finally, you should write these instructions to an assembly file, and link the program!

4 Testing your Code

You should have some kind of automated testing framework. This will help you to stay sane. Try to make sure that your tests cover your whole implementation, to avoid ugly surprises.

5 Running on the Reference Machine

Each language will be associated with one of the reference machines.

To spin up a VM to test your code on a reference machine, you’ll need to install Vagrant and VirtualBox. Then, check out the repo that matches your language, and in the resulting directory, make a copy of your project, in a directory called consieuten.

Next, run

vagrant up

to provision and boot the machine. Change to the "/vagrant" directory, where you’ll find your "consieuten" directory. Go ahead and build and test your project.Hopefully, this succeeds.

(Of course, it won’t actually succeed. Let me know what goes wrong.)

6 Testing Plan

For the purposes of evaluation, we’re breaking down the functionality into six separate "tiers". Each one is worth a fraction of the overall points for correctness, as follows:

(4 pts) Straight-line Code
(4 pts) Function Calls
(4 pts) If Blocks
(2 pts) Inner Functions
(2 pts) Type Checking
(1 pt) Functions as first-class values

1	Goal
2	Grammar of Language
3	How To Do It
4	Testing your Code
5	Running on the Reference Machine
6	Testing Plan