EECS 322: Compiler Construction

Overview

In this course you will build a compiler for a simple (but illustrative) programming language that takes programs all the way down to running on an x86 processor. It will explain the standard structuring for a compiler with a front end (parsing, type-checking), and a back end (code generation).


LectureSearle 2407; MW 2-3:20pm

Recommended Texts (none are required)Modern Compiler Implementation in ML by Andrew W. Appel

Engineering a Compiler by Keith Cooper & Linda Torczon

Packrat parsing. Probably best to start with Brian Ford's master's thesis.

Producing Wrong Data Without Doing Anything Obviously Wrong! by Mytkowicz et al
This paper shows why performance evaluation can be super tricky.

Piazza

Piazza signup
EECS 322 on Piazza


Racket

Racket is available online. We are using version 6.4.

Racket is also available on the t-lab machines, in /home/software/racket.


Interpreters

The file 322-interps.tar.gz contains an interpreter for each of the various languages you'll be compiling this quarter. Racket must be in your path to use them.

It also contains run-test-fests, a script to make it easier to run the tests in your compiler.


Runtime System

The file runtime.c contains the implementation of our GC and printing routines, as well as a main() function your compiler could use.


Lc Speed Test

The actual speed test will be during the last meeting of class. Read the assignment spec for more detail.

The benchmarks.tar.gz file contains the programs, inputs, expected outputs, and the amount of time the interpreter took.


Lecture notes

HELIX.pdf
lecture11.txt Some L5 optimizations
lecture10.txt L5 to L4
lecture09.pdf L4 to L3
the missing pieces of the back-end
lecture08.txt L3 to L2
puzzle solving
lecture07.txt tail calls
lecture06b.pdf an impossible to allocate function
lecture06.pdf register allocation, iii
lecture05.pdf register allocation, ii
lecture04.pdf intro to register allocation & spilling
lecture03.pdf from L1 to x86
lecture02.pdf L1
lecture01.pdf introduction


Homework assignments
Week 1 Thu 3/31 noon1-test
Week 2Mon 4/4 noon1Wed 4/6 noonspill-test
Week 3Sun 4/10 noonspillWed 4/13 noonliveness-test
Week 4Sun 4/17 noonlivenessWed 4/20 noongraph-test
Week 5Sun 4/24 noongraphWed 4/27 noon2-test
Week 6Sun 5/1 noon2Wed 5/4 noon3-test
Week 7Sun 5/8 noon3Wed 5/11 noon4-test
Week 8Sun 5/15 noon4Wed 5/18 noon5-test
Week 9Sun 5/22 noon5 
Week 10Mon 5/30 noonLc 

Test Fests

To run the test cases yourself, download tests.tar.gz and use the run-test-fests script.

1 first submission1 later submissions
spill first submissionspill later submissions
liveness first submissionliveness later submissions
graph first submissiongraph later submissions
2 first submission2 later submissions
3 first submission3 later submissions
4 first submission4 later submissions
5 first submission5 later submissions


Pair programming

Students are encouraged (but not required) to work in pairs. Pair programming is not team programming, however. That is, each member of a pair must promise (in writing) that they will always sit together when working on the assignments, never separately. If this is too much of a burden, work alone.

Pair assignments will not be accepted until both members of the pair have handed in their written promise.


Codewalks

Presenter(s): When presenting a codewalk:

  1. Concisely restate the task, and which parts you got done (if not all of them)
  2. Provide an overview of your solution. Depending on what is going on in your code, this should consist of some diagrams: diagrams explaining data structures (class hierarchies, if you used classes or similar data-definition diagrams), and/or diagrams explaining example, common interactions between components in your code (interaction diagrams).
  3. Present the components in a top-down manner, no matter how you designed and implemented them. Be prepared to defend your code's organization and explain how it matches (or why it fails to match) your organization. When presenting the code, you should, in general, be able to refer to some spot in an earlier diagram to explain the context of how the code is used.
In general, be prepared to figure out in real time how changes to the data structures and organization of your program would affect the code.

When evaluating a code walk, we look at three things

  1. The quality of your presentation
  2. The ability to focus in on specific lines of code in response to a comment
  3. The ability to think through specific issues brought up by the class or the panel

Panelists: As a panelist, you will have one of three different roles:

  • Manager: the first reader/analyist of the code, with the responsibility to keep the codewalk on track
  • Second manager: the second reader, who helps the first reader
  • Secretary: keeps notes on the code walk; weakness in the code, questions that came up, etc.
The secretary is responsible to supply a copy of the written notes to Robby by 5am the morning after the codewalk. If the notes are acceptable, they will be forwarded to the presenter(s); if not, edits will be requested.

When evaluating the managers, we are looking for the ability to identify solid problems with the code and articulate them well. While this may appear to be dependent on the quality of the code, in practice, all code has issues and places that it could be improved. The secretary is evaluated on the quality of the notes produced.


Programming LanguageStudents are free to use any programming language. As a general guideline, I recommend a programming language that is both safe and has garbage collection. These two features make building software easier. Also, you will have to build a simple parser for a parenthesis-based language that comes for free in Racket. Building that kind of parser is easy, however, and using other safe, garbage-collected languages is encouraged.

Grades

Grades in the course are based on passing each of the programming assignnments, the speed test (on the final day of class when Lc is due), and your codewalks for up to 13 opportunities to pass.

GradeAA-B+BCDF
Assignments
passed
131211109876543210

To pass one of the programming assignments (1, spill, liveness, graph, 2, 3, 4, or 5), you must either pass 75% of the test cases in the initial test fest, submit a test suite that finds a bug in every (other) submission in the initial test fest, or pass 98% of the test cases in a later test fest.

To pass the speed test (Lc), your compiler must generate a binary that produces the correct output for each of the submitted speed test programs.

The winner of the speed test and anyone that beats racket on all programs gets a free pass to be used on any one assignment. Note that while racket has had 15 years of continuous development that gives it a fair edge over your 10 or so weeks worth of effort, it is at a significant disadvantage because its versions of the primitive operations are more complex and have more error checking. Overall, this should make it a fair fight. (Put another way, getting performance in the face of all the details that go into a full-fledged, safe language is not easy.)

You may resubmit any version of any assignment any time up to the last day of finals and if you do not pass your codewalk, you may request a private codewalk (on the same assignment or a different one). Note that the bonus pass achieved in the speedtest is available only to compilers handed in by the Lc due date.


Cheating

Your code will be inspected for plagiarism. The work you hand in must be your own. If there is any doubt that you didn't produce the work entirely yourself (or yourselves if you are pair programming), explain it as part of your homework submission.


StaffSimone Campanoni
Robby Findler