CS 4610 - Programming Languages - Programming Assignment 6

Programming Assignment 6 - Extra Credit - The Code Optimizer

Project Overview

Programming assignments 2 through 4 involved the constructed of the front-end (lexer, parser) and gatekeeping (semantic analyzer) stages of a compiler. Programming assignment 5 involved the correct generation of assembly instructions. In this assignment you will produce efficient assembly instructions with respect to a fixed cost model.

You may do this assignment in OCaml, Python or Ruby. Almost invariably this assignment is done by modifying your PA5 submission.

You may work in a team of two people for this assignment. You may work in a team for any or all subsequent programming assignments. You do not need to keep the same teammate (but it may be tricky to work things out if you switch between PA5 and PA6). The course staff are not responsible for finding you a willing teammate.

Goal

For this assignment you will write an optimizing code generator. This typically involves transforming the abstract syntax tree into a control-flow graph and applying a number of dataflow analyses and optimizations. In addition, students typically pattern-matching optimizations at the AST level and peephole optimizations at the assembly level.

In all cases, the optimized assembly code you emit must produce exactly the same result as the reference interpreter. The first rule of optimization is "don't break the program".

You do not have to worry about "malformed input" because the semantic analyzer (from PA4) has already ruled out bad programs.

The Specification

You must create three artifacts:

A program that takes a single command-line argument (e.g., file.cl-type). That argument will be an ASCII text Cool class map, implementation map, parent map and annotated AST file (as described in PA4). Your program must emit file.cl-asm, a Cool Assembly Language program. Executing file.cl-asm in a Cool CPU Simulator must produce the correct output for file.cl according to Cool's operational semantics. Your program will consist of a number of OCaml files, a number of Python files, or a number of Ruby files.
- You will only be given .cl-type files from programs that pass the semantic analysis phase of the reference compiler. You are not responsible for correctly handling (1+"hello") programs.
A plain ASCII text file called readme.txt describing your optimization design decisions and choice of benchmarks. See the grading rubric. A few paragraphs should suffice.
Two benchmarks test1.cl and test2.cl. Each benchmark should be a valid Cool program that takes 50,000 instructions or fewer to execute (via the reference compiler and --asm and --profile). Optimizers will be graded on the average of their performance on a pool of programs: that pool contains some baseline Cool programs as well as all benchmarks submitted by participants. You can thus "stack the deck" in your favor by constructing programs that you are very good at optimizing but that others may not be able to handle.

Scoring

The Optimizer assignment is an explicit competition. You will receive more extra credit if your optimizer is "better" than that of other students.

Each optimizer will be evaluated on the same set of benchmark programs. Your optimizer's "score" for a program is the factor reduction it obtains over the reference non-optimizing compiler expressed as a percentage. For example, if cool --asm on benchmark1 produces a program that takes 722564 cycles and your optimizer on benchmark1 produces a program that takes 551855 cycles, your speedup on benchmark1 is 1.31x. Note that scores below 1.00x are possible, but unlikely.

Your final score over the benchmark suite is the average of your scores on the individual benchmarks.

If your optimizer "breaks" any benchmark program, your overall score immediately becomes zero. If we deem your README to be inadequate, your overall score is cut in half. If an inspection of your code reveals that you are hard-coding the benchmark files (e.g., by including pattern matching that works for the baseline Cool benchmark programs but few others), your overall score will be cut in half.

Baseline Benchmarks

A baseline set of indicative benchmarks is available here:

cool-optimizer-benchmarks.zip (as of Wed Nov 11 15:20:48 EST 2009)

Your program will be judged on the baseline set as well as programs submitted by you and other students. The baseline benchmarks will probably count for more than benchmarks submitted by students. The baseline benchmarks were chosen so that reasonable optimizations, such as those discussed in class, would yield non-trivial improvements. Some of them are "realistic" (e.g., matrix multiplication, calculating pi, ...) while others are clearly "synthetic".

You can test your optimizer's performance (and see its average speedup) by submitting it to our testing server.

Proposed Extra Credit

This section is subject to change.

We currently propose that any optimizer that is twice as good as cool --asm --opt will grant a 2% overall course grade improvement. When you submit your optimizer for testing, we will show you the reference optimizer's performance as a comparison.

We currently propose that the best optimizer in the class (or best two optimizers) will grant a 5% overall course grade improvement. Doing a stellar job on the optimizer also makes it easier for me to write you a strong letter of recommendation, since it is a largely independent project where you must synthesize and apply concepts from class on your own. For letters of recommendation, the complexity of the optimizations you attempt matters more than the final numerical speedup.

Commentary and Suggestions

Different optimizations work best on different representations. You may want to consider:

Constructing a CFG and performing dataflow analyses, such as:
- Constant folding and propagation
- Dead code elimination
- Common subexpression elimination
General AST pattern-matching optimizations, such as:
- Loop unrolling
- Loop-invariant code motion
- Inlining method or constructor calls
- Constant string merging
Object-oriented optimizations, such as:
- Constant constructors
- "Unboxing" Integers
- "Unboxing" Bools, Jump Threading for conditionals
- Removing unused methods and fields
Assembly-level optimizations, such as:
- Peephole optimization
- Strength reduction
- Code layout optimizations
Register optimizations, such as:
- Using R4 through R7 as the first four stack slots
- Making your own calling convention (e.g., for constructors)
- Full-blown register allocation

If you haven't heard of an optimization, search around for it on the Internet.

What To Turn In For PA6

You must turn in a zip file containing these files:

readme.txt -- your README file
benchmark1.cl -- a testcase
benchmark2.cl -- a testcase
source_files -- your implementation, including
- main.rb or
- main.py or
- main.ml

Your zip file may also contain:

team.txt -- an optional file listing your other team member (see below -- if you are not working in a team, do not include this file)

Submit the file as you did for PA1.

Working In Pairs

You may complete this project in a team of two. Same rules as before.

Autograding

We will use scripts to run your program on various testcases. Same rules as before.