You may do this assignment in OCaml, Python or Ruby. You must use each language at least once (over the course of PA2 - PA5); you will use one language (presumably your favorite) twice.
You may work in a team of two people for this assignment. You may work in a team for any or all subsequent programming assignments. You do not need to keep the same teammate. The course staff are not responsible for finding you a willing teammate. However, you must still satisfy the language breadth requirement (i.e., you must be graded on at least one OCaml program, at least one Ruby program, and at least one Python program).
Backslash not allowed \
Example error report output:
ERROR: 1: Lexer: invalid character: \
Example input:
Backslash not allowed
Example .cl-lex output:
1 type Backslash 1 not 2 identifier allowed
The official list of token names is:
The .cl-lex file format is exactly the same as the one generated by the reference compiler when you specify --lex. In addition, the reference compiler (and your upcoming PA3 parser!) will read .cl-lex files instead of .cl files.
A Ruby lexical analyzer generator called ruby-lex is available, but you must download it yourself.
A Python lexical analyzer generator called ply is available, but you must download it yourself.
All of these lexical analyzer generators are derived from lex (or flex), the original lexical analyzer generator for C. Thus you may find it handy to refer to the Lex paper or the Flex manual. When you're reading, mentally translate the C code references into the language of your choice.
My personal opinion is that the OCaml and Python tools are a bit more mature (i.e., easier to use) than the Ruby tools for this particular project, but feel free to prove me wrong. In addition, this is the programming project that will involve the least amount of "native coding", so if you have a least favorite language of the three you might consider using it for this project.
$ cool --out reference --lex file.cl $ my-lexer file.cl $ diff -b -B -E -w file.cl-lex reference.cl-lex
$ cool --out reference --lex file.cl $ ocamllex my-lexer.mll $ ocaml my-lexer.ml file.cl $ diff file.cl-lex reference.cl-lex
You may find the reference compiler's --unlex option useful for debugging your .cl-lex files.
Need more testcases? Any Cool file you have (including the one you wrote for PA1) works fine. The ten in the cool-examples.zip file should be a good start. There's also one among the PA1 hints. You'll want to make more complicated test cases -- in particular, you'll want to make negative testcases (e.g., testcases with malformed string constants).
Students on a team are expected to participate equally in the effort and to be thoroughly familiar with all aspects of the joint work. Both members bear full responsibility for the completion of assignments. Partners turn in one solution for each programming assignment; each member receives the same grade for the assignment. If a partnership is not going well, the teaching assistants will help to negotiate new partnerships. Teams may not be dissolved in the middle of an assignment.
If you are working in a team, exactly one team member should submit a PA2 zipfile. That submission should include the file team.txt, a one-line, one-word flat ASCII text file that contains the email address of your teammate. Don't include the @virgnia.edu bit. Example: If ph4u and wrw6y are working together, ph4u would submit ph4u-pa2.zip with a team.txt file that contains the word wrw6y. Then ph4u and wrw6y will both receive the same grade for that submission.
This seems picayune, but in the past we've had students fail to correctly format this one word file. Thus you now get a point on this assignment for either formatting this file correctly (i.e., including only a single word that is equal to your partner's uva email ID) or not including it (and thus not working in a pair).
In each case we will then compare your output to the correct answer:
If your answer is not the same as the reference answer you get 0
points for that testcase. Otherwise you get 1 point for that testcase.
For error messages and negative testcases we will compare your output but not the particular error message. Basically, your lexer need only correctly identify that there is an error on line X. You do not have to faithfully duplicate our English error messages. Many people choose to (because it makes testing easier) -- but it's not required.
We will perform the autograding on some unspecified test system. It is likely to be Solaris/UltraSPARC, Cygwin/x86 or Linux/x86. However, your submissions must officialy be platform-independent (not that hard with a scripting language). You cannot depend on running on any particular platform.
There is more to your grade than autograder results. See the Programming Assignment page for a point breakdown.
Your submission may not create any temporary files. Your submission may not read or write any files beyond its input and output. We may test your submission in a special "jail" or "sandbox".