Programming Assignment 1 - The Rosetta Stone

Instructions

The Rosetta Stone aided linguistic understanding by providing the same text in three different languages. In this project you will implement the same simple program in five separate languages. Each of your five implementations will have exactly the same interface, will otherwise adhere to the same specification, and should behave exactly the same way.

The five languages are Ruby, OCaml, Python, C and Cool. You will use Ruby, OCaml and Python for PA2 through PA5 (you must use each one at least once; you may use your favorite one twice). In PA2 through PA5 you will write an interpreter for Cool, so it is important that you understand how Cool works now. The C (not C++) implementation is so that you can gain a better appreciation for time-saving features of higher-level languages (e.g., automatic memory management, first-class higher-order functions, object-oriented programming). No assignment in this class after PA1 will use C.

You can find tutorials, manuals and other resources for these languages on the main project webpage.

The Specification

Your program must take in a list of tasks and either output an order in which to perform them or the single word cycle.

Your program will accept a number of lines of textual input (via standard input). That text input will contain a non-zero but even number of lines. Every two lines represent a pair of tasks. The first line gives the name of a task, the second line gives the name of a task that it depends on. This text input is also called the task list.

The task list will contain only standard ASCII characters (no UTF/8 Unicode or special accents). The goal is to test programming and program language concepts, not your internationalization abilities.

Each task name starts at the beginning of the line and extends all the way up to (but not including) the end of that line. So the newline or carriage return characters \r or \n are not part of the task name. Each task name is at most 60 characters long. (This limit is to make the C implementation easier; if you can support longer strings natively, feel free to do so.)

Example task list:

The interpretation is that in order to learn C one must first read the C tutorial and that in order to do PA1 one must first learn C. Desired output for this example: If the task list containts a cycle of any size, your program should output exactly and only the word cycle. Example cyclic input: Even if the task list contains a few non-cyclic parts, any single cycle forces you to output only the word cycle.

There is no fixed limit on the number of lines in the task list (although it is not zero and it is even).

Two tasks with the same name are really just the same task. Use standard string equality.

Duplicated pairs of tasks are not allowed. For example:

... that task list is not valid input because the pair learn C/read the C tutorial appears twice. Program behavior if the task list contains a duplicate pair is undefined. You will not be tested on it.

Your program may not perform any other file I/O, such as creating a temporary file to keep track of some intermediate sorting results. You do not need to create any such temporary files to solve this problem.

Choosing Among Unconstrained Tasks

If there are multiple outstanding unconstrained tasks, your program should output them in ascending ASCII alphabetical order. That is, if you ever have two or more tasks, each of which has no remaining dependencies, output the one that comes first ASCII-alphabetically. (This constraint makes your program deterministic; for any given input there is only one correct output.) Example: Because r comes before u, your output should be: To put it another way, consider this task list: Which yields a dependency graph like this: The proper ordering for this set of tasks is A B D E C. Note that B comes before D and E, even though B depends on A. This is because, once A is finished, B is free to go and it comes first alphabetically. You may want to consider this requirement when you pick your sorting algorithm. Given this requirement the answer A D E B C is incorrect and will receive no credit.

Commentary

This problem is just topological sort not-so-cleverly disguised. Feel free to look up how to do toposort on the internet (but remember that you must turn in your own work; you may not copy someone else's code and claim it as your own).

Take a look at the files in pa1-hint.zip. You could do worse than using them as starting points.

If you're having trouble writing anything reasonable in Cool, don't forget to look at the example Cool programs that are distributed near the interpreter.

Building and maintaining explicit graph structure is probably overkill.

Since you do not know the number of tasks in advance you will need some sort of dynamic memory allocation.

I recommend solving the problem in the Python, Ruby or OCaml first (depending on where you are most comfortable) and then just translating your solution into the others. Translating into Cool and C will not be as easy, however.

Use this as an opportunity to see what you like about various languages. What do you think of Python's enforced tabbing? How about ML's abysmal error reporting? Ruby's below-par execution speed? C's general hideousness?

For bonus brownie points, try to make your OCaml, Ruby and Python implementations as functional as possible (e.g., eschew side effects).

Don't know which language to pick for PA0? Look at the pa1-hint.zip file and pick a language that looks like it would be easy to debug. Your first attempt (for PA0) will probably contain the most algorithmic errors. After that it's more of a matter of translating it into other languages.

What To Turn In For PA1

You must turn in a zip file containing exactly these seven files:
  1. rosetta.c -- C implementation
  2. rosetta.cl -- Cool implementation
  3. rosetta.ml -- OCaml implementation
  4. rosetta.py -- Python implementation
  5. rosetta.rb -- Ruby implementation
  6. testcase.list -- A valid task list that you made up.
    and
  7. readme.txt -- your README file
Your zip file must be named your_email-pa1.zip. For example, if your University email address is wrw6y you must call your zip file wrw6y-pa1.zip. Do not use your gmail address or whatnot -- we really want your university ID here.

The testcase.list file should contain a valid task list (it may or may not contain a cycle -- your choice). It will be used as one of the test cases to grade all of the submissions, so you have the chance to be certain of getting at least one right.

The readme.txt file should be a plain ASCII text file (not a Word file, not an RTF file, not an HTML file) describing your design decisions. Which language did you start with? How did you store the (implicit) graph? Which language was the hardest? One or two English paragraphs should suffice. Spelling, grammar, capitalization and punctuation count.

Submit your zip file to Toolkit.

What To Turn In For PA0

PA0 is a checkpoint to make sure that you do not fall behind on PA1. You have more than two weeks to complete all of PA1, but if you try to do it all at the last minute you're probably doomed to failure.

For the one-week PA0 checkpoint, you must have at least two of the implementations done. This separates the project into two phases: (0) Can I write this program at all? and (1) Can I write this program in three other languages?

For PA0 you must turn in two source files. Use the following names:

  1. rosetta.c -- C implementation
  2. rosetta.cl -- Cool implementation
  3. rosetta.ml -- OCaml implementation
  4. rosetta.py -- Python implementation
    and/or
  5. rosetta.rb -- Ruby implementation
Each of your two source files should have one of those five names (e.g., rosetta.whatever). Submit a single zipfile containing exactly those two files. As above, your zip file must be named your_email-pa0.zip. For example, if your University email address is wrw6y you must call your zip file wrw6y-pa0.zip. Do not use your gmail address or whatnot -- we really want your university ID here. Submit using Toolkit, as above.

Autograding

We will use scripts to run your program on various testcases. The testcases will come from the testcase.list files you and your classsmates submit as well as held-out testcases used only for grading. Your programs cannot use any special libraries (aside from the OCaml unix and str libraries, which are not necessary for this assignment). We will use (loosely) the following commands to execute them: In each case we will then compare your output to the correct answer: If your answer is not the same as the reference answer you get 0 points for that testcase. Otherwise you get 1 point for that testcase.

We will perform the autograding on some unspecified test system. It is likely to be Solaris/UltraSPARC, Cygwin/x86 or Linux/x86. However, your submissions must officialy be platform-independent (not that hard with a scripting language). You cannot depend on running on any particular platform.

There is more to your grade than autograder results. See the Programming Assignment page for a point breakdown.

Notes

There was going to be an option to use lolcode as a sixth language, but none of the available lolcode interpreters or compilers are both stable enough to be used in a classroom setting and also sufficiently-featured to handle this assignment. Oh noes!