cs1120: Problem Set 7 - Implementing Interpreters

Turn-in Checklist:

Bring in to class a stapled turn-in containing your written answers to all the questions. If you worked with a partner, you and your partner should submit a single writeup with both of your names and UVA Email Ids on it. You must clearly indicate both names and UVA IDs in big block letters if you are working in a team. Each partner should use our automatic adjudicator service to separately submit just the YourCode.java file. In the YourCode.java file, be sure to add the UVA Email Id of each member of your team to the authors array. Each partner must separately submit a copy of this file.

Collaboration Policy - Read Carefully

For this assignment, you make work either alone or with a partner of your choice.

Remember to follow the pledge you read and signed at the beginning of the semester. For this assignment, you may consult any outside resources, including books, papers, web sites and people, you wish except for materials from previous cs1120, cs150, and cs200 courses. You may consult an outside person (e.g., another friend who is a CS major but is not in this class) who is not a member of the course staff, but that person cannot type anything in for you and all work must remain your own. That is, you can ask general questions such as "can you explain recursion to me?" or "how do lists work in Python?", but outside sources should never give you specific answers to problem set questions. If you use resources other than the class materials, lectures and course staff, explain what you used in your turn-in.

You are strongly encouraged to take advantage of the scheduled help hours and office hours for this course.

Purpose

Understand how language interpreters work
Understand the meta-circular evaluator
Ruminate on the nature of univeral programming languages
Learn how changing the evaluation rules changes programming

Background

Languages are powerful tools for thinking. One way to solve problems is to think of a language in which a solution to the problem can be easily expressed, and then to implement that language. This is the "great Hop forward" that Grace Hopper made in the 1950s: we can produce programs that implement languages. The input to the program is an expression specification in some language. If the program is an interpreter, the result is the value of that expression.

In this problem set, we provide a working interpreter for a mini-Python language, which is approximately a subset of Python. The interpreter implements the Python evaluation rules with state. Your assignment involves understanding the interpreter and making some additions and changes to it. If you are successful, you will produce an interpreter for our mini-Python language.

Reading: Read Chapter 11 (Interpreters) of the course book.
You may also want to refer to the Java Guide (used for PS6).

Download: Download ps7.zip to your machine and import the project with Eclipse (same way you did for PS6). Go back and look at the screenshots if you get stuck. This project contains a Java implementation of an interpreter for the Python-like language. The implementation closely matches the description in Chapter 11.

For the questions in this assignment, you will modify YourCode.java which you will submit electronically when you are finished.

Getting Started With mini-Python

Interpreting arbitrary Python code would be quite a challenge. In order to make our lives easier, we will work with mini-Python, which is missing some of features we're used to from Python. Here are some of the main differences between mini-Python and full Python:

What is in mini-Python?	What is not in mini-Python?
Variables and variable assignment Procedures (including recursive procedures and procedures defined inside of other procedures) Integers and Basic Integer Arithmetic print statements Comparison functions like `<`, `>`, and `==` `true` and `false` comparable with `==`	for loops lambda expressions Strings and decimal numbers Negative numbers (unless obtained by subtraction) The keywords `not`, `in`, `or`, `and`, etc. Comments Built in functions like `map`, `filter`, `len`, `str`, etc. List Comprehensions

The complete grammar for mini-Python is specified in Grammar in the src folder and can also be found by clicking here. In this problem set, we will be improving mini-Python to have support for lists and while loops, which will not work at first.

Interpreter Overview

Our interpreter is divided into three different phases: tokenization, parsing, and evaluation. These roughly correspond to the main activities involved in reading English or Text Messages. For example, to understand the text message I<3U Alan you must:

Break it up into "words": "I", "<3", "U" and "Alan". This is called tokenization, and each "word" (or bit of non-space punctuation) is called a token or terminal.
Determine if the "words" form a "sentence" or not. This is called parsing, and involves checking the words against a grammar (e.g., Sentence can be Subject Verb Object, etc.).
Finally, understand or interpret the resulting "sentence". This is called evaluation or interpretation. (This is the stage where we know that "<3" means "love", etc.)

To see how these different phases help us interpret a program, let's consider the following simple program in mini-Python:

x = 5
if x < 6:
    print 1
else:
    print 0

In order to interpret this program, we need to read it in from a file as one long String. We can turn a file into a Java String using the loadFile function provided in Main.java:

String myProgram = loadFile("code/Example1");

In tokenization stage, the interpreter breaks up the input string into smaller parts called "tokens". To tokenize a String, we use the tokenize method defined in the Tokenizer class. For this particular program, we would get the following list of tokens:

System.out.println(Tokenizer.tokenize(myProgram)));
// [x, =, 5, NEWLINE, if, x, <, 6, :, NEWLINE, 
//         TAB, print, 1, NEWLINE, else, :, NEWLINE, TAB, print, 0, NEWLINE]

Breaking the input into pieces is a good first step, but it doesn't tell us much about the program we are trying to run. In the next phase, parsing, we use the grammar for our mini-Python language in order to build a more helpful structure: an abstract syntax tree (AST). We'll discuss more about abstract syntax trees later in the problem set. For now, an abstract syntax tree is just a fancy computer representation of those sentence diagrams you may have seen in elementary school.

In order to parse an input String, we use the parse method defined in the Parser class:

System.out.println(Parser.parse(myProgram));
/* [[assign, x, 5], 
    [if, 
        [<, x, 6], 
        [[print, 1]], 
        [[print, 0]]]]  */

Now we're starting to see the different components of the program take shape. The AST helps us understand the structure of the input program. The if statement, for example, is now represented as a list with four elements: the String "if", the condition, the code to execute if the condition is True, and the code to execute if the condition is False.

With the program parsed, we are now ready to run the program. Given an AST and an Environment object, the meval method in the Evaluator class will run (i.e., evaluate) the input program in the given environment:

// Create a new Environment to run our code
Environment env = Environment.createGlobalEnvironment();

// Parse the program
ArrayList<Object> ast = Parser.parse(myProgram);

// Run the program
Evaluator.meval(ast.get(0),env); // Evaluate x=5
Evaluator.meval(ast.get(1),env); // Evaluate the if statement
// => 1

Our example program printed out 1, just as we expected. Since meval only works on one mini-Python statement at a time, we had to break our AST in to two parts and evaluate them separately.

Breaking the AST into pieces is a bit of a pain. To help out, Evaluator defines another helpful function evalSequence, which automatically evaluates all of the AST:

Environment env = Environment.createGlobalEnvironment();
ArrayList<Object> ast = Parser.parse(myProgram);
Evaluator.evalSequence(ast, env);
// => 1

We will be running programs in this way quite frequently, so Main.java defines another helper function runRile, which does all of this for us:

runFile("code/Example1");
// => 1

It may be challenging to understand all of the parts of our interpreter the first time you see them. We will clarify each of the parts of the interpreter and provide more detail throughout the problem set.

Practicing with mini-Python

Look at ExampleSquare in the code folder:

def square(x):
	return x*x
print square(4)
print square

We can run this mini-Python code in Main.java by typing:

runFile("code/ExampleSquare")
// 16
// Interpreter.Procedure@6ba8fb1b

If you run this code on your own machine, the second printed line may be slightly different but should generally look the same.

Question 1: Explain why Interpreter.Procedure@6ba8fb1b (or something similar) was printed when we ran ExampleSquare. Write this answer on paper and turn it in with the written part of your assignment.

Hint: To answer this question you may need to read through the PS7 code and/or try out that same program in PyCharm/Python.

Question 2: In the text file "YourFactorial" in the code folder, use mini-Python to write a recursive factorial function, and have it evaluate the expression 5!. Start with:

def factorial(x):
	#Your code here
print factorial(5)

When you've done it correctly, you should get the following output (when run in Main.java):

runFile("code/YourFactorial")
// => 120

Copy down (or print out) the YourFactorial code you wrote and turn it in with the written part of this assignment.

Hint: Review the mini-Python grammar. For example, mini-Python requires an else: for every if.

The Parser

The Parser is a big chunk of our Interpreter. The Parser does not actually run any of the code in our program, but rather helps us prepare to run the program by turning a sequence of tokens into a much more helpful structure: an Abstract Syntax Tree (AST).

The Parser takes a sequence of tokens from the tokenizer (as a TokenStream, which is really just an ArrayList of strings), and attempts to derive them using the rules of the grammar for our mini-Python language (just like you did in Problem Set 3). If the sequence of tokens is in the language of the grammar, our Parser produces a nice tree structure, which exactly diagrams the structure of our program.

To see how this works, let's see how our interpreter parses if statements (recall that mini-Python requires an else: branch for each if). In the Parser class, we define the method parseIf as follows:

public static ArrayList<Object> parseIf(TokenStream tokens) {
		// Make a new ArrayList to store our parsed If statement
		ArrayList<Object> retval = new ArrayList<Object>();
		
		// Make "if" the first element of our output list
		retval.add("if");
		tokens.munchAssert("if");
		
		// Parse the condition
		retval.add(parseExpression(tokens));
		
		// Parse the true clause
		tokens.munchAssert(":");
		retval.add(parseSequence(tokens));
		
		// Parse the false clause
		tokens.munchAssert("else");
		tokens.munchAssert(":");
		retval.add(parseSequence(tokens));
		return retval;
	}

The goal of parseIf is to take a TokenStream tokens and transform it into an ArrayList<Object> where the elements have the following format:

["if", condition, true-clause, false-clause]

This is done by reading off strings from the TokenStream. A TokenStream is a subclass of ArrayList<String>, which means that any methods you can use on an ArrayList, you can also use on a TokenStream. The TokenStream class also defines a few extra helpful functions:

String munch() - Returns the first token in the TokenStream and removes it from the list. For example, if we call munch on the list ["a","b","c"], it will return "a" and change the list to ["b","c"].
void munchAssert(String val) - Also removes the first token in the TokenStream, but instead of returning it it checks that it has the value val and throws an error if it does not. This is used to check for tokens that must be present, like : in Python.

Now that we know how TokenStream works, let's see what happens when we run parseIf with a subset of the tokens from our earlier example:

["if", "x", "<", "5", ":", "INDENT", "print", "1", 
    "DEDENT", "else", ":", "INDENT", "print", "0", "DEDENT"]

First, parseIf makes a new ArrayList<String> to store the parsed if statement. It then uses the line tokens.munchAssert("if"); to ensure that the current token is indeed "if". Since munchAssert removes the first token, our list of tokens now has the value:

["x", "<", "5", ":", "INDENT", "print", "1", 
    "DEDENT", "else", ":", "INDENT", "print", "0", "DEDENT"]

We now need to consult the grammar rule for if statements to know what to do next. In mini-Python, if statements have the following rule:

IfStatement := if Expression : Sequence else : Sequence

According to this rule, since we've just seen the "if" token, we should now expect an Expression representing the condition of the if statement. An Expression is just a snippet of mini-Python code that returns a value like 1+1, (1+x)<y, etc. Since we expect an Expression, we call the parseExpression method. In this case, parseExpression will return ["<","x","5"] representing our original mini-Python expression x<5. At this point, tokens has the value:

[":", "INDENT", "print", "1", "DEDENT", "else", ":", 
    "INDENT", "print", "0", "DEDENT"]

Next, the line tokens.munchAssert(":"); ensures that : is the next token and removes it. Our grammar rule now tells us that we expect a Sequence, so we call the appropriately named parseSequence method, which will munch the next four tokens and return the list [["print", "1"]]. We then repeat this process for the false-clause, which is also a Sequence.

After performing all of these calculations, evalIf returns the following list:

["if", ["<", "x", "5"], [["print", "5"]], [["print", "6"]]]]

which represents the abstract syntax tree for our if statement. Note that you do not need to understand how parseSequence and parseExpression work, but you may refer to their commented definitions in Parser if you are curious.

Adding Loops

Our mini-Python language does not support for loops, but we would like it to support while loops. While loops have a condition and a body, and will continue to run the body as long as the condition is true. Here is an example that prints out the numbers between 1 and 10:

i = 1
while i <=10:
	print i
	i = i + 1

Here is the same code in Java:

int i = 1;
while(i <= 10) {
	System.out.println(i);
	i = i + 1; // or i++;
}

In our code, we initialize an integer i to 1. Our condition is i <= 10 and in our body we print out i and then increments it. It is very important that we increment i at the end of the body. If we don't, then we will have an "infinite loop" where mini-Python will continue evaluating the while loop without stopping. It would be bad to cause such an infinite loop.

In the next few problems, we will add support for while loops to mini-Python. We will do this in two parts. We first have to add the while loop to the Parser, then we will add the code that actually runs the while loop in the Evaluator.

Question 3: Complete the parseWhile method in YourCode.java. To parse a WhileStatement, we'll need to parse both the loop condition and the loop bodyThe condition to keep looping should be an Expression, while the body should be a Sequence. More formally, you should expect while loops to follow the following grammar rule:

WhileStatement := while Expression : Sequence

Your implementation of parseWhile should produce an ArrayList<Object> with the following format: ["while", condition, body].

Hint: Your code for parseWhile will likely look very similar to the code parseIf, so that may be a good place to start.

Once completed, you should be able to get the following results by running this code in Main.java:

System.out.println(Parser.parse(loadFile("code/TestQuestion3")));
// => [[while, [<, i, [-, max, 1]], [[print, i], [assign, i, [+, i, 1]]]]]

where code/TestQuestion3 has the following mini-Python code:

while i < max-1:
	print i
	i = i + 1

The Evaluator

The Evaluator is where we actually run the code. The Evaluator takes in parsed code from the Parser as input and simulates the Python program according to the rules defined in Evaluator.java. Understanding Evaluator.java is very important for your success in PS7. Make sure you read all of this class.

meval

The key to the Evaluator is the meval method, which is partially reproduced below:

  
	public static Object meval(Object expr, Environment env) {
		    ...
		    ArrayList<Object> exp = (ArrayList<Object>) expr;		

            // To keep things simple, the first element of 
            // the list is always the operation. That's why we
            // went to the trouble of representing "1<2" as ["<","1","2"]
			Object operation = exp.get(0);

			// Function Definitions
			if (operation.equals("def")) {
				return evalDef(exp,env); 
			}
			// If Statements
			else if (operation.equals("if")) {				
				return evalIf(exp,env);
			}        
            ...
            // Multiplication
			else if (operation.equals("*")) {
				return evalMult(exp,env);
			}
            ...
            // Less Than
			else if (operation.equals("<")) {
				return evalLessThan(exp, env);
			}
            ...
	}

meval takes in a statement or expression as an ArrayList, determines what kind of statement or expression it is, and then calls the appropriate "eval" function to evaluate that statement or expression. In this way, meval is like a switchboard for all of our methods in Evaluator.

Environments

Environment objects are used to represent different program environments used throughout Evaluator. The environment contains a HashMap called frame that maps variable and procedure names to the corresponding actual values. For example, if we've just run the line "x=5", then frame.get("x") will return 5. You should take a moment and read through the Environment class.

Errors

Sometimes when we write invalid Python code, we get an exception. For example, if we try to index incorrectly into a list, we would get some sort of index out of bounds error. In our interpreter, we want to let programmers know if they wrote buggy code. We have provided the ParseError and EvalError classes to allow you to do this. When the interpreter should raise an error (e.g., to indicate a problem in the input mini-Python program), use:

throw new EvalError("error_message");

and replace "error_message" with a good description of the error.

Evaluating Simple Expressions

Let's see how our Interpreter evaluates certain kinds of expressions. One of the simplest types of Expressions in mini-Python is a ComparisonExpression, which represent any operation using >, <, or ==. Example ComparisonExpressions include 1<2, x>y, and (x*y/2)==(getNum(z)).

Note that for these last two examples, we cannot immediately compare two numbers, since we do not know the value of x or (x*y/2) right away. To determine the value of these expressions, we need to first evaluate them using meval. After evaluating both expressions, we should have two integers, which we can then compare and return True or False accordingly.

This is demonstrated in the evalLessThan method included below, which evaluates ComparisonExpressions that have the format ["<", expression1, expression2] :

	public static Boolean evalLessThan(ArrayList<Object> exp, Environment env) {
		// Evaluate the two expressions we'll be comparing
		Object v1 = meval(exp.get(1),env);
		Object v2 = meval(exp.get(2),env);
		
		// Check that both values are indeed integers
		if (v1 instanceof Integer && v2 instanceof Integer) {
			
			// Perform the actual calculation
			return (Integer)v1<(Integer)v2;
		}
		
		// Throw an error if we try to compare non-Integers
		throw new EvalError("Cannot compare " + v1 + " and " + v2);
	}

Even though we are expect that both expressions that we are comparing should be Integers, we double check their type anyway, and throw a convenient error message in the event that they are not both Integers.

Question 4: In YourCode.java, complete the evalLessThanEqual method. This method is called by meval in Evaluator.java and takes in an ArrayList and an Environment object. You can assume the ArrayList has the format ["<=", expression1, expression2].

Your method should evaluate the two expressions in the ArrayList, check to make sure they are integers, and then return true if the first integer is less than or equal to the second integer (and false otherwise). If the two expressions do not both evaluate to Integers, you should throw an EvalError.

Note that since your code is in the YourCode class rather than the Evaluator class, methods like meval(...) need to be written as Evaluator.meval(...).

Once completed, you should be able to get the following results by running code in Main.java:

runFile("code/TestQuestion4");
// => True, True, False, False

where code/TestQuestion4 has the following mini-Python code:

def return6():
	return 6
x = 5
print (x <= 5)
print (x <= (return6()))
print (x <= (return6()-2))
print (10 <= (10-20))

Evaluating Loops

Now that we've had some practice writing "eval" methods, we can now write the code that will actually evaluate the while loops we parsed in Question 3.

Question 5: Complete the evalWhile method in YourCode.java. evalWhile should take as input an ArrayList formatted in the same way as the output of parseWhile: ["while", condition, body].

Using this information, you should write code that runs the body of the while loop for as long as condition evaluates to true (this means you will have to re-evaluate the condition each time you run the body). If the condition does not evaluate to a Boolean at any point, you should throw an EvalError.

Note that when testing this, you can assume that loop bodies contain no mini-Python return statements. This means that your evalWhile method should always return null.

Hints: You will probably have to use a Java while loop for this (although you could also use Java recursion). Also, recall that Evaluator.meval works only on a single statement or expression, while Evaluator.evalSequence works on a list of statements. Looking at evalIf might also help you write this method.

Once completed, you should be able to get the following results by running code in Main.java:

runFile("code/TestQuestion5");
// => 0, 0, 1, 0, 1, 2

where code/TestQuestion5 has the following mini-Python code:

def looptest(x):
	i = 0
	sum = 0
	while (i < x):
		j = 0
		while j < i:
			print j
			j = j+1
		i=i+1
looptest(4)

Lists

Since we had so much fun with lists in Python, it would be a shame not to support them in mini-Python. Fortunately, much of the code needed to work with lists has already been written for you. In the Parser class, you'll find the methods parseList and parseElement, which parse lists and list element accesses (e.g., myList[1]) respectively.

Our implementation for parseElement is based on the mini-Python grammar rule for list element accesses:

ElementExpression := PrimitiveExpression ElementAccess
ElementAccess := [ Expression ] ElementAccess | ε

Note that this rule allows you to perform element accesses on nested lists. For example, the following is valid mini-Python code according to this production rule:

x = [[1,[3,4,5]],[2,3],[7,9]]
print x[0][1][1]
# => 4

Question 6: The grammar for the mini-Python language is currently missing the production rule for ListExpression. On paper, write an appropriate rule that support lists like [], [1,2,3], and [2+2,func()], but does not support lists like [return 5, print 7] and [def f(x): ..., if(x > y): ...]. You may create an extra production rule in addition to ListExpression, but you should not need more than two rules.

Hint: You may want to look at parseList in Parser to get a feel for what your rule should look like.

Evaluating Lists

While the appropriate list "parse" methods have been provided for you, you'll need to write the code to evaluate lists and list element accesses yourself.

Question 7: Complete the evalList method in YourCode.java. Your function evalList should take in an ArrayList formatted in the following way: ["list", expression1, expression2, expression3, ...] It should evaluate each of the expressions and build a new ArrayList<Object> with the result.

Hint: You may want to use some sort of loop for this problem.

Once completed, you should be able to get the following results by running code in Main.java:

runFile("code/TestQuestion7");
// => [1,2,3], [2,4,3,-5], 6

where code/TestQuestion7 has the following mini-Python code:

print [1,2,3]
print [1+1,2*2,9/3,5-10]
print |[1,2,3,4,5,6]|

Note that the vertical bar syntax |x| in mini-Python is equivalent to len(x) in normal Python. You will not need to implement anything related to this syntax in this problem set.

Question 8: Complete the evalGetListElement method in YourCode.java. This method will get the value at a certain index in a list. Your evalGetListElement function should take in an ArrayList formatted in the following way: ["get", list_expression, index_expression].

You should evaluate both the list_expression and index_expression, which should return an ArrayList and Integer, respectively. Your code should then return the appropriate element from the ArrayList. You should produce an EvalError in the following three cases:

list_expression does not evaluate to a list
index_expression does not evaluate to an Integer
The index is either less than 0 or greater than the size of the list-1

Note that mini-Python does not support the list[1:] syntax.

Hint: In order to check if an object obj is an ArrayList, use the syntax: (obj instanceof ArrayList<?>). To make Java happy you actually do need to type a question mark ? there, not the word Object or anything else.

Once completed, you should be able to get the following results by running code in Main.java:

runFile("code/TestQuestion8");
// => 9, 1, 7, 3, 2, 5

where code/TestQuestion8 has the following mini-Python code:

myList = [9,1,7,3,2,5]
i = 0
while i<|myList|:
	print myList[i]
	i = i+1

Checking for List Equality

mini-Python currently allows us to compare Integers and Booleans using ==. We would like to extend == so that it can also compare two Lists.

Question 9: Extend the evalEquals method in YourCode.java so that it can compare two lists. Note that this function should not throw any errors; if there is a case that seems like it should be an error (for example, one expression is a list while the other is an integer), you should return false.

Once completed, you should be able to get the following results by running code in Main.java:

runFile("code/TestQuestion9");
// => True, False, True, False, True

where code/TestQuestion9 has the following mini-Python code:

print [1, 2, 3] == [1, 2, 3]
print [1, 2] == [1,2,3]
print [[1,2], True] == [[1,2], True]
print [5] == False
print [[]] == [[]]

Question 10: As part of your written turn-in, print out (or copy and paste) the result of your interpreter evaluating the ExampleOldFavorites code. Indicate whether your interpreter's output is correct or incorrect.

Question 11: Fill in the missing text for the variables defined for Question 10 with your commands:

pleased = "What are you most pleased about regarding this course?"

displeased = "What are you most displeased about regarding this course?"

hopetolearn = "What did you most hope to learn in this course that we have not yet
covered?"

tacomments = "Who is your favorite TA, and why? What have the TAs done well? What could
the TAs do better?"

As usual, we appreciate your comments on any subject, but especially any concrete suggestions for improving the class and for making the rest of the class as worthwhile as possible.

Submitting Your Assignment

Since we are only submitting one file (YourCode.java), the way you Export your project is slightly different than before. Once again, start by going to File->Export. This time, however, select File System instead of Archive File. In the Export dialog, press Deselect All, then on the left, find the and select the folder PS7->src->Interpreter. Then, on the right side, check only the box for YourCode.java. Then use Browse... to select a folder to export your file to (your Desktop, for example). Now press Finish. Your YourCode.java file should now be in the directory you specified.

Turn-in Reminder: Submit your YourCode.java file containing all of your answers using the automatic adjudicator. Also make sure to turn in your written answers in class.

cs1120 Problem Set 7: mini-Python Implementing Interpreters

Collaboration Policy - Read Carefully

Purpose

Background

Getting Started With mini-Python

Interpreter Overview

Practicing with mini-Python

The Parser

Adding Loops

The Evaluator

meval

Environments

Errors

Evaluating Simple Expressions

Evaluating Loops

Lists

Evaluating Lists

Checking for List Equality

Submitting Your Assignment

cs1120 Problem Set 7: mini-Python
Implementing Interpreters