Java(TM) Compiler Compiler(TM)

Error Reporting and Recovery



This is a rough document describing the new error recovery features in

Version 0.7.1.  This document also describes how features have changed

since Version 0.6.



The first change (from 0.6) is that we have two new exceptions:



    . ParseException

    . TokenMgrError



Whenever the token manager detects a problem, it throws the exception

TokenMgrError.  Previously, it used to print the message:



  Lexical Error ...



following which it use to throw the exception ParseError.



Whenever the parser detects a problem, it throws the exception

ParseException.  Previously, it used to print the message:



  Encountered ... Was expecting one of ...



following which it use to throw the exception ParseError.



In Version 0.7.1, error messages are never printed explicitly,

rather this information is stored inside the exception objects that

are thrown.  Please see the classes ParseException.java and

TokenMgrError.java (that get generated by JavaCC during parser

generation) for more details.



If the thrown exceptions are never caught, then a standard action is

taken by the virtual machine which normally includes printing the

stack trace and also the result of the "toString" method in the

exception.  So if you do not catch the JavaCC exceptions, a message

quite similar to the ones in Version 0.6.



But if you catch the exception, you must print the message yourself.



Exceptions in Java are all subclasses of type Throwable.  Furthermore,

exceptions are divided into two broad categories - ERRORS and other

exceptions.



Errors are exceptions that one is not expected to recover from -

examples of these are ThreadDeath or OutOfMemoryError.  Errors are

indicated by subclassing the exception "Error".  Exceptions subclassed

from Error need not be specified in the "throws" clause of method

declarations.



Exceptions other than errors are typically defined by subclassing the

exception "Exception".  These exceptions are typically handled by the

user program and must be declared in throws clauses of method

declarations (if it is possible for the method to throw that

exception).



The exception TokenMgrError is a subclass of Error, while the

exception ParseException is a subclass of Exception.  The reasoning

here is that the token manager is never expected to throw an exception

- you must be careful in defining your token specifications such that

you cover all cases.  Hence the suffix "Error" in TokenMgrError.  You

do not have to worry about this exception - if you have designed your

tokens well, it should never get thrown.  Whereas it is typical to

attempt recovery from Parser errors - hence the name "ParseException".

(Although if you still want to recover from token manager errors, you

can do it - it's just that you are not forced to catch them.)



In Version 0.7.1, we have added a syntax to specify additional exceptions

that may be thrown by methods corresponding to non-terminals.  This

syntax is identical to the Java "throws ..." syntax.  Here's an

example of how you use this:



  void VariableDeclaration() throws SymbolTableException, IOException :

  {...}

  {

    ...

  }



Here, VariableDeclaration is defined to throw exceptions

SymbolTableException and IOException in addition to ParseException.



Error Reporting:



The scheme for error reporting is simpler in Version 0.7.1 (as compared

to Version 0.6) - simply modify the file ParseException.java to do

what you want it to do.  Typically, you would modify the getMessage

method to do your own customized error reporting.  All information

regarding these methods can be obtained from the comments in the

generated files ParseException.java and TokenMgrError.java.  It will

also help to understand the functionality of the class Throwable (read

a Java book for this).



There is a method in the generated parser called

"generateParseException".  You can call this method anytime you wish

to generate an object of type ParseException.  This object will

contain all the choices that the parser has attempted since the last

successfully consumed token.



Error Recovery:



JavaCC offers two kinds of error recovery - shallow recovery and deep

recovery.  Shallow recovery recovers if none of the current choices

have succeeded in being selected, while deep recovery is when a choice

is selected, but then an error happens sometime during the parsing of

this choice.



Shallow Error Recovery:



We shall explain shallow error recovery using the following example:



void Stm() :

{}

{

  IfStm()

|

  WhileStm()

}



Let's assume that IfStm starts with the reserved word "if" and WhileStm

starts with the reserved word "while".  Suppose you want to recover by

skipping all the way to the next semicolon when neither IfStm nor WhileStm

can be matched by the next input token (assuming a lookahead of 1).  That

is the next token is neither "if" nor "while".



What you do is write the following:



void Stm() :

{}

{

  IfStm()

|

  WhileStm()

|

  error_skipto(SEMICOLON)

}



But you have to define "error_skipto" first.  So far as JavaCC is concerned,

"error_skipto" is just like any other non-terminal.  The following is one

way to define "error_skipto" (here we use the standard JAVACODE production):



JAVACODE

void error_skipto(int kind) {

  ParseException e = generateParseException();  // generate the exception object.

  System.out.println(e.toString());  // print the error message

  Token t;

  do {

    t = getNextToken();

  } while (t.kind != kind);

    // The above loop consumes tokens all the way upto a token of

    // "kind".  We use a do-while loop rather than a while because the

    // current token is the one immediately before the erroneous token

    // (in our case the token immediately before what should have been

    // "if"/"while".

}



That's it for shallow error recovery.  In a future version of JavaCC

we will have support for modular composition of grammars.  When this

happens, one can place all these error recovery routines into a

separate module that can be "imported" into the main grammar module.

We intend to supply a library of useful routines (for error recovery

and otherwise) when we implement this capability.



Deep Error Recovery:



Let's use the same example that we did for shallow recovery:



void Stm() :

{}

{

  IfStm()

|

  WhileStm()

}



In this case we wish to recover in the same way.  However, we wish to

recover even when there is an error deeper into the parse.  For

example, suppose the next token was "while" - therefore the choice

"WhileStm" was taken.  But suppose that during the parse of WhileStm

some error is encoutered - say one has "while (foo { stm; }" - i.e., the

closing parentheses has been missed.  Shallow recovery will not work

for this situation.  You need deep recovery to achieve this.  For this,

we offer a new syntactic entity in JavaCC - the try-catch-finally block.



First, let us rewrite the above example for deep error recovery and then

explain the try-catch-finally block in more detail:



void Stm() :

{}

{

  try {

    (

      IfStm()

    |

      WhileStm()

    )

  catch (ParseException e) {

    error_skipto(SEMICOLON);

  }

}



That's all you need to do.  If there is any unrecovered error during the

parse of IfStm or WhileStm, then the catch block takes over.  You can

have any number of catch blocks and also optionally a finally block

(just as in Java).  What goes into the catch blocks is *Java code*,

not JavaCC expansions.  For example, the above example could have been

rewritten as:



void Stm() :

{}

{

  try {

    (

      IfStm()

    |

      WhileStm()

    )

  catch (ParseException e) {

    System.out.println(e.toString());

    Token t;

    do {

      t = getNextToken();

    } while (t.kind != SEMICOLON);

  }

}



Our belief is that its best to avoid placing too much Java code in the

catch and finally blocks since it overwhelms the grammar reader.  Its best

to define methods that you can then call from the catch blocks.



Note that in the second writing of the example, we essentially copied

the code out of the implementation of error_skipto.  But we left out the

first statement - the call to generateParseException.  That's because in

this case, the catch block already provides us with the exception.  But

even if you did call this method, you will get back an identical object.




JavaCC Home | SunTest Home | Download | Testimonials | Documentation | FAQ | Support | Contact Us