Both assignments are mandatory. The first is on parsing, and the second on classes, objects, and inheritance.

Assignment 1

The first assignment is on tools for lexical analysis and parsing. The Asterix compiler from which we start, is implemented using the Flex (Lex-like) lexical analyzer generator and the Bison (YACC-like) LALR parser generator. Once they were very innovative tools, but nowadays there are much more modern parser generating tools, which are easier to use, faster, and much better at error correction.

In this assignment the Bison implementation has to be replaced by an LLnextgen implementation. LLnextgen is a rewrite of LLgen, hence the name, and provides better development support (error reporting) than the original LLgen implementation, which is a modern LL(1) parser generator, with static and dynamic facilities to resolve non-LL(1) ambiguities. The behavior of the new compiler must be the same as the old compiler for correct input. For incorrect input, the compiler with the LLnextgen parser must have better support for error reporting and error correction.

The fact that the original parser generator is LALR and the new one is LL(1) has a number of implications:

The sources for the Asterix compiler can be found in /usr/local/asterix-compiler. You should create your own source directory and copy all files in /usr/local/asterix-compiler to that directory.


Assignment 2

[Grading assignment 1 may take some time, so do not wait for the result and continue with this assignment immediately. However, we do expect you to fix the bugs, if any, discovered by your instructor during the grading of assignment 1, so do make a snap shot of your compiler now.]

In this assignment, you are to extend the Asterix language with objects and classes, inheritance, and dynamic binding. A complete description of the changes of the language specification is given below. You must modify your LLnextgen-based compiler such that it implements the extended language.


In addition to var declarations and subprogram declarations, we introduce class declarations.

"class" identifier$_1$ [ "inherits" identifier$_2$ ] "is" declaration * "end"

This rule declares a new class with the name identifier$_1$ in the global scope. It is an error to declare a class other than at the global level. identifier$_1$ may not be re-declared in the global scope.

Optionally the class may inherit from one superclass identifier$_2$. identifier$_2$ must be declared as a class before identifier$_1$.

Each class introduces a new scope for its member declarations. These can be either var declarations or subprogram declarations (methods). If the class does not inherit from another class, the immediate parent scope of the member scope is the global scope. If the class does inherit from another class, the immediate parent scope of the class is the member scope of the class from which it inherits (see Fig. 1).

Figure 1: Scopes with classes.
\begin{figure}\leavevmode \centering\epsfysize =8.7cm \epsffile{scope.eps}\end{figure}

A method in a superclass can be overridden by a method in a subclass with the same name. Method overriding is only allowed if the types of the arguments are exactly the same, including the way arguments are passed (by value or by reference). Furthermore, both methods must either be a function returning the same type, or both be a procedure. Member variables cannot be overridden, however they can be redeclared in a subclass as a different variable.


A new class type is introduced. identifier must be declared by a preceding, or the current, class declaration. A variable of this type is a reference to an object of (a subclass of) class identifier. Object references must be initialized to null.

There is a distinction between a static type and a dynamic type. The static type of a variable is the type used in its declaration. The dynamic type of a variable is the type used in the "new" expression.

There is also a distinction between assignment compatibility and being of exactly the same types. Two expressions are assignment compatible if the static type of the lvalue is (a superclass (recursively)) of the static type of the rvalue. Assignment compatibility is used for the assignment statement, call-by-value parameter passing, and function value returning. For call-by-reference parameter passing, the types must be exactly the same (as well as for arrays of objects.)

"new" identifier

This expression creates an object of static and dynamic type identifier. identifier must be declared by a class declaration. If there is no heap space for a new object, a runtime error occurs. The newly created object should be properly initialized, with all member variables set to their default initialization value.

"delete" expression ";"

The semantics of a delete-statement has been redefined as deleting a class object or an array object. The reference must have been obtained by a new expression, although deleting a "null" value is allowed. Runtime behavior is undefined if an object is deleted more than once.


"self" is a reference to the object currently being invoked. It may not be used outside a method. Its static type is a reference to the class that contains the method. Its dynamic type is determined at object creation time. It is not an lvalue.


The meaning of the "null" expression has been redefined. "null" denotes a null reference to either an array of any type or any class. It is not an lvalue.

expression$_1$ "." qualident

[ identifier$_1$ "::" ] identifier$_2$

The dot-operator is used to select a member of an object. The precedence level and associativity are the same as for the index operator "[]" and function call "()". expression$_1$ must be of any class type. The program behavior is undefined if expression$_1$ equals "null" or has been deleted.

If identifier$_1$ is specified, it must be the same class as the static type of expression$_1$, or one of its superclasses (recursively). The qualification operator "::" binds stronger than the dot-operator. Name resolution for identifier$_2$ starts in identifier$_1$, or in the static type of expression$_1$ if identifier$_1$ is not specified. If identifier$_2$ is not declared in that class, and the class has a superclass, identifier$_2$ is searched for in its superclass (recursively). It is a compile-time error if identifier$_2$ is not declared in any of its (super)classes.

Binding is dynamic if identifier$_2$ denotes a method and is not qualified with identifier$_1$. If identifier$_2$ denotes a member variable, or is qualified with identifier$_1$, binding is static. The difference between static and dynamic binding has consequences for the method being invoked. If B is a subclass of A and overrides method m, object obj has static type A and dynamic type B, then if obj.m() is specified, the method defined in class B is invoked; if obj.A::m() is specified, the method defined in class A is invoked.

For convenience the following short-hand notation may be used within the context of a method:


it is semantically equivalent to "self""."qualident.

An example program is shown below.

(* Test function overriding *)
class A is
    function f() : int is return 1;
class B inherits A is
    function f() : int is return 2;
class C inherits B is
    function f() : int is return 3;
function main(argv : array of string) : int is
var obj : B;
    obj := new C;
    WriteString("Expect 3 : ");
    WriteInt(obj.f());            (* Note: obj.C::f() is not allowed *)
    WriteString("Expect 2 : ");
    WriteString("Expect 1 : ");
    delete obj;

    return 0;

K.G. Langendoen 2007-11-09