next up previous
Next: General remarks Up: Compiler construction assignments for Previous: Introduction



Both assignments are mandatory. The first is on records, and the second on code generation.

Assignment 1

The fist assignment has been designed to get acquainted with all the aspects of the reference compiler. In this assignment, you are to extend the Asterix language with records (structs in C). A complete description of the changes to the language specification is given below. You must modify the reference compiler such that its implements the extended language.


In addition to variable declarations and subprogram declarations, we introduce record declarations.

"record" identifier$_1$ "is" var_declarations "end"

This rule declares a new record with the name identifier$_1$ in the global scope. It is an error to declare a record other than at the global level. Identifier$_1$ may not be re-declared. Each record introduces a new scope for its member declarations. These can only be variable declarations. A record has to be defined before its use. It may be used in other records, except for recursive definitions.


A new record type is introduced. Identifier must be declared by a record declaration. A variable of this type is an object storing the members of record identifier. All members of the record have to be initialized with their appropriate values (see Table 1 in the Asterix reference manual).

Record types with the same type identifier are equal (name equivalence). Assignment to a record is allowed: all members of the source record (rvalue) are copied to the destination record (lvalue). It is not allowed to test for the equality of two records.

expression$_1$ "." identifier

The dot-operator is used to select a member of a record. The precedence level and associatively are the same as for the index operator "[]" and function call "()". expression$_1$ must be of any record type, and identifier has to be a member of this record.

An example program is shown below.


record foo is
    var a: int;

record bar is
    var a: int;
        b: foo;

function f(x: bar): int is
        x.b.a := x.b.a + 1;
        return x.b.a;

function main(argv: array of string): int is
    var h: bar;
        i: int;
        h.a := 2;
        h.b.a := 3;
        i := f(h);

        WriteString("This should be 4: ");

        WriteString("This should be 3: ");

        return 0;

The sources for the Asterix compiler can be found in /home/in4020tu/asterix-compiler/src. You should create your own source directory and copy all files in /home/in4020tu/asterix-compiler/src to that directory.


Assignment 2

In this assignment, you are to change the code generation part of your Asterix compiler developed in assignment 1 to emit IA-32 assembler instead of C code. To keep this task manageable you may use the simple code generation scheme that considers one AST node at a time, see Section 4.2.4 of the Modern Compiler Design book. You are advised to treat the IA-32 assembler as a stack machine (it supports push and pop instructions) to avoid the complexity of register allocation. We will grade your compiler on the correctness of the generated assembly code, not on its efficiency (i.e. speed).

It is important to understand the syntax and semantics of the IA-32 assembly language. You can use GCC to obtain this understanding by example: the -S option will cause it to generate assembly. For instance, the command gcc -S cb.c produces the file cb.s. Start by compiling a simple 'hello world' program and move on from there. If you prefer a more gentle introduction to assembly programming, you may consider the following on-line sources:

The original reference compiler, as well as your extended compiler supporting records, emits C code (i.e., the cb.c file) that uses the standard C, rts, and cblib libraries. You can also take advantage of these libraries by following the calling convention of the C compiler:

This calling convention is illustrated by Figure 1, which shows the assembly code as well as the activation records involved in making a call to function foo. The %ebp register (base pointer) keeps track of the current activation record; it is saved on function entry (dynamic link) and restored on exit.

Figure 1: Various snapshots of the call stack when invoking function foo.
\begin{figure}\leavevmode \centering\fbox{
\epsfxsize=.95\textwidth \epsffile{function_call.eps}

next up previous
Next: General remarks Up: Compiler construction assignments for Previous: Introduction
Koen Langendoen 2003-11-14