Fall 2009 Assignment 2 Prasad
Compiling and Evaluating Simple Expressions
(20
pts) (Due: November 9)
An EBNF
(Extended Backus-Naur Form) grammar for arithmetic expressions containing
variables ( a,
i) and binary operators (+,
*) is given below.
<expr> -> <term> { + <term> }
<term> -> <factor> { * <factor> }
<factor> -> <var> | ( <expr> )
<var> ->
a | i
{<expr>, <term>, <factor>, <var>} are the non-terminals. {a,i,+,*,(,)} are the terminals. <expr> is the start symbol. The meta-symbol "->" separates the lhs and the rhs of a production rule, the meta-symbol "|" represents alternatives on the rhs, and the paired curly braces "{...}" stands for Kleene-star operator (that is, 0 or more iterations of the enclosed regular expression).
Some example arithmetic expressions derivable in the grammar are "i", "( a + a ) * i", " (i * a) + (a)", etc. Some illegal expressions are "(b)", "6", etc. (Note that the double quotes are not part of the expression.)
Now consider the following template for a collection of C# programs.
class Test {
static double f(double a, int i) {
return <expr>;
}
public static void Main(string[] args) {
System.Console.WriteLine( f(2.0,1));
}
}
To obtain a valid C#
program (that is, valid function body), replace <expr>
with an expression derived from the above grammar.
A C# compiler (csc) takes the source code and generates MSIL code, which resembles assembly language instruction for a stack machine.
The translation of the return-expression "(i + a * i)" into MSIL code is:
ldarg.1
conv.r8
ldarg.0
ldarg.1
conv.r8
mul
add
The translation of the return-expression "(i + i)" into MSIL code is:
ldarg.1
ldarg.1
add
conv.r8
The formal arguments a and i are encoded as variables in locations 0 and 1. ldarg.0 (ldarg.1) stands for pushing the value of the variable i (a) on top of the stack ; add (mul) stand for popping the top two appropriate values from the stack, adding (multiplying) them, and pushing the result on top of the stack; and conv.r8 stands for coercing an integer value to a double value.
On the right is an applet that
performs the required translation of the expression into MSIL code, for
your reference. (If you have difficulty viewing Java applet in Internet
Explorer, try Firefox.) To generate more examples of such
translation on a PC, instantiate the above template by replacing <expr>
with a legal arithmetic expression in "Test.cs",
compile it using "%csc
Test.cs", and reverse engineer
the class file using
"%ildasm
Test.exe", focussing on "Method
static double f(double, int)".
PART I: Write a Java program Exprc.java that (1) reads in an expression from a file called expr.dat, (2) determines its equivalent MSIL code (when legal), and (3) outputs this compiled form in the output file called <msil.code> in one instruction per line format. Note that, you are expected to detect errors in expression, if any.
asg2.ppt illustrates basics of code generation and ExprcEg.java gives an incomplete program that you need to understand and then modify to get a working solution. It already provides code illustrating file I/O, scanning, and abstract syntax tree construction. (Specifically, it uses java.io.StreamTokenizer for scanning. Feel free to change it, if necessary)
Determine the associativity of "+" and "*" from the code
generated by the "compiler" applet, and document it in your code.
Even though the file name expr.dat has been fixed for uniformity, for generality, make the
file name optional command line argument defaulting to
expr.dat when the input file name is not explicitly
provided on the command line.
PART II: Write a Java program Exprv.java that simulates a stack machine in order to evaluate the MSIL code file output in Part I. For simplicity, you may explicitly define a stack of doubles for this purpose. The program should be capable of taking the initial values for the variables (that is, a and i) as command line arguments, and output the value returned by the corresponding function call. For example, if the file "msil.code" contains the following code
ldarg.1
conv.r8
ldarg.0
ldarg.1
conv.r8
mul
add
then the result of executing the command
%java Exprv msil.code 2.0 1
should be 3.0.
CS680 Students Only: Repeat Part I and Part II in either C++ or C# naming the files turned-in Exprc.cpp and Exprv.cpp or Exprc.cs and Exprv.cs respectively.
What to hand in?
Submit your solution files Exprc.java
and Exprv.java
by running the following turn-in command on
unixapps1.wright.edu.
%/common/public/tkprasad/cs480/turnin-pa2 Exprc.java Exprv.java README.txt
CS680 students can modify the command as follows:
%/common/public/tkprasad/cs480/turnin-pa2 Exprc.java Exprv.java Exprc.cpp Exprv.cpp README.txt
or
%/common/public/tkprasad/cs480/turnin-pa2 Exprc.java Exprv.java Exprc.cs Exprv.cs README.txt
Prior to submission, make sure that
your code compiles and runs on unixapps1 using the following commands
if coding in Java 5:
%javac5 Exprc.java
%java5 Exprc expr.dat
%javac5 Exprv.java
%java5 Exprv msil.code
2.0 1