Wright State University
Department of Computer Science and Engineering
CS 480/680 Comparative Languages

     Fall 2009                                Assignment  2                                   Prasad



Compiling and Evaluating Simple Expressions  (20 pts)   (Due:  November 9)

          
        An EBNF (Extended Backus-Naur Form) grammar for arithmetic expressions containing variables ( a, i) and  binary operators (+, *)  is given below.

     <expr>   ->  <term>   { + <term> }

    <term>   -> <factor>  { * <factor> }

    <factor> ->  <var>  |  ( <expr> )

    <var>  ->  a | i   
 

{<expr>, <term>, <factor>, <var>} are the non-terminals. {a,i,+,*,(,)} are the terminals. <expr> is the start symbol. The meta-symbol "->" separates the lhs and the rhs of a production rule, the meta-symbol "|" represents alternatives on the rhs, and the paired curly braces "{...}" stands for Kleene-star operator (that is, 0 or more iterations of the enclosed regular expression).

        Some example arithmetic expressions derivable in the grammar are "i",  "( a + a ) * i",  " (i * a) + (a)",  etc. Some illegal expressions are "(b)", "6",  etc. (Note that the double quotes are not part of the expression.)


        Now consider the following template for a collection of C# programs. 

class Test {
    static double f(double a, int i) {

       return  <expr>;
    }
    public static void Main(string[] args) {

       System.Console.WriteLine( f(2.0,1));
    }
}

      To obtain a valid C# program (that is, valid function body), replace  <expr> with an expression derived from the above grammar.   


      A C# compiler (csc) takes the source code and generates MSIL code, which resembles assembly language instruction for a stack machine.

    The translation of  the return-expression  "(i + a * i)" into MSIL code is:

        ldarg.1
        conv.r8
       
ldarg.0
        ldarg.1
        conv.r8

        mul
       
add

The translation of  the return-expression  "(i + i)" into MSIL code is:

        ldarg.1
       
ldarg.1
       
add
        conv.r8

        The formal arguments a and i are encoded as variables in locations 0 and 1. ldarg.0 (ldarg.1) stands for pushing the value of the variable  i (a) on top of the stack ; add (mul) stand for popping the top two appropriate values from the stack, adding (multiplying) them, and pushing the result on top of the stack; and  conv.r8 stands for coercing an integer value to a double value.

On the right is an applet that performs the required translation of the expression into MSIL code, for your reference. (If you have difficulty viewing Java applet in Internet Explorer, try Firefox.) To generate more examples of such translation on a PC, instantiate the above template by replacing <expr> with a legal arithmetic expression in "Test.cs", compile it using "%csc Test.cs",  and reverse engineer the class file using 
"
%ildasm Test.exe", focussing on "Method static double f(double, int)". 
 


PART I: Write a Java program Exprc.java that (1) reads in an expression from a file called expr.dat, (2) determines its equivalent MSIL code (when legal), and (3) outputs  this compiled form in the output file called <msil.code> in one instruction per line format.  Note that, you are expected to detect errors in expression, if any. 

asg2.ppt illustrates basics of code generation and ExprcEg.java gives an incomplete program that you need to understand and then modify to get a working solution. It already provides code illustrating file I/O, scanning, and abstract syntax tree construction. (Specifically, it uses java.io.StreamTokenizer for scanning. Feel free to change it, if necessary) 

Determine the associativity of "+" and "*" from the code generated by the "compiler" applet, and document it in your code.

Even though the file name expr.dat has been fixed for uniformity, for generality, make the file name optional command line argument defaulting to expr.dat when the input file name is not explicitly provided on the command line. 


PART II: Write a Java program Exprv.java that simulates a stack machine in order to evaluate the MSIL code file output in Part I. For simplicity, you may explicitly define a stack of doubles for this purpose. The program should be capable of taking the initial values for the variables (that is,  a and i) as command line arguments, and output the value returned by the corresponding function call. For example, if the file "msil.code" contains the following code

        ldarg.1
        conv.r8
       
ldarg.0
        ldarg.1
        conv.r8

        mul
       
add

then the result of executing the command   

   %java Exprv msil.code 2.0  1

should be 3.0.


CS680 Students Only: Repeat Part I and Part II in either C++ or C# naming the files turned-in Exprc.cpp and Exprv.cpp or Exprc.cs and Exprv.cs respectively.


What to hand in?    
     
        Submit your solution files
Exprc.java and  Exprv.java  by running the following turn-in command on unixapps1.wright.edu

%/common/public/tkprasad/cs480/turnin-pa2  Exprc.java Exprv.java
README.txt

CS680 students can modify the command as follows:

%/common/public/tkprasad/cs480/turnin-pa2  Exprc.java Exprv.java Exprc.cpp Exprv.cpp README.txt

or

%/common/public/tkprasad/cs480/turnin-pa2  Exprc.java Exprv.java Exprc.cs Exprv.cs README.txt

        Prior to submission, make sure that your code compiles and runs on
unixapps1 using the following commands if coding in Java 5:

%javac5  Exprc.java 

%java5   Exprc   expr.dat   

%javac5  Exprv.java

%java5   Exprv   msil.code   2.0   1


T. K. Prasad ( 08/26/2009 )