DEPARTMENT OF COMPUTER SCIENCE
CS762 COMPILER CONSTRUCTION II
Spring 2003

A quarter long research project will account for a significant portion of your grade for this course. Since there will be no final exam in this course, I expect that the final report to be of high quality.

The purpose of this research project is to allow you to gain some insight into the issues associated with compilers and provide you with some more hands-on experience with the internals of a compiler. In addition, the research project is intended to expose you to the current state of the art, and allow you the opportunity to implement those ideas, and try to come up with a better solution than currently exists. If your idea turns out not to be better than a current technique, that is ok, you will not be penalized.

I am providing you with a list of potential topics. If you want to explore another idea, you need to come talk to me or write a somewhat detailed description of what you are thinking of pursuing. This can be the first part of your research proposal, but more details are required for your research proposal (see research proposal details).

Due dates associated with the project

April 3: Handout potential research project topic descriptions

April 10:. You must select a topic before class. Assignments will be made on a first come, first serve basis.

April 21: Research proposals are due.

May 12: Research update 1 is due.

May 26: Research update 2 is due.

June 9: Final report and projects are due.

Potential Topics

The following are a list of topics. They are intended to be vague descriptions of potential ideas. It is your job to fill in the details in your research proposal. You need to select a topic as soon as possible. In order to select a topic read through the following list. Come up with a few potential topics that you think sound interesting. Then you need to do a little bit of research before selecting the topic. Determine if you can quickly find enough information that will help you with the background related work. Then think briefly about what you intend to propose. You need to contimplate this decision, this is a significant portion of your final grade. Don't select a topic that you later find out you can not handle. You must send me email before next class.

Each project is based on performing an empirical evaluation. Some of the topics require more programming than others. But they all require utilizing exisintg frameworks, VMS, and compilers. The Soot compiler and the Research Virtual Machine are both high quality open source compiler infratstures that you can utilize.

  1. Space Optimizations: Optimizations that reduce the runtime of programs are important. However, as small devices with restricted memories become more previlant and are relied on more heavily, optimizations that reduce the size of code are also important. In particular, Java virtual machines are being implemented on small devices and the size of bytecode which runs on these VMs needs to considered. The purpose of this project will be to try to reduce the size of Java bytecode by reducing the number of local variables, by reducing the stack height, by reducing the constant pool size, by removing useless fields and useless methods, reverse inlining, and so on. Static size measurements can be used to evaulate the effectiveness of the optimizations.

  2. Finding Rectangular Arrays: Determining when an array is rectangluar is useful in order to eliminate some array bounds checks. A simple algorithm is described in [QHV00] for determining if an array is rectangular. This project would implement this algorithm in order to eliminate array bounds check. In order to show the effectiveness of this optization, comparisons with and without the optimization would be performed. It would also be interesting to find some other approaches that eliminate array bounds check, implement them and compare the results.

    [QHV00] Feng Qian, Laurie Hendren and Clark Verbrugge. A Comprehensive Approach to Array Bounds Check Elimination for Java Compiler Construction 2002.

  3. Comparing intermediate representations: The Soot infrastructure is a research infrastructre from McGill. While it has proven to be a useful research infrastructure utilized by many research groups, there are some deficiencies in the bytecode produced by Soot. This project would evaulate bytecode produced by javac, Soot, and Jikes (http://oss.software.ibm.com/developerworks/opensource/jikes). What is different about the bytecode? How could the Soot bytecode be improved? Are there optimizations performed by Jikes and not by Soot?

  4. Refactoring Java based on execution traces: All methods are not equal, and all parts of a method are not equal, meaning that some portions of a method may have hot parts, when other protions are cold (executed rarely). Since it takes time to compile cold parts, it has been suggested that such methods should be refactored so that only the hot parts remain in the method body and the cold parts are moved to other methods. The purpose of this project would be to use Soot or AdpatJ to profile frequencies of important basic blocks. Then, based on the profiling factor out the cold code to new methods.

    John Whaley. Partial Method Compilation using Dynamic Profiling, OOPSLA 2001.

  5. Field Analysis in Soot: In [GRS00], simple analyses were presented based on object fields. Implementing such an analysis in Soot should be relatively simple. Once the analysis is implemented, bytecode optimizations could be performed.

    [GRS00] Sanjay Ghemawat, Keith H. Randall, and Daniel J. Scales. Field analysis: getting useful and low-cost interprocedural information. PLDI 2000.

  6. Getter and Setter methods: One object-oriented practice that is utilized quite frequently is based on accessing fields of an object through getter and setter methods. However, potentially there is overhead associated with all these method calls.

    This project would have two parts: First, refactor code to create getter and setter methods, and compare the results to the unfactored code utilizing several virtual machines. Is there any overhead assiocated with getter and setter methods? The second task would be to perform the reverse analysis in order to locate getter and setter methods and replace them with inlined references to the fields. Is there any speed-up? Do inlining techniques for standard VMs already inline, or is there room for improvement?

  7. AdaptJ: The AdaptJ tool allows one to collect a variety of different metrics about the runtime behavior of programs. This project would determine the current metrics calculated by AdaptJ and then come up with a couple of new metrics and incorpate them into AdaptJ. Using a large number of benchmarks, demonstrate the usefulness of the metric. For example, there are many static metrics that are useful in the software engineering community, there are dynamic equivalents of such metrics. Metrics could also be connected to optimization opportunities. Is the 80/20 rule for real?

  8. Recognizing Aspects: What aspects exist in software? How to detect aspects in Java programs? Robillard and Murphy[RM00] utilize an approach based on the concern graph for detecting concerns, however parts of the approach could be automated to eliminate some work for a software maintainer. This project would first explore exisitng work on aspect/concern detection and then develop a technique for automating (or semi-automating) the process. Part of the project requires determining which parts of the process can be automated versus parts that can not be automated. [RM00] Robillard and Murphy. Concern Graphs: Finding and Describing Concerns Using Structural Program Dependencies. Proceedings of ICSE 2002, pages 406-416, May 2002.

    Robillar. A Representation for Describing and Analyzing Concerns in Source Code. Proceedings of ICSE 2002, pages 721-722, May 2002.

    G.C. Murphy, R.J. Walker, E.L.A. Baniassad, M.P. Robillard, A. Lai, and M.A. Kersten. Does Aspect-oriented Programming Work?. Communication of the ACM, 44(10):75-77, October 2001.

    M.P. Robillard, G.C. Murphy. Analyzing Concerns Using Class Member Dependencies. Position paper for the ICSE 2001 Workshop on Advanced Separation of Concerns in Software Engineering.

    G.C. Murphy, A. Lai, R.J. Walker, M.P. Robillard. Separating Features in Source Code: An Exploratory Study. Proceedings of ICSE 2001, pages 275-284, May 2001.

  9. More Aspects A recent paper describes a simple optimization of aspects [SM03]. This paper looks at how to use regular expressions and static anlaysis to eliminate runtime overhead associated certain characteristics of aspects. This project study their approach and then try to improve on their approach. You need to implement the approach, and run benchmarks to show results. It is ok, if your attempt to improve on their appraoch fails. [SM03] D. Sereni and O. de Moor. Static analysis of aspects. International Conference on Aspect-Oriented Software Development. 2003.

  10. Virtual Machine Performance: Not all VMs are created equal. Many factors affect the performance of virtual machines including quality of code produced, performance of the garbage collector, the quality of the class library utilized, and the success of the inliner, to name a few. This project will study several existing virtual machines and perform a study and analysis in order to come to some conclusions about the factors that affect VMs. What are the strengths and weakness of each VM?

  11. Observer Design Pattern and Data Flow Analysis: In a recent paper [N02], data flow analyses are implemented utilizing the observer design pattern and results are presented indicating that such an approach is better than a traditional worklist based approach.

    This project would entail implementing a data flow analysis utilizng the observer design pattern and compare it to a data flow anlaysis implemented in Soot. Is the observer pattern better? If so, in what ways is the observer implementation better. [N02] Gleb Naumovich. Using the Observer Design Pttern for Implmentation of Data Flow Analyses. PASTE 02.

    If you would like to pursue a different research project please discuss your idea with me as soon as possible.

References

AdaptJ http://www.sable.mcgill.ca/~bdufou1/AdaptJ/main.html

Laurie Hendren COMP621 Optimizing Compilers http://www.sable.mcgill.ca/~hendren/
Other page with projects: http://www.sable.mcgill.ca/~olhota/621

IBM Research Virtual Machine http://oss.software.ibm.com/developerworks/oss/jikesrvm/?origin=jikes

Soot Compiler Infrastructure http://www.sable.mcgill.ca/
Amie Lynn Souter
Last modified: Sun Sept 22 10:58:15 EDT 2002