DEPARTMENT OF COMPUTER SCIENCE
SE320 SOFTWARE VERIFICATION & VALIDATION
FALL 2004

Assignment 4: Empirical Evaluation of Test Suites

Due: November 30th. This is a long assignment. Do not wait until the night before to start it.

There are two options for this assignment. You only need to do one of them. If you do both, you will receive up to 6 points on your final exam score.

Option 1) JCrasher

  1. Read about, Download and Install JCrasher. http://www.cc.gatech.edu/people/home/csallnch/jcrasher/
JCrasher is an automatic robustness testing tool for Java code. JCrasher examines the type information of a set of Java classes and constructs code fragments that will create instances of different types to test the behavior of public methods under random data. JCrasher attempts to detect bugs by causing the program under test to ``crash''--to throw an undeclared runtime exception. Although in general the random testing approach has many limitations, it also has the advantage of being completely automatic: no supervision is required except for online inspection of the test cases that have caused a crash. Compared to other similar commercial and research tools, JCrasher offers several novelties: it transitively analyzes methods, determines the size of each tested method's parameter-space and selects parameter combinations and therefore test cases at random, taking into account the time allocated for testing; it defines heuristics for determining whether a Java exception should be considered a program bug or the JCrasher supplied inputs have violated the code's preconditions; it includes support for efficiently undoing all the state changes introduced by previous tests; it produces test files for JUnit--a popular Java testing tool; and can be integrated in the Eclipse IDE.
  1. Select a set of classes to test. Use the application from the previous assignment. Make the experiment interesting, so select 5 or more classes. Try to select large classes with respect to lines of code.
  1. Create several test suites of varying sizes using the depth option to vary the size.  I understand that you may not have enough disk space, but experiment and create the largest test suite you can. Try to create at least four test suites ( small - 100, medium- 1000, large 5000, extra-large 10000+). Record the number of test cases in each test suite, the time to create the test suite, and the depth you set to create the test suite.
    This is JCrasher 0.27. 
    Usage: java edu.gatech.cc.jcrasher.JCrasher [-depth d] ClassToCrash*
    -depth d .. maximal depth of constructor chaining >0; default =3
    E.g.:
    java edu.gatech.cc.jcrasher.JCrasher -depth 4 my.package.MyClass
    p2.C2 C3
  2. Run the test suites on the classes. Record the results. How long does it take to execute the test suites? Did you find any faults?
  1. Now we are going to manually insert faults into each of the classes above. Insert 3-5 faults into eah class.

    When you seed a fault you need to know what happens when that fault is encountered. Faults shouldn't be seeded in a such a way that they are trivial to uncover ie every test case uncovers the fault; but they shouldn't be so difficult to uncover that no test case uncovers the fault. There needs to be a balance. When you execute the class, you want only one fault to be seeded at a time. I recommend writing a script that inserts, recompiles, and executes the script for each fault (see fault seeding notes below).

  2. Execute the different size test suites on the application. For each test suite,
    1. Record how many faults were uncovered
    2. Record the code coverage
    3. How many faults were detected per test case

  3. Re-run the experiment using the test suite that came with the application. Record the same info as above.
  4. Write up your results in a report. I expect a well written write-up, please us tables to organize your results. For example, the following table would be good. TS: Test Suite, TC: Test Case

    Test Suite
    Size of TS
    Depth
    Time to Create TS
    Time to Run TS
    # of Bugs detected
    # of Seeded Faults Detected
    Faults Detected per TC
    Code Coverage
    JCrasher 1








    JCrasher 2








    JCrasher 3








    JCrasher 4








    Original









  5. Answer the following:
Option 2) Ordering Broken Unit Tests for Focused Debugging

The goal of the second project is to try to replicate the results of the paper titled: Ordering Broken Unit Tests for Focused Debugging, Galli et al, ICSM2004.

Start by reading the ICSM2004 paper on ordering broken unit tests. The abstract:

Current unit test frameworks present broken unit tests in an arbitrary order, but developers want to focus on the most specific ones first. We have therefore inferred a partial order of unit tests corresponding to a coverage hierarchy of their sets of covered method signatures: When several unit tests in this coverage hierarchy break, we can guide the developer to the test calling the smallest number of methods. Our experiments with four case studies indicate that this partial order is semantically meaningful, since faults that cause a unit test to break generally cause less specific unit tests to break as well.

There are several parts of this project:

We are going to use Junit, instead of SUnit. Use the application from the previous assignments.

We are first going to verify that the problem claimed in the abstract exists. Current unit test frameworks present broken unit tests in an arbitrary order, but developers want to focus on the most specific ones first.
To do this, we will use Jester to insert faults, and then run the Junit Test suites on the application and verify that the broken tests are indeed reported in an arbitrary order. Note: if Jester does not work, manually seed faults in the application.  Seed at least 10-20 faults.

Record your results, how many of the seeded faults were detected. How many test cases fail due to each fault?

For one fault, which breaks a large number of test cases(10+) ,  manually order the test cases from least specific to most specific, where the most specific test case will invoke the smallest number of methods.

Design and implement an approach to automatically select the most specific test case. You can use Clover to record the methods that are executed for each test case. Read the ICSM2004 paper, to get ideas.

For this project, you need to hand in a report that answers the following, and the code for your implementation.

1. Create a table like the following and add results.


System
LOC
Test LOC
Coverage
# of Unit Tests
Equivalent Tests
Tests Covered by Tests








2. Results of the above question about how many test cases fail due to each fault.

3. Illustrate the ordering of the test cases for one the chosen fault.

4. Write-up describing approach to automatically selecting the most specific test case.

5. Code for implementation.

6. Results executing your implementation on test suite.

Fault Seeding Notes

Fault seeding is typically performed by creating a variant program version for each fault inserted into the program. Therefore, if there are X faults, we have X versions of the program such that each new version of the program differs from the original version only by the introduction of the fault. Once faults have been seeded in a program, the goal is to determine the effectiveness of the test suite. This is performed by executing each variant program with the test suite and determining if test cases in the test suite detect a fault. A test case is said to detect a fault if the output of the variant program differs from the output of the original program. The fault detection effectiveness of a test suite for a particular program is the ratio of faults detected by the test suite to all injected faults[wong-icse95]. The wong paper describes fault seeding in more detail.

Notes about Jester

  1. Jester can be downloaded from http://jester.sourceforge.net
  2. "Jester is different than code coverage tools, because it can find code that is executed by the running of tests but not actually tested. However, Jester is not meant as a replacement for code coverage tools, merely as a complementary approach. "

    You need to add jester.jar to your CLASSPATH.

    Read the readme file found in the jester directory for usage directions. IT IS IMPORTANT THAT YOU READ THIS FILE.

    In order to run the examples in the jester directory, you need to have the junit.jar file in your CLASSPATH. I also needed to add mockobjects.jar to my CLASSPATH. I ran test.bat and was able to obtain the same results as the README. For acceptancetest.bat, I had to run each command in the .bat file by hand. But, I was able to obtain the same results as reported in the README.

    Here is a paper about Jester: Jester Paper Please create a sensible report that incorporates the above requirements. In the report, clearly indicate each of the above requirements.