Homework Assignment #1 — Test Coverage

In this assignment you will create three high-coverage test suites, one for each of three different programs spanning three different application domains and three different programming languages. You will then be asked to write a short report reflecting on the activity.

Two of the key properties found in industrial software engineering, but not in most undergraduate academic coursework, are: the program is large and you did not write the program. If you are hired by Microsoft as a software engineer, they will not ask you to write Office from scratch. Instead, as discussed in class, the vast majority of software engineering is the comprehension and maintenance of large legacy codebases (i.e., old code you did not write).

Thus, there is no particular assumption that you are familiar with any of the subject programs in this assignment. Indeed, that is the point: you will be applying testing concepts to unknown software.

You may work with a partner for this assignment. If you do you must use the same partner for all sub-components of this assignment. Only one partner needs to submit the report on Canvas, but if you both do, nothing bad happens.

Installing, Compiling and Running Legacy Code

It is your responsibility to download, compile, and run the three subject programs. Getting the code to work is part of the assignment. You can post on the forum for help and compare notes bemoaning various architectures (e.g., windows vs. mac vs. linux, etc.). Ultimately, however, it is your responsibility to read the documentation for these programs and utilities and use some elbow grease to make them work.

If you are having trouble (e.g., windows is often a bit more complicated than linux for these things), post on the forum. Other students should be able to offer help and advice.

For example, one TA reports that gcc may not easily support static linking on a Mac, but that it took fewer than two minutes to set everything up in Ubuntu.

If relevant, the grading server uses Python 2.7.12, gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609 and javac 1.8.0_151 for this assignment.

The TAs have suggested that some components of this assignment, such as HW1b, are very difficult to get set up natively on a Mac. Just use a linux virtual machine.

Subject Programs and Metrics

There are three subject programs for this assignment. Each program has varying characteristics, input formats, and coverage goals. Your work for each subject program is submitted separately to the grading server (as HW1a, HW1b and HW1c).

HW1a — AVL Trees (Python)

The first subject program is a simple python implementation of an abstract data type: the AVL Tree container data structure.

Test cases for this program take the form of sequences of instructions for manipulating a single global AVL tree:

For example, if the text file test1.avl contains i1 i2 i3 i4 i5 d4 p, then python avl.py test1.avl produces the output:
  2.
 /  \
1   5
/\  /\
   3
   /\

We compute coverage for this program using the coverage package for Python. Both statement and branch coverage are considered.

The reference implementation is available here. The program is about 300 lines long and is a self-contained single file.

A test suite for this application is a collection of .avl files. The reference starter test suite is available here.

Python Build, Installation and Coverage Details (click here)

Installing coverage may look like this on your Ubuntu setup:

$ sudo apt-get install python-pip python-dev build-essential
$ sudo pip install --upgrade pip
$ sudo pip install --upgrade virtualenv
$ sudo pip install coverage

Note that apt-get is Debian/Ubuntu-specific. If you have another Linux distribution you may need to use something else (yum on Red Hat, etc.). If you're using a Mac, read this entire page and note the multiple paragraphs about Macs: while HW1A is fairly reasonable on a Mac, HW1B may not be, and you may want to consider setting up a virtual machine now. Check the forum for student hints.

After that, you can use the coverage utility on multiple test cases and view combined coverage results:
$ coverage run --append avl.py simple1.avl
1
/\
 2
 /\

$ coverage report
Name     Stmts   Miss  Cover
----------------------------
avl.py     182     99    46%

$ coverage run --append avl.py simple2.avl
  2.
 /  \
1   5
/\  /\
   3
   /\

$ coverage report
Name     Stmts   Miss  Cover
----------------------------
avl.py     182     54    70%

$ coverage erase

Note how the measured coverage of only the first test is low (46%) but the combined coverage result for the two tests is higher. The --append option is necessary to avoid overwriting the stored coverage data with each new test run.

Now we consider branch coverage. Simply add --branch:

$ coverage run --append --branch avl.py simple1.avl

$ coverage run --append --branch avl.py simple2.avl

$ coverage report
Name     Stmts   Miss Branch BrPart  Cover
------------------------------------------
avl.py     182     54     90     19    66%

$ coverage report -m
Name     Stmts   Miss Branch BrPart  Cover   Missing
----------------------------------------------------
avl.py     182     54     90     19    66%   82-85, 88, 101, 109-112, 121,
123-127, 139-141, 145, 189, 201-202, 211, 216, 223-238, 244-248, 253-254,
284, 286-292, 304-306, 81->82, 87->88, 100->101, 107->109, 120->121,
122->123, 138->139, 144->145, 171->exit, 210->211, 212->214, 215->216,
243->244, 250->253, 283->284, 285->286, 301->303, 303->304, 319->312

$ coverage erase

The additional columns give branch-specification information (e.g., about partial branches; see the manual), but the final column now gives the branch coverage. Note also how the missing column guides you to line numbers that are not exercised by the current suite.

There is external written documentation available for this utility. Finally, note that this utility can also produce XML and HTML reports, which you may find easier to interpret.

HW1b — PNG Graphics (C)

The second subject program is the portable network graphics file format reference library, libpng. It is used for image manipulation and conversion.

We will be using the developer-provided pngtest driver. It takes as input a single .png file.

We compute coverage for this program using the gcov test coverage program for gcc. Only statement coverage is considered.

The reference implementation (version 1.6.34) is available here. It contains about 85,000 lines of code spread over about 70 files.

A test suite for this application is a collection of .png files. The reference implementation comes with over 150 such example files. Feel free to use an image manipulation or conversion program to create your own files.

The difficulty difference between HW1a and HW1b is large. (For example, students may report that HW1a takes "minutes" while HW1b takes "hours".)

C Build, Installation and Coverage Details (click here)

The gcov utility usually comes with gcc, so you should not need to do anything specific to install it. (Unless you are trying to use a Mac natively despite our warnings not to do so. In such a world you will likely encounter the issue that gcc static linking may not work on a Mac and that using dynamic linking will give you 0% coverage. You should then install linux in a VM.) However, our subject program depends on the development version of a compression library:

$ sudo apt-get install libz-dev 
In addition, you will have to compile the project with coverage instrumentation (taking care to use static linking or to otherwise make all of the libpng source files, and not just the test harness, visible to coverage):
$ cd libpng-1.6.34
$ sh ./configure CFLAGS="--coverage -static"
$ make clean ; make 

$ ./pngtest pngtest.png
        Testing libpng version 1.6.34
        ...

$ gcov *.c
        ...
File 'png.c'
Lines executed:37.78% of 1236
Creating 'png.c.gcov'

File 'pngerror.c'
Lines executed:16.67% of 252
Creating 'pngerror.c.gcov'

        ...

Lines executed:28.71% of 10606

$ ./pngtest contrib/gregbook/toucan.png
        ...

$ gcov *.c
        ...

File 'pngwutil.c'
Lines executed:61.90% of 979
Creating 'pngwutil.c.gcov'

Lines executed:30.72% of 10606


$ rm *.gcda pngout.png

Note how gcov gives a separate report for each source file specified on the command line, but also gives a sum total for statement coverage (the final Lines executed reported).

The png.c.gcov (etc.) files created by this processed are marked-up source files that include coverage and visitation information. You can view them to determine which lines and branches were covered. Note that lcov, a graphical front-end to gcov, may help you interpret that output for large projects. While it is possible to obtain branch coverage using gcov (e.g., -b -c) we will not for this assignment.

Note that pngtest creates pngout.png. (This can confuse some students, because if you collect coverage after running on *.png and then do it again, you may get a different answer the second time if you're not careful because of the newly-added file!)

HW1c — Data Visualization (Java)

The final subject program is the JFreeChart chart creation library. It is used to produce quality charts for a variety of applications and files.

We will be writing tests against its programmable API. Each test case will thus be a .java file that invokes methods from the chart-creation library.

We compute coverage for this program using the cobertura bytecode test coverage utility. Both statement and branch coverage is considered.

The reference implementation (version 1.5.0) is available here. It contains about 300,000 lines of code spread over about 640 files.

A test suite for this application is a collection of .java files. The reference starter test suite is available here. Note that our grading server does not support graphical user interfaces, so any tests that create Java/AWT displays or widgets or GUI elements are unlikely to work.

Java Build, Installation and Coverage Details (click here)

First, you will need to install cobertura to collect Java coverage and maven to build this project:

$ sudo apt-get install cobertura maven 

Then you will need to build the project:

$ tar zxf jfreechart-1.5.0.tar.gz
$ tar zxf jfreechart-tests.tar.gz
$ cd jfreechart-1.5.0
$ EDIT pom.xml          
	# search for maven-compiler-plugin and 
	# add "debug" and "debuglevel" properties
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.7.0</version>
                <configuration>
                    <source>${project.source.level}</source>
                    <target>${project.target.level}</target>
                    <encoding>${project.build.sourceEncoding}</encoding>
                    <debug>true</debug>
                    <debuglevel>lines,vars,source</debuglevel>
                </configuration>
            </plugin>

$ mvn compile
$ cd target/classes
$ cobertura-instrument org/ 
Cobertura 1.9.4.1 - GNU GPL License (NO WARRANTY) - See COPYRIGHT file
Instrumenting 1 file
WARN   visitEnd, No line number information found for class org.jfree.chart.axis.DateAxis$1.  Perhaps you need to compile with debug=true?
Cobertura: Saved information on 661 classes.
Instrument time: 835ms

$ javac -cp /usr/share/java/cobertura.jar:. test/ChartTest1.java
$ java -cp /usr/share/java/cobertura.jar:. test/ChartTest1
output1.png created
Flushing results...
Flushing results done
Cobertura: Loaded information on 661 classes.
Cobertura: Saved information on 661 classes.

$ cobertura-check --totalline 100
Cobertura 1.9.4.1 - GNU GPL License (NO WARRANTY) - See COPYRIGHT file
Cobertura: Loaded information on 661 classes.
Project failed check. Total line coverage rate of 3.3% is below 100.0%

$ cobertura-check --totalbranch 100
Cobertura 1.9.4.1 - GNU GPL License (NO WARRANTY) - See COPYRIGHT file
Cobertura: Loaded information on 661 classes.
Project failed check. Total branch coverage rate of 1.4% is below 100.0%

$ javac -cp /usr/share/java/cobertura.jar:. test/ChartTest2.java
$ java -cp /usr/share/java/cobertura.jar:. test/ChartTest2
$ cobertura-check --totalline 100
Project failed check. Total line coverage rate of 3.5% is below 100.0%
$ cobertura-check --totalbranch 100
Project failed check. Total branch coverage rate of 1.6% is below 100.0%

You can run cobertura-report --destination report to make a report/ subdirectory containing source files annotated with coverage information. You can also run cobertura-check with different arguments to get per-file numerical coverage reports.

Note: If you want to clear information and recalculate coverage, some report that merely deleting cobertura.ser will not work, and you should restart from mvn compile and cobertura-instrument org/ in the instructions above. (If you do not you may either see coverage reports that are too high or coverage reports that are entirely blank.)

Note that there is a much easier way to run cobertura for standard Java development:

$ mvn cobertura:cobertura
This instructs maven to automatically download the cobertura plugin, run the project tests, and place an HTML report in ${project}/target/site/cobertura/index.html. We will return to this convenient approach in Homework 2, but for now, we want you to gain experience with nitty-gritty command-line approaches to coverage that do not abstract away details.

(If you ignored our warnings and tried to do this on a Mac, you likely ran into a number of issues. You may find that you need to download cobertura yourself and convert all of its shell scripts using something like dos2unix. You may have to handle the Java NoClassDefFoundError exceptions by installing slf4j and putting slf4j-api.jar and Slf4j-nop.jar in your classpath. You may then have to change all of the commands and paths to work on a Mac, and even if you do this you'll find that the coverage reported is a little bit off from what is reported above and what is done on the grading server — and the grading server measurements count. Or you could install linux on a virtual machine.)

HW1d — Written Report

You must also write a short two-paragraph report reflecting on your experiences creating high-coverage tests for this assigment. (If you are working with a partner, indicate as much in the text or header. Recall from above that you need only submit one copy of the report, but if both you do, nothing bad happens. Try to use the Groups on Canvas if you can.) Consider addressing points such as:

Rubric:

There is no explicit format (e.g., for headings or citations) required. You may discuss other activities (e.g., validation or automation scripts you wrote, etc.) as well.

The grading staff will select a small number of excerpts from particularly high-quality or instructive reports and share them with the class. If your report is selected you will receive extra credit.

Recommended Approach

You will almost certainly need to inspect the source code of the various programs (i.e., white-box testing) to construct high-coverage test suites.

For each program, start with any test cases we have provided. Because of limits on autograding submissions, and so that you learn how coverage computations work across various systems (a relevant skill for interviews, etc.), you will almost certainly want to install the coverage utilities yourself. As you read each project's source code and expand your test suite, recompute the coverage locally. When you perceive that it has gone up significantly, submit to the grading server for a more official check.

Note the increasing size and complexity of the subject programs. Splitting your time evenly between them is probably not the optimal allocation. The first project is due earlier to prevent students from putting things off until the last minute.

Submission

For each of the three programmatic components (HW1a, HW1b, HW1c), submit your (individual, not zipped) test cases via the autograder.io autograding server. You may submit multiple times (see the grading server for details). For the written component (HW1d), submit your PDF report via Canvas.

The exact grading rubric (e.g., x points for y coverage) can be seen in the submission feedback details. As of 1/7/18, the grading rubric scales roughly between:

However, the grading rubric on the autograder is officially correct and the rough ranges above are provided only as a planning convenience and may be slightly inaccurate.

Note that your local results may differ slightly from the autograder results and that is fine and expected. For example, you may observe 6.8% line coverage "at home" and 6.2% on the autograder. (Different compilers, header files, optimizations, etc., may result in different branch and line counts.) Your grade is based on the autograder results, regardless of what you see locally.

We always use your best autograder submission result (even if your latest result is worse) for your grade.

Note that you are typically limited to a maximum number of test cases (i.e., 50 files totalling 30 megabytes). If you find that you have created more than that, minimize your test suite by prioritizing those that most increase coverage. (You might, as an example, use a for loop to add one test to the suite at a time, noting the new coverage after each addition, and use that information to make your decision.)

Note also that the grading server has resource usage (e.g., CPU time) caps. As a result, if you submit a test suite that is very large, we may not be able to evaluate your submission. Consider both quantity and quality.

Feel free to scour the web (e.g., Stack Overflow, etc.) or this webpage (e.g., the example tests shown above) or the tarballs (e.g., yes, you can submit pngtest.png or toucan.png if you want to) for ideas and example images to use directly as part of your answer (With or without modification) — just cite your sources (or URLs) in the report. However, submissions are limited to 50 test cases (so just finding a big repository of two hundred images may not immediately help you without additional work) totalling 30 megabytes. In addition, you may never submit another student's work (images or test selection) as your own.

Commentary

As befitting a 400-level elective, this assignment is somewhat open-ended. There is no "one true path" to manually creating high-coverage test suites.

You may find yourself thinking something like "I know what to do at a high level, but I don't know specifically what to create here. I could try to read and understand the program to make tests that cause it to do different things, but reading and understanding the program takes a long time, especially since I didn't write this code and I don't really care about it and I'm only doing it because someone is telling me to." Such concerns are, in some sense, the point: they are indicative of industrial software engineering.

History suggests that HW1b is the hardest, HW1c is moderate, and HW1a is the easiest. While your results may vary, feel free to use this information when planning your time.

Finally, students sometimes wonder if it is possible to get 100% coverage on 1a. While it should be possible to get 100% coverage for the general AVL tree algorithm, this particular implementation of the algorithm may well have branches that you are unable to reach — especially given this testing setup.

FAQ and Troubleshooting

In this section we detail previous student issues and resolutions:

  1. Question: When I type the command sudo apt-get install python-pip python-dev build-essential I am prompted for a password — which should I use?

    Answer: The password should be the one associated with your user account (the one you used to log in to the OS). If you are using a virtual machine, it is the virtual machine login password, not your "real" login password.

  2. Question: I get

    The program 'make' can be found in the following packages:
     * make
    

    Answer: You need to do sudo apt-get install make or similar. (If you are uncertain about installing and using make, you may want to drop the class this semester and take it again next year after taking a few other courses; this class assumes this background.)

  3. Question: For HW1b:

    ./.libs/libpng16.a(pngrutil.o): In function `png_inflate_claim':
    pngrutil.c:(.text+0xd5b): undefined reference to `inflateValidate'
    collect2: error: ld returned 1 exit status
    Makefile:1005: recipe for target 'pngfix' failed
    
    or
    Note, selecting 'zlib1g-dev' instead of 'libz-dev'

    Answer: You do not have libz-dev installed correctly. The student eventually resolved this by installing an Ubuntu virtual machine.

  4. Question: I get this error

    make: *** No targets specified and no makefile found.  Stop.
    

    Answer: You need to run configure before running make. Double-check the build and installation instructions above.

  5. Question: When I try to run

    gcc -o pngtest pngtest.c
    
    It chokes with undefined reference errors.

    Answer: Yes — pngtest requires a number of libraries to build. You should follow the installation instructions above (run "configure" then "make", etc.).

  6. Question: For HW1c:

    Warning: JAVA_HOME environment variable is not set
    
    or
    [ERROR] COMPILATION ERROR : 
    [INFO] -------------------------------------------------------------
    [ERROR] No compiler is provided in this environment. Perhaps you are
    running on a JRE rather than a JDK?
    [INFO] 1 error
    

    Answer: Try sudo apt-get install openjdk-8-jdk
    See here for more information.

  7. Question: For HW1c, I get decent coverage locally, but when I submit, I get 0.

    Answer 1: Click on the right-facing arrow to expand the item to gain additional information. In one student's case, the line

    test/AreaChartTest.java:43: error: package org.junit does not exist
    
    ... revealed that the student was using a test that required Junit (which is not present on the grading server). This can happen if you try to use a test from JFreeChart or the web unchanged.

    Answer 2: In another student's case, the line (again, click on the triangle thingy to expand it)

    Exception in thread "main" java.lang.IllegalArgumentException: Series index out of bounds
    
    Revealed that the student was submitting tests that would raise exceptions before covering many lines.

    Answer 3: In another case, the line:

    No X11 DISPLAY variable was set, but this program performed an operation which requires it.
    
    Revealed that he student was submitting a test that made use of a graphical user interface ("X11" is the Linux/Unix graphical UI). Those are not supported on the grading server.

  8. Question: For HW1c, on the generated HTML report, I get Unable to locate org/jfree/chart/ChartColor.java. Have you specified the source directory?

    Answer: Try specifying the source directory when you generate the report, using something like:

    cobertura-report --destination report srcdir=" ../../src/main/java"
    

  9. Question: For HW1c, I get:

    javac: file not found: test/ChartTest1.java
    

    Answer: Download the test tarball and unpack it.

  10. Question: For HW1c, when I am trying to run my own tests, I get:

     
    Error: could not find or load main class test.insertfilenamehere
    

    Answer: Try adding package test; to your files.

  11. Question: For HW1d, can you post guidelines or points of interest for writing high-quality or instructive reports?

    Answer: I regret — I cannot. At least, not in the way one is likely to desire when posing such a query. You can potentially reverse engineer things a bit, but ultimately there's no magic formula.

    The grading staff may well select a small number excerpts that are worth sharing with the class. You can infer from that that we'll be favoring excerpts that are "family friendly" and that contribute to pedagogical goals: either by bringing something new to light, or by reinforcing or reinterpreting concepts from class. Given my eclectic tastes and views on liberal arts education, however, this could potentially be anything from a strong link between, say, testing and rhetoric, to some direct point about software maintenance. Usually these sorts of things end up being awarded to students who have clearly spent a bit more time than usual on the assignment. (If you are being very mercenary, it's probably not worth shooting for an ill-defined unsure thing.)

Mac OS X Help

Using a Mac is not supported by the course staff; it is your responsibility to get the software up and running. That said, we provide a number of helpful student hints on an "as is" basis.

Some students suggest that you may be able to get things working with a virtual environment:

$ python3 -m venv env
$ source env/bin/activate
(env) $ echo "Now I am in a virtual environment!"
(env) Now I am in a virtual environment!
(env) $ pip install coverage
Now you can run coverage:
(env) $ coverage run --append avl.py simple1.avl 
To exit the virtual environment:
(env) $ deactivate
$ echo "Left virtual environment."
Left virtual environment.
In addition, you can obtain Cobertura (which is not a Python package, and thus cannot be installed via pip) from the central Cobertura repository.

You may also want to install VirtualBox. You can You can download VirtualBox from here: https://www.virtualbox.org/wiki/Downloads (look under "VirtualBox 5.2.4 platform packages"). For Mac OS you get a .dmg so installation should be straightforward. You then need an image. One student reported success with an official Ubuntu disk image from https://www.ubuntu.com/download/desktop. You then create a VM within Virtualbox, using these instructions: https://askubuntu.com/questions/142549/how-to-install-ubuntu-on-virtualbox.

If you instead want to use Vagrant (like VirtualBox, but if you don't need the entire GUI overhead), these Slides from EECS 485 may help: https://drive.google.com/open?id=0B85QsxlI9S_RVEdHWTVXZlg0YW8. One student's steps:

  1. Download VirtualBox and Vagrant
  2. cd into your 481 directory or any directory you want to use for your project
  3. Run the following commands:
    $ vagrant init bento/ubuntu-16.04
    $ vagrant up
    $ vagrant ssh
    $ cd /vagrant
    $ ls  # you should see all the files from your 481 directory here
      

(Special thanks to Ben Reeves and Zi Yang and Josiah Bruner and Trent Zaranek and Anonymous.)