In this assignment you will create three high-coverage test suites, one for each of three different programs spanning three different application domains and three different programming languages. You will then be asked to write a short report reflecting on the activity.
Two of the key properties found in industrial software engineering, but not in most undergraduate academic coursework, are: the program is large and you did not write the program. If you are hired by Microsoft as a software engineer, they will not ask you to write Office from scratch. Instead, as discussed in class, the vast majority of software engineering is the comprehension and maintenance of large legacy codebases (i.e., old code you did not write).
Thus, there is no particular assumption that you are familiar with any of the subject programs in this assignment. Indeed, that is the point: you will be applying testing concepts to unknown software.
You may work with a partner for this assignment. If you do you must use the same partner for all sub-components of this assignment. Only one partner needs to submit the report on Gradescope, but you do need to use Gradescope's interface to select your partner. (Here is a video showing Gradescope partner selection.) You may use files, benchmarks or resources from the Internet (unlike in many other classes), provided you cite them in your written report.
A recurring theme in this course is a focus on everything except writing code. Perhaps surprisingly, other elements, such as reading code, testing code, eliciting requirements, debugging code, and planning projects, are usually more important in industry than simply writing software. Your other electives cover programming — this one covers software engineering.
The assignments in this class are often more like "open ended puzzles". You won't be writing much code. Instead, you will be reading and using code written by other people. (Yes, that's annoying — that's the point!)
Among other things, we hope this assignment will give students exposure to: statement (or line) coverage, branch coverage, legacy code bases, white-box testing, black-box testing, writing your own tests from scratch, slightly editing existing tests to improve coverage, thinking about high-coverage inputs from first principles, and using other available resources.
The parts of this assignment may be different from what you are used to. HW1a focuses on white-box testing and looking at the code. In HW1b, there is too much code for you to read it all, so you will have to think of other black-box approaches. HW1c will likely require a combination of efforts.
It is your responsibility to download, compile, and run the three subject programs. Getting the code to work is part of the assignment. You can post on the forum for help and compare notes bemoaning various architectures (e.g., windows vs. mac vs. linux, etc.). Ultimately, however, it is your responsibility to read the documentation for these programs and utilities and use some elbow grease to make them work.
If you are having trouble (e.g., windows is often a bit more complicated than linux for these things), post on the forum. Other students should be able to offer help and advice.
For example, one TA reports that gcc may not easily support static linking on a Mac, but that it took fewer than two minutes to set everything up in Ubuntu.
If relevant, the grading server uses Python 2.7.12, gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609 and javac 1.8.0_151 for this assignment.
The TAs have suggested that some components of this assignment, such as HW1b, are very difficult to get set up natively on a Mac. Just use a linux virtual machine.
There are three subject programs for this assignment. Each program has varying characteristics, input formats, and coverage goals. Your work for each subject program is submitted separately to the grading server (as HW1a, HW1b and HW1c).
The first subject program is a simple python implementation of an abstract data type: the AVL Tree container data structure.
Test cases for this program take the form of sequences of instructions for manipulating a single global AVL tree:
2. / \ 1 5 /\ /\ 3 /\
We compute coverage for this program using the coverage package for Python. Both statement and branch coverage are considered.
The reference implementation is available here. The program is about 300 lines long and is a self-contained single file.
A test suite for this application is a collection of .avl files. The reference starter test suite is available here.
Installing coverage may look like this on your Ubuntu setup:
$ sudo apt-get install python-pip python-dev build-essential $ sudo pip install --upgrade pip $ sudo pip install --upgrade virtualenv $ sudo pip install coverage
Note that apt-get is Debian/Ubuntu-specific. If you have another Linux distribution you may need to use something else (yum on Red Hat, etc.). If you're using a Mac, read this entire page and note the multiple paragraphs about Macs: while HW1A is fairly reasonable on a Mac, HW1B may not be, and you may want to consider setting up a virtual machine now. Check the forum for student hints.
After that, you can use the coverage utility on multiple test cases and view combined coverage results:$ coverage run --append avl.py simple1.avl 1 /\ 2 /\ $ coverage report Name Stmts Miss Cover ---------------------------- avl.py 182 99 46% $ coverage run --append avl.py simple2.avl 2. / \ 1 5 /\ /\ 3 /\ $ coverage report Name Stmts Miss Cover ---------------------------- avl.py 182 54 70% $ coverage erase
Note how the measured coverage of only the first test is low (46%) but the combined coverage result for the two tests is higher. The --append option is necessary to avoid overwriting the stored coverage data with each new test run.
Now we consider branch coverage. Simply add --branch:
$ coverage run --append --branch avl.py simple1.avl $ coverage run --append --branch avl.py simple2.avl $ coverage report Name Stmts Miss Branch BrPart Cover ------------------------------------------ avl.py 182 54 90 19 66% $ coverage report -m Name Stmts Miss Branch BrPart Cover Missing ---------------------------------------------------- avl.py 182 54 90 19 66% 82-85, 88, 101, 109-112, 121, 123-127, 139-141, 145, 189, 201-202, 211, 216, 223-238, 244-248, 253-254, 284, 286-292, 304-306, 81->82, 87->88, 100->101, 107->109, 120->121, 122->123, 138->139, 144->145, 171->exit, 210->211, 212->214, 215->216, 243->244, 250->253, 283->284, 285->286, 301->303, 303->304, 319->312 $ coverage erase
The additional columns give branch-specification information (e.g., about partial branches; see the manual), but the final column now gives the branch coverage. The columns may look different on other setups (e.g., on CAEN machines) — that's fine. Note also how the missing column guides you to line numbers that are not exercised by the current suite.
There is external written documentation available for this utility. Finally, note that this utility can also produce XML and HTML reports, which you may find easier to interpret.
The second subject program is the portable network graphics file format reference library, libpng. It is used for image manipulation and conversion.
We will be using the developer-provided pngtest driver. It takes as input a single .png file.
We compute coverage for this program using the gcov test coverage program for gcc. Only statement coverage is considered.
The reference implementation (version 1.6.34) is available here. It contains about 85,000 lines of code spread over about 70 files.
A test suite for this application is a collection of .png files. The reference implementation comes with over 150 such example files. Feel free to use an image manipulation or conversion program to create your own files or to use files or benchmarks you find on the Internet.
The difficulty difference between HW1a and HW1b is large. (For example, students may report that HW1a takes "minutes" while HW1b takes "hours". In addition, many students find that the difficulty lies more in compilation and instrumentation than in high-coverage images.)
The gcov utility usually comes with gcc, so you should not need to do anything specific to install it. (Unless you are trying to use a Mac natively despite our warnings not to do so. In such a world you will likely encounter the issue that gcc static linking may not work on a Mac and that using dynamic linking will give you 0% coverage. You should then install linux in a VM.) However, our subject program depends on the development version of a compression library:
$ sudo apt-get install libz-devIn addition, you will have to compile the project with coverage instrumentation (taking care to use static linking or to otherwise make all of the libpng source files, and not just the test harness, visible to coverage):
$ cd libpng-1.6.34 $ sh ./configure CFLAGS="--coverage -static" $ make clean ; make $ ./pngtest pngtest.png Testing libpng version 1.6.34 ... $ gcov *.c ... File 'png.c' Lines executed:37.78% of 1236 # your numbers may differ! Creating 'png.c.gcov' File 'pngerror.c' Lines executed:16.67% of 252 # your numbers may differ! Creating 'pngerror.c.gcov' ... Lines executed:28.71% of 10606 # your numbers may differ! $ ./pngtest contrib/gregbook/toucan.png ... $ gcov *.c ... File 'pngwutil.c' Lines executed:61.90% of 979 # your numbers may differ! Creating 'pngwutil.c.gcov' Lines executed:30.72% of 10606 # your numbers may differ # and everything is still fine! $ rm *.gcda pngout.png
Note how gcov gives a separate report for each source file specified on the command line, but also gives a sum total for statement coverage (the final Lines executed reported). Your coverage numbers may be slightly different from the example ones listed above.
The png.c.gcov (etc.) files created by this processed are marked-up source files that include coverage and visitation information. You can view them to determine which lines and branches were covered. Note that lcov, a graphical front-end to gcov, may help you interpret that output for large projects. While it is possible to obtain branch coverage using gcov (e.g., -b -c) we will not for this assignment.
Note that pngtest creates pngout.png. (This can confuse some students, because if you collect coverage after running on *.png and then do it again, you may get a different answer the second time if you're not careful because of the newly-added file!)
The final subject program is the JFreeChart chart creation library. It is used to produce quality charts for a variety of applications and files.
We will be writing tests against its programmable API. Each test case will thus be a .java file that invokes methods from the chart-creation library.
We compute coverage for this program using the cobertura bytecode test coverage utility. Both statement and branch coverage is considered.
The reference implementation (version 1.5.0) is available here. It contains about 300,000 lines of code spread over about 640 files.
A test suite for this application is a collection of .java files. The reference starter test suite is available here. Note that our grading server does not support graphical user interfaces, so any tests that create Java/AWT displays or widgets or GUI elements are unlikely to work.
First, you will need to install cobertura to collect Java coverage and maven to build this project:
$ sudo apt-get install cobertura maven
Then you will need to build the project:
$ tar zxf jfreechart-1.5.0.tar.gz $ tar zxf jfreechart-tests.tar.gz $ cd jfreechart-1.5.0 $ EDIT pom.xml # search for maven-compiler-plugin and # add "debug" and "debuglevel" properties <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.7.0</version> <configuration> <source>${project.source.level}</source> <target>${project.target.level}</target> <encoding>${project.build.sourceEncoding}</encoding> <debug>true</debug> <debuglevel>lines,vars,source</debuglevel> </configuration> </plugin> $ mvn compile $ cd target/classes $ cobertura-instrument org/ Cobertura 1.9.4.1 - GNU GPL License (NO WARRANTY) - See COPYRIGHT file Instrumenting 1 file WARN visitEnd, No line number information found for class org.jfree.chart.axis.DateAxis$1. Perhaps you need to compile with debug=true? Cobertura: Saved information on 661 classes. Instrument time: 835ms $ javac -cp /usr/share/java/cobertura.jar:. test/ChartTest1.java $ java -cp /usr/share/java/cobertura.jar:. test/ChartTest1 output1.png created Flushing results... Flushing results done Cobertura: Loaded information on 661 classes. Cobertura: Saved information on 661 classes. $ cobertura-check --totalline 100 Cobertura 1.9.4.1 - GNU GPL License (NO WARRANTY) - See COPYRIGHT file Cobertura: Loaded information on 661 classes. Project failed check. Total line coverage rate of 3.3% is below 100.0% $ cobertura-check --totalbranch 100 Cobertura 1.9.4.1 - GNU GPL License (NO WARRANTY) - See COPYRIGHT file Cobertura: Loaded information on 661 classes. Project failed check. Total branch coverage rate of 1.4% is below 100.0% $ javac -cp /usr/share/java/cobertura.jar:. test/ChartTest2.java $ java -cp /usr/share/java/cobertura.jar:. test/ChartTest2 $ cobertura-check --totalline 100 Project failed check. Total line coverage rate of 3.5% is below 100.0% $ cobertura-check --totalbranch 100 Project failed check. Total branch coverage rate of 1.6% is below 100.0%
You can run cobertura-report --destination report to make a report/ subdirectory containing source files annotated with coverage information. You can also run cobertura-check with different arguments to get per-file numerical coverage reports.
Note: If you want to clear information and recalculate coverage, some report that merely deleting cobertura.ser will not work, and you should restart from mvn compile and cobertura-instrument org/ in the instructions above. (If you do not you may either see coverage reports that are too high or coverage reports that are entirely blank.)
Note that there is a much easier way to run cobertura for standard Java development:
$ mvn cobertura:coberturaThis instructs maven to automatically download the cobertura plugin, run the project tests, and place an HTML report in ${project}/target/site/cobertura/index.html. We will return to this convenient approach in Homework 2, but for now, we want you to gain experience with nitty-gritty command-line approaches to coverage that do not abstract away details.
(If you ignored our warnings and tried to do this on a Mac, you likely ran into a number of issues. You may find that you need to download cobertura yourself and convert all of its shell scripts using something like dos2unix. You may have to handle the Java NoClassDefFoundError exceptions by installing slf4j and putting slf4j-api.jar and Slf4j-nop.jar in your classpath. You may then have to change all of the commands and paths to work on a Mac, and even if you do this you'll find that the coverage reported is a little bit off from what is reported above and what is done on the grading server — and the grading server measurements count. Or you could install linux on a virtual machine.)
You must also write a short two-paragraph report reflecting on your experiences creating high-coverage tests for this assigment. (If you are working with a partner, indicate as much in the text or header. Recall from above that you need only submit one copy of the report, but if both you do, nothing bad happens. Select your partner on Gradescope.) Consider addressing points such as:
Rubric:
There is no explicit format (e.g., for headings or citations) required. You may discuss other activities (e.g., validation or automation scripts you wrote, etc.) as well.
The grading staff may select a small number of excerpts from particularly high-quality or instructive reports and share them with the class. If your report is selected you will receive extra credit.
You will almost certainly need to inspect the source code of the various programs (i.e., white-box testing) to construct high-coverage test suites.
For each program, start with any test cases we have provided. Many students also start with image files (for HW1b) or coding hints (for HW1c) they find on the Internet. Because of limits on autograding submissions, and so that you learn how coverage computations work across various systems (a relevant skill for interviews, etc.), you will almost certainly want to install the coverage utilities yourself. As you read each project's source code and expand your test suite, recompute the coverage locally. When you perceive that it has gone up significantly, submit to the grading server for a more official check.
Note the increasing size and complexity of the subject programs. Splitting your time evenly between them is probably not the optimal allocation. The first project is due earlier to prevent students from putting things off until the last minute.
For each of the three programmatic components (HW1a, HW1b, HW1c), submit your (individual, not zipped) test cases via the autograder.io autograding server. You may submit multiple times (see the grading server for details). For the written component (HW1d), submit your PDF report via Gradescope.
The exact grading rubric (e.g., x points for y coverage) can be seen in the submission feedback details. As of 2019, the grading rubric scales roughly between:
Note that your local results may differ largely from the autograder results and that is fine and expected. For example, you may observe 6.8% line coverage "at home" and 16.8% on the autograder. (Different compilers, header files, optimizations, etc., may result in different branch and line counts.) Your grade is based on the autograder results, regardless of what you see locally.
We always use your best autograder submission result (even if your latest result is worse) for your grade.
Note that you are typically limited to a maximum number of test cases (i.e., 50 files totalling 30 megabytes). If you find that you have created or found more than that, minimize your test suite by prioritizing those that most increase coverage. (You might, as an example, use a for loop to add one test to the suite at a time, noting the new coverage after each addition, and use that information to make your decision.)
Note also that the grading server has resource usage (e.g., CPU time) caps. As a result, if you submit a test suite that is very large or long-running, we may not be able to evaluate your submission. Consider both quantity and quality.
Feel free to scour the web (e.g., Stack Overflow, etc.) or this webpage (e.g., the example tests shown above) or the tarballs (e.g., yes, you can submit pngtest.png or toucan.png if you want to) for ideas and example images to use directly as part of your answer (with or without modification) — just cite your sources (or URLs) in the report. However, submissions are limited to 50 test cases (so just finding a big repository of two hundred images may not immediately help you without additional work) totalling 30 megabytes. In addition, you may never submit another student's work (images or test selection) as your own.
As befitting a 400-level elective, this assignment is somewhat open-ended. There is no "one true path" to manually creating high-coverage test suites.
You may find yourself thinking something like "I know what to do at a high level, but I don't know specifically what to create here. I could try to read and understand the program to make tests that cause it to do different things, but reading and understanding the program takes a long time, especially since I didn't write this code and I don't really care about it and I'm only doing it because someone is telling me to." Such concerns are, in some sense, the point: they are indicative of industrial software engineering.
History suggests that HW1b is the hardest, HW1c is moderate, and HW1a is the easiest. While your results may vary, feel free to use this information when planning your time.
Finally, students sometimes wonder if it is possible to get 100% coverage on 1a. While it should be possible to get 100% coverage for the general AVL tree algorithm, this particular implementation of the algorithm may well have branches that you are unable to reach — especially given this testing setup.
In this section we detail previous student issues and resolutions:
Question: When I type the command sudo apt-get install python-pip python-dev build-essential I am prompted for a password — which should I use?
Answer: The password should be the one associated with your user account (the one you used to log in to the OS). If you are using a virtual machine, it is the virtual machine login password, not your "real" login password.
Question: I get
The program 'make' can be found in the following packages: * make
Answer: You need to do sudo apt-get install make or similar. (If you are uncertain about installing and using make, you may want to drop the class this semester and take it again next year after taking a few other courses; this class assumes this background.)
Question: Can I really use images or code snippets I found online to help with this assignment?
Answer: Yes, provided that you cite them at the end of your report. This class is about "everything except writing code", in some sense, and in the real world people do everything they can to get high-coverage test suites (including paying people to construct them and using special tools). So you can use image files or benchmarks you find on the Internet if you like — but you'll still have to do the work of paring things down to a small number of high-coverage files. Similarly, you can use Java code snippets if you like — but note that the grading server does not support junit or graphics, so you may have to manually edit them. You'll likely get the most out of the assignment if you use a combination of white-box testing, black-box testing and searching for resources — but there are many ways to complete it for full credit.
Question: For HW1b:
./.libs/libpng16.a(pngrutil.o): In function `png_inflate_claim': pngrutil.c:(.text+0xd5b): undefined reference to `inflateValidate' collect2: error: ld returned 1 exit status Makefile:1005: recipe for target 'pngfix' failedor
Note, selecting 'zlib1g-dev' instead of 'libz-dev'
Answer: You do not have libz-dev installed correctly. One student eventually resolved this by installing an Ubuntu virtual machine.
Question: Is there a recommended Ubuntu version for use in a virtual machine?
Answer: Students report that Ubuntu 16 (in particular, 16.04) works "flawlessly" while Ubuntu 18 (e.g., 18.10 and 18.04) both give some issues for HW1b and HW1c. From the professor's perspective, there is nothing Ubuntu-specific in the assignment or the learning goals, so any setup you can get to work is fine.
Question: I get this error
make: *** No targets specified and no makefile found. Stop.
Answer: You need to run configure before running make. Double-check the build and installation instructions above.
Question: When I try to run
gcc -o pngtest pngtest.cIt chokes with undefined reference errors.
Answer: Yes — pngtest requires a number of libraries to build. You should follow the installation instructions above (run "configure" then "make", etc.).
Question: For HW1c:
Warning: JAVA_HOME environment variable is not setor
[ERROR] COMPILATION ERROR : [INFO] ------------------------------------------------------------- [ERROR] No compiler is provided in this environment. Perhaps you are running on a JRE rather than a JDK? [INFO] 1 error
Answer: Try sudo apt-get install openjdk-8-jdk
See here
for more information.
Question: For HW1c, I get decent coverage locally, but when I submit, I get 0.
Answer 1: Click on the right-facing arrow to expand the item to gain additional information. In one student's case, the line
test/AreaChartTest.java:43: error: package org.junit does not exist... revealed that the student was using a test that required Junit (which is not present on the grading server). This can happen if you try to use a test from JFreeChart or the web unchanged.
Answer 2: In another student's case, the line (click on the small triangle icon in your submission report to expand it)
Exception in thread "main" java.lang.IllegalArgumentException: Series index out of boundsRevealed that the student was submitting tests that would raise exceptions before covering many lines.
Answer 3: In another case, the line:
No X11 DISPLAY variable was set, but this program performed an operation which requires it.Revealed that the student was submitting a test that made use of a graphical user interface ("X11" is the Linux/Unix graphical UI). Those are not supported on the grading server.
Question: For HW1c, on the generated HTML report, I get Unable to locate org/jfree/chart/ChartColor.java. Have you specified the source directory?
Answer: Try specifying the source directory when you generate the report, using something like:
cobertura-report --destination report srcdir=" ../../src/main/java"
Question: For HW1c, I get:
javac: file not found: test/ChartTest1.java
Answer: Download the test tarball and unpack it.
Question: For HW1c, when I am trying to run my own tests, I get:
Error: could not find or load main class test.insertfilenamehere
Answer: Try adding package test; to your files.
Question: For HW1d, can you post guidelines or points of interest for writing high-quality or instructive reports?
Answer: I regret — I cannot. At least, not in the way one is likely to desire when posing such a query. You can potentially reverse engineer things a bit, but ultimately there's no magic formula.
The grading staff may well select a small number excerpts that are worth sharing with the class. You can infer from that that we'll be favoring excerpts that are "family friendly" and that contribute to pedagogical goals: either by bringing something new to light, or by reinforcing or reinterpreting concepts from class. Given my eclectic tastes and views on liberal arts education, however, this could potentially be anything from a strong link between, say, testing and rhetoric, to some direct point about software maintenance. Usually these sorts of things end up being awarded to students who have clearly spent a bit more time than usual on the assignment. (If you are being very mercenary, it's probably not worth shooting for an ill-defined unsure thing.)
Question: How do other students tackle HW1b?
Answer: Here's a public answer from one former student describing the process:
Question: I am running the gcov command gcov *.c and am successfully getting the per-file output, but not the "Lines executed:28.71% of 10606" output. Any ideas why?
Answer: A student reports solving this by using Ubuntu 16 (presumably instead of a different Linux or Ubuntu version).
Using a Mac is not supported by the course staff; it is your responsibility to get the software up and running. That said, we provide a number of helpful student hints on an "as is" basis.
Some students suggest that you may be able to get things working with a virtual environment:
$ python3 -m venv env $ source env/bin/activate (env) $ echo "Now I am in a virtual environment!" (env) Now I am in a virtual environment! (env) $ pip install coverageNow you can run coverage:
(env) $ coverage run --append avl.py simple1.avlTo exit the virtual environment:
(env) $ deactivate $ echo "Left virtual environment." Left virtual environment.In addition, you can obtain Cobertura (which is not a Python package, and thus cannot be installed via pip) from the central Cobertura repository.
You may also want to install VirtualBox. You can You can download VirtualBox from here: https://www.virtualbox.org/wiki/Downloads (look under "VirtualBox 5.2.4 platform packages"). For Mac OS you get a .dmg so installation should be straightforward. You then need an image. One student reported success with an official Ubuntu disk image from https://www.ubuntu.com/download/desktop. You then create a VM within Virtualbox, using these instructions: https://askubuntu.com/questions/142549/how-to-install-ubuntu-on-virtualbox.
If you instead want to use Vagrant (like VirtualBox, but if you don't need the entire GUI overhead), these Slides from EECS 485 may help: https://drive.google.com/open?id=0B85QsxlI9S_RVEdHWTVXZlg0YW8. One student's steps:
$ vagrant init bento/ubuntu-16.04 $ vagrant up $ vagrant ssh $ cd /vagrant $ ls # you should see all the files from your 481 directory here
(Special thanks to Ben Reeves and Zi Yang and Josiah Bruner and Trent Zaranek and Anonymous.)