Lectures and Details

Meetings

Monday/Wednesday, 2:00-3:15pm, Rice 340

Instructor

Westley Weimer

Office Hours

Mondays 11:00-11:30am Rice 423 and Wednesdays 11:00-Noon Rice 423

Breadth Area

This course counts for the Software Systems breadth area.

Prerequisites

Students are expected to have enough background in programming languages, software engineering, compilers, security, and theory to be able to understand research papers from PL and SE conferences. One previous PL or SE course at the undergraduate or graduate level should suffice for the motivated student.

Overview

This special topics course is a research seminar in software quality, analysis and transformation. The course will focus on active research areas in programming languages and software engineering, but the specific topics will be largely determined by a combination of instructor fiat and the interests of the students.

First-Year Graduate Students

This course currently conflicts with CS 6190. If you are a first-year graduate student and would like to take this course instead, you may defer CS 6190 until next year.

Note that CS 6190 is very important for PhD students and Master's Thesis students for finding a research advisor. If you do not yet have an advisor, you should probably take CS 6190 your first semester. However, if you are in our coursework-based professional master's program or if you already have an advisor in mind (or plan to work with one from another class you are taking, etc.), you can defer CS 6190.

Structure and Presentations

Randomized Presentations

For each paper discussion I will choose up to three students at random to give a five-minute presentation.

Each selected student will give a five-minute presentation that at least (1) summarizes the work, (2) lists its strengths, and (3) lists ways in which it might be improved in a subsequent publication. Including other information, such as your opinion about the work or its relation to other projects, is recommended.

The goals of this approach are to encourage all participants to read the material thoroughly in advance, to provide jumping-off points for detailed discussions, and to allow me to evaluate participation.

There is no project component to this course — your primary responsibility is to prepare five minutes of cogent discussion for each paper.

Grading

Grading will be based entirely on participation. One sufficient condition for an "A" grade is to do a good job whenever you are called on to present a paper and to interject insightful comments into the discussions of other papers.

Reading List

Read Two Ahead

You are responsible for reading and preparing at least the next two undiscussed papers for any given class meeting.

On average, we will discuss three papers a week (devoting perhaps 50 minutes to each paper). However, some papers may merit more or less discussion.

The next paper to discuss is shown highlighted. You should read and prepare at least the next two-to-three papers before the next meeting.

General Systems Research

Todd Mytkowicz, Amer Diwan, Matthias Hauswirth, Peter F. Sweeney: Producing wrong data without doing anything obviously wrong! ASPLOS 2009: 265-276
Christian Bird, Adrian Bachmann, Eirik Aune, John Duffy, Abraham Bernstein, Vladimir Filkov, Premkumar T. Devanbu: Fair and balanced?: bias in bug-fix datasets. ESEC/SIGSOFT FSE 2009: 121-130
Static Bug Detection
Cormac Flanagan, K. Rustan M. Leino, Mark Lillibridge, Greg Nelson, James B. Saxe, Raymie Stata: Extended Static Checking for Java. PLDI 2002: 234-245
(Ten-Year Most Influential Paper Award)
Dawson R. Engler, David Yu Chen, Andy Chou: Bugs as Inconsistent Behavior: A General Approach to Inferring Errors in Systems Code. SOSP 2001: 57-72
Nathaniel Ayewah, David Hovemeyer, J. David Morgenthaler, John Penix, William Pugh: Using Static Analysis to Find Bugs. IEEE Software 25(5): 22-29 (2008)
[ Project Website ] [ Unrelated-But-Good Slides ]
- Related but optional:
  Joseph R. Ruthruff, John Penix, J. David Morgenthaler, Sebastian G. Elbaum, Gregg Rothermel: Predicting accurate and actionable static analysis warnings: an experimental approach. ICSE 2008: 341-350
- Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, Dawson Engler: A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World. Communications of the ACM Vol. 53 No. 2, Pages 66-75
Automated Debugging
Andreas Zeller: Yesterday, My Program Worked. Today, It Does Not. Why? ESEC / SIGSOFT FSE 1999: 253-267
(ACM SIGSOFT 10 Year Impact Award) [ Keynote Talk Video ] [ Project Website ] [ Additional Free Implementation ]
James A. Jones, Mary Jean Harrold: Empirical evaluation of the tarantula automatic fault-localization technique. ASE 2005: 273-282
[ Project Website ]
- Related but optional:
  Thomas Ball, Mayur Naik, Sriram K. Rajamani: From symptom to cause: localizing errors in counterexample traces. POPL 2003: 97-105
Chris Parnin, Alessandro Orso: Are automated debugging techniques actually helping programmers? ISSTA 2011: 199-209
Program Repair — Origins
David E. Lowell, Subhachandra Chandra, Peter M. Chen: Exploring Failure Transparency and the Limits of Generic Recovery. OSDI 2000: 289-304
George C. Necula, Scott McPeak, Westley Weimer: CCured: type-safe retrofitting of legacy code. Symposium on Principles of Programming Languages (POPL) 2002: 128-139
(10-Year Most Influential Paper Award)
- Related but optional, journal extension: George C. Necula, Jeremy Condit, Matthew Harren, Scott McPeak, Westley Weimer: CCured: type-safe retrofitting of legacy software. ACM Trans. Program. Lang. Syst. 27(3): 477-526 (2005)
Westley Weimer: Patches as better bug reports. Conference on Generative Programming and Component Engineering (GPCE) 2006: 181-190

Program Repair — Techniques

Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, Westley Weimer: A Systematic Study of Automated Program Repair: Fixing 55 out of 105 bugs for $8 Each. International Conference on Software Engineering (ICSE) 2012: 3-13
[ Project Website ]
- Related but optional:
  Westley Weimer, ThanhVu Nguyen, Claire Le Goues, Stephanie Forrest: Automatically finding patches using genetic programming. ICSE 2009: 364-374
  (Best Paper Award)
Dongsun Kim, Jaechang Nam, Jaewoo Song, Sunghun Kim: Automatic patch generation learned from human-written patches. ICSE 2013: 802-811
(Best Paper Award)
Martin Monperrus: A critical review of "automatic patch generation learned from human-written patches": essay on the problem statement and the evaluation of automatic software repair. ICSE 2014: 234-242
Fan Long, Martin Rinard: Staged program repair with condition synthesis. ESEC/SIGSOFT FSE 2015: 166-178
Sergey Mechtaev, Jooyong Yi, Abhik Roychoudhury: Angelix: scalable multiline program patch synthesis via symbolic analysis. ICSE 2016: 691-701
[ Project Website ]

Dataflow Analysis

Thomas W. Reps, Susan Horwitz, Shmuel Sagiv: Precise Interprocedural Dataflow Analysis via Graph Reachability. POPL 1995: 49-61
Glenn Ammons, James R. Larus: Improving data-flow analysis with path profiles (with retrospective). Best of PLDI 1998: 568-582
(Best Paper Award)
Sumit Gulwani, George C. Necula: Precise interprocedural analysis using random interpretation. POPL 2005: 324-337
Ben Hardekopf, Calvin Lin: The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code. PLDI 2007: 290-299
(Best Paper Award)

Natural Language Analysis

Mark Gabel, Zhendong Su: A study of the uniqueness of source code. SIGSOFT FSE 2010: 147-156
Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, Premkumar T. Devanbu: On the naturalness of software. ICSE 2012: 837-847

Binary Analysis

Peter B. Kessler: Fast breakpoints: design and implementation (with retrospective). Best of PLDI 1990: 390-397
(Best Paper Award)
Amitabh Srivastava, Alan Eustace: ATOM: a system for building customized program analysis tools (with retrospective). Best of PLDI 1994: 528-539
(Best Paper Award)
Chi-Keung Luk, Robert S. Cohn, Robert Muth, Harish Patil, Artur Klauser, P. Geoffrey Lowney, Steven Wallace, Vijay Janapa Reddi, Kim M. Hazelwood: Pin: building customized program analysis tools with dynamic instrumentation. PLDI 2005: 190-200
(Ten-Year Most Influential Paper Award)
Junghee Lim, Thomas W. Reps: A System for Generating Static Analyzers for Machine Instructions. CC 2008: 36-52
(Best Paper Award) [ Slides ]

Test Case Generation

Patrice Godefroid, Nils Klarlund, Koushik Sen: DART: directed automated random testing. PLDI 2005: 213-223
- Related but optional:
  American Fuzzy Lop tool
Cristian Cadar, Daniel Dunbar, Dawson R. Engler: KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. OSDI 2008: 209-224
(Best Paper Award)
Raul A. Santelices, Pavan Kumar Chittimalli, Taweesup Apiwattanapong, Alessandro Orso, Mary Jean Harrold: Test-Suite Augmentation for Evolving Software. ASE 2008: 218-227
(Best Paper Award)
Gordon Fraser, Andreas Zeller: Mutation-Driven Generation of Unit Tests and Oracles. IEEE Trans. Software Eng. 38(2): 278-292 (2012)
[ Project Website ]

Parallel Programs

Charles Edwin Killian, James W. Anderson, Ranjit Jhala, Amin Vahdat: Life, Death, and the Critical Transition: Finding Liveness Bugs in Systems Code NSDI 2007
(Best Paper Award)
Cormac Flanagan, Stephen N. Freund: FastTrack: efficient and precise dynamic race detection. PLDI 2009: 121-133

Program Synthesis

Armando Solar-Lezama, Christopher Grant Jones, Rastislav Bodik: Sketching concurrent data structures. PLDI 2008: 136-148
Saurabh Srivastava, Sumit Gulwani, Jeffrey S. Foster: From program verification to program synthesis. POPL 2010: 313-326

Invariant Generation

Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin: Dynamically Discovering Likely Program Invariants to Support Program Evolution. IEEE Trans. Software Eng. 27(2): 99-123 (2001)
[ Slides ] [ Project Website ]
Ashutosh Gupta, Rupak Majumdar, Andrey Rybalchenko: From Tests to Proofs. TACAS 2009: 262-276
(Best Paper Award)
ThanhVu Nguyen, Deepak Kapur, Westley Weimer, Stephanie Forrest: Using Dynamic Analysis to Discover Polynomial and Array Invariants. ICSE 2012: 683-693
(Best Paper Award)

Run-Time Optimization

Peter Lee, Mark Leone: Optimizing ML with run-time code generation (with retrospective). Best of PLDI 1996: 540-553
(Best Paper Award)
Matteo Frigo: A fast Fourier transform compiler (with retrospective). Best of PLDI 1999: 642-655
(Best Paper Award)

Proof-Carrying Code

J. Gregory Morrisett, David Walker, Karl Crary, Neal Glew: From System F to Typed Assembly Language. POPL 1998: 85-97
[ Project Website ]

Concluding Thoughts Slides

George C. Necula, Peter Lee: The design and implementation of a certifying compiler (with retrospective) Best of PLDI 1998: 612-625
(Best Paper Award)
Casey Klein, John Clements, Christos Dimoulas, Carl Eastlund, Matthias Felleisen, Matthew Flatt, Jay A. McCarthy, Jon Rafkind, Sam Tobin-Hochstadt, Robert Bruce Findler: Run your research: on the effectiveness of lightweight mechanization. POPL 2012: 285-296
[ Project Website with tools and video ]
Grab Bag
Optional: Claire Le Goues: Double-blind review at SSBSE. (This essay summarizes research relevant to the arguments for and against double-blind reviewing.)
Luca de Alfaro, Thomas A. Henzinger, Rupak Majumdar: Symbolic Algorithms for Infinite-State Games. CONCUR 2001: 536-550
(Best Paper Award)
David F. Bacon, C. Richard Attanasio, Han Bok Lee, V. T. Rajan, Stephen E. Smith: Java without the Coffee Breaks: A Nonintrusive Multiprocessor Garbage Collector. PLDI 2001: 92-103
Christian Bird, Nachiappan Nagappan, Premkumar T. Devanbu, Harald Gall, Brendan Murphy: Does distributed development affect software quality? An empirical case study of Windows Vista. ICSE 2009: 518-528
(Appears in CACM Research Highlights)
Paul Ammann, John C. Knight: Data Diversity: An Approach to Software Fault Tolerance. IEEE Trans. Computers 37(4): 418-425 (1988)