Lectures and Details

Monday/Wednesday, 3:00-4:30pm, 1303 EECS
Office Hours
To Be Determined (Monday or Wednesday)
Course Forum
Students are expected to have enough background in software engineering, programming languages, compilers, security, and theory to be able to understand research papers from SE and PL conferences. One previous SE or PL course at the undergraduate or graduate level should suffice for the motivated student.


This special topics course is a research seminar covering the intersection of course software engineering topics (e.g., program comprehension and maintenance) and associated human factors (particularly those revealed by medical imaging). The course will focus on active research areas in software engineering.

Special emphasis will be placed on instructor-led in-person discussions of contextual aspects of papers that may be less evident to students (e.g., based on the instructor's first-hand knowledge of the authors or research situation). These may include:

Such discussions can help students to view papers as part of a collaborative, human activity takes place over a span of time. Conversations about how to construct well-designed follow-on work (e.g., balancing risk and rewarding, lifting input assumptions, etc.), at the level of the graduate Preliminary Exam, will be encouraged.

Software Engineering and Cognition

This class will include papers that explicitly cover human cognitive aspects of software engineering and programming languages, such as studies that make use of eye tracking or medical imaging.

It is not expected that incoming students have any expertise in such cognitive aspects. Instead, relevant cognitive aspects will be discussed (including in the third paper, below) such that students can interpret their relevance to computer science.

Graduate Depth Area

As of January 20, 2023, it is confirmed that this course satisfies the Software Depth area requirement. (It does not satisfy the breadth requirement.)

Structure and Presentations

Randomized Presentations

For each paper discussion I will choose up to three students at random to give a five-minute presentation.

Each selected student will give a five-minute in-person presentation that at least (1) summarizes the work, (2) lists its strengths, and (3) lists ways in which it might be improved in a subsequent publication. Including other information, such as your opinion about the work or its relation to other projects, is recommended.

The goals of this randomized approach are to encourage all participants to read the material thoroughly in advance, to provide jumping-off points for detailed discussions, and to allow me to evaluate participation.

There is no project component to this course — your primary responsibility is to prepare five minutes of cogent discussion for each paper.

Grading Rubric

Grading will be based on in-person participation.

  • 40% Paper Presentations
  • 40% Discussions
  • 20% Professionalism

In more detail, Paper Presentations will be graded on the summarization of the work (20%), the enumeration of strengths (30%), and areas and mechanisms for improvement (50%).

The Discussion component will be assessed by noting when students contribute to the conversation and analysis of a paper (outside of their presentations).

The Professionalism component will be assessed in terms of helping to maintain a welcoming environment for everyone and demonstrating our shared values. Participants are assumed to be professionals by default, but may lose points in certain negative circumstances. Examples of less-professional conduct include:

Informally, one sufficient condition for an "A" grade is to do a good job whenever you are called on to present a paper, to interject insightful comments into the discussions of other papers, and to treat others respectfully.

The grading cutoffs are:

Reading List

Read Two Ahead

You are responsible for reading and preparing for the next two not-yet-discussed papers for any given class meeting.

On average, we will discuss three papers a week (devoting perhaps 50 minutes to each paper). However, some papers may merit more or less discussion.

It is often possible to find presentation slides or video recordings associated with a paper.

The next paper to discuss may be shown below a highlighted line. You should read and prepare at least the next two-to-three papers before the next meeting.

    Software Research and Threats to Validity

  1. Todd Mytkowicz, Amer Diwan, Matthias Hauswirth, Peter F. Sweeney: Producing wrong data without doing anything obviously wrong! ASPLOS 2009: 265-276
  2. Janet Siegmund, Norbert Siegmund, Sven Apel: Views on Internal and External Validity in Empirical Software Engineering. ICSE 2015: 9-19 (distinguished paper award)
  3. Neuroscience and Language

  4. Jingyuan E. Chen, Gary H. Glover: Functional Magnetic Resonance Imaging Methods. Neuropsychology Review, vol. 25, pages 289-313 (2015)
  5. Ioulia Kovelman, Stephanie A. Baker, Laura-Ann Petitto: Bilingual and Monolingual Brains Compared: A Functional Magnetic Resonance Imaging Investigation of Syntactic Processing and a Possible "Neural Signature" of Bilingualism. J. Cogn. Neurosci. 20(1): 153-169 (2008)
  6. Software Engineering and the Brain

  7. Janet Siegmund, Christian Kästner, Sven Apel, Chris Parnin, Anja Bethmann, Thomas Leich, Gunter Saake, André Brechmann: Understanding understanding source code with functional magnetic resonance imaging. ICSE 2014: 378-389
  8. Benjamin Floyd, Tyler Santander, Westley Weimer: Decoding the representation of code in the brain: an fMRI study of code review and expertise. ICSE 2017: 175-186 (distinguished paper award)
  9. Yu Huang, Xinyu Liu, Ryan Krueger, Tyler Santander, Xiaosu Hu, Kevin Leach, Westley Weimer: Distilling neural representations of data structure manipulation using fMRI and fNIRS. ICSE 2019: 396-407 (distinguished paper award)
  10. Ryan Krueger, Yu Huang, Xinyu Liu, Tyler Santander, Westley Weimer, Kevin Leach: Neurological Divide: An fMRI Study of Prose and Code Writing. International Conference on Software Engineering (ICSE): 2020
  11. Zachary Karas, Andrew Jahn, Westley Weimer, Yu Huang: Connecting the Dots: Rethinking the Relationship between Code and Prose Writing with Functional Connectivity: Foundations of Software Engineering (ESEC/FSE): 2021
  12. Software Expertise and the Brain

  13. Norman Peitek, Annabelle Bergum, Maurice Rekrut, Jonas Mucke, Matthias Nadig, Chris Parnin, Janet Siegmund, Sven Apel: Correlates of programmer efficacy and their link to experience: a combined EEG and eye-tracking study. ESEC/SIGSOFT FSE 2022: 120-131
  14. Ikutani Y, Kubo T, Nishida S, Hata H, Matsumoto K, Ikeda K, Nishimoto S. Expert Programmers Have Fine-Tuned Cortical Representations of Source Code. eNeuro. 2021 Jan 28. 8(1): ENEURO.0405-20.2020.
  15. Implications — Code Comprehension

  16. Sarah Fakhoury, Devjeet Roy, Yuzhan Ma, Venera Arnaoudova, Olusola O. Adesope: Measuring the impact of lexical and structural inconsistencies on developers' cognitive load during bug localization. Empir. Softw. Eng. 25(3): 2140-2178 (2020)
  17. Norman Peitek, Sven Apel, Chris Parnin, André Brechmann, Janet Siegmund: Program Comprehension and Code Complexity Metrics: An fMRI Study. ICSE 2021: 524-536 (distinguished paper award)
  18. Implications — Code Review and Trust

  19. Tyler J. Ryan, Gene M. Alarcon, Charles Walter, Rose F. Gamble, Sarah A. Jessup, August A. Capiola, Marc D. Pfahler: Trust in Automated Software Repair - The Effects of Repair Source, Transparency, and Programmer Experience on Perceived Trustworthiness and Trust. HCI (29) 2019: 452-470
  20. Yu Huang, Kevin Leach, Zohreh Sharafi, Nicholas McKay, Tyler Santander, Westley Weimer: Biases and Differences in Code Review using Medical Imaging and Eye-Tracking: Genders, Humans, and Machines: Foundations of Software Engineering (ESEC/FSE): 2020
  21. Implications — Learning and Teaching

  22. Ryan Shaun Joazeiro de Baker, Sidney K. D'Mello, Ma. Mercedes T. Rodrigo, Arthur C. Graesser: Better to be frustrated than bored: The incidence, persistence, and impact of learners' cognitive-affective states during interactions with three different computer-based learning environments. Int. J. Hum. Comput. Stud. 68(4): 223-241 (2010)
  23. Naser Al Madi, Cole S. Peterson, Bonita Sharif, Jonathan I. Maletic: From Novice to Expert: Analysis of Token Level Effects in a Longitudinal Eye Tracking Study. ICPC 2021: 172-183
  24. Nischal Shrestha, Colton Botta, Titus Barik, Chris Parnin: Here we go again: why is it difficult for developers to learn another programming language? Commun. ACM 65(3): 91-99 (2022) (distinguished paper award)
  25. Madeline Endres, Zachary Karas, Xiaosu Hu, Ioulia Kovelman, Westley Weimer: Relating Reading, Visualization, and Coding for New Programmers: A Neuroimaging Study: International Conference on Software Engineering (ICSE): (2021)
  26. Chris Parnin, Alessandro Orso: Are automated debugging techniques actually helping programmers? ISSTA 2011: 199-209

    Software Engineering and Deep Learning

  27. Ru Zhang, Wencong Xiao, Hongyu Zhang, Yu Liu, Haoxiang Lin, Mao Yang: An empirical study on program failures of deep learning jobs. ICSE 2020: 1159-1170 (distinguished paper award)
  28. Hung Viet Pham, Shangshu Qian, Jiannan Wang, Thibaud Lutellier, Jonathan Rosenthal, Lin Tan, Yaoliang Yu, Nachiappan Nagappan: Problems and Opportunities in Training Deep Learning Software Systems: An Analysis of Variance. ASE 2020: 771-783 (distinguished paper award)
  29. Program Repair — Techniques and Criticism

  30. Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, Westley Weimer: A Systematic Study of Automated Program Repair: Fixing 55 out of 105 bugs for $8 Each. International Conference on Software Engineering (ICSE) 2012: 3-13
  31. Dongsun Kim, Jaechang Nam, Jaewoo Song, Sunghun Kim: Automatic patch generation learned from human-written patches. ICSE 2013: 802-811 (distinguished paper award)
  32. Martin Monperrus: A critical review of "automatic patch generation learned from human-written patches": essay on the problem statement and the evaluation of automatic software repair. ICSE 2014: 234-242
  33. Fan Long, Martin Rinard: Staged program repair with condition synthesis. ESEC/SIGSOFT FSE 2015: 166-178
  34. Sergey Mechtaev, Jooyong Yi, Abhik Roychoudhury: Angelix: scalable multiline program patch synthesis via symbolic analysis. ICSE 2016: 691-701

  35. OPTIONAL: We are done reading papers formally. You don't have to read any more.
  36. Thomas Durieux, Fernanda Madeiral, Matias Martinez, Rui Abreu: Empirical review of Java program repair tools: a large-scale experiment on 2,141 bugs and 23,551 repair attempts. ESEC/SIGSOFT FSE 2019: 302-313 (distinguished paper award)
  37. Joel Lehman, Jeff Clune, Dusan Misevic, Christoph Adami, Lee Altenberg, Julie Beaulieu, Peter J. Bentley, Samuel Bernard, Guillaume Beslon, David M. Bryson, Nick Cheney, Patryk Chrabaszcz, Antoine Cully, Stéphane Doncieux, Fred C. Dyer, Kai Olav Ellefsen, Robert Feldt, Stephan Fischer, Stephanie Forrest, Antoine Frénoy, Christian Gagné, Léni K. Le Goff, Laura M. Grabowski, Babak Hodjat, Frank Hutter, Laurent Keller, Carole Knibbe, Peter Krcah, Richard E. Lenski, Hod Lipson, Robert MacCurdy, Carlos Maestre, Risto Miikkulainen, Sara Mitri, David E. Moriarty, Jean-Baptiste Mouret, Anh Nguyen, Charles Ofria, Marc Parizeau, David P. Parsons, Robert T. Pennock, William F. Punch, Thomas S. Ray, Marc Schoenauer, Eric Schulte, Karl Sims, Kenneth O. Stanley, François Taddei, Danesh Tarapore, Simon Thibault, Richard A. Watson, Westley Weimer, Jason Yosinski: The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities. Artif. Life 26(2): 274-306 (2020)