Research Projects

I am interested in the design and implementation of high-performance, power-efficient, bug-free, secure, and cost-effective computing systems. My research interests include computer architecture, robust and secure system design, hardware and software verification, and performance analysis tools and techniques. Below is a selection of current (and recent) research projects I am working on:

Control-Data Isolated Code - A Code Injection Resistant Compilation Technology

Representative publication: William Arthur, Ben Mehne, Reetuparna Das, and Todd Austin, "Getting in Control of Your Control Flow with Control-Data Isolation," in the 2015 International Symposium on Code Generation and Optimization (CGO-2015), February 2015.

Project Synopsis: Computer security has become a central focus in the information age. Though enormous effort has been expended on ensuring secure computation, software exploitation remains a serious threat. The software attack surface provides many avenues for hijacking; however, most exploits ultimately rely on the successful execution of a control-flow attack. This pervasive diversion of control flow is made possible by the pollution of control flow structure with attacker-injected runtime data. Many control-flow attacks persist because the root of the problem remains: runtime data is allowed to enter the program counter. In this paper, we propose a novel approach: Control-Data Isolation. Our approach provides protection by going to the root of the problem and removing all of the operations that inject runtime data into program control. While previous work relies on CFG edge checking and labeling, these techniques remain vulnerable to attacks such as heap spray, read, or GOT attacks and in some cases suffer high overheads. Rather than addressing control-flow attacks by layering additional complexity, our work takes a subtractive approach; subtracting the primary cause of contemporary control-flow attacks. We demonstrate that control-data isolation can assure the integrity of the programmer's CFG at runtime, while incurring average performance overheads of less than 7% for a wide range of benchmarks.

EVA Vision Architecture

Representative publication: Jason Clemons, Andrea Pellegrini, Silvio Savarese and Todd Austin, "EVA: An Efficient Vision Architecture for Mobile Systems," in the 2013 International Conference on Compilers Architecture and Synthesis for Embedded Systems (CASES-2013), October 2013.

Project Synopsis: The capabilities of mobile devices have been increasing at a momentous rate. As better processors have merged with capable cameras in mobile systems, the number of computer vision applications has grown rapidly. However, the computational and energy constraints of mobile devices have forced computer vision application developers to sacrifice accuracy for the sake of meeting timing demands. To increase the computational performance of mobile systems we present EVA. EVA is an application-specific heterogeneous multi-core having a mix of computationally powerful cores with energy efficient cores. Each core of EVA has computation and memory architectural enhancements tailored to the application traits of vision codes. Using a computer vision benchmarking suite, we evaluate the efficiency and performance of a wide range of EVA designs. We show that EVA can provide speedups of over 9x that of an embedded processor while reducing energy demands by as much as 3x.

Brick and Mortar Silicon - A Technology for Assembly-Time Silicon Specialization

Representative publication: Martha Mercaldi, Mojtaba Mehrara, Mark Oskin and Todd Austin, "Architectural Implications of Brick and Mortar Silicon Manufacturing", in the 34th Annual International Symposium on Computer Architecture (ISCA-2007), June 2007.

Project Synopsis: We introduce a novel chip fabrication technique called "brick and mortar", in which chips are made from small, pre-fabricated ASIC bricks and bonded in a designer-specified arrangement to an interbrick communication backbone chip. The goal of brick and mortar assembly is to provide a low-overhead method to produce custom chips, yet with performance that tracks an ASIC more closely than an FPGA. This paper examines the architectural design choices in this chip-design system. These choices include the definition of reasonable bricks, both in functionality and size, as well as the communication interconnect that the I/O cap provides. To do this we synthesize candidate bricks, analyze their area and bandwidth demands, and present an architectural design for the inter-brick communication network. We discuss a sample chip design, a 16-way CMP, and analyze the costs and benefits of designing chips with brick and mortar. We find that this method of producing chips incurs only a small performance loss (8%) compared to a fully custom ASIC, which is significantly less than the degradation seen from other low-overhead chip options, such as FPGAs. Finally, we measure the effect that architectural design decisions have on the behavior of the proposed physical brick assembly technique, fluidic self-assembly.


MVSS: Michigan Visual Sonification System

Representative publication: Jason Clemons, Sid Yingze Bao, Silvio Savarese, Todd Austin, and Vinay Sharma, "MVSS: Michigan Visual Sonification System," in the 2012 International Conference on Emerging Signal Processing Applications (ESPA-2012), January 2012.

Project Synopsis: Visual Sonification is the process of converting visual properties of objects into sound signals. This paper describes the Michigan Visual Sonification System (MVSS) that utilizes this process to assist the visually impaired in distinguishing different objects in their surroundings. MVSS uses depth information to first segment and localize salient objects and then represents an object's appearance using histograms of visual features. A dictionary of invariant visual features (or words) is created in an a-priori off-line learning phase using Bag-of-Words modeling. The histogram of a segmented object is then converted to a sound signal, the volume and 3D placement of which is determined by the relative position of the object with respect to the user. The system then relies on the considerable discriminating power of the human brain to localize and "classify" the sound, thus enabling the user to distinguish between visually distinct object classes. This paper describes the different components of MVSS in detail and presents some promising initial experimental results.

Testudo: Dynamic Distributed Debug

Representative publication: Joseph Greathouse, Chelsea LeBlanc, Valeria Bertacco and Todd Austin, "Highly Scalable Distributed Dataflow Analysis," in the 2011 International Symposium on Code Generation and Optimization (CGO-2011), April 2011.

Project Synopsis: Dynamic dataflow analyses find software errors by tracking meta-values associated with a program's runtime data. Despite their benefits, the orders-of-magnitude slowdowns that accompany these systems limit their use to the development stage; few users would tolerate such overheads. This work extends dynamic dataflow analyses with a novel sampling system which ensures that runtime slowdowns do not exceed a user-defined threshold. While previous sampling methods are inadequate for dataflow analyses, our technique efficiently reduces the number and size of analyzed dataflows. In doing so, it allows individual users to test large, stochastically chosen sets of a process's dataflows. Large populations can therefore, in aggregate, analyze a larger portion of the program than is possible by any single user running a complete, but slow, analysis. In our experimental evaluation, we show that 1 out of every 10 users expose a number of security exploits while only experiencing a 10% performance slowdown, in contrast with the 100x overhead caused by a complete analysis that exposes the same problems.

BulletProof: Defect-Tolerant Architectures

Representative publication: Kypros Constantinides, Smitha Shyam, Sujay Phadke, Valeria Bertacco and Todd Austin, "Ultra Low-Cost Defect Protection for Microprocessor Pipelines", International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), San Jose, October 2006.

Project Synopsis: The sustained push toward smaller and smaller technology sizes has reached a point where device reliability has moved to the forefront of concerns for next-generation designs. PI Austin and his research team at University of Michigan are addressing these challenges through the StoneShield project, which is developing ultra low-cost mechanisms to protect a microprocessor pipeline and on-chip memory system from silicon defects. While traditional defect tolerance techniques require at least 100% overhead due to duplication of critical resources, the BulletProof project is exploiting the use of on-line testing-based approaches, which provide the same level of protection with overheads of less than 5%. The BulletProof team is developing novel ultra low-cost mechanisms to protect a microprocessor pipeline and on-chip memory system from silicon defects. To achieve this goal they combine area-frugal on-line testing techniques and system-level check-pointing to provide the same guarantees of reliability of traditional solutions, but at much lower cost. Their approach utilizes a microarchitectural check-pointing mechanism to create coarse-grained epochs of execution, during which distributed on-line built in self-test (BIST) mechanisms validate the integrity of the underlying hardware. In case a failure is detected, they rely on the natural redundancy of instruction-level parallel (ILP) processors to repair the system such that it can still operate in a degraded performance mode.

Razor: Low-Power Processor Designs based on Timing Speculation


Representative publication: Todd Austin, David Blaauw, Trevor Mudge, and Krisztian Flautner, "Making Typical Silicon Matter with Razor", IEEE Computer, March 2004.

Project Synopsis: The Razor team have been engaged in the development of Razor Logic and computer system simulation infrastructure used to evaluate Razor Logic. Razor is an error-tolerant dynamic voltage scaling technology capable of shaving away voltage margins, resulting in more energy efficient designs with little performance impact. The key observation underlying the design of Razor is that the worst-case conditions that drive traditional design are improbable conditions. Thus, by building error detection and correction mechanisms into the Razor design, it becomes possible to tune voltage to typical energy requirements, rather than worst case. The resulting design has significantly lower energy requirements, even in the presence of added energy processing requirements due to occasional error recoveries. The Razor design utilizes an in-situ timing error detection and correction mechanism implemented within the Razor flip-flop. Razor flip-flops double-sample pipeline stage values, once with an aggressive fast clock and again with a delayed clock that guarantees a reliable second sample. A metastability-tolerant error detection circuit is employed to check the validity of all values latched on the fast Razor clock. In the event of a timing error, a modified pipeline flush mechanism restores the correct stage value into the pipeline, flushes earlier instructions, and restarts the next instruction after the errant computation.

Subliminal Systems: Ultra Low-Power Processing with Subthreshold Circuits


Representative publication: Leyla Nazhandali, Bo Zhai, Ryan Helfand, Michael Minuth, Javin Olson, Sanjay Pant, Anna Reeves, Todd Austin, and David Blaauw, "Energy Optimization of Subthreshold-Voltage Sensor Processors," in the 32nd Annual International Symposium on Computer Architecture (ISCA-2005), June 2005.

Project Synopsis: Sensor network processors and their applications are a growing area of focus in computer system research and design. Inherent to this design space is a reduced processing performance requirement and extremely high energy constraints, such that sensor network processors must execute low-performance tasks for long durations on small energy supplies. In the Subliminal Systems project, we are demonstrating that subthreshold-voltage circuit design (400 mV and below) lends itself well to the performance and energy demands of sensor network processors. Moreover, we have shown that the landscape for microarchitectural energy optimization dramatically changes in the subthreshold domain. The dominance of leakage power in the subthreshold regime demands architectures that i) reduce overall area, ii) increase the utility of transistors, while iii) maintaining acceptable CPI efficiency. Our best sensor platform, implemented in 130nm CMOS and operating at 235 mV, only consumes 1.38 pJ/instruction, nearly an order of magnitude less energy than previously published sensor network processor results. This design, accompanied by bulk-silicon solar cells for energy scavenging, has been manufactured by IBM.

Past Projects

  • DIVA - Dynamic Instruction Verification Architectures
  • Cyclone - Decentralized Dynamic Scheduler Architectures
  • CryptoManiac - Application-Specific Processor Design
  • The SimpleScalar Tool Set - Architectural Performance Analysis Tools