I am interested in the design and implementation of
high-performance, power-efficient, bug-free, secure, and cost-effective computing systems. My
research interests include computer architecture, robust and secure system
design,
hardware and software verification, and performance analysis tools and
techniques. Below is a selection of current (and recent) research projects I
am working on:Control-Data Isolated Code - A Code
Injection Resistant Compilation Technology

Representative publication:
William Arthur, Ben Mehne, Reetuparna Das, and Todd Austin, "Getting
in Control of Your Control Flow with Control-Data Isolation,"
in the 2015 International Symposium on Code Generation and Optimization
(CGO-2015), February 2015.
Project Synopsis: Computer security has become a central focus in
the information age. Though enormous effort has been expended on ensuring
secure computation, software exploitation remains a serious threat. The
software attack surface provides many avenues for hijacking; however, most
exploits ultimately rely on the successful execution of a control-flow
attack. This pervasive diversion of control flow is made possible by the
pollution of control flow structure with attacker-injected runtime data.
Many control-flow attacks persist because the root of the problem remains:
runtime data is allowed to enter the program counter. In this paper, we
propose a novel approach: Control-Data Isolation. Our approach provides
protection by going to the root of the problem and removing all of the
operations that inject runtime data into program control. While previous
work relies on CFG edge checking and labeling, these techniques remain
vulnerable to attacks such as heap spray, read, or GOT attacks and in some
cases suffer high overheads. Rather than addressing control-flow attacks by
layering additional complexity, our work takes a subtractive approach;
subtracting the primary cause of contemporary control-flow attacks. We
demonstrate that control-data isolation can assure the integrity of the
programmer's CFG at runtime, while incurring average performance overheads
of less than 7% for a wide range of benchmarks.
EVA Vision Architecture

Representative publication: Jason Clemons, Andrea Pellegrini,
Silvio Savarese and Todd Austin, "EVA:
An Efficient Vision Architecture for Mobile Systems," in the 2013
International Conference on Compilers Architecture and Synthesis for
Embedded Systems (CASES-2013), October 2013.
Project Synopsis: The capabilities of mobile devices have been
increasing at a momentous rate. As better processors have merged with
capable cameras in mobile systems, the number of computer vision
applications has grown rapidly. However, the computational and energy
constraints of mobile devices have forced computer vision application
developers to sacrifice accuracy for the sake of meeting timing demands.
To increase the computational performance of mobile systems we present
EVA. EVA is an application-specific heterogeneous multi-core having a
mix of computationally powerful cores with energy efficient cores. Each
core of EVA has computation and memory architectural enhancements
tailored to the application traits of vision codes. Using a computer
vision benchmarking suite, we evaluate the efficiency and performance of
a wide range of EVA designs. We show that EVA can provide speedups of
over 9x that of an embedded processor while reducing energy demands by
as much as 3x.
Brick and Mortar Silicon - A Technology for
Assembly-Time Silicon Specialization

Representative publication:
Martha Mercaldi, Mojtaba Mehrara, Mark Oskin and Todd Austin, "Architectural
Implications of Brick and Mortar Silicon Manufacturing",
in the 34th Annual International Symposium on Computer Architecture
(ISCA-2007), June 2007.
Project Synopsis: We introduce a novel chip fabrication technique
called "brick and mortar", in which chips are made from small,
pre-fabricated ASIC bricks and bonded in a designer-specified arrangement to
an interbrick communication backbone chip. The goal of brick and mortar
assembly is to provide a low-overhead method to produce custom chips, yet
with performance that tracks an ASIC more closely than an FPGA. This paper
examines the architectural design choices in this chip-design system. These
choices include the definition of reasonable bricks, both in functionality
and size, as well as the communication interconnect that the I/O cap
provides. To do this we synthesize candidate bricks, analyze their area and
bandwidth demands, and present an architectural design for the inter-brick
communication network. We discuss a sample chip design, a 16-way CMP, and
analyze the costs and benefits of designing chips with brick and mortar. We
find that this method of producing chips incurs only a small performance
loss (8%) compared to a fully custom ASIC, which is significantly less than
the degradation seen from other low-overhead chip options, such as FPGAs.
Finally, we measure the effect that architectural design decisions have on
the behavior of the proposed physical brick assembly technique, fluidic
self-assembly.
MVSS: Michigan Visual Sonification
System

Representative publication: Jason Clemons, Sid Yingze Bao,
Silvio Savarese, Todd Austin, and Vinay Sharma, "MVSS: Michigan Visual Sonification System," in the 2012
International Conference on Emerging Signal Processing Applications
(ESPA-2012), January 2012.
Project Synopsis: Visual Sonification is the process of
converting visual properties of objects into sound signals. This paper
describes the Michigan Visual Sonification System (MVSS) that utilizes
this process to assist the visually impaired in distinguishing different
objects in their surroundings. MVSS uses depth information to first
segment and localize salient objects and then represents an object's
appearance using histograms of visual features. A dictionary of
invariant visual features (or words) is created in an a-priori off-line
learning phase using Bag-of-Words modeling. The histogram of a segmented
object is then converted to a sound signal, the volume and 3D placement
of which is determined by the relative position of the object with
respect to the user. The system then relies on the considerable
discriminating power of the human brain to localize and "classify" the
sound, thus enabling the user to distinguish between visually distinct
object classes. This paper describes the different components of MVSS in
detail and presents some promising initial experimental results.
Testudo: Dynamic
Distributed Debug

Representative publication: Joseph Greathouse, Chelsea
LeBlanc, Valeria Bertacco and Todd Austin, "Highly Scalable Distributed Dataflow Analysis," in the 2011 International
Symposium on Code Generation and Optimization (CGO-2011), April 2011.
Project Synopsis: Dynamic dataflow analyses find software
errors by tracking meta-values associated with a program's runtime data.
Despite their benefits, the orders-of-magnitude slowdowns that accompany
these systems limit their use to the development stage; few users would
tolerate such overheads. This work extends dynamic dataflow analyses
with a novel sampling system which ensures that runtime slowdowns do not
exceed a user-defined threshold. While previous sampling methods are
inadequate for dataflow analyses, our technique efficiently reduces the
number and size of analyzed dataflows. In doing so, it allows individual
users to test large, stochastically chosen sets of a process's dataflows.
Large populations can therefore, in aggregate, analyze a larger portion
of the program than is possible by any single user running a complete,
but slow, analysis. In our experimental evaluation, we show that 1 out
of every 10 users expose a number of security exploits while only
experiencing a 10% performance slowdown, in contrast with the 100x
overhead caused by a complete analysis that exposes the same problems.
BulletProof: Defect-Tolerant Architectures

Representative publication: Kypros Constantinides, Smitha Shyam,
Sujay Phadke, Valeria Bertacco and Todd Austin, "Ultra Low-Cost
Defect Protection for Microprocessor Pipelines", International
Conference on Architectural Support for Programming Languages and Operating
Systems (ASPLOS), San Jose, October 2006.
Project Synopsis: The sustained push toward smaller and smaller
technology sizes has reached a point where device reliability has moved to
the forefront of concerns for next-generation designs. PI Austin and his
research team at University of Michigan are addressing these challenges
through the StoneShield project, which is developing ultra low-cost
mechanisms to protect a microprocessor pipeline and on-chip memory system
from silicon defects. While traditional defect tolerance techniques require
at least 100% overhead due to duplication of critical resources, the
BulletProof project is exploiting the use of on-line testing-based
approaches, which provide the same level of protection with overheads of
less than 5%. The BulletProof team is developing novel ultra low-cost
mechanisms to protect a microprocessor pipeline and on-chip memory system
from silicon defects. To achieve this goal they combine area-frugal on-line
testing techniques and system-level check-pointing to provide the same
guarantees of reliability of traditional solutions, but at much lower cost.
Their approach utilizes a microarchitectural check-pointing mechanism to
create coarse-grained epochs of execution, during which distributed on-line
built in self-test (BIST) mechanisms validate the integrity of the
underlying hardware. In case a failure is detected, they rely on the natural
redundancy of instruction-level parallel (ILP) processors to repair the
system such that it can still operate in a degraded performance mode.
Razor: Low-Power
Processor Designs based on Timing Speculation


Representative publication: Todd Austin, David Blaauw, Trevor
Mudge, and Krisztian Flautner, "Making
Typical Silicon Matter with Razor", IEEE Computer, March 2004.
Project Synopsis: The Razor team have been engaged in the
development of Razor Logic and computer system simulation infrastructure
used to evaluate Razor Logic. Razor is an error-tolerant dynamic voltage
scaling technology capable of shaving away voltage margins, resulting in
more energy efficient designs with little performance impact. The key
observation underlying the design of Razor is that the worst-case
conditions that drive traditional design are improbable conditions.
Thus, by building error detection and correction mechanisms into the
Razor design, it becomes possible to tune voltage to typical energy
requirements, rather than worst case. The resulting design has
significantly lower energy requirements, even in the presence of added
energy processing requirements due to occasional error recoveries. The
Razor design utilizes an in-situ timing error detection and correction
mechanism implemented within the Razor flip-flop. Razor flip-flops
double-sample pipeline stage values, once with an aggressive fast clock
and again with a delayed clock that guarantees a reliable second sample.
A metastability-tolerant error detection circuit is employed to check
the validity of all values latched on the fast Razor clock. In the event
of a timing error, a modified pipeline flush mechanism restores the
correct stage value into the pipeline, flushes earlier instructions, and
restarts the next instruction after the errant computation.
Subliminal Systems:
Ultra Low-Power Processing with Subthreshold Circuits


Representative publication: Leyla Nazhandali, Bo Zhai, Ryan
Helfand, Michael Minuth, Javin Olson, Sanjay Pant, Anna Reeves, Todd
Austin, and David Blaauw, "Energy
Optimization of Subthreshold-Voltage Sensor Processors," in the 32nd
Annual International Symposium on Computer Architecture (ISCA-2005), June
2005.
Project Synopsis: Sensor network processors and their
applications are a growing area of focus in computer system research and
design. Inherent to this design space is a reduced processing
performance requirement and extremely high energy constraints, such that
sensor network processors must execute low-performance tasks for long
durations on small energy supplies. In the Subliminal Systems project,
we are demonstrating that subthreshold-voltage circuit design (400 mV
and below) lends itself well to the performance and energy demands of
sensor network processors. Moreover, we have shown that the landscape
for microarchitectural energy optimization dramatically changes in the
subthreshold domain. The dominance of leakage power in the subthreshold
regime demands architectures that i) reduce overall area, ii) increase
the utility of transistors, while iii) maintaining acceptable CPI
efficiency. Our best sensor platform, implemented in 130nm CMOS and
operating at 235 mV, only consumes 1.38 pJ/instruction, nearly an order
of magnitude less energy than previously published sensor network
processor results. This design, accompanied by bulk-silicon solar cells
for energy scavenging, has been manufactured by IBM.
Past Projects
- DIVA - Dynamic Instruction Verification Architectures
- Cyclone - Decentralized Dynamic Scheduler Architectures
- CryptoManiac - Application-Specific Processor Design
- The SimpleScalar Tool Set - Architectural Performance Analysis
Tools
|