I am an Associate Professor in the Computer Science & Engineering
department at University of Michigan, where I lead the Order Lab.
Prior to joining U-M, I was an Assistant Professor at Johns Hopkins CS department from 2017 to 2022.
I have broad research interests in computer systems including OS, cloud computing,
and distributed systems. I am particularly interested in designing principled
techniques to enable reliable, efficient, and secure systems from large
data centers to small mobile devices.
My lab has openings for postdocs, graduate and undergraduate student
interns. I’m looking for students who are self-motivated and have strong
interests in systems building and research. Prospective students please read this page.
News
Dec. 2025
Yuzhuo successfully defended his PhD thesis. Congrats Dr. Jing!
Jul. 2025
TrainVerify is accepted to SOSP '25. Congrats Yunchi!
Jul. 2025
Phoenix is accepted to SOSP '25. Congrats Yuzhuo!
Jul. 2025
Atropos is accepted to SOSP '25. Congrats Yigong!
Mar. 2025
TrainCheck is accepted to OSDI '25. Congrats Yuxuan, Ziming!
Mar. 2025
T2C is accepted to OSDI '25. Congrats Chang, Dimas!
Dec. 2024
Congrats to Chang on receiving the NSF CAREER award in the earliest stage!
pBox is accepted to SOSP '23. Congrats Yigong, Gongqi!
Jun. 2023
Yigong passed his PhD defense and will join University of Washington CSE as a postdoc!
May 2023
Chang passed his PhD defense and will join University of Virginia CS as an Assistant Professor!
Jan. 2023
vProf is accepted to EuroSys '23. Congrats Lingmei!
Research
A major focus of my recent research is to push for higher availability and
observability of next-generation cloud systems. This includes a series of
projects in multiple thrusts:
Understanding of failures beyond fail-stop model
Gray failure: We advocate the importance of the gray failure problem
in cloud systems and discuss its differential observability traits.
Partial failure: We study and analyze real-world
partial failures in popular distributed systems.
Principled detection and localization of complex failures
Panorama: We design a solution to capture and enhance
inherent observability in cloud systems for the detection of gray failures.
Watchdog: We propose the intrinsic watchdog abstraction
for comprehensive runtime checking in system software.
OmegaGen: We design a program analysis and
instrumentation tool to generate custom watchdogs to localize partial failures. (Best Paper Award)
Data-driven approach to transform traditional reliability activities
Narya: a holistic system to predict failures and adaptively mitigate failures through online experimentation.
Gandalf: an analytics service for safe deployments in cloud.
AIOps: a short position paper on the real-world challenges and research
opportunities on AIOps.
I also research on energy-efficient mobile systems (e.g., LeaseOS, DefDroid,
eDoctor) and preventing system misconfigurations (e.g., Violet,
ConfValley).
I received my Ph.D. from UCSD, advised by
Prof. Yuanyuan Zhou. Before joining Hopkins,
I took one year off at MSR Redmond Systems Group
to gain exposure to real-world system challenges in a state-of-the-art cloud service, Microsoft
Azure. I received B.S. (Computer Science) and B.A. (Economics) from Peking University.
Note: Ryan is my English name. For legal documents and publications, Peng Huang is used.