My work focuses on systems, networking, and security. I'm particularly drawn to research problems that require a cross-disciplinary approach and produce a practical impact.
Runtime programmable networks:
We envision that the end-to-end network infrastructure, vertically from the host kernels to the NICs, and horizontally extending across switches to the other end of the network,
can be reprogrammed on-the-fly without packet loss, and with strong consistency guarantees. See our vision paper, joint talk,
the runtime programmable switch project which implements runtime reconfiguration for a commercially available switch ASIC, a program synthesis tool that generates runtime update plans, and our $3M NSF Large project and Nvidia's HotChips'23 talk.
Programmable in-network security:
Our vision is to transform a programmable network into a "programmable defense infrastructure," which supports security as naturally as it does routing. In this design, a switch not only forwards traffic,
but also applies to it a wide range of defenses. The network not only routes traffic end-to-end, but also swaps defenses along the paths in and out as needed to mitigate attacks.
Recent projects include:
ML for systems software:
Systems software (e.g., OS kernels) needs to support different applications and multiplex different types of hardware platforms; no one-size-fits-all optimizations exist.
Neural networks are effective at generalizing to unseen scenarios, but their blackbox nature is a poor fit for low-level systems software, which must make safety-critical decisions.
In this project, we are pursuing two approaches: a) creating systems-level mechanisms that are constrained by symbolic logic while making them reconfigurable with learning-derived policies,
and b) applying learning to analyze systems-level code to identify optimization opportunities.
See examples at this paper
and the Clara project.
Causality in distributed systems:
Diagnosing problems in large systems has always been a challenging problem due to their complexity. Our project uses data provenance to track causal relationships
between system states and their changes. It further uses them to enable automated reasoning for fault diagnosis, repair, and prevention, e.g., using a Datalog-like logical model.
See the individual projects:
Infrastructure optimizations for data-intensive systems:
This project aims at a tighter vertical integration between data-intensive systems and the cloud infrastructure to improve their performance,
by whole-stack optimizations from the network layer to the OS, and to distributed frameworks and the applications themselves.
One overarching theme for these optimizations is to reconfigure or rearchitect various parts of the infrastructure for data-intensive workloads, as further
informed by modern hardware technologies available in the cloud.