My work is making current computing systems more energy efficient. Toward this goal, I explore different levels of the computing stack. At the computer architecture level, I design Machine Learning accelerators for Fully-Connected, Convolutional, and Graph Neural Networks. I exploit computation and data reuse techniques, such as weight sparsity and repetition, differential computation, and compression schemes.
My current research lies at the memory system level. Data movement between memory and processing cores consumes a substantial portion of the energy. Furthermore, memory access cannot keep up with the computation speed, because of limitted bandwidth of processor-memory interface, and high latency of lower levels of memory system. A well-know solution to this problem, known as memory wall, is in-memory computing, which reuses memory arrays to do in-situ computation. This breaks the memory wall as data movement is eliminated from the memory system. Despite various domain-specific in-memory architectures being proposed, I aim to make in-memory computing a General-Purpose programming paradigm that is flexible for expressing different application domains.
You can find my contributions to the field of Computer Architecture and Systems in my Google Scholar.