Home Work Experience Projects Publications CV

About Me

I am final year Ph.D. candidate at the University of Michigan. I am a part of the Circuits and Architecture Design Research (CADRe) group. My advisor is Prof. Ronald Dreslinski.

My dissertation research aims to build efficient computer architectures and systems for emerging non-Si technologies by developing solutions across the compute stack. My research work has been published in top computer architecture and EDA conferences (ISLPED, HPCA, DAC). My work was nominated for the best paper award at ISLPED 2017.

I am currently applying for full-time/postdoc positions in Industry Research starting Fall 2021. My interests lie in working at the cusp of computer architecture and memory systems for emerging technologies and applications. I am interested in positions both in the US and abroad. Please reach out to me if you are hiring!

Work Experience

IBM Corporation, May - Aug 2020
Graduate Research Intern,
Developed a real-time scheduler library for a heterogeneous SoC as part of DARPA’s DSSoC project. Designed a learning-agent-based scheduling policy to improve quality-of-mission for autonomous vehicles.

IBM Corporation, May - Aug 2019
Graduate Research Intern,
Developed an open source real-time smart scheduler for heterogeneous accelerator-rich architectures. Scheduler is built for tasks related to the application domain of autonomous vehicles.

Advanced Micro Devices, May - Aug 2018
Architecture Research Co-op Engineer,
Worked on the PathForward project with the goal to enhance performance, reduce energy cost per instruction and lower performance variability per thread for exascale workloads. Optmized the power efficiency and memory-level parallelism of the Load-Store Unit (LS) design. Targeted improving performance by partitioning the load queue based on the state of load instructions. Designed an architectural mechanism to reduce dynamic power by reducing the associative searches done in the load and store queues.

NVIDIA Corporation, Jul 2014 - Jun 2015
Memory Design Co-op Intern,
Using Monte Carlo simulations on HSpice, designed and incorporated Advanced On-Chip Variation (AOCV) characterization module for timing enclosure into NanoTime reports for the SRAM design team. Published keeper-to-pull down rule for FINFETs based on writability of local read bitline and leakage and charge sharing supported by the keeper using Solido.

Teaching Experience

EECS 470 (Computer Architecture), University of Michigan
  • Graduate student instructor for a 45-student course
  • Responsible for conducting lab sessions and office hours and assisting the Lecturer, Dr. Mark Brehob.
  • Responsible for grading final projects of students which involves building an entire out-of-order processor with special features to improve performance such as superscalar, multi-core and simultaneous multi-threading.

Research Projects

Device-aware scheduling: Improving SoC throughput using workload scheduling on multi-technology accelerators
Ongoing work to design of a device-level heterogeneous SoC model using multi-technology accelerators and developing intelligent workload scheduling that improves SoC throughput and performance.

Real-time scheduling on heterogeneous SoC for Autonomous Vehicles
Designed multiple scheduling policies for real-time constrained applications. Developed hierarchical scheduling policies and infrastructure for its evaluation.

Transmuter: A Reconfigurable Architecture for General Purpose Acceleration
Application development for a highly parallel, SPMD-based reconfigurable architecture. The architecture can reconfigure for multiple memory systems and dataflows.

Firmware development for DARPA's DSSoC and SDH programs
Responsible for the development of Linux firmware for device trees, boot sequence and workload scheduling. Developed a QEMU-SystemC platform that emulates a Zynq MPSoC to develop software and evaluate the final chip prototype.

A Reconfigurable Architecture for Addressing the Reliability Concerns of 3D Multi-Core Processors and Low Yield Rates of Future Technologies
Designed a fine-grained reconfigurable 3D architecture policy that detects faults on-line, repairs the system and decelerates aging caused by NBTI effects at a marginal clock cycle and area overhead.

OuterSPACE: An Outer product based SPArse matrix multiplication acCElerator
Designed an outer-product based matrix multiplication energy-efficient accelerator, which minimizes data movement and maximizes reuse to efficiently process billions of edges of real world matrices.

A new design framework for high-variation carbon-nanotube based transistor technology
Designed a multigranular-reconfigurable 3D architecture, used along with a CNT density variation model to improve yield and performance of high-variation CNTFETs.

DARPA's Circuit Realization At Faster Timescales (CRAFT) Project
Built an advanced node chip as a part of a multi-university group for DARPA's CRAFT project aiming to minimize chip design time.

A Carbon Nanotube Transistor based RISC-V Processor using Pass Transistor Logic
Explored circuit and architectural design choices using Carbon Nanotube field-effect transistor in pass transistor logic to create an energy-efficient RISC-V based processor.


Transmuter: Bridging the Efficiency Gap using Memory and Dataflow Reconfiguration
S. Pal, S. Feng, D.-h. Park, S. Kim, A. Amarnath, C.-S. Yang, X. He, J. Beaumont, K. May, Y. Xiong, K. Kaszyk, J. Morton, J. Sun, M. O'Boyle, M. Cole, C. Chakrabarti, D. Blaauw, H.-S. Kim, T. Mudge and R. Dreslinski
ACM International Conference on Parallel Architectures and Compilation Techniques (PACT), Sept 2020.

R2D3: A Reliability Engine for 3D Parallel Systems
Javad Bagherzadeh, Aporva Amarnath, Jielun Tan, Subhankar Pal, Ronald Dreslinski
International Symposium on Low-Power Electronic Design (ISLPED), Jul 2019.

Sparse-TPU: Adapting Systolic Arrays for Sparse Matrices
Xin He, Subhankar Pal, Aporva Amarnath, Siying Feng, Dong-Hyeon Park, Austin Rovinski, Haojie Ye, Yuhan Chen, Ronald Dreslinski, Trevor Mudge
International Conference on Supercomputing (ICS), Jul 2020.

3DTUBE: A Design Framework for High-Variation Carbon Nanotube-based Transistor Technology
Aporva Amarnath, Javad Bagherzadeh, Jielun Tan, Ronald Dreslinski
International Symposium on Low-Power Electronic Design (ISLPED), Jul 2019.

A 1.4 GHz 695 Giga RISC-V Inst/s 496-core Manycore Processor with Mesh On-Chip Network and an All-Digital Synthesized PLL in 16nm CMOS
Austin Rovinski, Chun Zhao, Khalid Al-Hawaj, Paul Gao, Shaolin Xie, Christopher Torng, Scott Davidson, Aporva Amarnath et.al
IEEE Symposium on VLSI Technology (VLSI), Jun 2019

A 7.3M Output Non-Zeros/J Sparse Matrix-Matrix Multiplication Accelerator using Memory Reconfiguration in 40nm
Subhankar Pal, Dong-Hyeon Park, Siying Feng, Paul Gao, Jielun Tan, Austin Rovinski, S. Xie, C. Zhao, Aporva Amarnath, et.al
IEEE Symposium on VLSI Technology (VLSI), Jun 2019

The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric: Fast Architectures and Design Methodologies for Fast Chips
Scott Davidson, Shaolin Xie, Christopher Torng, Khalid Al-Hawai, Austin Rovinski, Tutu Ajayi, Luis Vega, Chun Zhao, Ritchie Zhao, Steve Dai, Aporva Amarnath, Bandhav Veluri, Paul Gao, Anuj Rao, Gai Liu, Rajesh K Gupta, Zhiru Zhang, Ronald Dreslinski, Christopher Batten, Michael Bedford Taylor
IEEE MICRO Journal, 2018

OuterSPACE: An Outer Product based Sparse Matrix Multiplication Accelerator
Subhankar Pal, Jonathan Beaumont, Dong-Hyeon Park, Aporva Amarnath, Siying Feng, Chaitali Chakrabarti, Hun-Seok Kim, David Blaauw, Trevor Mudge, Ronald Dreslinski
HPCA 2018

Experiences Using the RISC-V Ecosystem to Design an Accelerator-Centric SoC in TSMC 16nm
Tutu Ajayi, Khalid Al-Hawaj, Aporva Amarnath, Steve Dai, Scott Davidson, Paul Gao, Gai Liu, Anuj Rao, Austin Rovinski, Ningxiao Sun, Christopher Torng, Luis Vega, Bandhav Veluri, Shaolin Xie, Chun Zhao, Ritchie Zhao
CARRV, Workshop in MICRO 2017

Celerity: An Open Source RISC-V Tiered Accelerator Fabric
Tutu Ajayi, Khalid Al-Hawaj, Aporva Amarnath, Steve Dai, Scott Davidson, Paul Gao, Gai Liu, Atieh Lotfi, Julian Puscar, Anuj Rao, Austin Rovinski, Loai Salem, Ningxiao Sun, Christopher Torng, Luis Vega, Bandhav Veluri, Xiaoyang Wang, Shaolin Xie, Chun Zhao, Ritchie Zhao, Christopher Batten, Ronald G Dreslinski, Ian Galton, Rajesh K Gupta, Patrick P Mercier, Mani Srivastava, Michael B Taylor, Zhiru Zhang
Hot Chips 2017

A Carbon Nanotube Transistor based RISC-V Processor using Pass Transistor Logic
Aporva Amarnath, Siying Feng, Subhankar Pal, Tutu Ajayi, Austin Rovinski, Ronald G Dreslinski

Website template credits: jkloosterman.net