Research

I've graduated from the University of Michigan and am currently working at Parakinetics, Inc.

The remainder of this page concerns research I did as a graduate student at the University of Michigan.

Research Interests

Current Projects

  • Development of an Architecture Description Framework to describe and synthesize custom processors.
  • Automated design of non-programmable and programmable loop accelerators.
  • Operation partitioning for heterogeneous clustered VLIW architectures.
  • Design space exploration and compilation for machines with an incomplete register bypass network.

Publications (First Author)

  • [ pdf ] Automatic Design of Efficient Application-centric Architectures. K. Fan. Ph.D. Thesis, August 2008.

  • [ pdf ] Bridging the Computation Gap Between Programmable Processors and Hardwired Accelerators. K. Fan, M. Kudlur, G. Dasika, S. Mahlke. International Symposium on High-Performance Computer Architecture (HPCA), February 2009.

  • [ pdf ] Modulo Scheduling for Highly Customized Datapaths to Increase Hardware Reusability. K. Fan, H. Park, M. Kudlur, S. Mahlke. International Symposium on Code Generation and Optimization (CGO), April 2008.

  • [ pdf ] Increasing Hardware Efficiency with Multifunction Loop Accelerators. K. Fan, M. Kudlur, H. Park, S. Mahlke. International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October 2006.

  • [ pdf ] Cost Sensitive Modulo Scheduling in a Loop Accelerator Synthesis System. K. Fan, M. Kudlur, H. Park, S. Mahlke. 38th International Symposium on Microarchitecture (MICRO), November 2005.

  • [ pdf ] Compiler-directed Synthesis of Multifunction Loop Accelerators. K. Fan, M. Kudlur, H. Park, S. Mahlke. Workshop on Application Specific Processors (WASP), in conjunction with CODES+ISSS, September 2005.

  • [ pdf ] Systematic Register Bypass Customization for Application-Specific Processors. K. Fan, N. Clark, M. Chu, K. V. Manjunath, R. Ravindran, M. Smelyanskiy, S. Mahlke. IEEE 14th International Conference on Application-specific Systems, Architectures and Processors (ASAP), June 2003.

Other Publications

  • [ pdf ] Power-Efficient Medical Image Processing using PUMA. G. Dasika, K. Fan, S. Mahlke. IEEE Symposium on Application Specific Processors (SASP), July 2009.

  • [ pdf ] Edge-centric Modulo Scheduling for Coarse-Grained Reconfigurable Architectures. H. Park, K. Fan, S. Mahlke, T. Oh, H. Kim, H.-S. Kim. International Conference on Parallel Architectures and Compilation Techniques (PACT), October 2008.

  • [ pdf ] DVFS in Loop Accelerators Using BLADES. G. Dasika, S. Das, K. Fan, S. Mahlke, D. Bull. Design Automation Conference (DAC), June 2008.

  • [ pdf ] Streamroller: Automatic Synthesis of Prescribed Throughput Accelerator Pipelines. M. Kudlur, K. Fan, S. Mahlke. International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October 2006.

  • [ pdf ] Modulo Graph Embedding: Mapping Applications onto Coarse-Grained Reconfigurable Architectures. H. Park, K. Fan, M. Kudlur, S. Mahlke. International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), October 2006.

  • [ pdf ] A Distributed Control Path Architecture for VLIW Processors. H. Zhong, K. Fan, S. Mahlke, M. Schlansker. 14th International Conference on Parallel Architectures and Compilation Techniques (PACT), September 2005.

  • [ pdf ] Automatic Synthesis of Customized Local Memories for Multicluster Application Accelerators. M. Kudlur, K. Fan, M. Chu, S. Mahlke. IEEE 15th International Conference on Application-specific Systems, Architectures and Processors (ASAP), September 2004.

  • Cost-Sensitive Partitioning in an Architecture Synthesis System for Multicluster Processors. M. Chu, K. Fan, R. Ravindran, S. Mahlke. IEEE Micro, 24(3):10-20, May/June 2004.

  • [ pdf ] FLASH: Foresighted Latency-Aware Scheduling Heuristic for Processors with Customized Datapaths. M. Kudlur, K. Fan, M. Chu, R. Ravindran, N. Clark, S. Mahlke. International Symposium on Code Generation and Optimization (CGO), March 2004.

  • [ pdf ] Cost-Sensitive Operation Partitioning for Synthesizing Custom Multicluster Datapath Architectures. M. Chu, K. Fan, R. Ravindran, S. Mahlke. Workshop on Application Specific Processors (WASP), in conjunction with MICRO-36, December 2003.

  • [ pdf ] Region-based Hierarchical Operation Partitioning for Multicluster Processors. M. Chu, K. Fan, and S. Mahlke. ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2003.

Presentations

  • [ ppt ] Streamroller: Compiler Orchestrated Synthesis of Accelerator Pipelines. M. Kudlur, K. Fan, G. Dasika, and S. Mahlke. Workshop on Compiler Assisted SoC Assembly (CASA), October 2006.

  • [ ppt ] Compiler-directed Synthesis of Programmable Loop Accelerators. K. Fan, H. Park, and S. Mahlke. Workshop on Emerging Directions in Electronic Design Automation, in conjunction with CASES, September 2004.

  • [ ppt ] OptimoDE: Programmable Accelerator Engines Through Retargetable Customization. N. Clark, H. Zhong, K. Fan, S. Mahlke, K. Flautner, and K. Van Nieuwenhove. Hot Chips 16, August 2004.

Classes Taken

  • EECS 586: Design and Analysis of Algorithms
  • EECS 570: Parallel Computer Architecture
  • EECS 598-2: Virtual Machine Technologies and Applications
  • EECS 573: Microarchitecture
  • EECS 583: Advanced Compilers
  • EECS 500: Communications and Networks (Seminar)
  • EECS 470: Computer Architecture
  • EECS 492: Intro to Artificial Intelligence
  • Math 419: Linear Spaces and Matrix Theory