|
I've graduated from the University of Michigan and am currently working at
Parakinetics, Inc.
The remainder of this page concerns research I did as a graduate student at
the University of Michigan.
Research Interests
Current Projects
- Development of an Architecture Description Framework to describe and
synthesize custom processors.
- Automated design of non-programmable and programmable loop accelerators.
- Operation partitioning for heterogeneous clustered VLIW architectures.
- Design space exploration and compilation for machines with
an incomplete register bypass network.
Publications (First Author)
- [ pdf ]
Automatic Design of Efficient Application-centric Architectures. K. Fan.
Ph.D. Thesis, August 2008.
- [ pdf ]
Bridging the Computation Gap Between Programmable Processors and Hardwired
Accelerators. K. Fan, M. Kudlur, G. Dasika, S. Mahlke. International
Symposium on High-Performance Computer Architecture (HPCA), February
2009.
- [ pdf ]
Modulo Scheduling for Highly Customized Datapaths to Increase Hardware
Reusability. K. Fan, H. Park, M. Kudlur, S. Mahlke. International
Symposium on Code Generation and Optimization (CGO), April 2008.
- [ pdf ]
Increasing Hardware Efficiency with Multifunction Loop Accelerators.
K. Fan, M. Kudlur, H. Park, S. Mahlke. International Conference on
Hardware/Software Codesign and System Synthesis (CODES+ISSS), October
2006.
- [ pdf ]
Cost Sensitive Modulo Scheduling in a Loop Accelerator Synthesis System.
K. Fan, M. Kudlur, H. Park, S. Mahlke. 38th International Symposium on
Microarchitecture (MICRO), November 2005.
- [ pdf ]
Compiler-directed Synthesis of Multifunction Loop Accelerators.
K. Fan, M. Kudlur, H. Park, S. Mahlke. Workshop on Application
Specific Processors (WASP), in conjunction with CODES+ISSS, September
2005.
- [ pdf ]
Systematic Register Bypass Customization for Application-Specific
Processors. K. Fan, N. Clark, M. Chu, K. V. Manjunath, R. Ravindran,
M. Smelyanskiy, S. Mahlke. IEEE 14th International Conference on
Application-specific Systems, Architectures and Processors (ASAP), June
2003.
Other Publications
- [ pdf ]
Power-Efficient Medical Image Processing using PUMA. G. Dasika, K. Fan,
S. Mahlke. IEEE Symposium on Application Specific Processors (SASP), July
2009.
- [ pdf ]
Edge-centric Modulo Scheduling for Coarse-Grained Reconfigurable
Architectures. H. Park, K. Fan, S. Mahlke, T. Oh, H. Kim, H.-S. Kim.
International Conference on Parallel Architectures and Compilation Techniques
(PACT), October 2008.
- [ pdf ]
DVFS in Loop Accelerators Using BLADES. G. Dasika, S. Das, K. Fan, S.
Mahlke, D. Bull. Design Automation Conference (DAC), June 2008.
- [ pdf ]
Streamroller: Automatic Synthesis of Prescribed Throughput Accelerator
Pipelines. M. Kudlur, K. Fan, S. Mahlke. International Conference on
Hardware/Software Codesign and System Synthesis (CODES+ISSS), October
2006.
- [ pdf ]
Modulo Graph Embedding: Mapping Applications onto Coarse-Grained
Reconfigurable Architectures. H. Park, K. Fan, M. Kudlur, S. Mahlke.
International Conference on Compilers, Architecture, and Synthesis for Embedded
Systems (CASES), October 2006.
- [ pdf ]
A Distributed Control Path Architecture for VLIW Processors.
H. Zhong, K. Fan, S. Mahlke, M. Schlansker. 14th International
Conference on Parallel Architectures and Compilation Techniques
(PACT), September 2005.
- [ pdf ]
Automatic Synthesis of Customized Local Memories for Multicluster
Application Accelerators. M. Kudlur, K. Fan, M. Chu, S. Mahlke.
IEEE 15th International Conference on Application-specific
Systems, Architectures and Processors (ASAP), September 2004.
- Cost-Sensitive Partitioning in an Architecture Synthesis System
for Multicluster Processors. M. Chu, K. Fan, R. Ravindran, S.
Mahlke. IEEE Micro, 24(3):10-20, May/June 2004.
- [ pdf ]
FLASH: Foresighted Latency-Aware Scheduling Heuristic for Processors
with Customized Datapaths. M. Kudlur, K. Fan, M. Chu, R. Ravindran,
N. Clark, S. Mahlke. International Symposium on Code Generation and
Optimization (CGO), March 2004.
- [ pdf ]
Cost-Sensitive Operation Partitioning for Synthesizing Custom
Multicluster Datapath Architectures. M. Chu, K. Fan, R. Ravindran,
S. Mahlke. Workshop on Application Specific Processors (WASP), in
conjunction with MICRO-36, December 2003.
- [ pdf ]
Region-based Hierarchical Operation Partitioning for Multicluster
Processors. M. Chu, K. Fan, and S. Mahlke. ACM SIGPLAN Conference on
Programming Language Design and Implementation (PLDI), June 2003.
Presentations
- [ ppt ]
Streamroller: Compiler Orchestrated Synthesis of Accelerator Pipelines.
M. Kudlur, K. Fan, G. Dasika, and S. Mahlke. Workshop on Compiler Assisted SoC
Assembly (CASA), October 2006.
- [ ppt ]
Compiler-directed Synthesis of Programmable Loop Accelerators. K. Fan,
H. Park, and S. Mahlke. Workshop on Emerging Directions in Electronic Design
Automation, in conjunction with CASES, September 2004.
- [ ppt ]
OptimoDE: Programmable Accelerator Engines Through Retargetable
Customization. N. Clark, H. Zhong, K. Fan, S. Mahlke, K. Flautner, and K.
Van Nieuwenhove. Hot Chips 16, August 2004.
Classes Taken
- EECS 586: Design and Analysis of Algorithms
- EECS 570: Parallel Computer Architecture
- EECS 598-2: Virtual Machine Technologies and Applications
- EECS 573: Microarchitecture
- EECS 583: Advanced Compilers
- EECS 500: Communications and Networks (Seminar)
- EECS 470: Computer Architecture
- EECS 492: Intro to Artificial Intelligence
- Math 419: Linear Spaces and Matrix Theory
|