machine learning

Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators

We propose a compiler extension to efficiently manage the scratch-pad memories in moden deep learning accelerators.

Accelerating Deep Neural Network Computation on a Low Power Reconfigurable Architecture

We evaluate an popular deep neural networks on a massively parallel, reconfigurable architecture called Transformer.