DLAU: A Scalable Deep Learning Accelerator Unit on FPGA - 2017


Because the emerging field of machine learning, deep learning shows excellent ability in solving complicated learning issues. However, the size of the networks becomes increasingly massive scale due to the stress of the sensible applications, that poses vital challenge to construct a high performance implementations of deep learning neural networks. In order to enhance the performance also to maintain the low power value, in this paper we style deep learning accelerator unit (DLAU), that is a scalable accelerator design for giant-scale deep learning networks using field-programmable gate array (FPGA) because the hardware prototype. The DLAU accelerator employs 3 pipelined processing units to improve the throughput and utilizes tile techniques to explore locality for deep learning applications. Experimental results on the state-of-the-art Xilinx FPGA board demonstrate that the DLAU accelerator is able to realize up to thirty× speedup comparing to the Intel Core2 processors, with the facility consumption at 234 mW.

