EMBEDDING DEEP NEURAL NETWORKS FOR URBAN SOUND RECOGNITION ON AN FPGA

Student thesis: Master's Thesis

Abstract

Recent evolutions show that Convolutional Neural Networks are the most
accurate way for doing urban sound recognition. However CNNs can be
quite slow since they involve a large amount of computations. Since the
sounds have to be classified locally in real-time for some applications, the
CNN classifiers have to be deployed on edge devices that are fast enough.
This thesis therefore aims to embed two types of CNNs on an FPGA by
using the hls4ml tool flow. An FPGA is a type of hardware architecture
that can be reconfigured after production. Because of their fine-granular
nature, FPGAs are very suitable for doing a lot of computations at the
same time and they can exploit the parallelism of CNNs very well with the
result of accelerated sound classification. The results on the FPGA are also
compared with other embedded platforms, including solutions dedicated to
neural network acceleration and more general-purpose platforms. Although
a cost has to be paid in terms of accuracy, the ultimate FPGA solution
outperforms all other platforms for the first CNN architecture when it comes
to speed. For the second CNN architecture, the dedicated CNN accelerators
turn out faster although the results on FPGA are still good. This shows that,
with the use of model compression methods and the right configuration,
FPGAs can be faster than other embedded platforms and can be used in
real-time urban sound recognition.
Date of Award23 Jun 2021
Original languageEnglish
SupervisorAbdellah Touhafi (Promotor) & Bruno Tiago da Silva Gomes (Promotor)

Cite this

'