Energy Efficient DNNs for Inference at the EDGE
There is growing interest in being able to perform DNN inference at the edge (i.e. on or close to end devices), for reasons of latency, privacy/security, bandwidth, power, etc. However, such embedded devices are typically highly constrained, featuring low-power microcontrollers with limited memory, processing power, and energy reserves. The aim of this research project is to enable DNN inference on such devices, achieving acceptable performance (in terms of power consumption, execution time, accuracy).
In order to achieve this, existing approaches have typically involved quantizing the model coefficients or pruning them altogether resulting in a sparse matrix that’s difficult to execute efficiently on today’s hardware. Other approaches have focused on searching for model architectures using reinforcement learning and evolutionary algorithms which can be inefficient. Additionally, the building blocks used in the search to compose the networks are limited which can inhibit discovery of efficient models.
This research project will investigate development of models using efficient differentiable programming techniques with objectives that force it to be more energy efficient from the outset, exploiting hardware properties and capabilities, and/or conducting DSE of the ML model to best suit the hardware architecture and maximise performance (energy/speed/accuracy). The research will inherently explore the juncture between hardware and software, optimizing ML models and algorithms in order to efficiently execute on current (and future) microcontroller hardware, taking advantage of specific optimizations and accelerators and avoiding known bottlenecks.