Entropy-evolution guided architectural CNN design
Master
Semester programme:Master of Applied IT
Stijn Schellekens
Project description
This project explores a new way to design energy efficient Convolutional Neural Networks (CNNs) by tracking how information spreads through the model during training, and how this evolves over time.
Context
As deep learning models get larger, they also get more computationally expensive and more power hungry to run. This becomes a problem especially for more smaller specific tasks like classifying to types of images. Pre-trained models and general architectures can be overkill for these tasks, and while methods like model pruning and quantization exist, they're only applied after training.
This project aims to improve model efficiency from the start by looking at the information flow through the network as it learns, and aim for a model that compresses data the further into the model it goes. Instead of focusing just on performance, we also enable insights into information flow and efficiency.
Results
This project had a couple main results. Namely that CNN architectures designed with the idea of information compression in mind trained at lower kWh than models that were even smaller in size. Other results it showed is that architectures can in theory be designed to compress information over time, but might not actually do so in practice, but these issues can be noticed by this project.