ESCEPE: Early-Exit Network Section-Wise Model Compression Using Self-distillation and Weight Clustering

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

2 Citaten (Scopus)

Samenvatting

Deploying deep learning models on resource-constrained (edge) devices is challenging due to their high computational demands and large model sizes. Early-exit neural networks are one of the approaches to make deep learning models more efficient for resource-constrained devices by reducing computational cost and latency. However, even with early-exit neural networks, the model size may remain a problem when deploying them on edge devices. To address this problem, we propose a section-wise model compression technique for compressing an early-exit neural network with intermediate classifiers. Our approach divides the model into a few sections and uses different compression settings in the weight clustering-based compression for each section to prevent accuracy loss in the intermediate sections. We demonstrate that knowledge distillation can be used in the retraining phase to transfer knowledge from uncompressed to compressed sections and to accelerate the recovery of performance reduction after the weight clustering stages. The performance evaluation of our proposed method on CIFAR10 and CIFAR100 datasets using ResNet and WideResNet architectures demonstrates that the proposed technique can compress an early-exit neural network with a high compression ratio with minimal impact on the accuracy of intermediate classifiers. The proposed method achieves compression ratios of more than 36 and 22 times for ResNet18 with three shallow classifiers on CIFAR10 and CIFAR100, respectively, with an ensemble accuracy loss of less than 1%. By eliminating shallow classifiers from the early-exit model, the static model can achieve compression ratios of up to 64 and 52 times for ResNet18 and WideResNet50, respectively, on the CIFAR10 dataset with an accuracy loss of less than 2.5%.
Originele taal-2Engels
TitelEdgeSys '23: Proceedings of the 6th International Workshop on Edge Systems, Analytics and Networking
UitgeverijAssociation for Computing Machinery, Inc
Pagina's48–53
Aantal pagina's6
ISBN van elektronische versie9798400700828
DOI's
StatusGepubliceerd - 8 mei 2023

Vingerafdruk

Duik in de onderzoeksthema's van 'ESCEPE: Early-Exit Network Section-Wise Model Compression Using Self-distillation and Weight Clustering'. Samen vormen ze een unieke vingerafdruk.

Citeer dit