Block-Level Surrogate Models for Inference Time Estimation in Hardware Aware Neural Architecture Search

Kurt Stolle, Sebastian Vogel, Fons van der Sommen, Willem P. Sanberg

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

6 Downloads (Pure)

Abstract

Hardware-Aware Neural Architecture Search (HA-NAS) is an attractive approach for discovering network architectures that balance task accuracy and deployment efficiency. In an iterative search algorithm, inference time is typically determined at every step by directly profiling architectures on hardware. This imposes limitations on the scalability of search processes because access to specialized devices for profiling is required. As such, the ability to assess inference time without hardware access is an important aspect to enable deep learning on resource-constrained embedded devices. Previous work estimates inference time by summing individual contributions of the architecture’s parts. In this work, we propose using block-level inference time estimators to find the network-level inference time. Individual estimators are trained on collected datasets of independently sampled and profiled architecture block instances. Our experiments on isolated blocks commonly found in classification architectures show that gradient boosted decision trees serve as an accurate surrogate for inference time. More specifically, their Spearman correlation coefficient exceeds 0.98 on all tested platforms. When such blocks are connected in sequence, the sum of all block estimations correlates with the measured network inference time, having Spearman correlation coefficients above 0.71 on evaluated CPUs and an accelerator platform. Furthermore, we demonstrate the applicability of our Surrogate Model (SM) methodology in its intended HA-NAS context. To this end, we evaluate and compare two HA-NAS processes: one that relies on profiling via hardware-in-the-loop and one that leverages block-level surrogate models. We find that both processes yield similar Pareto-optimal architectures. This shows that our method facilitates a similar task-performance outcome without relying on hardware access for profiling during architecture search.
Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2022, Proceedings
Subtitle of host publicationEuropean Conference, ECML PKDD 2022, Grenoble, France, September 19–23, 2022, Proceedings, Part V
EditorsMassih-Reza Amini, Stéphane Canu, Asja Fischer, Tias Guns, Petra Kralj Novak, Grigorios Tsoumakas
Place of PublicationCham
PublisherSpringer
Pages463-479
Number of pages17
ISBN (Electronic)978-3-031-26419-1
ISBN (Print)978-3-031-26418-4
DOIs
Publication statusPublished - 17 Mar 2023
Event2022 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2022 - World Trade Center, Grenoble, France
Duration: 19 Sept 202223 Sept 2022
https://2022.ecmlpkdd.org/

Publication series

NameLecture Notes in Computer Science
Volume13717
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349
NameLecture Notes in Artificial Intelligence
Volume13717
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2022 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2022
Abbreviated titleECML PKDD
Country/TerritoryFrance
CityGrenoble
Period19/09/2223/09/22
Internet address

Bibliographical note

ID 737

Keywords

  • AutoML
  • Inference time estimation
  • Neural network design

Fingerprint

Dive into the research topics of 'Block-Level Surrogate Models for Inference Time Estimation in Hardware Aware Neural Architecture Search'. Together they form a unique fingerprint.

Cite this