Reviewing inference performance of state-of-the-art deep learning frameworks

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

16 Citations (Scopus)

Abstract

Deep learning models have replaced conventional methods for machine learning tasks. Efficient inference on edge devices with limited resources is key for broader deployment. In this work, we focus on the tool selection challenge for inference deployment. We present an extensive evaluation of the inference performance of deep learning software tools using state-of-the-art CNN architectures for multiple hardware platforms. We benchmark these hardware-software pairs for a broad range of network architectures, inference batch sizes, and floating-point precision, focusing on latency and throughput. Our results reveal interesting combinations for optimal tool selection, resulting in different optima when considering minimum latency and maximum throughput.

Original languageEnglish
Title of host publicationSCOPES '20: Proceedings of the 23th International Workshop on Software and Compilers for Embedded Systems
EditorsSander Stuijk
PublisherAssociation for Computing Machinery, Inc
Pages48-53
Number of pages6
ISBN (Electronic)9781450371315
DOIs
Publication statusPublished - 25 May 2020
Event23rd International Workshop on Software and Compilers for Embedded Systems (SCOPES 2020) - St. Goar, Germany
Duration: 25 May 202026 May 2020

Conference

Conference23rd International Workshop on Software and Compilers for Embedded Systems (SCOPES 2020)
Country/TerritoryGermany
CitySt. Goar
Period25/05/2026/05/20

Fingerprint

Dive into the research topics of 'Reviewing inference performance of state-of-the-art deep learning frameworks'. Together they form a unique fingerprint.

Cite this