Exception Analysis of Running Complex and Computation-intensive Deep Learning Models

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

3 Downloads (Pure)

Samenvatting

Advances in GPU have facilitated design and execution of complex and computation-intensive deep learning models. As the model complexity increases, the risk of encountering problems due to very large model size, individual tensor size, Not a Number (NaN) value, and memory leak increases as well. When untreated, these problems lead to substantial increase of execution time, generating unpredictable results, and memory leak exceptions. In this paper, we address these problems and particularly large tensor support, C++ kernel changes, and recompilation of the TensorFlow framework. In addition, issues related to NaN value debugging with existing debugging toolkits and solutions to alleviate memory leaks will be explored. Based on experience gained from our analysis, we propose solutions related to better tensor dimension sanity checks, alternative tensor loop procedures, different ways of applying kernels to tensors, a debug trace file filter method, and ways how memory leak exceptions can be resolved. While these problems and solutions may be applicable to running any complex and computation-intensive deep learning model, we described how we encountered them in a use case, in which we designed a deep learning model for activity and gesture recognition using radio data aiming to mitigate domain shift problem.
Originele taal-2Engels
Titel2023 10th International Conference on Wireless Networks and Mobile Communications (WINCOM)
RedacteurenKhalil Ibrahimi, Mohamed El Kamili, Abdellatif Kobbane, Ibraheem Shayea
UitgeverijInstitute of Electrical and Electronics Engineers
Aantal pagina's7
ISBN van elektronische versie979-8-3503-2967-4
ISBN van geprinte versie979-8-3503-2968-1
DOI's
StatusGepubliceerd - 22 nov. 2023
EvenementInternational Conference on Wireless Networks and Mobile Communications - Istanbul, Turkije
Duur: 26 okt. 202328 okt. 2023
Congresnummer: 10
https://www.wincom-conf.org/WINCOM_2023/

Congres

CongresInternational Conference on Wireless Networks and Mobile Communications
Verkorte titelWINCOM
Land/RegioTurkije
StadIstanbul
Periode26/10/2328/10/23
Internet adres

Vingerafdruk

Duik in de onderzoeksthema's van 'Exception Analysis of Running Complex and Computation-intensive Deep Learning Models'. Samen vormen ze een unieke vingerafdruk.

Citeer dit