TY - JOUR
T1 - Exploring the clinical features of narcolepsy type 1 versus narcolepsy type 2 from European Narcolepsy Network database with machine learning
AU - Zhang, Zhongxing
AU - Mayer, Geert
AU - Dauvilliers, Yves
AU - Plazzi, Giuseppe
AU - Pizza, Fabio
AU - Fronczek, Rolf
AU - Santamaria, Joan
AU - Partinen, Markku
AU - Overeem, Sebastiaan
AU - Peraita-Adrados, Rosa
AU - Da Silva, Antonio Martins
AU - Sonka, Karel
AU - Rio-Villegas, Rafael Del
AU - Heinzer, Raphael
AU - Wierzbicka, Aleksandra
AU - Young, Peter
AU - Högl, Birgit
AU - Bassetti, Claudio L.
AU - Manconi, Mauro
AU - Feketeova, Eva
AU - Mathis, Johannes
AU - Paiva, Teresa
AU - Canellas, Francesca
AU - Lecendreux, Michel
AU - Baumann, Christian R.
AU - Barateau, Lucie
AU - Pesenti, Carole
AU - Antelmi, Elena
AU - Gaig, Carles
AU - Iranzo, Alex
AU - Lillo-Triguero, Laura
AU - Medrano-Martínez, Pablo
AU - Haba-Rubio, José
AU - Gorban, Corina
AU - Luca, Gianina
AU - Lammers, Gert Jan
AU - Khatami, Ramin
PY - 2018/12/1
Y1 - 2018/12/1
N2 - Narcolepsy is a rare life-long disease that exists in two forms, narcolepsy type-1 (NT1) or type-2 (NT2), but only NT1 is accepted as clearly defined entity. Both types of narcolepsies belong to the group of central hypersomnias (CH), a spectrum of poorly defined diseases with excessive daytime sleepiness as a core feature. Due to the considerable overlap of symptoms and the rarity of the diseases, it is difficult to identify distinct phenotypes of CH. Machine learning (ML) can help to identify phenotypes as it learns to recognize clinical features invisible for humans. Here we apply ML to data from the huge European Narcolepsy Network (EU-NN) that contains hundreds of mixed features of narcolepsy making it difficult to analyze with classical statistics. Stochastic gradient boosting, a supervised learning model with built-in feature selection, results in high performances in testing set. While cataplexy features are recognized as the most influential predictors, machine find additional features, e.g. mean rapid-eye-movement sleep latency of multiple sleep latency test contributes to classify NT1 and NT2 as confirmed by classical statistical analysis. Our results suggest ML can identify features of CH on machine scale from complex databases, thus providing 'ideas' and promising candidates for future diagnostic classifications.
AB - Narcolepsy is a rare life-long disease that exists in two forms, narcolepsy type-1 (NT1) or type-2 (NT2), but only NT1 is accepted as clearly defined entity. Both types of narcolepsies belong to the group of central hypersomnias (CH), a spectrum of poorly defined diseases with excessive daytime sleepiness as a core feature. Due to the considerable overlap of symptoms and the rarity of the diseases, it is difficult to identify distinct phenotypes of CH. Machine learning (ML) can help to identify phenotypes as it learns to recognize clinical features invisible for humans. Here we apply ML to data from the huge European Narcolepsy Network (EU-NN) that contains hundreds of mixed features of narcolepsy making it difficult to analyze with classical statistics. Stochastic gradient boosting, a supervised learning model with built-in feature selection, results in high performances in testing set. While cataplexy features are recognized as the most influential predictors, machine find additional features, e.g. mean rapid-eye-movement sleep latency of multiple sleep latency test contributes to classify NT1 and NT2 as confirmed by classical statistical analysis. Our results suggest ML can identify features of CH on machine scale from complex databases, thus providing 'ideas' and promising candidates for future diagnostic classifications.
UR - http://www.scopus.com/inward/record.url?scp=85049967571&partnerID=8YFLogxK
U2 - 10.1038/s41598-018-28840-w
DO - 10.1038/s41598-018-28840-w
M3 - Article
C2 - 30006563
SN - 2045-2322
VL - 8
SP - 11
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 10628
ER -