TY - JOUR
T1 - Deep Transfer Learning for Automated Single-Lead EEG Sleep Staging with Channel and Population Mismatches
AU - van der Aar, Jaap
AU - van den Ende, Daan
AU - Fonseca, Pedro
AU - van Meulen, Fokke
AU - Overeem, Sebastiaan
AU - van Gilst, Merel M.
AU - Peri, Elisabetta
PY - 2024/1/5
Y1 - 2024/1/5
N2 - Automated sleep staging using deep learning models typically requires training on hundreds of sleep recordings, and pre-training on public databases is therefore common practice.However, suboptimal sleep stage performance may occur from mismatches between source and target datasets, such as differences in population characteristics (e.g., an unrepresented sleep disorder) or sensors (e.g., alternative channel locations for wearable EEG). We investigated three strategies for training an automated single-channel EEG sleep stager: pre-training (i.e., training on the original source dataset), training-from-scratch (i.e., training on the new target dataset), and fine-tuning (i.e., training on the original source dataset, fine-tuning on the new target dataset). As source dataset, we used the F3-M2 channel of healthy subjects (N=94). Performance of the different training strategies was evaluated using Cohen's Kappa (κ) in eight smaller target datasets consisting of healthy subjects (N=60), patients with obstructive sleep apnea (OSA, N=60), insomnia (N=60), and REM sleep behavioral disorder (RBD, N=22), combined with two EEG channels, F3-M2 and F3-F4. No differences in performance between the training strategies was observed in the agematched F3-M2 datasets, with an average performance across strategies of κ = .83 in healthy, κ = .77 in insomnia, and κ = .74 in OSA subjects. However, in the RBD set, where data availability was limited, fine-tuning was the preferred method (κ = .67), with an average increase in κ of .15 to pre-training and training-from-scratch. In the presence of channel mismatches, targeted training is required, either through training-from-scratch or fine-tuning, increasing performance with κ = .17 on average. We found that, when channel and/or population mismatches cause suboptimal sleep staging performance, a fine-tuning approach can yield similar to superior performance compared to building a model from scratch, while requiring a smaller sample size. In contrast to insomnia and OSA, RBD data contains characteristics, either inherent to the pathology or age-related, which apparently demand targeted training.
AB - Automated sleep staging using deep learning models typically requires training on hundreds of sleep recordings, and pre-training on public databases is therefore common practice.However, suboptimal sleep stage performance may occur from mismatches between source and target datasets, such as differences in population characteristics (e.g., an unrepresented sleep disorder) or sensors (e.g., alternative channel locations for wearable EEG). We investigated three strategies for training an automated single-channel EEG sleep stager: pre-training (i.e., training on the original source dataset), training-from-scratch (i.e., training on the new target dataset), and fine-tuning (i.e., training on the original source dataset, fine-tuning on the new target dataset). As source dataset, we used the F3-M2 channel of healthy subjects (N=94). Performance of the different training strategies was evaluated using Cohen's Kappa (κ) in eight smaller target datasets consisting of healthy subjects (N=60), patients with obstructive sleep apnea (OSA, N=60), insomnia (N=60), and REM sleep behavioral disorder (RBD, N=22), combined with two EEG channels, F3-M2 and F3-F4. No differences in performance between the training strategies was observed in the agematched F3-M2 datasets, with an average performance across strategies of κ = .83 in healthy, κ = .77 in insomnia, and κ = .74 in OSA subjects. However, in the RBD set, where data availability was limited, fine-tuning was the preferred method (κ = .67), with an average increase in κ of .15 to pre-training and training-from-scratch. In the presence of channel mismatches, targeted training is required, either through training-from-scratch or fine-tuning, increasing performance with κ = .17 on average. We found that, when channel and/or population mismatches cause suboptimal sleep staging performance, a fine-tuning approach can yield similar to superior performance compared to building a model from scratch, while requiring a smaller sample size. In contrast to insomnia and OSA, RBD data contains characteristics, either inherent to the pathology or age-related, which apparently demand targeted training.
KW - Polysomnography
KW - Sleep Staging
KW - Single-channel
KW - Wearable EEG
KW - Fine-tuning
KW - Deep learning
UR - http://www.scopus.com/inward/record.url?scp=85182673808&partnerID=8YFLogxK
U2 - 10.3389/fphys.2023.1287342
DO - 10.3389/fphys.2023.1287342
M3 - Article
C2 - 38250654
SN - 1664-042X
VL - 14 - 2023
JO - Frontiers in Physiology
JF - Frontiers in Physiology
M1 - 1287342
ER -