# TinySleepNet(2020) >[source](https://pubmed.ncbi.nlm.nih.gov/33018069/) - ### Abstract - Most EEG based models are overengineered to hace many layers or additional steps in processing(converting EEG to spectogram based images). - They also require training on large datasets to avoid overfitting - This paper proposes a model TinySleepNet which uses a smaller DL network with raw single channel EEG as input. - The model is a improved version of DeepSleepNet([notes](https://hackmd.io/cW_57wdUQ7-2dVyblDbrnQ)) which utilises data augmentation to overcome the overfitting problem. - ### Model - **Architecture:** - ![](https://i.imgur.com/IhpuA70.png) - Representation Learning: - The CNN has 4 conv layers with 2 maxpool and dropout layers. - This extracts time-invariant features from raw EEG signals. - Instead of 2 CNNs(DeepSleepNet )one is used as it has been studided a stack of conv layers have similar effect as one conv layer with large filter(References VGGNet). - Sequence Learning: - This part consists of a single unidirectional LSTM layer followed by a dropout. - It learns the temporal information such as sleep transition rules. - Unidirectional LSTMs are used instead of bidirectional(DeepSleepNet)to reduce computation. - As many layers arent needed the residual connection(DeepSleepNet) to solve vanishing gradient problem is removed. - Training: - The model is trained end-to-end with minibatch descent and employs signal and sequence augmentation along with weighted cross entropy to solve class imbalance issues(weights in favor of N1). - Signal augmentation : - It shifts the EEG signal along the time axis. - The duration is uniformly sampled from a range ± B<sub>sig</sub>% of the EEG epoch duration. - This helps us synthesize new signal patterns for each training epoch. - Sequence augmentation : - Here the stating point of sequence of EEG epochs from each sleep is randomly chosen. - The skipping of EEG epochs at the beginning is uniformly sampled from range 0 to B<sub>seq</sub>(0 for no skipping, B<sub>seq</sub> for maximum skipping). - This helps generate new batches of multiple sequences of EEG epochs in minibatch gradient descent. - In this manner training doesnt include to pretrain the network on oversampled class balanced dataset(DeepSleepNet). - Adam optimizer was used for 200 epochs with lr,beta1 and beta2 as 10<sup>-4</sup>, 0.9 and 0.999. - Minibatch size is 20 and sequence length is 15. - L2 weight decay with lambda 10<sup>-3</sup> and gradient clipping with threshold 5 is also used. - B<sub>seq</sub> = 5 and B<sub>sig</sub> = 10 - ### Evaluation - Datasets used are MASS and Sleep-edf. - K fold cross validation was used to evaluate the model preformance. - Metrics calculated are Precision, recall, F1 score, macro-averageing F1 score, overall accuracy and cohen kappa coeffiecient. - N1 score is lower than the other classes but is comparable to SOTA scores(50-60).