# Improved error analysis Baseline model error analysis: in-depth error analysis of your baseline model (gap analysis, confusion matrix, fold scores, visual error analysis), interpretation of your findings and conclusions for next steps ## Confusion matrix analysis ## Gap analysis Why is our Train - Validate gap so large ![](https://i.imgur.com/sKcKkIN.png) ![](https://i.imgur.com/r8OAbww.png) ![](https://i.imgur.com/4zxwCYX.png) ![](https://i.imgur.com/XpXQZP9.png) + high deviation on folds - Model too complex, too many features - Definitely the case - Subset selection doesn't work very well - Information spread out across features, SelectKBest selects similar ones - Features highly correlated - We saw features highly correlated - SelectKBest doesn't take into account inter feature correlation Overfitting ? Problem in distributions ? - Feature distribution: for different signers might vary -> left, right handed - Differences in features per signer ? Problematic for distributions -> see learning curve - Difference in distribution train set and validation set - Curve saturates and stays far apart - Plot features differences for signers ## Fold analysis Scores per fold for baseline Fold 0: Training accuracy 0.826865671641791 +/- 0.826865671641791 Fold 0: Cross-validation accuracy: 0.689922480620155 +/- 0.689922480620155 Fold 1: Training accuracy 0.8428417653390743 +/- 0.8428417653390743 Fold 1: Cross-validation accuracy: 0.7477477477477478 +/- 0.7477477477477478 Fold 2: Training accuracy 0.8580894533406958 +/- 0.8580894533406958 Fold 2: Cross-validation accuracy: 0.6131578947368421 +/- 0.6131578947368421 Fold 3: Training accuracy 0.8865291262135923 +/- 0.8865291262135923 Fold 3: Cross-validation accuracy: 0.6666666666666666 +/- 0.6666666666666666 Fold 4: Training accuracy 0.8679458239277652 +/- 0.8679458239277652 Fold 4: Cross-validation accuracy: 0.6181384248210023 +/- 0.6181384248210023 Original train data label distribution ![](https://i.imgur.com/GhVwEZ7.png) Label analysis for folds - Label distribution per fold ![](https://i.imgur.com/AmxzKbn.png) ![](https://i.imgur.com/o7STBr5.png) ![](https://i.imgur.com/GtFYFi4.png) ![](https://i.imgur.com/kW4A4Gz.png) ![](https://i.imgur.com/U2JYNxq.png) Signer analysis for - Signer distribution per fold Fold 0: ![](https://i.imgur.com/cnoPED7.png) ![](https://i.imgur.com/HUTm8iO.png) Fold 1: ![](https://i.imgur.com/1gKoy3M.png) ![](https://i.imgur.com/PShPTrF.png) Fold 2: ![](https://i.imgur.com/8h7FV9k.png) ![](https://i.imgur.com/vJfMElU.png) Fold 3: ![](https://i.imgur.com/6axSohJ.png) ![](https://i.imgur.com/w0ux8SA.png) Fold 4: ![](https://i.imgur.com/iPaj1Pr.png) ![](https://i.imgur.com/TsX2i5u.png) Feature analysis for folds - Seating positions, left right handed might be different per fold ## Confusion matrices per folds Fold 0 ![](https://i.imgur.com/9Fin70E.png) ![](https://i.imgur.com/duoTJ9L.png) Fold 1 ![](https://i.imgur.com/vZVj1If.png) ![](https://i.imgur.com/enCXN4F.png) Fold 2 ![](https://i.imgur.com/CECGydu.png) ![](https://i.imgur.com/3kqrbn7.png) Fold 3 ![](https://i.imgur.com/qhyRqSf.png) ![](https://i.imgur.com/CMCZhzk.png) Fold 4 ![](https://i.imgur.com/5GMTdPD.png) ![](https://i.imgur.com/1n8mFVE.png) ## Detailed analysis per fold Reasons why validation scores fold 2 and fold 4 low ? Fold 0: - C:1 very high prediction accuracy why ? - All except 1 signer in validation set use right hand - A large fraction of the signers in training set also use right hand Fold 2: - c-AF occurs the most in the test set and is often confused with NAAR-A and SCHILDPAD. - More confused with SCHILDPAD HANDEN then in other folds ? -> look at samples of signer Fold 4: - ZEFLDE and AF many samples in validation set, HAAS-OOR less - Haas-oor confused with NAAR-A for samples of this fold - ZELFDE-A often confused with samples of c.OOK for this fold - For Fold 3 this is not the case ## Visual error analysis