# Questions - Start with bad scores on data analysis and error analysis - Briefly discuss old data analysis - New data analysis - Shoulder to shoulder distance - Angles between shoulders - Tried rotation -> no camera -> no work - Rotation independent features ? - Better way of rotating ? - Missing data analysis ? why only C :'( - More noise removal ? - Current noise removal: shoulder to shoulder distance normalization - Binning? Rounding? - Large gap between train and validation and learning curve still flattens (even with new features) - Hypothesis 1: Splits not identically distributed ? - Kind of, but also the case for less folds because of grouped split - Still some large variations in fold socres - Hypothesis 2: Too many features, bad feature selection (SelectKBest) for old feature set (tried PCA as well) - Still use a lot of the old features - From data analysis we learned that face features least important so we removed these already - Still using too many old features from stage 1 ? - How to do better feature selection for features from stage 1 ? Highly correlated - better alternatives than SelectKBest - Current model going in good direction? (double column transformer model) - How to deal with left and right handed signers -> influence on folds Solutions: Dominant hands: Data augmentation, mirror the signs meer features, selectk best behouden? binning vingers to noisy shoulder distance normalizeren is goed roteren schouders niet overtuigd ROBUST features meer dan 5 folds nemen? ## Feedback - Just augment the data for left-right signers - Increase resolution in time of new features - SelectKBest not greatest but is here probably not the problem - Finger distance probably shaky! - Try running averages over time series maybe? - Use binning? - Maybe use denser grids and finetune more? Maybe use more folds - keep using the old features but try to include all the new features we can get our hands on ## Session 02 december Feedback - Do no check all classes at same time for robustness (scatter plots). Check classes for whic the feature intended. - Mutual Information (but less informative) - Calculate things from the keypoints, not just statistics. Trajectories / speeds - Spacial features: Area's or intersection - Prof. Dambre post is about new features - For 2 part splitting: Maybe not try two halves of same length, but different lengths?