# Questions
- Start with bad scores on data analysis and error analysis
- Briefly discuss old data analysis
- New data analysis
- Shoulder to shoulder distance
- Angles between shoulders
- Tried rotation -> no camera -> no work
- Rotation independent features ?
- Better way of rotating ?
- Missing data analysis ? why only C :'(
- More noise removal ?
- Current noise removal: shoulder to shoulder distance normalization
- Binning? Rounding?
- Large gap between train and validation and learning curve still flattens (even with new features)
- Hypothesis 1: Splits not identically distributed ?
- Kind of, but also the case for less folds because of grouped split
- Still some large variations in fold socres
- Hypothesis 2: Too many features, bad feature selection (SelectKBest) for old feature set (tried PCA as well)
- Still use a lot of the old features
- From data analysis we learned that face features least important so we removed these already
- Still using too many old features from stage 1 ?
- How to do better feature selection for features from stage 1 ? Highly correlated - better alternatives than SelectKBest
- Current model going in good direction? (double column transformer model)
- How to deal with left and right handed signers -> influence on folds
Solutions:
Dominant hands: Data augmentation, mirror the signs
meer features,
selectk best behouden?
binning
vingers to noisy
shoulder distance normalizeren is goed
roteren schouders niet overtuigd
ROBUST features
meer dan 5 folds nemen?
## Feedback
- Just augment the data for left-right signers
- Increase resolution in time of new features
- SelectKBest not greatest but is here probably not the problem
- Finger distance probably shaky!
- Try running averages over time series maybe?
- Use binning?
- Maybe use denser grids and finetune more? Maybe use more folds
- keep using the old features but try to include all the new features we can get our hands on
## Session 02 december Feedback
- Do no check all classes at same time for robustness (scatter plots). Check classes for whic the feature intended.
- Mutual Information (but less informative)
- Calculate things from the keypoints, not just statistics. Trajectories / speeds
- Spacial features: Area's or intersection
- Prof. Dambre post is about new features
- For 2 part splitting: Maybe not try two halves of same length, but different lengths?