# Machine Learning Assignmnet 1
### Mustafin Timur, B&S
## Description of the results.
### Logistic Regression

**Score:** 96.13165931455717
### SVM

**Score:** 95.79233118425518
### SVM with SGD

**Score:** 95.35120461486257
### Results
The data is almost linearly separable but we have a small overlap in data between `Sitting` and `Standing` because all three models made pretty much constant error there.
## Model comparison — effectivness
The best accuracy score has been demonstrated by `Logistic Regression`. It happened becuase SVM (SVM with SGD) tried to find non-existing dependecies.
## Model comparison — time consuming
I measured `evaluate_model` time consuming, the results are:
1. **SVM with SGD**: 53.765860080718994 seconds
1. **SVM**: 62.17709183692932 seconds
1. **Logistic Regression**: 129.93274903297424 seconds
## The feasibility of precision and recall
We can't just call `sklearn.metrics.recall_score` in this case because classicaly these classifiers used to be binary and we calculated `recall` as 
but this works only for binary cases (again, as it was designed). For our multiclass case we should redefine the function in order to handle multiclasses. There are a few ways(from the sklearn docs):
* 'micro':
Calculate metrics globally by counting the total true positives, false negatives and false positives.
* 'macro':
Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
* 'weighted':
Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
* 'samples':
Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score).