# Processing with AI
## Exploration : IA & Ethics
Name:
> NARAYANA Thibault
>
Subject:
> Detect dermatological problems using Computer Vision
>[TOC]
## Design brief
### Biais
If we don't source our dataset with enough rigor, the following biais might appear:
>1. We might **not be able to detect minority skin** depending on the set we use. E.g.: not possible to detect colored people if the basic set contains only people with white skin.
>2. If we don't represent very special cases in our data set, the **AI may not work at all** when it encounters them in reality. E.g. albino people, burnt skin...
>3. The same skin problem must be represented on different skins, otherwise the **AI could detect a skin problem when it is only a peculiarity of the subject's skin**.
We will ensure that our model is not biaised by:
>1. Train different models independently and in the same way for each major skin colour type. This include having different data set for each skin colour.
>2. Make sure that for each model, the same skin problems are represented.
>3. Study the in real life repartition of the diseases we want to detect and represent them accordingly in our data sets.
### Overfitting
We will make sure our model does not overfit by
> Training our algorithms and verify their operation on two separate sub-divisions of the same data set. As stated before, we'll have different models for each major skin colour, and therefore we'll make our data set focusing only on the representativity of the skin diseases.
### Misuse
>We have to remind ourselves that our application could be misused by some racist people to do skin colour detection.
### Data leakage
*Choose the most relevant proposition:*
>In a catastrophic scenario, where all of our training dataset were stolen or recovered from our model, the risk would be that the datasets would be used to scientifically prove racial theories. E.g., more pronounced presence of a particular disease in people of a particular skin color.
### Hacking
> If someone found a way to "cheat" our model and make it make any prediction that it want instead of the real one, the risk would be that you can make a person look like they have a skin disease when they don't. If these medical data then were to be shared, for example with insurance companies, this could have a negative impact on the data subject's life.