# Processing with AI
## Exploration : IA & Ethics
Name:
> Paul Buhr
>
Subject:
> Detect dermatological problems using Computer Vision
>[TOC]
## Design brief
### Bias
If we don't source our dataset with enough rigor, the following bias might appear:
>First and foremost, Computer Vision could not recognize one dermatological problem (like eczema) or one more scarce because there is not enough dataset. This is a **no recognition biais**.
>Second, Computer Vision could not recognize dermatological problems linked to some type of skins if we don't have the accurate dataset (with different skin colors, and composition). This is a **lack of diversity skins bias**.
>Third, incomplete dataset also means that Computer Vision could recognize dermatological problems only linked to humans and not animals for example. This is a **race biais**, where our tool would be efficient only for humans.
We will ensure that our model is not biased by:
>1. Sourcing our dataset from recognized dermatologists around the world with pictures, videos, reports, testimonies...
>2. Making sure our dataset takes into account :
>- the de-identification process to not reveal any sensitive information.
>- the race and lack of diversity bias thanks to a good sourcing.
>- a broad amount of dataset of every dermatological problem that exists. We want an **explainable AI** here.
>3. Having an accurate Computer Vision with an open source program to understand which dataset it takes into account and how.
### Overfitting
We will make sure our model does not overfit by :
>- Checking the accuracy of our parameters. Do not take too much of them, so they can always be justified by the data.
>- Trying to reduce the **noise** as much as possible by extracting what we do not want to consider in our detection.
>- Also, checking that our model avoid issues like One Pixel Attack.
### Misuse
>We have to remind ourselves that our application could be misused by dermatologists **to do business researchs**, in order to extract money from such detection in any way or form.
>Also, our application could be misused by bioterrorists who could try to spread chemical products to **increase some important dermatological problems**.
### Data leakage
>We have decided that our training dataset will be fully open-sourced, but before we made sure that every dataset has followed a careful de-identification process (anonymization or pseudonymization, no racial nor contextual information, suppression of quasi-identifiers...).
### Hacking
> If someone found a way to "cheat" our model and make it make any prediction that it want instead of the real one, risks would be that :
> - Computer Vision may recognize wrong dermatological problems or whatever the cheater want, to patients
> - Basic vital information may be stolen about current and former clients
> - Feelings of panic and anger against Artificial Intelligence spread more in the medical world