# Processing with AI ## Exploration : IA & Ethics Name: > Paul Buhr > Subject: > Detect dermatological problems using Computer Vision >[TOC] ## Design brief ### Bias If we don't source our dataset with enough rigor, the following bias might appear: >First and foremost, Computer Vision could not recognize one dermatological problem (like eczema) or one more scarce because there is not enough dataset. This is a **no recognition biais**. >Second, Computer Vision could not recognize dermatological problems linked to some type of skins if we don't have the accurate dataset (with different skin colors, and composition). This is a **lack of diversity skins bias**. >Third, incomplete dataset also means that Computer Vision could recognize dermatological problems only linked to humans and not animals for example. This is a **race biais**, where our tool would be efficient only for humans. We will ensure that our model is not biased by: >1. Sourcing our dataset from recognized dermatologists around the world with pictures, videos, reports, testimonies... >2. Making sure our dataset takes into account : >- the de-identification process to not reveal any sensitive information. >- the race and lack of diversity bias thanks to a good sourcing. >- a broad amount of dataset of every dermatological problem that exists. We want an **explainable AI** here. >3. Having an accurate Computer Vision with an open source program to understand which dataset it takes into account and how. ### Overfitting We will make sure our model does not overfit by : >- Checking the accuracy of our parameters. Do not take too much of them, so they can always be justified by the data. >- Trying to reduce the **noise** as much as possible by extracting what we do not want to consider in our detection. >- Also, checking that our model avoid issues like One Pixel Attack. ### Misuse >We have to remind ourselves that our application could be misused by dermatologists **to do business researchs**, in order to extract money from such detection in any way or form. >Also, our application could be misused by bioterrorists who could try to spread chemical products to **increase some important dermatological problems**. ### Data leakage >We have decided that our training dataset will be fully open-sourced, but before we made sure that every dataset has followed a careful de-identification process (anonymization or pseudonymization, no racial nor contextual information, suppression of quasi-identifiers...). ### Hacking > If someone found a way to "cheat" our model and make it make any prediction that it want instead of the real one, risks would be that : > - Computer Vision may recognize wrong dermatological problems or whatever the cheater want, to patients > - Basic vital information may be stolen about current and former clients > - Feelings of panic and anger against Artificial Intelligence spread more in the medical world