Learning Differentially Private Models

# Learning Differentially Private Models If not familiar with machine learning, u can watch this short vid first [link](https://developers.google.com/machine-learning/practica/image-classification) In practical setting, image classification normally trained on a fixed set of images(called training data). After training, the model should be able to classify images that are similar with training data. One attack to the machine learning model is called "membership inference attack". Such attack predicts whether an image itself is in the training data. ![](https://i.imgur.com/ganzCsg.png) ### Differential Privacy in ML models [differential privacy](https://en.wikipedia.org/wiki/Differential_privacy) If we remove a single image from the training data, will the model behaves differently? The heavier the model's behavior differs, we normally called it less differentially private. ### What does that has to do with the differential privacy(DP)? the process of training a model, actually results in the model memorize the training data eventually. Therefore, when the model sees the images in the training data again, not only will it predict it correctly, but also predicting it with a extreamly high confidence(since it basically knows it). Thus, one easy attack method is to directly observe the confidence of the model. This has so much to do with DP bc, the model can behaves different(assigning high confidence) when an image in the training data. ***Hint: we want to train a model that will not assign high confidence to the training data, but also has the same predictive power.*** ### My Method What if we prepare a synthetic(it's actually half synthetic) training dataset that is not so similar with the training data and also can preserve the model predictive power? The simple idea is that a model never sees the actual training data, so it cannot achieve what membership inference attack assumes(models predict high confidence on training data). ### How do we create such dataset goals of this dataset * model trained on it should have the same predictive power just as it trained on original training data * It cannot be too similar with training data *Applying heavy data augmentations can do the trick* I'll prob explain it when we discuss it. This is basically the whole background of my idea.