# Data Set The WIKI data set from Section X was further subset by target label (age, gender) and data balancing method (low, high). The four data subsets were each assigned a data tag (e.g., `wiki-age-low`) and define the four data sets that each model in Section X was run on for each experiment. [insert: table of target label counts] name | original values | binary condition | binary label | favourable label --- | --- | --- | --- | --- age* | {-75..321} | x > 0, x < 30 | {0, 1} | 1 gender | {0, 1} | x = 1 | {0, 1} | 1 The two target labels correspond to two different tasks: 1. infer whether or not an image corresponds to a favourable age 2. infer whether or not an image corresponds to a favourable gender The favourable age is defined to be valid ages below 30 years old (y=1) at the time that image was taken. Valid ages are non-negative and non-zero ages (see Table X). The age label was derived from both the person's date of birth and the year the photo was taken (i.e., age = photo_year - date_of_birth). The favourable gender is defined to male (y=1) in the data set. No further processing was done to this target label. ### Protected Attributes The data set provided only two possible variables to serve as protected attributes: age and gender. Because these two variables also double as target variables for the inference tasks described in Section X, the free variable, i.e., the non-target label, is used as the protected attribute for that task. Target labels and protected attributes are never assigned to the same variable to avoid exposing the model to those values. For example, if the target label is "gender", then "age" becomes the protected attribute using the original values per Table X. Favourable and unfavourable labels for a protected attribute are used to train the base model only. Privileged and unprivileged groups for a protected attribute are not used to train the base model. Both are used to train the bias mitigation model and to calculate statistical parity scores. #### Favourable Labels From protected attributes, favourable labels are the binary labels that meet the binary condition. These labels correspond to general favourability defined by societal and cultural norms. For age, the favourable label are ages that are less than 30 (i.e., 1) (see Table X). Unfavourable labels are ages that equal to or greater than 30 (i.e., 0). Similarly, for gender, the favourable label is male (i.e., 1) and the unfavourable label is female (i.e., 0). The favourable label relates to the target label that a model is predicting for in either task. #### Privileged Group The privileged group are the original values in Table X that meet the binary condition for the favourable label. These groups correspond to the latent protected attribute values that a model can show bias for or against. For age, the privileged group contains ages that are less than 30 (e.g., 21, 23, 25). The unprivileged group contains ages that are equal to or greater than 30 (e.g., 33, 45, 67). For gender, the privileged group consists of male observations (i.e., 1). The unprivileged group consists of female observations (i.e., 0). ### Balancing Labels The WIKI data set's target labels are unbalanced (see Table X). Specifically, the favourable label is outnumbers the unfavourable label for both tasks. To address the unbalanced data, multiple subsets were created by oversampling the underrepresented label (high sampling) and undersampling the overrepresented label (low sampling) (see Table X). These data balancing methods create unique scenarios that challenge the model differently during fine-tuning. Underrepresented labels and their associated protected attributes are duplicated. Overrepresented labels and their associated protected attributes are dropped out from the training data. [insert: table with counts] name | count(0) | count(1) --- | --- | --- gender | 12,260 (0.21) | 47,040 (0.79) age* | 29,120 (0.48) | 31,781 (0.52) [insert: table of training data] Data Tag | Prot. Attribute | Sampling | Size --- | --- | --- | --- wiki | Gender | High | 76,381 wiki | Gender | Low | 24,838 wiki | Age | High | 68,332 wiki | Age | Low | 32,867 # Base Model Models from both transformer and convolutional neural network (CNN) architectures were selected for this project (see Table X). These models were selected for their popularity (e.g., ViT), novelty (e.g., ConvNeXt), and unique features (e.g., CLIP). This diverse set of architectures will help to understand if bias is learnt and introduced by the model and to what extent. [insert: table of models] ### Fine-Tuning Each model (n=7) was fine-tuned on each of the subsets described in Section X. Training was performed according to the schedule in Table X. The final set of fine-tuned transformers (n=16) and CNNs (n=12) were used for all experiments. The only model modification done was for CLIP, which is a transformer model that takes both image and text as input. The data did not have associated text for each image, so we disabled that part of the CLIP model. [insert: table of training information] [insert: table of performance metrics] parameter | value --- | --- Epochs | 20 Batch size | 8 train/ 8 validation Learning rate | 1e-06 Epsilon | 1e-08 Weight decay | 1e-02 GPU | V100 Table X shows the model F<sub>1</sub> scores at the end of training. Bias mitigation is reported and observed to lower the performance of models (cite). Reasons for the decrease in score is that debiasing model output directly counters bias a model implicitly learns and leverages during inference. [insert: model scores] | model_tag | wiki-age-high | wiki-age-low | wiki-gender-high | wiki-gender-low | |:--------------|----------------:|---------------:|-------------------:|------------------:| | vit | 0.73 | 0.71 | 0.87 | 0.83 | | swin | 0.72 | 0.7 | 0.83 | 0.81 | | deit | 0.76 | 0.72 | 0.89 | 0.84 | | clip | 0.87 | 0.78 | 0.94 | 0.88 | | resnet | 0.66 | 0.65 | 0.75 | 0.72 | | densenet | 0.66 | 0.66 | 0.78 | 0.74 | | convnext_tiny | 0.72 | 0.7 | 0.85 | 0.81 | # Mitigation Model The reject option classifier (ROC) was selected as the bias mitigation strategy from the AIF360 package (cite X). This is one of the simpler models available in that package for postprocessing. Each fine-tuned model had an associated ROC model trained for it. That is, ROC models were not used across different base models. ### Training To train the ROC, the fine-tuned base model was used to generated logits from a smaller set of training data than that used to fine-tune the model. The smaller data set was not included in the base model's training data. The ROC model was then trained on the ROC model using _k_ fold cross validation. Two parameters are learnt during training: classification threshold and ROC margin. These parameters' values are averaged to create the final set of parameters to assign to the saved ROC model.