# Workshop 1 Agenda
## Running the SAD Workflows
Command: `python -m gtc_sad WORKFLOW`
### `prepare_predict`
Retrieves data from Github and PSQL and generates an feature dataset. Store all of them on a data lake.
### `prepare_training`
Retrieves data from the human evaluation sheet and generate labels based on it + any heuristics on the data lake.
### `fit`
Trains a ML model based on the features + label set and stores the pickled model on the data lake.
### `predict`
Generates an prediction dataset based on the fitted model and the existing features.
### `prepare_human_eval`
Generates human evaluation sheets based on the existing dataset of predictions and labels.
## `push_endpoint`
Generates an "clean" dataset for marking users as sybil / non-sybil on the Gitcoin dashboard.
## Workflows output
(just browse the GCS bucket for R12 / R13)
## Setting up the environment for SAD
### Creating the data sinks: GCS bucket and human evaluation sheet.
- Create a GCS bucket
- Clone an Human Evaluation Sheet
### Getting credentials: GCP and Github
- Create an GCP service account (`config/gcp_credentials.json`)
- Enable the Google Sheets API (`config/credentials_human_eval_sheets.json`)
- Creating an Github token (`config/github_token.txt`)
- Setting up (`config/params.json`)
- Setting up (`gtc_sad/cloud/definitions.py`)
- Connecting to an secure VPN (`config/psql_conn_string.txt`)
### Testing the SAD
- Visual Studio Code
## Homework
### Required
Without those items, it is not going to be possible to run workshop #2 unless
we deactivate certain features
- Get access to an trusted static IP for running the SAD (quick solution: PureVPN)
- Get PostgreSQL credentials
### Desirable
By having those items, we will have a smoother execution of workshop #2
- An clean human evaluations sheet
- Github token
- GCP project
- GCS bucket
## Pointers:
- R13 sheet: https://docs.google.com/spreadsheets/d/1P-plAeFmChHgKwnS2hk3TxpjZNH6xmncu3hCt9WUHfs
- GCS buckets: https://console.cloud.google.com/storage/browser?authuser=1&project=gitcoin-322518&prefix=
- R13 bucket: https://console.cloud.google.com/storage/browser/round_13;tab=objects?forceOnBucketsSortingFiltering=false&authuser=1&project=gitcoin-322518&prefix=&forceOnObjectsSortingFiltering=false
- Cloud IAM: https://console.cloud.google.com/iam-admin/iam?authuser=1&project=gitcoin-322518
-