# Xendit Reliability ETL
The purpose behind this project is to organize the various data that is necessary so we can easily analyze how reliable we are operating as an engineering organization. The current set of data that we look at are:
1. Incidents
2. PagerDuty
3. Sentry
A primer on the above can be found in our [RFC documents](https://drive.google.com/drive/u/1/folders/1ltxy0JbLY53PX7mtwow7Depc2FqflfJ-) so will not be covered in this document.
## Extension
There are a few areas which need to be covered before you're able to understand and contribute to this project:
1. Source Data
2. Data Schemas
### Source Data
As mentioned above, our current data sources which inform how well we're doing in terms of reliability are taken from (this list will expand):
1. Incidents
2. PagerDuty
3. Sentry
4. Github
5. DataDog
#### Incidents
Our incidents data comes from [postmortem docs](https://drive.google.com/drive/u/1/folders/0ANGugt3gnJvqUk9PVA). We have created automation for this. Follow these steps below to perform ETL on incidents data,
#### Steps
- Config `google_client_id`, `google_client_secret`, and `google_client_redirect_uri` in your environment variables.
- Run `node scripts/extractors/postmortem.js`. This script will download all postmortem docs in the drive and stored it as `html` files in `raw_data/postmortems`.
- Run `node scripts/extractors/incidents.js`. This script will automate the extract stage and stored all the postmortem docs in `clean_data/incidents.json` .
##### Structures
```
[
...
{
"incident_name": "<INCIDENT_NAME>",
"postmortem_link": "<POSTMORTEM_LINK>",
"statuspage_link": "<STATUSPAGE_LINK>",
"incident_date": "<INCIDENT_DATE>",
"affected_entities": [
{
"entity_name": "<AFFECTED_ENTITY_NAME>",
"products": [
"<AFFECTED_PRODUCT>",
...
]
},
...
],
"root_causes": [
"<ROOT_CAUSE>",
...
],
"stats": {
"time_of_first_trigger": "<TIME_OF_FIRST_TRIGGER>",
"time_of_customer_detect": "<TIME_OF_CUSTOMER_DETECT>",
"time_of_internal_detect": "<TIME_OF_INTERNAL_DETECT>",
"time_of_recovery": "<TIME_OF_RECOVERY>",
"time_of_reconcile": "<TIME_OF_RECONCILE>",
"time_of_rca": "<TIME_OF_RCA>",
"number_of_impacted_customers": "<NUMBER_OF_IMPACTED_CUSTOMERS>",
"number_of_failed_requests": "<NUMBER_OF_FAILED_REQUESTS>",
"severity_level": "<SEVERITY_LEVEL>"
},
"reliability_gaps": "<RELIABILITY_GAPS>"
}
...
]
```
#### PagerDuty
The way the process for getting the performance data from PagerDuty is as follows:
1. Run the extractor script in `PAGERDUTY_SECRET=<YOUR_PAGERDUTY_SECRET> node /scripts/extractors/pagerduty.js`
2. Make sure the raw data gets written to:
1. `/raw_data/pagerduty_services/`
2. `/raw_data/pagerduty_incidents/`
3. `/raw_data/pagerduty_log_entries/`
4. `/raw_data/pagerduty_teams/`
5. `/raw_data/pagerduty_users/`
#### Sentry
Sentry offers us an API to read the information about our usage and statistics. In order to pull this data, you must have access to a Sentry API secret key (we're currently using personal tokens to avoid having to go through the OAuth flow).
1. Run the Sentry projects extractor by running `SENTRY_API_KEY=<YOUR_API_KEY> node scripts/extractors/sentry.js`
2. Check to make sure that the data has been written to:
1. `/raw_data/sentry_projects/`
2. `/raw_data/sentry_issues/`
3. `/raw_data/sentry_teams/`
#### Github
Our github data will allow us to know whether we've implemented our RFCs correctly. To pull the github data, run the following:
1. Run the extractor by calling `GITHUB_API_KEY=<YOUR_API_KEY> node scripts/extractors/github.js`
2. Check to make sure the data has been written to:
1. `/raw_data/github_teams/`
2. `/raw_data/github_repos/`
3. `/raw_data/github_repo_teams/`
4. `/raw_data/github_repo_hooks/`
#### DataDog
To pull DataDog data, run the following:
1. Run the extractor by calling `DATADOG_API_KEY=<YOUR_API_KEY> DATADOG_APPLICATION_KEY=<YOUR_APP_KEY> node scripts/extractors/datadog.js`
2. Check to make sure the data has been written to:
1. `/raw_data/datadog_monitors/`
2. `/raw_data/datadog_synthetics/`
### Data Schemas
Below is a detailed description of our data schemas that the source data is loaded into to allow us to perform complex queries more efficiently.
#### Incidents

#### PagerDuty

#### Sentry

#### Github

#### DataDog

#### Developer Survey

1. Incidents
1. Incidents
2. AffectedProducts
3. RootCauses
2. PagerDuty
1. PagerDutyTeams
2. PagerDutyUsers
3. PagerDutyUserTeams
4. PagerDutyUserContactMethods
5. PagerDutyServices
6. PagerDutyIncidents
7. PagerDutyLogEntries
3. Sentry
1. SentryProjects
2. SentryTeams
3. SentryIssues
4. SentryIssueEvents
4. Github
1. GithubRepos
2. GithubTeams
3. GithubRepoTeams
4. GithubRepoHooks
5. DataDog
1. DataDogMonitors
2. DataDogSynthetics