# Interview
### SESSIONS
<br>
<br>
| id | when | doctor_id | patient_id |
|----|---------------------|-----------|------------|
| 1 | 2023-03-01 20:00:00 | 11 | 111 |
| 2 | 2023-04-01 15:00:00 | 11 | 112 |
| 3 | 2023-04-01 15:00:00 | 12 | 221 |
| 4 | 2023-08-01 15:00:00 | 12 | 221 |
<br>
<br>
If we assume that a week is complete if a patient has a session at least once during that week we can imagine a logical table like the following:
| patient_id | 28/11 | 5/12 | 12/12 | 19/12 | 26/12 |
|------------|-------|------|-------|-------|-------|
| 111 | X | O | X | X | X |
| 112 | O | X | X | O | O |
| 113 | X | X | X | X | X |
| 114 | O | O | O | X | O |
<br>
<br>
### Problem #0:
How can you count all the sessions that took place in the last month?
<br>
```sql=
SELECT count(*)
FROM sessions
WHERE "when" < DATE_ADD(day, current_date(), -30)
```
### Problem #1:
What is the % of completeness for every pair doctor/patient in the last month?
| id | when | doctor_id | patient_id |
|----|---------------------|-----------|------------|
| 1 | 2023-03-01 20:00:00 | 11 | 111 |
| 2 | 2023-04-01 15:00:00 | 11 | 112 |
| 3 | 2023-04-01 15:00:00 | 12 | 221 |
| 4 | 2023-08-01 15:00:00 | 12 | 221 |
<br>
**OUTPUT**:
| doctor_id | patient_id | completeness |
|-----------|------------|--------------|
| 11 | 111 | 80% |
| 11 | 112 | 50% |
| 12 | 221 | 30% |
<br>
<br>
```sql=
WITH calendar AS (
select
date_series(date) AS date
from date_range(trunc_date(week, DATE_ADD(day, current_date(), -30)), trunc_date(week, current_date()), 7) AS date_series
),
patients AS (select distinct patient_id as patient_id from sessions)
,
calendar_x_patients AS (
select
calendar.date,
patient.id
from calendar
cross join patients
),
etl AS (
SELECT
calendar.date,
patient_id,
trunc_date(week, "when") is not null as visited
FROM calendar_x_patients
LEFT JOIN sessions
ON calendare_x_patients.date = trunc_date(week, sessions."when")
)
SELECT
patient_id
count_if(visited = TRUE)/count(*)
FROM etl
GROUP BY patient_id
```
<br>
<br>
### Problem #2:
What if we drop the assumption that "no patient starts in the middle of the period" -> a patient can start in the middle of the period?
<br>
<br>
| patient_id | 28/11 | 5/12 | 12/12 | 19/12 | 26/12 |
|------------|-------|------|-------|-------|-------|
| 111 | X | O | X | X | X |
| 112 | - | X | X | O | O |
| 113 | X | X | X | X | X |
| 114 | O | O | X | X | O |
<br>
<br>
```sql=
WITH calendar AS (
select
date_series(date) AS date
from date_range(trunc_date(week, DATE_ADD(day, current_date(), -30)), trunc_date(week, current_date()), 7) AS date_series
),
patients AS (
select
patient_id,
min("when") AS first_visit_date
from sessions
group by patient_id
)
,
calendar_x_patients AS (
select
calendar.date,
patient.id
from calendar
cross join patients
where calendare.date <= trunc_date(week, patient.first_visit_date)
),
etl AS (
SELECT
calendar.date,
patient_id,
trunc_date(week, "when") is not null as visited
FROM calendar_x_patients
LEFT JOIN sessions
ON calendare_x_patients.date = trunc_date(week, sessions."when")
)
SELECT
patient_id
count_if(visited = TRUE)/count(*)
FROM etl
GROUP BY patient_id
```
<br>
<br>
## Assumptions
<br>
<br>
- Every day there is at least one session (not necessarily for a single doctor).
- A week is complete for a certain patient if there is at least one session for that patient for that week
- We assume no therapy is stopped during this time
- We assume that we can drop from the calculation the patients that did not have any sessions during this time
- All doctors started working well before the period considered
- No doctor stopped working during this time
- All therapies are ongoing (i.e. no patient starts in the middle of the period)