# DHS StreetSmart Data Documentation
This is documentation and contextual information for frequently used data tables at the Department of Homeless Services. It is compiled by the Street Homesless Solutions Lab.
NOTE: most of the code snippets here are written for python, but hopefully they are simple enough to translate into SQL if need be.
# Servers
There are two main servers from which we look at data.
- `VPROD01DW`
- `BKODS01SQL16.DHS.NYCNET`
These two servers are distinguished between front and back end purposes and have slightly different versions of the same tables because of this.
## `VPROD01DWVPROD01DW`
Primarily used for back-end (i.e. "raw" data) calculations and is the server of record. However, this also means that many of the column values are encoded as category numbers rather than more human readable values. As an example, this may mean that genders are are labeled as `1`,`2`,`3`,`4`, etc rather than `Male`, `Female`, `Unknown`,`Transgender`, etc. Futher, many of the tables are joins from various individual tables, which allows for access to more information per table.
The main database used in this server is `StreetSmart_Datamart`.
Importantly,`VPROD02SQL` has full SSNs (`BKODS01SQL16` only has last four digits)
### How to access
From python, you can access tables in the Datamart using the following code snippet:
```python
import pyodc
server1 = 'VPROD02SQL.DHS.NYCNET'
database1 = 'StreetSmart_Datamart'
username = 'wenfeixu'
port='1433'
conn1 = pyodbc.connect('DRIVER={SQL Server};SERVER='+server1+\
';DATABASE='+database1+\
';UID='+username+\
';PORT='+port+';Trusted_Connection=yes')
```
## `BKODS01SQL16`
Primarily used for front-end data display purposes. This is the database that links to the Streetsmart dashboard. Because this is used for the front-end display, the column values are human readable values. However, note that this table is not always updated synchronously with the back-end database.
The main databases used in this server are:
- `StreetSmart_prod`
- `Cares_ODS`
Importantly, `BKODS01SQL16` only has last four digits of SSN while `VPROD02SQL` has full SSNs.
### How to access
From python, you can access tables in the `StreetSmart_prod` using the following code snippet:
```python
import pyodc
server2 = 'BKODS01SQL16.DHS.NYCNET'
database2 = 'StreetSmart_prod'
username = 'wenfeixu'
port='1433'
conn2 = pyodbc.connect('DRIVER={SQL Server};SERVER='+server2+\
';DATABASE='+database2+\
';UID='+username+\
';PORT='+port+';Trusted_Connection=yes')
```
From python, you can access tables in `CARES_ODS` using the following code snippet:
```python
import pyodc
server2 = 'BKODS01SQL16.DHS.NYCNET'
database3 = 'Cares_ODS'
username = 'wenfeixu'
port='1433'
conn3 = pyodbc.connect('DRIVER={SQL Server};SERVER='+server2+\
';DATABASE='+database3+\
';UID='+username+\
';PORT='+port+';Trusted_Connection=yes')
```
# Tables
The tables listed here are not exhaustive but a catalogue of the most commonly used tables for analysis purposes.
---
#### Note about StreetSmart Dashboard Tables
- These tables are updated by the 15th of the month **after** the reporting month. For ex, this means that data for May 2020 will only be reported as of June 15h 2020.
- As such, all statuses about clients (i.e. placements, case status) should probably be considered as of the 15th of the next month. (I've noticed this most obviously in the `CASE_STATUS` column, which contain a large number of "Closed" cases. In theory, all cases should be "Open" for that Month if a client was on caseload.)
- The `CreatedDate`, which seems to be when `CASE_STATUS` is updated. Thus, you'll find some discrepancy between the client's status that month, and what their `CASE_STATUS` shows. Somtimes, the `CreatedDate` is more than a month after the date indicated on `Month`
- The `CASE_NUMBER` sometimes refers to CARES case number (when there is no "C" in front of the number) and sometimes refers to the StreetSmart case number (when there is a "C").
---
All the case information referred to by tables used in the StreetsSmart dashboard (for instance `CASE_START_DATE`, `CASE_END_DATE`,`CASE_STATUS`) refer to the client's latest case by start date. Also NOTE: All `SS` tables are referring to the `CASE_NUMBER` mean either the CARES case number when it contains only digits and the StreetSmart case number when it has a `C` in the beginning.
## `SSMonthlyCensus`
This table is created by DHS ITS and is used in the Streetsmart dashboard. It reflects the total client caseload for a particular `Month`. As such, the records should indicate the status of a client for that time period, e.g. if a client's case was open then for April 2019, their `CASE_STATUS` should say "Open" even if it were closed later on.
This table is useful because it gives a by-month historical snapshot of caseload. It should reflect the client's current case at the time.
### Timeframe
June 1, 2016 to present.
### Columns
Below are all the columns available in the table and their intended data type. Note: these will likely all be read as `obj` when read into a pandas dataframe. I have provided some code for the essential datatype adjustments.
```
SSMonthlyCensusId int64
Month datetime64[ns]
CASE_NUMBER str
ENROLLMENT_START_DATE datetime64[ns]
ENROLLMENT_END_DATE datetime64[ns]
CASE_START_DATE datetime64[ns]
CASE_END_DATE datetime64[ns]
CASE_STATUS str
CARES_ID float64
CLIENT_NAME str
GENDER str
DATEOFBIRTH datetime64[ns]
VETERAN_INDICATOR str
PLACED_ON_STREET str
FACILITY_NAME str
PROVIDER_GROUP str
SSProviderName str
PLACEMENTDATE_Latest datetime64[ns]
PLACEMENT_RECORD_TYPE_Latest str
PLACEMENT_LEVEL_Latest str
PLACEMENT_CLASS_Latest str
SSPlacementDescription str
SSPlacementLevel str
SSPlacementType str
CreatedBy int64
CreatedDate datetime64[ns]
Streetsmartid str
CaseEndReason str
```
### How to access
```python
query = `select * from SSMonthlyCensus`
ss_census = pd.read_sql(query,conn2)
ss_census['Month'] = pd.to_datetime(ss_census['Month'])
ss_census['CASE_START_DATE'] = pd.to_datetime(ss_census['CASE_START_DATE'])
ss_census['CASE_END_DATE'] = pd.to_datetime(ss_census['CASE_END_DATE'])
```
### Useful columns
- `Month`: To filter for the desired historical snapshot
- `Streetsmartid`: To link back to other client information
- `CASE_STATUS`: To estimate open caseload (TO DO: Figure out which cases are being used here)
- `SSPlacementType`: Commonly used in order to distinguish between transitional, permanent placement, and other settings. Note that this field will not be populated for a client if they are currently on street.
- `CaseEndReason`: If a client's case has been ended, then provider needs to give an explanation for why this is the case. Refer to Streetsmart for a full list of reasons.
- `PLACEMENTDATE_Latest`: The client's latest known placement.
- `SSProviderName`
## `SSMonthlyPlacement`
This table is created by DHS ITS and is used in the Streetsmart dashboard. It reflects client placements into transitional housing, permanent placement, or other placements (e.g. hospital) for a particular `Month`.
This table is useful because it gives a by-month historical snapshot of placements for that month. TO DO/NOTE: We are not sure which dates are being referred to by the `CASE_START_DATE` and `CASE_END_DATE`. Theoretically, it should reflect the client's current case at the time.
### Timeframe
June 1, 2016 to present.
### Columns
Below are all the columns available in the table and their intended data type. Note: these will likely all be read as `obj` when read into a pandas dataframe.
```
SSMonthlyPlacementId int64
Month datetime64[ns]
CASE_NUMBER str
CaseType str
CASE_START_DATE str
CASE_STATUS str
CARES_ID float64
CLIENT_NAME str
PLACEMENT_RECORD_TYPE str
PLACEMENTLEVEL str
PLACEMENT_CLASSIFICATION str
PLACEMENTDATE datetime64[ns]
SSPlacementDescription str
SSPlacementLevel str
SSPlacementType str
CreatedBy int64
CreatedDate datetime64[ns]
Streetsmartid str
ProviderName str
PlacementEndDate datetime64[ns]
LengthOfStay float64
```
### How to access
```python
query = `select * from SSMontlyPlacement`
ss_placement = pd.read_sql(query,conn2)
ss_placement['Month'] = pd.to_datetime(ss_placement['Month'])
ss_placement['PLACEMENTDATE'] = pd.to_datetime(ss_placement['PLACEMENTDATE'])
```
### Useful columns
- `Month`: To filter for the desired historical snapshot
- `Streetsmartid`: To link back to other client information
- `PLACEMENTDATE`: The actual date associated with the placement. This should be in the same month as `Month`
- `CASE_STATUS`: To estimate open caseload (TO DO: Figure out which cases are being used here) UPDATE: This is the latest case for the client
- `SSPlacementType`: Commonly used in order to distinguish between transitional, permanent placement, and other settings. Note that this field will not be populated for a client if they are currently on street.
- `PLACEMENT_CLASSIFICATION`: Slightly more detailed than `SSPlacementType` in that disguishes `Stabilization bed` in `Trasitional Housing`
- `ProviderName`
## `SSMonthlyClientsPlaced`
This table is created by DHS ITS and is used in the Streetsmart dashboard. It reflects unique clients placed into transitional housing, permanent placement, or other placements (e.g. hospital) for a particular `Month`. It is slightly different than `SSMonthlyPlacements` in that each row represents a client, and not a placement.
This table is useful because it gives a by-month historical snapshot of clients placed in a particula month.
### Timeframe
June 1, 2016 to present.
### Columns
Below are all the columns available in the table and their intended data type. Note: these will likely all be read as `obj` when read into a pandas dataframe.
```
SSMonthlyPlacementId int64
Month datetime64[ns]
CASE_NUMBER str
CaseType str
CASE_START_DATE str
CASE_STATUS str
CARES_ID float64
CLIENT_NAME str
PLACEMENT_RECORD_TYPE str
PLACEMENTLEVEL str
PLACEMENT_CLASSIFICATION str
SSPlacementDescription str
SSPlacementLevel str
SSPlacementType str
CreatedBy int64
CreatedDate datetime64[ns]
Streetsmartid str
ProviderName str
PlacementEndDate datetime64[ns]
LengthOfStay float64
```
### How to access
```python
query = `select * from SSMontlyClientsPlacement`
ss_clientsplaced = pd.read_sql(query,conn2)
ss_clientsplaced['Month'] = pd.to_datetime(ss_clientsplaced['Month'])
```
### Useful columns
- `Month`: To filter for the desired historical snapshot
- `Streetsmartid`: To link back to other client information
- `CASE_STATUS`: To estimate open caseload (TO DO: Figure out which cases are being used here)
- `SSPlacementType`: Commonly used in order to distinguish between transitional, permanent placement, and other settings. Note that this field will not be populated for a client if they are currently on street.
- `PLACEMENT_CLASSIFICATION`: Slightly more detailed than `SSPlacementType` in that disguishes `Stabilization bed` in `Trasitional Housing`
- `ProviderName`
## `Cases`
This table is the table of record (TO DO: True??) for all open and closed cases. Each client in this table has one or more cases in the table and, in theory, the case dates should not overlap. Clients often have multiple cases because clients may return to caseload (e.g. if they leave permanent placements, if they are rediscovered or return to the city). Some clients may have cases that are provisionally closed if they cannot be located after XX number of days (TODO: check this).
When we use the phrase "clients on caseload", we typically mean clients who still have an open case based on this data.
### Timeframe
- Case start dates range from June 21, 1967 to present.
- Earliest created date is Jun 14, 2016.
- NOTE: Case information is less reliable before 2017. (TO DO: Is this true for cases?)
### Columns
Below are all the columns available in the table and their intended data type. Note: these will likely all be read as `obj` when read into a pandas dataframe.
```
CaseId int64
CaresCaseId float64
ClientId int64
CaseNumber str
CaresCaseNumber float64
CaseType str
CaseStartDate datetime64[ns]
CaseEndDate datetime64[ns]
HHSize int64
CaseStatus str
ServiceType str
CaseRepresentative str
CreatedBy str
CreatedDate datetime64[ns]
ModifiedBy str
ModifiedDate datetime64[ns]
ProviderName object
IsActive bool
RegistrationDate datetime64[ns]
CaseRepresentativeNetworkId str
ClientEnrollmentId float64
ProvisionallyCaseClosedDate datetime64[ns]
CaseEndReason str
Borough str
```
### How to access
```python
query = `select * from [Case] where isactive=1`
cases = pd.read_sql(query,conn1)
```
### Useful columns
- `CaseStartDate`: This tells us when a client's caseload started. This is columns is sometimes updated after the fact.
- `CaseEndDate`: If this client is still on caseload, the `CaseEndDate` will be `NaT`, i.e. not available since the case is still open. Along with `CaseStartDate`, this can be used to calculate case duration.
- `CaseStatus`: To see whether the case status to open, closed, or provisionally closed.
- `CaseEndReason`: This can be instructive in helping us understand permanent placement rentention. If a client's been put into permanent placement and their case is closed, this column can be used to understand whether a client has successfully been placed, whether they left and cannot be found, or potentially found housing in another city, or with family.
- `ProviderName`
## `Placements`
This table is the table of record (TO DO: True??) for all open and closed placements, though the Streetsmart front-end will display a narrower range of dates. All DHS client placements in either transitional, permanent, or other placement type will be recorded here. Note, this table is often joined with `Case` since it does not have any direct link to client information, only the `CaseId` (yes, this is as annoying as it sounds.)
When we use the phrase "clients on caseload", we typically mean clients who still have an open case based on this data.
### Timeframe
- Placement dates range from Jan 22, 2007 to present.
- Earliest created date is Jun 17, 2016.
- NOTE: CInformation is less reliable before 2017. (TO DO: Is this true for cases?)
### Columns
Below are all the columns available in the table and their intended data type. Note: these will likely all be read as `obj` when read into a pandas dataframe.
```
PlacementId int64
CaresPlacementId float64
CaseId int64
FacilityName str
PlacementDate datetime64[ns]
PlacementExitDate datetime64[ns]
ExitReason str
FacilityAddress str
Address str
PlacementType str
PlacementLevel str
PlacementDescription str
DetoxCenter str
DetoxCenterProgram str
Hospital str
IsActive bool
CreatedBy str
CreatedDate datetime64[ns]
ModifiedBy str
ModifiedDate datetime64[ns]
FacilityId float64
FacilityAddressId float64
TravelAssistanceProgramId str
ClientStatusType str
IsPermanentAddressKnown str
PlacementTypeId int64
PlacementTypeCode str
```
### How to access
```python
query = `select * from [Placement] where isactive=1`
placements = pd.read_sql(query,conn1)
```
### Useful columns
- `PlacementDate`
- `PlacementExitDate`
- `CaseId`
- `PlacementType`
- `ProviderName`
- `PlacementId`: Note, this is NOT the same ID as the `SSMontlyPlaceId` and should be joined base on that.
### Useful joins
Use this join placements information with cases.
```python
placements_new = pd.merge(placements,
pd.merge(cases[['CaseId','CaseNumber','ClientId']],
clients[['ClientId','StreetsmartId']],
on='ClientId'),
on='CaseId')
```
## `ClientEngagements`
This table is the table of record (TO DO: True??) for all prospective and client engagements. Each time a provider has an service-related interaction, the interaction will be recorded in the table (TO TOD: Is this true?).
Note that not all engagements are with clients on caseload. If someone is not a client, we put them on caseload after **three engagements** within **thirty days** of each other.
### Timeframe
- Engagement dates range from Aug 02, 1951 (which seems like it could be wrong?) to present.
- Earliest created date is Mar 14, 2016.
- NOTE: Information is less reliable before 2017. (TO DO: Is this true for cases?)
### Columns
Below are all the columns available in the table.
```
ClientEngagementID', 'ClientId', 'ProviderName', 'Area', 'ShiftName', 'TeamType', 'TeamName', 'DispatcherName', 'JointOutreachName', 'JointOutreachParticipants', 'Weather', 'Temperature', 'EngagementRecordNumber', 'ServiceRequestNumber', 'ContactDateTime', 'ContactReason', 'VisitReason', 'ApproximateHeightFt', 'ApproximateHeightInch', 'GlassesStatus', 'TattoosStatus', 'HairColor', 'EyeColor', 'HairType', 'IsDistinguishingGarment', 'PiercingStatus', 'ApproxWeight', 'ServiceNeeds', 'OutreachFollowUp', 'OutreachFollowUpDate', 'EngagementOutcome', 'Borough', 'LocationType', 'AddressType', 'LocationName', 'AdditionalDetails', 'IsCurrentSleepLocation', 'EngagementNotes', 'CreatedBy', 'CreatedDate', 'ModifiedBy', 'ModifiedDate', 'PlacementId', 'IsActive', 'Latitude', 'Longitude', 'Caseid', 'PlacementDescription', 'FacilityId', 'Sector', 'SubwayDiversionNo', 'SubwayDiversionTabLogNo', 'EngagementOccured'
```
### How to access
```python
query = '''
select
ClientId,ContactDateTime,ProviderName,Borough,CreatedBy,
CreatedDate,ModifiedBy, ModifiedDate,PlacementId, IsActive,
PlacementDescription,FacilityId
from ClientEngagement
where isactive = 1
'''
engagements = pd.read_sql(query, conn1)
```
### Useful columns
These columns seem more useful for day to day analysis purpose:
```
ClientId int64
ContactDateTime datetime64[ns]
ProviderName str
Borough str
CreatedBy str
CreatedDate datetime64[ns]
ModifiedBy str
ModifiedDate datetime64[ns]
PlacementId float64
IsActive bool
PlacementDescription str
FacilityId float64
```
### Useful joins
Use this get the `StreetsmartId` for each client.
```python
engagements_new = pd.merge(engagements,
clients[['ClientId','StreetsmartId']],
on='ClientId'),
on='CaseId',
how='left')
```
## `Clients`
This table is the table of record (TO DO: True??) for all prospective, current, and previous clients. It has client relevant personal information, demographics, and service status (like SSN, veteran status, criminal history), and various kinds of verifications (whether they have their birth certificate, code blue vulnerabiilities, DoB verification).
Note: Here we typically use `VPROD02SQL` because the column entries have not be encoded. However, the `SSN` column here has been partially hidden. In order to get full social-security numbers, we have to use the `BKODS01SQL16` server.
### Timeframe
- Earliest created date is Nov 04, 1983.
- NOTE: Case information is less reliable before 2017. (TO DO: Is this true for cases?)
### Columns
Below are all the columns available in the table.
```
'ClientId', 'CaresPersonConcernRoleId', 'CaresId', 'AlternateCaresId', 'FirstName', 'MiddleName', 'LastName', 'Alias', 'SSN', 'DateOfBirth', 'Race', 'Gender', 'MaritalStatus', 'EmploymentStatus', 'VeteranIndicator', 'VeteranStatus', 'VeteranDischargeStatus', 'ImmigrationStatus', 'Ethnicity', 'ClientCategory', 'PreviousLengthOfHomelessnessDays', 'SSNCard', 'PhotoID', 'BirthCertificate', 'IsDeceased', 'DeceasedDate', 'CreatedBy', 'CreatedDate', 'ModifiedBy', 'ModifiedDate', 'ApproximateAge', 'StreetsmartId', 'SexOffender', 'SexOffenderDescription', 'ClientHasPet', 'MemberOfAnAdultFamily', 'IsFirstNameVerified', 'IsLastNameVerified', 'IsDOBVerified', 'IsSSNVerified', 'IsActive', 'VulnerabilityStatus', 'VulnerabilityReason', 'IsStartDateOnStreetVerified', 'IsConsentObtained', 'CodeBlueCodeRedVulnerability', 'VulnerableClientInvoluntaryRemoval', 'PendingStatusStartDate', 'IsAliasVerified', 'IsAliasChangeMailSent', 'NameSuffix', 'ServiceEra', 'Area', 'TotalLengthOfHomelessNessDays', 'CriminalHistoryStatus', 'CriminalHistoryFromDate', 'CriminalHistoryToDate', 'State', 'County', 'City', 'CaresServiceEraCode', 'IsJCCClient', 'JCCClientStartDate', 'JCCClientEndDate'
```
### How to access
To access the table with more detailed information but **no** SSN:
```python
query = '''
select * from [Client]
where isactive = 1
''' ## Because "Case" is a condtional in SQL, put brackets around to indicate that this is a table name
clients = pd.read_sql(query, conn1)
```
To access the table with less detailed information but **with** SSN:
```python
query = '''
select * from [Client]
where isactive = 1
''' ## Because "Case" is a condtional in SQL, put brackets around to indicate that this is a table name
clients = pd.read_sql(query, conn2)
```
# CARES
CARES is an older sytem that tracks shelter activities. I am less familiar with the data and table details from this system, but have detailed some basic information here that may be useful.
There are two main databases associated with CARES data:
- `Cares_ODS` : This is an older system in which practically no tables have been joined to be more user-friendly.
- `EDW_RPT`: This is a newer database that has more human-readable columns as well as more pre-joined talbes.
Here, I highlight the main tables that are used in the `EDW_RPT`. At the end of an Intranet page called [Enterprise Data Warehouse](http://sundat4.ux.hra.nycnet:7778/portal/pls/portal/APPPRD.DOCUMENTPAGE.show?p_arg_names=an_page_id&p_arg_values=1&p_arg_names=_title&p_arg_values=Library%20%20%20%EE%9C%92%20WX%20%EE%BE%80) in the "DHS" section is a more thorough set of metadata tables. This can be especially useful if there are certain columns whose purpose can be unclear.
## `Case_Fct`
This the table of client cases.
**Warning**
There are about 6.5 million records. It's advised filter the initial SQL query using datebounds or in some other way to reduce the number of records analyzed in the table.
### Timeframe
- Earliest record creation date is June 20, 2011.
- The earliest record date is June 21, 1967.
### Columns
Below are some useful columns available in the table. Note that some of the records are from Streetsmart and joined to CARES. This because there are clients who participate in both outreach and shelter programs.
```
Case_ID str
Case_Number int64
HOC_CARES_ID int64
Case_Outcome str
CaresCaseNumber int64
Case_Status str
Case_Type str
Service_type str
Start_Date datetime64[ns]
Recent_Record_FLG str
StreetSmart_Case_Number str
Source_DB str
CASE_End_Reason str
```
### How to access
To access the table:
```python
query = '''
select * from [Case_Fct]
where isactive = 1
cares_case_fct = pd.read_sql(query, conn1)
```
# Useful functions
```python=
def open_per(df,case_startdate_col='PlacementStartDate', case_enddate_col='PlacementEndDate',start_date=startdate,end_date=enddate):
'''
This function returns whether a case is still open based on start end dates of the case and lookup period
'''
df['open_case'] = False
df.loc[(df[case_startdate_col]<=end_date)& ((df[case_enddate_col].isna()) | ((df[case_enddate_col]>=start_date))),'open_case']=True
return df['open_case']
def open_cur(df,case_startdate_col='PlacementStartDate', case_enddate_col='PlacementEndDate',start_date=startdate,end_date=enddate):
'''
This function returns whether a case is still open based on start end dates of the case and lookup period
'''
df['open_case'] = False
df.loc[(df[case_startdate_col]<=end_date)& ((df[case_enddate_col].isna()) | ((df[case_enddate_col]>=end_date))),'open_case']=True
return df['open_case']
def placement_timeframe(df,clientid,service_start_date='PlacementStartDate',start_date=startdate,end_date=enddate):
'''
Return placements within the timeframe
'''
return (df[service_start_date] >=start_date) & (df[service_start_date] <=end_date) & (df[service_start_date].isna()==False)
def latest_placement(df,clientid,service_start_date='PlacementStartDate',ascending=True):
'''
This function just does the sort + groupby to get each client's latest placement
'''
if ascending ==False:
df_new = df.sort_values([clientid,service_start_date],ascending=False)\
.groupby(clientid).head(1)
else:
df_new = df.sort_values([clientid,service_start_date],ascending=True)\
.groupby(clientid).tail(1)
return df_new
```
# Common Joins
The universal link between all client-level data should be `StreetsmartId` or `Streetsmartid` and the link between all placement-level data should be `CaseId`, as all placements should have a case attached to them.
## Case and Placements
Join to get case information for client placements.
```python
case_placements = pd.merge(cases,
placements,
on='CaseId')
```
## SS Census and Client data
Join to get more information for clients in the SS Monthyl Census.
```python
pd.merge(ss_census,
clients,
left_on='Streetsmartid',
right_on='StreetsmartId')
```
## Placements
The placements table does not include the `StreetsmartId`. Use this join to get `StreetsmartId`
```python
placements = pd.merge(placements,
pd.merge(cases[['CaseId','ClientId']],
clients[['ClientId','StreetsmartId']],
on='ClientId'),
on='CaseId')
```
## Clients and Cases
The cases table does not include the `StreetsmartId`. Use this join to get `StreetsmartId`
```python
cases = pd.merge(cases,
clients[['StreetSmartid','ClientId']],
on='ClientId')
```
# Data Discrepancies to QA
This is a space to keep track of some known data discrepancies, probably primarily for Streetsmart.
- `C111424` = Latest placement is transitional (06/13/2017), but change log shows permanent (10/20/2017). Client cannot be located. This might be SS front-end, `ss_case_placement` back-end discrepancy issue.
# OPS_RPT
The `OPS_RPT` database contains line-level client information for all SHS clients and accompanying reference information. It contains the following tables:
- `Master_Client_List_WEEKLY`:A table updated **weekly** on **Monday at 6pm** consisting of a snapshot all SHS clients across the Streetsmart and CARES systems. This includes clients currently from the following services:
- Safe Haven
- Stabilization Beds
- Drop-in Centers
- Shelter (with an open outreach case)
- On Street (with an open outreach case)
- Placed into Permanent Placement during this period
- In Other Transitional settings
- In Other settings
- `Master_Client_List_MONTHLY`: A table updated **monthly on the 16th** consisting of a snapshot of all SHS clients for the previous month (for ex, if it's Jul 16th now, the most recent monthly table will be for clients in the month of June). This table has the same columns as the weekly, except that the `Report_Run_Datetime` column is now called the `SS_Month`.
- `Stabilization_Facilities`: This a table of current and past stabilization facilities. Note: Updating this list will automatically change which facilities are included in the final count in future runs of the MCL (TO DO). There's probably no need to remove old facililties as they will likely just not be counted. I believe facilities that used to be stabilization beds and are currently a different type will be coded differently in CARES/Streesmart.
- `SS_and_Cares_Clients`: This is a table that is updated weekly (at the same time as `Master_Client_List_WEEKLY`). It allows us to search by Streetsmart ID, Cares ID, SSN, and name, for clients that might have multiple IDs (SS or Cares).
## Master Client List
The MCL is meant, amongst other tasks, to produce the weekly Mayor's Dashboard of placements and locations of current outreach clients.
The tables are stored on the server `BKODS01SQL16.DHS.NYCNET` in the database `OPS_RPT`.
### Versions
There are two versions of the MCL in the `OPS_RPT` database on the `BKODS01SQL16.DHS.NYCNET` server:
- `Master_Client_List_WEEKLY`: Updated weekly Mondays to reflect numbers from the Monday to Sunday midnight of the previous week. **Importantly**, for the Streetsmart counts this table pulls data from `Case` and `Placement` tables in `StreetSmart_Datamart` (i.e. not the monthly Streetsmart frontend counts).
- `Master_Client_List_MONTHLY`: Updated monthly after the 15th of the following month, i.e. on the same schedule as the StreetSmart monthly census. This table is based off the monthly census numbers and tables (`SSMonthlyCensus`, `SSMonthlyPlacement`). It is a reflection of monthly numbers and should match the StreetSmart front end numbers in terms of placements and current locations.
Two versions of the MCL provide a weekly and monthly snapshot of **both** the **current clients** and **anyone who was an active client during the period**. The
- For columns that end with `In_Timeframe`, means that this client had this status or service at some point during the time period. For ex: `CARES_SH_In_Timeframe=True` means that the client had a safe haven service.
- - For columns that end with `End_Timeframe`, means that this client was *currently* in this service or status at the end of the time period.
### Data Dictionary
Each MCL table is a reflection of client numbers over the course of the time period and snapshot of clients on the end date of the time period.
A formatting note:
- Most dates are in a `str` format because SQL does not allow more than one column to have a datetime format (though there are several date columns in the data.)
- Booleans have also been converted to strings.
| Column | Description| Format|
| ------ | ---------|---------|
|`StreetsmartId`| StreetSmartID from the StreetSmart system|`str`|
|`CaresId`| CARES ID from both a combination of the CARES and StreetSmart system (if clients' CaresId is listed in StreetSmart, that is the one we use).|`int`|
|`ClientId`| Unique ID created by us for each client. If a client has an existing `ClientId` associated with their Streetsmart ID, that is used. Otherwise, this is newly generated |`int`|
|`FacilityName_current`| If a client is currently in a placement, this is the name of the Facility they are in.|`str`|
|`CaseStartDate_current`|This is the start date of the client's most recent active SS case. This case might still be open if the client is a current SHS client.|`str`|
|`CaseEndDate_current`|This is the end date of the client's most recent active SS case. This case might still be open if the client is a current SHS client, in which case there will be no end date.|`str`|
|`ProviderName_current`|If a client is currently in a placement, this is the name of the Provider responsible for this placement.|`str`|
|`PlacementType_current`|If a client is currently in a placement, this is the high-level SS category (`Permanent Placement`,`Transitional Housing`, or `Other`) of the client.|`str`|
|`PlacementDescription_current`|If a client is currently in a placement, this is the more detailed SS category of the client. This level has the Safe Haven, Stabilization Bed, Shelter, Other, etc distinctions. Note: if you don't filter out Permanent Placement clients in the SQL, the various types of permanent housing will appear here. |`str`|
|`PlacementType_current_SS`|If a client is currently in a placement **according to Streetsmart placement data**, this is the high-level SS category.|`str`|
|`PlacementDescription_current_SS`|If a client is currently in a placement **according to Streetsmart placement data**, this is the more detailed SS category.|`str`|
|`PlacementType_current_CARES`|If a client is currently in a placement **according to CARES placement data**, this is the high-level SS category.|`str`|
|`PlacementDescription_current_CARES`|If a client is currently in a placement **according to CARES placement data**, this is the more detailed SS category.|`str`|
|`Primary_System`|This is the system from which most of the Client's information is populated. Note: The client's location and placement information could be from a different system.|`str`|
|`SS_Client_In_Timeframe`| Client had an open Outreach case at some point in the timeframe (True/False) |`str`|
|`SS_Client_End_Timeframe`| Client had an open Outreach case at the end of the timeframe (True/false) |`str`|
|`SS_Stab_In_Timeframe`| Individual had an active Stabilization Bed service at some point in the timeframe, according to Streetsmart data. (True/false) |`str`|
|`SS_Stab_End_Timeframe`|Individual had an active Stabilization Bed service at the end of the timeframe, according to Streetsmart data. (True/false) |`str`|
|`CARES_Shelter_In_Timeframe`| Individual had an active Shelter service at some point in the timeframe, according to CARES data (True/false) |`str`|
|`CARES_Shelter_End_Timeframe`| Individual had an active Shelter service at the end of the timeframe, according to CARES data (True/false) |`str`|
|`CARES_SH_In_Timeframe`| Individual had an active Safe Haven service at some point in the timeframe, according to CARES data (True/false) |`str`|
|`CARES_SH_End_Timeframe`|Individual had an active Safe Haven service at the end of the timeframe, according to CARES data (True/false) |`str`|
|`CARES_Stab_In_Timeframe`| Individual had an active Stabilization Bed service at some point in the timeframe, according to CARES data (True/false) |`str`|
|`CARES_Stab_End_Timeframe`|Individual had an active Stabilization Bed service at the end of the timeframe, according to CARES data (True/false) |`str`|
|`CARES_DIC_In_Timeframe`| Individual had an active Drop-In service at some point in the timeframe, according to CARES data (True/false) |`str`|
|`CARES_DIC_End_Timeframe`|Individual had an active Drop-In Bed service at the end of the timeframe, according to CARES data (True/false) |`str`|
|`Placed_In_Timeframe_CARES`| Individual had a placement in CARES during the timeframe, according to CARES data (True/false) |`str`|
|`Placed_In_Timeframe_SS`|Flag to indicate if the client had a placement recorded in SS during the timeframe, according to Streetsmart data (True/false) |`str`|
|`Latest_Placement_Type_In_Timeframe_SS`| Latest placement type for a placement during the timeframe, according to Streetsmart data (True/false) |`str`|
|`Latest_Placement_SubType_In_Timeframe_SS`| Latest placement "subtype", which is a **placement description** in StreetSmart system, for a placement during the timeframe, according to Streetsmart data (True/false) |`str`|
|`Latest_Placement_StartDate_In_Timeframe_SS`| Latest placement start date for a placement during the timeframe if there is a placement, according to Streetsmart data |`str`|
|`Latest_Placement_EndDate_In_Timeframe_SS`| Latest placement end date for a placement during the timeframe, if there is a placement and a placement end date, according to Streetsmart data |`str`|
|`Latest_Placement_Type_In_Timeframe_CARES`| Latest placement type for a placement during the timeframe, according to Cares data |`str`|
|`Latest_Placement_SubType_In_Timeframe_CARES`| Latest placement "subtype", which is a **placement description** in CARES system, for a placement during the timeframe, according to Cares data |`str`|
|`Latest_Placement_StartDate_In_Timeframe_CARES`| Latest placement start date for a placement during the timeframe, if there is a placement in the CARES system, if there is a placement|
|`Latest_Placement_EndDate_In_Timeframe_CARES`| Latest placement end date for a placement during the timeframe in the CARES system, according to Cares data |`str`|
|`Report_Run_DateTime`| [ONLY IN `Master_Client_List_WEEKLY`] Datetime of when the report was run|`datetime`|
|`SS_Month`| [ONLY IN `Master_Client_List_MONTHLY`] Datetime of the report month|`datetime`|
### Example SQL queries
#### Outreach Clients
The majority of clients are exclusively in CARES who have used some kind of SHS service. For many of the reports and calculations that are just based on outreach clients, you'll need to first filte.
For clients who are still active outreach clients (i.e. this have an *Open* or *Provisionally Closed*) clients at **the end of** the query period:
```sql=
select * from Master_Client_List_WEEKLY where
SS_Client_End_Timeframe = True
```
For clients who were active clients (i.e. this have an *Open* or *Provisionally Closed*) clients at **any point during** the query period:
```sql=
select * from Master_Client_List_WEEKLY where
SS_Client_In_Timeframe = True
```
#### Mayor's Dashboard
To match numbers with Mayor's Dashboard in "Current Location" transitional settings, run this SQL:
```sql=
select * from Master_Client_List_WEEKLY where
SS_Client_End_Timeframe='True' AND
(PlacementType_current !='Permanent Placement')
```
Katrina's code for reproducing counts for Mayor's Dashboard:
```sql=
select
a.Report_Run_DateTime,
a.snapshot_look,
count (distinct a.clientid) clients
from
(select
case
when PlacementDescription_current = 'Shelter' then '1 - Shelter'
when PlacementDescription_current = 'Safe Haven' then '2 - Safe Haven'
when PlacementDescription_current = 'Stabilization Bed' then '3 - Stab Beds'
when PlacementType_current = 'Transitional Housing' then '4 - Other Transitional'
when PlacementType_Current != 'Location Unknown' then '5 - Other Settings'
else '6 - On-Street'
end as snapshot_look,
Report_Run_DateTime,
clientid
from ops_rpt.dbo.Master_Client_List_WEEKLY
where SS_Client_End_Timeframe='True'
AND (PlacementType_current !='Permanent Placement')
AND Report_Run_DateTime >= '2021-04-12'
) a
group by Report_Run_DateTime, snapshot_look
order by report_run_datetime desc, snapshot_look asc;
```
### List of planned updates
**Next update:**
- [x] ~~In order to get more client placements and case information for current clients, the following are proposed columns to add~~:
- ~~`Latest_Placement_ID_In_Timeframe_SS`~~
- ~~`Latest_Placement_ID_In_Timeframe_CARES`~~
- ~~`Latest_Case_ID_SS`~~
- [x] Because there are some (mostly recent, emergency) stabilization beds that are only Streetsmart and not CARES, the following are proposed to be able to distinguish just the stabilization bed clients in SS:
- `SS_Stab_In_TimeFrame`
- `SS_Stab_End_Timeframe`
- [x] Make sure the CaresId is an int.
- [x] Rename the `ClientId` column to be `GeneratedID`
- [x] Hard code a link to the data documentation in the database.
- [x] Move everything to `BKODS` server. **note: make sure to update the documentation to reflect this.**
**Future updates:**
- [x] Update script so it reads stabilization facilities from the `OPS_RPT` table.
## Stabilization Bed List
The `Stabilization_Facilities` table contains all the n
The tables are stored on the server `BKODS01SQL16.DHS.NYCNET` in the database `OPS_RPT`.