# NHI OMOP
## 服務IP:Port
- https://bdcode.vghtpe.gov.tw/
- [10.221.252.51/atlas](http://10.221.252.51/atlas) Atlas 分析版
- 10.221.252.51:5436 Atlas NHI資料源 (Postgres)
- 10.221.252.50:5432 Atlas ICU資料源 (Postgres)
- 10.221.252.51:5438 Atlas WebAPI (Postgres)
- 10.221.252.51:8787 Hades (RStudio)
## Rabbit報告
- [連結](https://drive.google.com/drive/folders/1RFzS1mWo9Tcu-gDhvLIg7iEeViowF5YN?usp=drive_link)
# 林鴻睿
## 參考連結
[急診OMOP轉換流程手冊](https://hackmd.io/-ykODPJiRXWuWZCIh9lP6Q)
[OMOP v5.4規範](https://ohdsi.github.io/CommonDataModel/cdm54.html)
[Athena id查詢](https://athena.ohdsi.org/search-terms/start)
[Athena API](https://hackmd.io/QBKU9ZTkStqJk6TgcY3-CA#API-Test)
[Athena API帶參數範例](https://athena.ohdsi.org/api/v1/concepts?pageSize=15&domain=Measurement&standardConcept=Standard&invalidReason=Valid&page=1&query=sodium)
[北榮大數據](https://wd.vghtpe.gov.tw/bdc/Fpage.action?muid=19745&fid=17871)
[OHDSI White Rabbit](https://github.com/OHDSI/WhiteRabbit)
[GoogleSheet](https://docs.google.com/spreadsheets/d/1CcmLsikAMBxu3yTH5Hy17NmVtoOQkRQHxZCh1_o5LPU/edit?usp=sharing)
[OMOP健保資料庫前人整理GoogleSheet](https://docs.google.com/spreadsheets/d/1Ur_qzITI2Rz8a6NJyDAFsseYK5vlp01L/edit?gid=147600560#gid=147600560)
[OHDSI官方文檔](https://ohdsi.github.io/TheBookOfOhdsi/)
[OHDSI日文社群文檔](https://rwd-data-environment-in-hospital.github.io/Documents/)
## OMOP Table SQL Queries
### measurement
- 進度: 完成
- 缺:
- 其他:
- `measurement_id`, `measurement_type_concept_id`, 目前流水號
- `measurement_concept_id`目前以`Athena API`腳本方式大量自動取得
```sql=
WITH measurement_base AS (
SELECT
CAST("PERSON"."person_id" AS INTEGER) AS "person_id",
"LAB_ITEM_MAPPING"."id" AS "measurement_concept_id",
CAST("LAB_20231003"."sdate" AS DATE) AS "measurement_date",
CASE
WHEN "LAB_20231003"."outcome" ~ '^[+-]?[0-9]*\.?[0-9]+$' THEN CAST("NEPHR"."LAB_20231003"."outcome" AS NUMERIC)
ELSE NULL
END AS "value_as_number",
"VISIT_OCCURRENCE"."visit_occurrence_id"
FROM
"LAB_20231003"
LEFT JOIN "PERSON"
ON "LAB_20231003"."uuid" = "PERSON"."person_source_value"
LEFT JOIN "LAB_ITEM_MAPPING"
ON "LAB_ITEM_MAPPING"."item" = "NEPHR"."LAB_20231003"."item"
LEFT JOIN "VISIT_OCCURRENCE"
ON "PERSON"."person_id" = "VISIT_OCCURRENCE"."person_id" AND "VISIT_OCCURRENCE"."visit_start_date" = "LAB_20231003"."sdate" AND "VISIT_OCCURRENCE"."visit_end_date" = "NEPHR"."LAB_20231003"."sdate"
)
INSERT INTO "measurement" (
"measurement_id",
"person_id",
"measurement_concept_id",
"measurement_date",
"measurement_type_concept_id",
"value_as_number",
"visit_occurrence_id"
)
SELECT
ROW_NUMBER() OVER () + 375050204 AS "measurement_id",
"person_id",
"measurement_concept_id",
"measurement_date",
1 AS "measurement_type_concept_id",
"value_as_number",
"visit_occurrence_id"
FROM measurement_base
WHERE "person_id" IS NOT NULL
AND "measurement_concept_id" IS NOT NULL
AND "measurement_date" IS NOT NULL
AND "value_as_number" IS NOT NULL
AND "visit_occurrence_id" IS NOT NULL
```
- "item"欄位補充說明
HTN(CD or DD - icd) / 共病症
BUN / 血尿素氮 (Blood Urea Nitrogen)
HDL(HDLC) / 高密度脂蛋白 (High-Density Lipoprotein)
LDL(LDLC) / 低密度脂蛋白 (Low-Density Lipoprotein)
TG / 三酸甘油酯 (Triglycerides)
WBC / 白血球 (White Blood Cells)
Hb(HGB) / 血紅素 (Hemoglobin)
CREAT / 血清肌酐 (Creatinine)
eGFR / 預測腎小管過濾率 (estimated Glomerular Filtration Rate)
NA / 鈉 (Sodium)
K / 鉀 (Potassium)
Cl / 氯 (Chloride)
Ca / 鈣 (Calcium)
IP / 無機磷 (Inorganic Phosphate)
CHOL / 總膽固醇 (Cholesterol)
UA / 尿酸 (Uric Acid)
HCO3 / 重碳酸根 (Bicarbonate)
HbA1c / 糖化血紅素 (Hemoglobin A1c)
UPCR / 尿蛋白對肌酐比值 (Urine Protein to Creatinine Ratio)
UACR / 尿白蛋白對肌酐比值 (Urine Albumin to Creatinine Ratio)
- [Broadsea Atlas](https://github.com/OHDSI/Broadsea/blob/7e61aaeff305d1ee621d278dc643e06d5793a2b6/docker-compose.yml#L95)示範`table`

### observation
- 進度: 完成
- 缺:
- 其他:
- `observatin_id`目前流水號
- `observation_type_concept_id`, 目前全部帶`44814721`(`EHR/Electronic Health Record`的`concept_id`碼)
- `observation_date`目前暫帶`ID.firstvisit`
```sql=
WITH observation_base AS (
SELECT
"PERSON"."person_id" as "person_id",
CASE
WHEN "ID"."bloodtype" = 'A' THEN 1576146
WHEN "ID"."bloodtype" = 'A -' THEN 45910926
WHEN "ID"."bloodtype" = 'A +' THEN 45930334
WHEN "ID"."bloodtype" = 'AB' THEN 1576148
WHEN "ID"."bloodtype" = 'AB -' THEN 45930337
WHEN "ID"."bloodtype" = 'AB +' THEN 45934055
WHEN "ID"."bloodtype" = 'B' THEN 1576147
WHEN "ID"."bloodtype" = 'B-' THEN 45917006
WHEN "ID"."bloodtype" = 'B +' THEN 45930336
WHEN "ID"."bloodtype" = 'O' THEN 1576149
WHEN "ID"."bloodtype" = 'O -' THEN 45917005
WHEN "ID"."bloodtype" = 'O +' THEN 45930335
ELSE NULL
END AS "value_as_concept_id",
"ID"."bloodtype" AS "value_as_string",
CASE
WHEN "ID"."firstvisit" = '' OR "ID"."firstvisit" IS NULL OR "ID"."firstvisit" = '20201920' THEN NULL
ELSE CAST("ID"."firstvisit" AS DATE)
END AS "observation_date"
FROM "ID"
LEFT JOIN "PERSON" ON "ID"."uuid" = "PERSON"."person_source_value"
)
INSERT INTO "observation" (
"observation_id",
"person_id",
"observation_concept_id",
"observation_date",
"observation_type_concept_id",
"value_as_concept_id",
"value_as_string"
)
SELECT
ROW_NUMBER() OVER () + 475050204 AS "observation_id",
"person_id",
4246053 AS "observation_concept_id",
"observation_date",
44814721 AS "observation_type_concept_id",
"value_as_concept_id",
"value_as_string"
FROM observation_base
WHERE "person_id" IS NOT NULL
AND "observation_date" IS NOT NULL
AND "value_as_concept_id" IS NOT NULL
--LIMIT 10000;
```
- [Broadsea Atlas](https://github.com/OHDSI/Broadsea/blob/7e61aaeff305d1ee621d278dc643e06d5793a2b6/docker-compose.yml#L95)示範`table`

### observatin_period (Atlas Data Source顯示圖必要table, 不能空, [參考1](https://forums.ohdsi.org/t/broadsea-atlas-why-are-most-data-source-reports-empty-when-using-a-local-data-source/18858/3), [參考2](https://forums.ohdsi.org/t/atlas-now-load-info-properly/16522/2), [參考3](https://github.com/OHDSI/Achilles/blob/main/inst/sql/sql_server/analyses/505.sql)(Atlas顯示也用此Table做日期filter))
```sql=
WITH observation_period_base AS (
SELECT
"PERSON"."person_id" as "person_id",
CAST("ID"."firstvisit" AS DATE) AS "observation_period_start_date",
CAST("ID"."lastvisit" AS DATE) AS "observation_period_end_date"
FROM "ID"
LEFT JOIN "PERSON" ON "ID"."uuid" = "PERSON"."person_source_value"
WHERE "ID"."firstvisit" != ''
AND "ID"."lastvisit" != ''
AND "ID"."lastvisit" NOT IN ('20182332', '20201920')
),
ranked_periods AS (
SELECT
"person_id",
"observation_period_start_date",
"observation_period_end_date",
ROW_NUMBER() OVER (PARTITION BY "person_id" ORDER BY "observation_period_start_date") AS row_num
FROM observation_period_base
)
INSERT INTO "observation_period" (
"observation_period_id",
"person_id",
"observation_period_start_date",
"observation_period_end_date",
"period_type_concept_id"
)
SELECT
ROW_NUMBER() OVER () + 29755 AS "observation_period_id", -- Create unique ID
"person_id",
"observation_period_start_date",
"observation_period_end_date",
44814724 AS "period_type_concept_id"
FROM ranked_periods
WHERE row_num = 1
AND "person_id" IS NOT NULL
AND "observation_period_start_date" IS NOT NULL
AND "observation_period_end_date" IS NOT NULL;
```
- [Broadsea Atlas](https://github.com/OHDSI/Broadsea/blob/7e61aaeff305d1ee621d278dc643e06d5793a2b6/docker-compose.yml#L95)示範`table`

### cost
- 進度: 完成
- 缺: `cost_event_id `, ` cost_domain_id`,
- 其他:
- `cost_type_concept_id`目前帶`5032:Covered Charge`
- `cost_event_id`,來自於`CD`帶`1`,來自於`DO`帶`2`
- `cost_domain_id`目前暫帶`visit_temp`
```sql=
WITH cost_CD_base AS (
SELECT
CAST("CD"."t_amt" AS NUMERIC) AS "total_charge",
CAST("CD"."t_appl_amt" AS NUMERIC) AS "total_cost",
CAST("CD"."part_amt" AS NUMERIC) AS "paid_patient_copay",
"CD"."case_pay_code" AS "drg_source_value",
1 AS "cost_event_id"
FROM "CD"
),
cost_DO_base AS (
SELECT
CAST("DO"."order_price" AS NUMERIC) AS "total_charge",
CAST("DO"."order_amt" AS NUMERIC) AS "total_cost",
CAST(NULL AS NUMERIC) AS "paid_patient_copay",
CAST(NULL AS VARCHAR) AS "drg_source_value",
2 AS "cost_event_id"
FROM "DO"
)
INSERT INTO "COST" (
"cost_id",
"cost_event_id",
"cost_domain_id",
"cost_type_concept_id",
"total_charge",
"total_cost",
"paid_patient_copay",
"drg_source_value"
)
SELECT
ROW_NUMBER() OVER () AS "cost_id",
"cost_event_id",
'visit_temp' AS "cost_domain_id",
5032 AS "cost_type_concept_id",
"total_charge",
"total_cost",
"paid_patient_copay",
"drg_source_value"
FROM (
SELECT * FROM cost_CD_base
UNION ALL
SELECT * FROM cost_DO_base
) AS combined_cost_data
LIMIT 10000;
```
- [Broadsea Atlas](https://github.com/OHDSI/Broadsea/blob/7e61aaeff305d1ee621d278dc643e06d5793a2b6/docker-compose.yml#L95)示範`table`

### death
```sql=
WITH death_base AS (
SELECT
"PERSON"."person_id" AS "person_id",
CAST("ID"."lastvisit" AS DATE) AS "death_date",
32815 AS "death_type_concept_id"
FROM
"ID"
JOIN
"PERSON" ON "ID"."uuid" = "PERSON"."person_source_value"
WHERE
"ID"."lastvisit" != ''
AND "ID"."lastvisit" NOT IN ('20182332', '20201920')
AND "ID"."isdead" = 'true'
),
ranked_deaths AS (
SELECT
"person_id",
"death_date",
"death_type_concept_id",
ROW_NUMBER() OVER (PARTITION BY "person_id" ORDER BY "death_date" ASC) AS row_num
FROM death_base
)
INSERT INTO "death" (
"person_id",
"death_date",
"death_type_concept_id"
)
SELECT
"person_id",
"death_date",
"death_type_concept_id"
FROM ranked_deaths
WHERE row_num = 1;
```
- [Broadsea Atlas](https://github.com/OHDSI/Broadsea/blob/7e61aaeff305d1ee621d278dc643e06d5793a2b6/docker-compose.yml#L95)示範`table`

### condition era & drug era
- [官方網站SQL](https://ohdsi.github.io/CommonDataModel/sqlScripts.html#drug_eras)
- [官方GithubSQL](https://github.com/OHDSI/ETL-CMS/tree/master/SQL)
## Broadsea Atlas, Postgres, Achilles, R Studio
- [Broadsea Atlas docker-compose.yaml](https://github.com/OHDSI/Broadsea/blob/7e61aaeff305d1ee621d278dc643e06d5793a2b6/docker-compose.yml#L95) (`L95`, 改`FLYWAY_DATASOURCE_PASSWORD=mypass`)
- `Broadsea Postgres`建立`Achilles`用`schema`準備
- `cdmDatabaseSchema`
- [Github](https://github.com/OHDSI/CommonDataModel/releases/tag/v5.4.1)下載解包,跑`ddl.sql`,`indices.sql`,`primary_keys.sql`。記得改`@cdmDatabaseSchema`至自己的`schema`名,此`schema`將為`Achilles`的`cdmDatabaseSchema`
- `schema`建立完成將自己的`cdm tables: person, measurement, observation, etc`上傳
- `vocabulary`的`table`內容於[Athena官網](https://athena.ohdsi.org/)建帳戶下載後上傳
- `resultsDatabaseSchema`準備
- [Github](https://gist.github.com/asd1245dss/bdc0770e089677ff3c3ef1a052d89df3)或[WebAPI](http://127.0.0.1/WebAPI/ddl/results?dialect=postgresql&schema=results&vocabSchema=cdm&tempSchema=temp&initConceptHierarchy=true),複製`sql`,`schema`記得改自己`schema`名稱,此`schema`將為`Achilles`的`resultsDatabaseSchema`
- `Achilles`生成`results`
- 登入(`127.0.0.1:8787, username:ohdsi, password:mypass`)
```r=
if (!require("remotes")) install.packages("remotes")
remotes::install_github("OHDSI/Achilles")
library(Achilles)
connectionDetails <- createConnectionDetails(
dbms="postgresql",
server="broadsea-atlasdb/postgres",
user="postgres",
password='mypass',
port="5432"
)
achilles(connectionDetails,
cdmDatabaseSchema = "cdm",
resultsDatabaseSchema = "cdm_results",
vocabDatabaseSchema = "cdm",
numThreads = 4,
sourceName = "CDM",
cdmVersion = "5.4",
createTable = TRUE,
createIndices = TRUE,
sqlDialect = "postgresql"
)
```
- `Achilles`執行約需`1`小時(依資料量)
- 執行完`results schema`會有`achilles_results`等相關`tables`產生
- 
- `Broadsea postgres`修改`webapi`的`tables`
- `Achilles`生成完`results`需要於`Broadsea postgres`中`webapi.source`和`webapi.source_daimon`下`sql`進行修改來讓`Atlas`知道顯示來源
- `webapi.source`的`sql`範例
```sql=
INSERT INTO "webapi"."source" (
"source_id",
"source_name",
"source_key",
"source_connection",
"source_dialect",
"username",
"password",
"krb_auth_method",
"keytab_name",
"krb_keytab",
"krb_admin_server",
"deleted_date",
"created_by_id",
"created_date",
"modified_by_id",
"modified_date" ,
"is_cache_enabled"
)
VALUES
(
2,
'CDM',
'OHDSI',
'jdbc:postgresql://ip:port/postgres?user=postgres&password=postgres',
'postgresql',
NULL,
NULL,
'PASSWORD',
NULL,
NULL,
NULL,
NULL,
NULL,
NULL,
NULL,
NULL,
true
);
```
- `webapi.source_daimon`的`sql`範例
```sql=
INSERT INTO "webapi"."source_daimon"("source_daimon_id", "source_id", "daimon_type", "table_qualifier", "priority") VALUES (4, 2, 0, 'cdm', 2);
INSERT INTO "webapi"."source_daimon"("source_daimon_id", "source_id", "daimon_type", "table_qualifier", "priority") VALUES (5, 2, 1, 'cdm', 2);
INSERT INTO "webapi"."source_daimon"("source_daimon_id", "source_id", "daimon_type", "table_qualifier", "priority") VALUES (6, 2, 2, 'cdm_results', 2);
```
- 刷新添增
- [http://127.0.0.1/WebAPI/source/refresh](http://127.0.0.1/WebAPI/source/refresh)刷新`WebAPI` (記得改自己`ip`)
- 
- [http://127.0.0.1/atlas/#/configure](http://127.0.0.1/atlas/#/configure)檢視 (記得改自己`ip`)
- 
- 顯示圖
- 
- 
- 
- 
- 
- 
- 
- 
- 
- [Atlas顯示SQL參考](https://github.com/OHDSI/Achilles/tree/main/inst/sql/sql_server/export)
### Concept Sets, Cohort Definition, Cohort Pathways
1. `Search`用`concept_id`搜,綠色`Add To New Concept Set`

2. `Cohort Sets`,選`concept_id`,綠色`Add`,右上綠色`Save`

3. `Cohort Definition`設定條件,性別,年齡,日期等,右上綠色`Save`

4. `Cohort Pathways`, `target`設大範圍(有高血壓),小範圍設細條件(女性,50歲+, 高血壓),設好後`Execution`生圖和報告



## [DQD (DataQualityDashboard)](https://github.com/OHDSI/DataQualityDashboard) 分析
```r=
remotes::install_github("OHDSI/SqlRender") #如有可跳過
install.packages("remotes")
remotes::install_github("OHDSI/DataQualityDashboard")
connectionDetails <- DatabaseConnector::createConnectionDetails(
dbms = "postgresql",
user = "postgres",
password = "mypass",
server = "ip/databasename",
port = "port")
DataQualityDashboard::executeDqChecks(
connectionDetails = connectionDetails,
cdmDatabaseSchema = "cdm",
resultsDatabaseSchema = "dqd_results",
cdmSourceName = "CDM",
cdmVersion = "5.4", # 根據 CDM 版本修改
numThreads = 4,
outputFolder = "output",
outputFile = "results.json"
)
jsonFilePath <- "/home/ohdsi/output/results.json" #要full-path, 沒有會跳error
DataQualityDashboard::viewDqDashboard(jsonFilePath)
```
- `cdm_source`的`table`記得查一下放相關值,不然`DQD`會跳`error`
- 
- 範例
- `DQD`結果
- 
- 
## White Rabbit & Rabbit In A Hat
- [White Rabbit & Rabbit In A Hat下載](https://github.com/OHDSI/WhiteRabbit/releases/)
- `White Rabbit Location Database連結`
- `Postgres RESEARCH`讀取`table`問題


#### Windows

- [postgres bug](https://github.com/OHDSI/WhiteRabbit/issues/416)
- 原始碼判斷有`bug`,[v1.0.0](https://github.com/OHDSI/WhiteRabbit/releases/tag/v1.0.0), [v0.10.6](https://github.com/OHDSI/WhiteRabbit/releases/tag/v0.10.6)可連結數據庫但無法正常讀取`RESEARCH database`的`table`內容,目前放棄資料庫方式。
- `White Rabbit Scan 手動上傳csv`
- 除連結資料庫外,也可以手動上傳`csv`生成`scan report`, `csv`來源為資料庫`SELECT FROM TABLE LIMIT 100`,有包含到想要欄位即可。
- 
- `White Rabbit`生成的`scan report`將會餵入下一工具`Rabbit In A Hat`
- `Rabbit In A Hat`
- 
- 上傳剛剛`White Rabbit`生成之`scan report`進行欄位連連看
- 欄位連完按`Generate`,選擇產出檔案,有`word`, `xlsx`, `sql`, 等
- 
- 生成`SQL`範例
## Broadsea 3.5 & HTTPS
- Clone
- https://github.com/OHDSI/Broadsea/releases/tag/v3.5.0
- `git clone https://github.com/OHDSI/Broadsea`
- 拿憑證`*.crt`,`ssl.key`
```
cat STAR_vghtpe_gov_tw.crt USERTrustRSAAAACA.crt SectigoRSADomainValidationSecureServerCA.crt AAACertificateServices.crt > ./certs/broadsea.crt
mv ssl.key ./certs/broadsea.key
```
- 改`.env`,相對應`HOST`
- 
- `Up & Down`
- `docker compose --profile default up -d`
- `docker compose --profile default down -v`

## Postgrest API
- Postgres
```sql=
CREATE ROLE web_anon NOLOGIN;
GRANT USAGE ON SCHEMA nhi TO web_anon;
GRANT SELECT ON ALL TABLES IN SCHEMA nhi TO web_anon;
```
- https://github.com/PostgREST/postgrest
```yaml=
version: '3.8'
services:
postgrest:
image: postgrest/postgrest:latest
container_name: postgrest
environment:
PGRST_DB_URI: postgres://user:password@postgres_ip:port/database
PGRST_DB_SCHEMA: nhi
PGRST_DB_ANON_ROLE: web_anon
PGRST_SERVER_PORT: 3000
ports:
- "3000:3000"
restart: always
```
- curl
- `curl "http://postgrest_ip:3000/person?limit=10"`
- 
## 10.221.252.51 硬碟塞爆
- `2024/12/16`,`10.221.252.51`運行的`docker-compose Broadsea`中`Atlas postgres`突然斷線,`DB`子凱來電,通知硬碟被塞爆,判斷是`docker-compose up`的工程師`postgres volume`放錯地方,沒放進`120T`的空間,放錯進`50G /root`
- `ssh`進去`10.221.252.51`, `docker-compose.yml`所有`volumes`的`path`發現為`docker`默認,所有`volume`被放到`/root`, 應該改為`/mnt/bgcdb/void/...`連接到`120T bgcdb`

# 鄭嘉祥
## 匯入Athena VOCAB資料
- 從官方[GitHub](https://github.com/OHDSI/CommonDataModel/releases/tag/v5.4.2)下載並執行SQL檔案就可以在資料庫建立OMOP CDM相關表格
- 於[Athena](https://athena.ohdsi.org/search-terms/start)官網下載VOCAB壓縮檔,其中為收錄相關字彙的CSV檔案(`ICD10PCS`選項須打勾才能處理`DD`表格中ICD10的手術代碼)

- 於pgAdmin的PSQL Tool中執行以下指令(需根據檔案位置變更路徑) 以匯入VOCAB資料至相對應的表格
```cmd=
\COPY cdm.DRUG_STRENGTH FROM 'C:\Users\vghuser\Downloads\VOCAB\DRUG_STRENGTH.csv' WITH DELIMITER E'\t' CSV HEADER QUOTE E'\b' ENCODING 'UTF8';
\COPY cdm.CONCEPT FROM 'C:\Users\vghuser\Downloads\VOCAB\CONCEPT.csv' WITH DELIMITER E'\t' CSV HEADER QUOTE E'\b' ENCODING 'UTF8';
\COPY cdm.CONCEPT_RELATIONSHIP FROM 'C:\Users\vghuser\Downloads\VOCAB\CONCEPT_RELATIONSHIP.csv' WITH DELIMITER E'\t' CSV HEADER QUOTE E'\b' ENCODING 'UTF8';
\COPY cdm.CONCEPT_ANCESTOR FROM 'C:\Users\vghuser\Downloads\VOCAB\CONCEPT_ANCESTOR.csv' WITH DELIMITER E'\t' CSV HEADER QUOTE E'\b' ENCODING 'UTF8';
\COPY cdm.CONCEPT_SYNONYM FROM 'C:\Users\vghuser\Downloads\VOCAB\CONCEPT_SYNONYM.csv' WITH DELIMITER E'\t' CSV HEADER QUOTE E'\b' ENCODING 'UTF8';
\COPY cdm.VOCABULARY FROM 'C:\Users\vghuser\Downloads\VOCAB\VOCABULARY.csv' WITH DELIMITER E'\t' CSV HEADER QUOTE E'\b' ENCODING 'UTF8';
\COPY cdm.RELATIONSHIP FROM 'C:\Users\vghuser\Downloads\VOCAB\RELATIONSHIP.csv' WITH DELIMITER E'\t' CSV HEADER QUOTE E'\b' ENCODING 'UTF8';
\COPY cdm.CONCEPT_CLASS FROM 'C:\Users\vghuser\Downloads\VOCAB\CONCEPT_CLASS.csv' WITH DELIMITER E'\t' CSV HEADER QUOTE E'\b' ENCODING 'UTF8';
\COPY cdm.DOMAIN FROM 'C:\Users\vghuser\Downloads\VOCAB\DOMAIN.csv' WITH DELIMITER E'\t' CSV HEADER QUOTE E'\b' ENCODING 'UTF8';
```
## 原始資料清理
### ICD疾病代碼
```sql=
DROP TABLE IF EXISTS patient_visits;
CREATE TABLE patient_visits AS
SELECT *
FROM dblink(
'dbname=NHIRaw host=10.221.252.51 port=5436 user=postgres password=mypass',
$$WITH outpatient_icd_10 AS (
SELECT
uuid,
func_date::DATE AS start_date,
COALESCE(CAST(treat_end_date AS DATE), func_date::DATE) AS end_date,
UNNEST(STRING_TO_ARRAY(acode_icd9_1, ' ')) AS icd_code,
func_type
FROM
"NEPHR"."CD"
WHERE
func_date::DATE >= '2016-01-01'::DATE
GROUP BY
uuid,
func_date::DATE,
COALESCE(CAST(treat_end_date AS DATE), func_date::DATE),
UNNEST(STRING_TO_ARRAY(acode_icd9_1, ' ')),
func_type
UNION ALL
SELECT
uuid,
func_date::DATE,
COALESCE(CAST(treat_end_date AS DATE), func_date::DATE),
UNNEST(STRING_TO_ARRAY(acode_icd9_2, ' ')),
func_type
FROM
"NEPHR"."CD"
WHERE
func_date::DATE >= '2016-01-01'::DATE
GROUP BY
uuid,
func_date::DATE,
COALESCE(CAST(treat_end_date AS DATE), func_date::DATE),
UNNEST(STRING_TO_ARRAY(acode_icd9_2, ' ')),
func_type
UNION ALL
SELECT
uuid,
func_date::DATE,
COALESCE(CAST(treat_end_date AS DATE), func_date::DATE),
UNNEST(STRING_TO_ARRAY(acode_icd9_3, ' ')),
func_type
FROM
"NEPHR"."CD"
WHERE
func_date::DATE >= '2016-01-01'::DATE
GROUP BY
uuid,
func_date::DATE,
COALESCE(CAST(treat_end_date AS DATE), func_date::DATE),
UNNEST(STRING_TO_ARRAY(acode_icd9_3, ' ')),
func_type
),
outpatient_icd_9 AS (
SELECT
uuid,
func_date::DATE AS start_date,
COALESCE(CAST(treat_end_date AS DATE), func_date::DATE) AS end_date,
UNNEST(STRING_TO_ARRAY(acode_icd9_1, ' ')) AS icd_code,
func_type
FROM
"NEPHR"."CD"
WHERE
func_date::DATE < '2016-01-01'::DATE
GROUP BY
uuid,
func_date::DATE,
COALESCE(CAST(treat_end_date AS DATE), func_date::DATE),
UNNEST(STRING_TO_ARRAY(acode_icd9_1, ' ')),
func_type
UNION ALL
SELECT
uuid,
func_date::DATE,
COALESCE(CAST(treat_end_date AS DATE), func_date::DATE),
UNNEST(STRING_TO_ARRAY(acode_icd9_2, ' ')),
func_type
FROM
"NEPHR"."CD"
WHERE
func_date::DATE < '2016-01-01'::DATE
GROUP BY
uuid,
func_date::DATE,
COALESCE(CAST(treat_end_date AS DATE), func_date::DATE),
UNNEST(STRING_TO_ARRAY(acode_icd9_2, ' ')),
func_type
UNION ALL
SELECT
uuid,
func_date::DATE,
COALESCE(CAST(treat_end_date AS DATE), func_date::DATE),
UNNEST(STRING_TO_ARRAY(acode_icd9_3, ' ')),
func_type
FROM
"NEPHR"."CD"
WHERE
func_date::DATE < '2016-01-01'::DATE
GROUP BY
uuid,
func_date::DATE,
COALESCE(CAST(treat_end_date AS DATE), func_date::DATE),
UNNEST(STRING_TO_ARRAY(acode_icd9_3, ' ')),
func_type
),
inpatient_icd_10 AS (
SELECT
uuid,
in_date::DATE AS start_date,
COALESCE(CAST(out_date AS DATE), in_date::DATE) AS end_date,
UNNEST(STRING_TO_ARRAY(icd9cm_code, ' ')) AS icd_code,
func_type
FROM
"NEPHR"."DD"
WHERE
in_date::DATE >= '2016-01-01'::DATE
GROUP BY
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code, ' ')),
func_type
UNION ALL
SELECT
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_1, ' ')),
func_type
FROM
"NEPHR"."DD"
WHERE
in_date::DATE >= '2016-01-01'::DATE
GROUP BY
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_1, ' ')),
func_type
UNION ALL
SELECT
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_2, ' ')),
func_type
FROM
"NEPHR"."DD"
WHERE
in_date::DATE >= '2016-01-01'::DATE
GROUP BY
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_2, ' ')),
func_type
UNION ALL
SELECT
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_3, ' ')),
func_type
FROM
"NEPHR"."DD"
WHERE
in_date::DATE >= '2016-01-01'::DATE
GROUP BY
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_3, ' ')),
func_type
UNION ALL
SELECT
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_4, ' ')),
func_type
FROM
"NEPHR"."DD"
WHERE
in_date::DATE >= '2016-01-01'::DATE
GROUP BY
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_4, ' ')),
func_type
),
inpatient_icd_9 AS (
SELECT
uuid,
in_date::DATE AS start_date,
COALESCE(CAST(out_date AS DATE), in_date::DATE) AS end_date,
UNNEST(STRING_TO_ARRAY(icd9cm_code, ' ')) AS icd_code,
func_type
FROM
"NEPHR"."DD"
WHERE
in_date::DATE < '2016-01-01'::DATE
GROUP BY
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code, ' ')),
func_type
UNION ALL
SELECT
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_1, ' ')),
func_type
FROM
"NEPHR"."DD"
WHERE
in_date::DATE < '2016-01-01'::DATE
GROUP BY
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_1, ' ')),
func_type
UNION ALL
SELECT
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_2, ' ')),
func_type
FROM
"NEPHR"."DD"
WHERE
in_date::DATE < '2016-01-01'::DATE
GROUP BY
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_2, ' ')),
func_type
UNION ALL
SELECT
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_3, ' ')),
func_type
FROM
"NEPHR"."DD"
WHERE
in_date::DATE < '2016-01-01'::DATE
GROUP BY
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_3, ' ')),
func_type
UNION ALL
SELECT
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_4, ' ')),
func_type
FROM
"NEPHR"."DD"
WHERE
in_date::DATE < '2016-01-01'::DATE
GROUP BY
uuid,
in_date::DATE,
COALESCE(CAST(out_date AS DATE), in_date::DATE),
UNNEST(STRING_TO_ARRAY(icd9cm_code_4, ' ')),
func_type
)
SELECT
uuid,
start_date,
end_date,
func_type,
icd_code,
CASE
WHEN icd_type = 9 AND icd_code ~ '^[A-Z]\d+' AND SUBSTRING(icd_code, 1, 1) NOT IN ('E', 'V')
THEN 10
WHEN icd_type = 10 AND icd_code ~ '^\d+'
THEN 9
ELSE icd_type
END AS icd_type
FROM (
SELECT *, 9 AS icd_type FROM outpatient_icd_9
UNION ALL
SELECT *, 10 AS icd_type FROM outpatient_icd_10
UNION ALL
SELECT *, 9 AS icd_type FROM inpatient_icd_9
UNION ALL
SELECT *, 10 AS icd_type FROM inpatient_icd_10
)
WHERE LENGTH(icd_code) <> 0$$) AS t (
uuid TEXT,
start_date DATE,
end_date DATE,
func_type TEXT,
icd_code TEXT,
icd_type INTEGER
);
CREATE INDEX idx_patient_visits_uuid ON patient_visits (uuid);
CREATE INDEX idx_patient_visits_icd_code ON patient_visits (icd_code);
```
## OMOP Table SQL Queries
### care_site
- 紀錄醫療院所之識別碼與名稱及其醫療部門

- [醫事機構代碼等資料來源](https://info.nhi.gov.tw/IODE0000/IODE0000S09?id=1120)

- `care_site`之間為階層結構的關係,例如北榮之下有不同的主科別,主科別之下又有次科別。階層結構的關係必須記錄於`fact_relationship`表格中。
- 目前`care_site_source_value`欄位用於紀錄醫院或科別的代號。

#### 整理醫療院所數據
```python=
def retrieve_care_site_data(update_data: bool = False) -> DataFrame:
care_site_df = (
pd.read_csv(CARE_SITE_DATA_SRC_URL)
if update_data
else pd.read_csv("./csv/CARE_SITE_SOURCE_DATA.csv")
)
care_site_df = care_site_df[["醫事機構名稱", "醫事機構代碼"]]
care_site_df.columns = ["care_site_name", "care_site_source_value"]
specialties_data = pd.read_csv("./csv/SPECIALTIES.csv")
specialties_data = specialties_data[["科別", "代號"]]
specialties_data.columns = ["care_site_name", "care_site_source_value"]
subspecialties_data = pd.read_csv("./csv/SUBSPECIALTIES.csv")
subspecialties_data = subspecialties_data[["次分科", "次分科別代號"]]
subspecialties_data.columns = ["care_site_name", "care_site_source_value"]
return pd.concat(
[care_site_df, specialties_data, subspecialties_data], axis=0
).reset_index(inplace=False)
```
#### 插入醫療院所數據
```python=
def load_care_site_data(
conn: connection, schema: str, update_data: bool = False
) -> None:
cursor = conn.cursor()
cursor.execute(f"TRUNCATE {schema}.care_site;")
cursor.close()
df = retrieve_care_site_data(update_data)
data = [
(index + 1, row["care_site_name"], row["care_site_source_value"])
for index, row in df.iterrows()
]
insert_data_into_target_db(
conn,
schema,
"care_site",
["care_site_id", "care_site_name", "care_site_source_value"],
data,
)
print(f"Successfully copied data into {schema}.care_site.")
```
#### 建立care_site(醫院與主次科別)之間的關聯
```sql=
WITH src AS (
SELECT
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_1,
care_site_id AS fact_id_1,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_2,
(SELECT care_site_id FROM @target_schema.care_site WHERE care_site_source_value = '02') AS fact_id_2,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care Site is part of Care Site') AS relationship_concept_id
FROM @target_schema.care_site
WHERE care_site_source_value IN ('AA', 'AB', 'AC', 'AD', 'AE', 'AF', 'AG', 'AH', 'AI', 'AJ')
UNION ALL
SELECT
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_1,
care_site_id AS fact_id_1,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_2,
(SELECT care_site_id FROM @target_schema.care_site WHERE care_site_source_value = '03') AS fact_id_2,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care Site is part of Care Site') AS relationship_concept_id
FROM @target_schema.care_site
WHERE care_site_source_value IN ('BA', 'BB', 'BC', 'BD')
UNION ALL
SELECT
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_1,
care_site_id AS fact_id_1,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_2,
(SELECT care_site_id FROM @target_schema.care_site WHERE care_site_source_value = '04') AS fact_id_2,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care Site is part of Care Site') AS relationship_concept_id
FROM @target_schema.care_site
WHERE care_site_source_value IN ('CA', 'CB')
UNION ALL
SELECT
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_1,
care_site_id AS fact_id_1,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_2,
(SELECT care_site_id FROM @target_schema.care_site WHERE care_site_source_value = '06') AS fact_id_2,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care Site is part of Care Site') AS relationship_concept_id
FROM @target_schema.care_site
WHERE care_site_source_value IN ('HA')
UNION ALL
SELECT
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_1,
care_site_id AS fact_id_1,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_2,
(SELECT care_site_id FROM @target_schema.care_site WHERE care_site_source_value = '12') AS fact_id_2,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care Site is part of Care Site') AS relationship_concept_id
FROM @target_schema.care_site
WHERE care_site_source_value IN ('DA')
UNION ALL
SELECT
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_1,
care_site_id AS fact_id_1,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_2,
(SELECT care_site_id FROM @target_schema.care_site WHERE care_site_source_value = '40') AS fact_id_2,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care Site is part of Care Site') AS relationship_concept_id
FROM @target_schema.care_site
WHERE care_site_source_value IN ('GA')
UNION ALL
SELECT
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_1,
care_site_id AS fact_id_1,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_2,
(SELECT care_site_id FROM @target_schema.care_site WHERE care_site_source_value = '82') AS fact_id_2,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care Site is part of Care Site') AS relationship_concept_id
FROM @target_schema.care_site
WHERE care_site_source_value IN ('FA', 'FB')
UNION ALL
SELECT
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_1,
care_site_id AS fact_id_1,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care site' AND vocabulary_id = 'Domain') AS domain_concept_id_2,
(SELECT care_site_id FROM @target_schema.care_site WHERE care_site_name = '臺北榮民總醫院') AS fact_id_2,
(SELECT concept_id FROM @target_schema.concept WHERE concept_name = 'Care Site is part of Care Site') AS relationship_concept_id
FROM
@target_schema.care_site
WHERE
care_site_source_value IN (
'00', '01', '02', '03', '04', '05', '06', '07', '08', '09',
'10', '11', '12', '13', '14', '15', '22', '23', '40', '41',
'42', '43', '44', '45', '46', '47', '48', '49', '60', '61',
'62', '63', '64', '65', '66', '67', '68', '69', '81', '82',
'83', '84', '2A', '2B'
)
)
INSERT INTO @target_schema.fact_relationship (
domain_concept_id_1,
fact_id_1,
domain_concept_id_2,
fact_id_2,
relationship_concept_id
)
SELECT
domain_concept_id_1,
fact_id_1,
domain_concept_id_2,
fact_id_2,
relationship_concept_id
FROM
src;
```
### location
- 可紀錄地址、郵遞區號、經緯度等
- 目前主要紀錄院內`ID`表格中的個人居住之郵遞區號
- 此表格主要用於OMOP中`person`的`location_id`欄位

##### 插入郵遞區號數據
```sql=
INSERT INTO @target_schema.location (
location_id,
zip
)
SELECT
ROW_NUMBER() OVER(ORDER BY zipcode),
zipcode
FROM (
SELECT
id.zipcode
FROM dblink(
'dbname=@source_database host=@source_host port=@source_port user=@source_username password=@source_password',
$$SELECT
DISTINCT REPLACE(zipcode, ' ', '')
FROM
@source_schema."ID"
WHERE
zipcode IS NOT NULL AND
LENGTH(REPLACE(zipcode, ' ', '')) <> 0$$
) AS id(
zipcode VARCHAR(9)
)
UNION
-- 若有些紀錄遺失郵遞區號,可用此值代替。
SELECT 'UNKNOWN'
);
```
- [Broadsea Atlas](https://github.com/OHDSI/Broadsea/blob/7e61aaeff305d1ee621d278dc643e06d5793a2b6/docker-compose.yml#L95)示範`table`

### person
- 紀錄個人資訊,包含生日、種族、族群等資料
- 資料來源為院內`ID`表格
#### SQL範例
```sql=
WITH unknown_zipcode AS (
SELECT location_id
FROM @target_schema.location
WHERE zip = 'UNKNOWN'
),
gender AS (
SELECT concept_id, concept_code
FROM @target_schema.concept
WHERE domain_id = 'Gender'
),
race AS (
SELECT concept_id, concept_name
FROM @target_schema.concept
WHERE domain_id = 'Race'
),
ethnicity AS (
SELECT concept_id, concept_code
FROM @target_schema.concept
WHERE domain_id = 'Ethnicity'
),
person_source AS (
SELECT *
FROM dblink(
'dbname=@source_database host=@source_host port=@source_port user=@source_username password=@source_password',
$$SELECT
uuid,
sex,
birthday,
REPLACE(COALESCE(zipcode, ''), ' ', '')
FROM
@source_schema."ID"$$
) AS t(
uuid VARCHAR,
sex VARCHAR,
birthday VARCHAR,
zipcode VARCHAR
)
)
INSERT INTO @target_schema.person (
person_id,
gender_concept_id,
year_of_birth,
month_of_birth,
day_of_birth,
race_concept_id,
ethnicity_concept_id,
location_id,
care_site_id,
person_source_value,
gender_source_value
)
SELECT
ROW_NUMBER() OVER(),
CASE sex
WHEN '1' THEN (SELECT concept_id FROM gender WHERE concept_code = 'M') -- Male concept ID
WHEN '2' THEN (SELECT concept_id FROM gender WHERE concept_code = 'F') -- Female concept ID
ELSE (SELECT concept_id FROM gender WHERE concept_code = 'U') -- UNKNOWN concept ID
END,
CAST(
CASE
WHEN omop_is_valid_date(birthday, 'YYYYMMDD') THEN SUBSTRING(birthday, 1, 4)
ELSE '0001'
END AS INT
),
CAST(
CASE
WHEN omop_is_valid_date(birthday, 'YYYYMMDD') THEN SUBSTRING(birthday, 5, 2)
ELSE '01'
END AS INT
),
CAST(
CASE
WHEN omop_is_valid_date(birthday, 'YYYYMMDD') THEN SUBSTRING(birthday, 7, 2)
ELSE '01'
END AS INT
),
(SELECT concept_id FROM race WHERE concept_name = 'Unknown'),
-- From OMOP offical website: Only use this field (ethnicity_concept_id) if you have US-based data and a source of this information.
(SELECT concept_id FROM ethnicity WHERE concept_code = 'Not Hispanic'),
COALESCE(l.location_id, (SELECT location_id FROM unknown_zipcode)),
(
SELECT care_site_id
FROM @target_schema.care_site
WHERE care_site_name = '臺北榮民總醫院'
),
uuid,
sex
FROM
person_source
LEFT JOIN @target_schema.location l ON person_source.zipcode = l.zip
```
- [Broadsea Atlas](https://github.com/OHDSI/Broadsea/blob/7e61aaeff305d1ee621d278dc643e06d5793a2b6/docker-compose.yml#L95)示範`table`

### visit_occurrence
- 紀錄就診開始和結束時間
- 資料來源為院內`CD`和`DD`表格
- 目前主要紀錄門診與住院
#### CD表格

筆記:
- 可能的`visit_concept_id`值:
1. 9201: Inpatient Visit(住院訪問)
2. 9202: Outpatient Visit(門診訪問)
3. 9203: Emergency Room Visit(急診訪問)
- `visit_type_concept_id`可設為32817 - EHR(電子健康紀錄)
- [參考資料](https://ohdsi.github.io/ETL-LambdaBuilder/CPRD/CPRD_Visit_Occurrence.html)
- [Broadsea Atlas](https://github.com/OHDSI/Broadsea/blob/7e61aaeff305d1ee621d278dc643e06d5793a2b6/docker-compose.yml#L95)示範`table`

#### DD表格

- [Broadsea Atlas](https://github.com/OHDSI/Broadsea/blob/7e61aaeff305d1ee621d278dc643e06d5793a2b6/docker-compose.yml#L95)示範`table`

#### 輸入資料
```sql=
WITH source AS (
SELECT
t.uuid,
t.visit_concept_id,
t.visit_start_date,
t.visit_end_date,
t.visit_type_concept_id,
t.visit_source_value,
t.visit_source_concept_id
FROM
dblink(
'dbname=@source_database host=@source_host port=@source_port user=@source_username password=@source_password',
'SELECT DISTINCT
uuid,
9202,
CAST(func_date AS DATE),
COALESCE(CAST(treat_end_date AS DATE), CAST(func_date AS DATE)),
32834, -- EHR outpatient note
acode_icd9_1,
45880088 -- ICD-10
FROM @source_schema."CD"
WHERE CAST(func_date AS DATE) >= ''2016-01-01''
GROUP BY uuid, func_date, treat_end_date, acode_icd9_1
UNION ALL
SELECT DISTINCT
uuid,
9202,
CAST(func_date AS DATE),
COALESCE(CAST(treat_end_date AS DATE), CAST(func_date AS DATE)),
32834, -- EHR outpatient note
acode_icd9_1,
45880957 -- ICD-9
FROM @source_schema."CD"
WHERE CAST(func_date AS DATE) < ''2016-01-01''
GROUP BY uuid, func_date, treat_end_date, acode_icd9_1
UNION ALL
SELECT DISTINCT
uuid,
9201,
CAST(in_date AS DATE),
COALESCE(CAST(out_date AS DATE), CAST(in_date AS DATE)),
32829, -- EHR inpatient note
icd9cm_code,
45880088 -- ICD-10
FROM @source_schema."DD"
WHERE CAST(in_date AS DATE) >= ''2016-01-01''
GROUP BY uuid, in_date, out_date, icd9cm_code
UNION ALL
SELECT DISTINCT
uuid,
9201,
CAST(in_date AS DATE),
COALESCE(CAST(out_date AS DATE), CAST(in_date AS DATE)),
32829, -- EHR inpatient note
icd9cm_code,
45880957 -- ICD-9
FROM @source_schema."DD"
WHERE CAST(in_date AS DATE) < ''2016-01-01''
GROUP BY uuid, in_date, out_date, icd9cm_code'
) AS t(
uuid character varying(100),
visit_concept_id integer,
visit_start_date date,
visit_end_date date,
visit_type_concept_id integer,
visit_source_value character varying(50),
visit_source_concept_id integer
)
)
INSERT INTO
@target_schema.visit_occurrence (
visit_occurrence_id,
person_id,
visit_concept_id,
visit_start_date,
visit_end_date,
visit_type_concept_id,
care_site_id,
visit_source_value,
visit_source_concept_id
)
SELECT
ROW_NUMBER() OVER (),
p.person_id,
r.visit_concept_id,
r.visit_start_date,
r.visit_end_date,
r.visit_type_concept_id,
(
SELECT
care_site_id
FROM
@target_schema.care_site
WHERE
care_site_name = '臺北榮民總醫院'
),
r.visit_source_value,
r.visit_source_concept_id
FROM
source r
LEFT JOIN @target_schema.person p ON r.uuid = p.person_source_value;
```
### condition_occurrence
- 紀錄病人患有之疾病與其發病時間(定為就診時間)
- 資料來源為CD與DD表格
- 筆記:須注意疾病分類號是否為ICD-9或是ICD-10

```sql=
INSERT INTO
@target_schema.condition_occurrence (
condition_occurrence_id,
person_id,
condition_concept_id,
condition_start_date,
condition_end_date,
condition_type_concept_id,
visit_occurrence_id,
condition_source_value,
condition_source_concept_id
)
SELECT
ROW_NUMBER() OVER(),
v.person_id,
c.concept_id,
v.visit_start_date,
v.visit_end_date,
v.visit_type_concept_id,
v.visit_occurrence_id,
v.visit_source_value,
v.visit_source_concept_id
FROM
@target_schema.visit_occurrence v
LEFT JOIN @target_schema.concept c ON (
CASE
-- ICD-10 Code
WHEN v.visit_source_concept_id = 45880088 THEN (
CASE
WHEN LENGTH(v.visit_source_value) >= 4 THEN SUBSTRING(
v.visit_source_value
FROM
1 FOR 3
) || '.' || SUBSTRING(
v.visit_source_value
FROM
4
)
ELSE v.visit_source_value
END
) -- ICD-9 Code
ELSE (
CASE
WHEN v.visit_source_value LIKE 'E%'
AND LENGTH(v.visit_source_value) > 4 THEN SUBSTRING(
v.visit_source_value
FROM
1 FOR 4
) || '.' || SUBSTRING(
v.visit_source_value
FROM
5
)
WHEN v.visit_source_value LIKE 'V%'
AND LENGTH(v.visit_source_value) > 3 THEN SUBSTRING(
v.visit_source_value
FROM
1 FOR 3
) || '.' || SUBSTRING(
v.visit_source_value
FROM
4
)
WHEN LENGTH(v.visit_source_value) > 3 THEN SUBSTRING(
v.visit_source_value
FROM
1 FOR 3
) || '.' || SUBSTRING(
v.visit_source_value
FROM
4
)
ELSE v.visit_source_value
END
)
END
) = c.concept_code
WHERE
c.concept_id IS NOT NULL;
```
- 疾病統計

- 手術統計

### 表格欄位定義
[Google Sheet](https://docs.google.com/spreadsheets/d/1T2PqYeDPAvSULc3IQC1zA2Eifj7ykavq5tIVxsGu4lY/edit?usp=sharing)
## 血液透析

- 血液透析資料來源為`BDC_BASE`中`DW_Detect`與`DW_DetectA`表格。
- `DW_DetectA`資料為2021-08-05至2023-10-31。
- `DW_Detect`資料起始日期為2024-06-01。
- 資料將插入於`procedure_occurrence`中,需要`PTIMESTAMP`與`MEDICALID`欄位作為`procedure_date`與`person_source_value`。
- 忽略`PTIMESTAMP`中時間之部分
- 以`PTIMESTAMP`與`MEDICALID`作為`GROUP BY`的要件,取出每個病人做血液透析之日期
- [血液透析之Concept](https://athena.ohdsi.org/search-terms/terms/43018324)。
### SQL範例
```sql=
SELECT
TO_DATE(SUBSTRING("PTIMESTAMP", 1, 10), 'YYYY-MM-DD') AS procedure_date,
"MEDICALID" AS person_source_value
FROM (
SELECT * FROM "BDC_BASE"."DW_Detect"
UNION ALL
SELECT * FROM "BDC_BASE"."DW_DetectA"
)
WHERE
TO_DATE(SUBSTRING("PTIMESTAMP", 1, 10), 'YYYY-MM-DD') >= '2010-01-01'::DATE AND
"MEDICALID" ~ '^[0-9]+$'
GROUP BY
procedure_date, person_source_value;
```
## 錯誤排除
### Achilles錯誤
#### 錯誤訊息:`ERROR: index "idx_ar_aid" does not exist`
- 根據[此討論](https://forums.ohdsi.org/t/error-index-idx-ar-aid-does-not-exist-sql-drop-index-ohdsi-idx-ar-aid/10641),此錯誤並非為嚴重錯誤。
#### 錯誤訊息:`ERROR: could not resize shared memory`
- 解決方式為於`docker-compose.yml`中增加`shm_size`的值(預設為`"64m"`)。例如:
```yaml=
services:
postgres:
image: postgres:latest
environment:
POSTGRES_PASSWORD: your_password
ports:
- "5432:5432"
shm_size: "8g" # Set shared memory to 8GB
# ... other configurations
```
## 自動化
利用Python腳本自動初始化資料庫並執行SQL檔案來匯入資料。

## Broadsea 3.5



## omoponfhir-main
- [GitHub](https://github.com/omoponfhir/omoponfhir-main)
### 解決無Patient問題
- [解決方式](https://github.com/omoponfhir/omoponfhir-main/issues/3#issuecomment-569881201)
- omoponfhir執行後,需要自行載入`person_id`至`f_person`表格中,而其他欄位可為NULL。
- 範例:
```sql=
INSERT INTO sas.f_person (
person_id
)
SELECT
person_id
FROM
sas.person;
```
```sql=
INSERT INTO sas.f_observation_view (
observation_id,
observation_date,
observation_datetime,
observation_source_value,
qualifier_source_value,
unit_source_value,
value_as_number,
value_as_string,
value_source_value,
person_id,
observation_concept_id,
observation_source_concept_id,
observation_type_concept_id,
provider_id,
qualifier_concept_id,
unit_concept_id,
value_as_concept_id,
visit_occurrence_id
)
SELECT
observation_id,
observation_date,
observation_datetime,
observation_source_value,
qualifier_source_value,
unit_source_value,
value_as_number,
value_as_string,
value_source_value,
person_id,
observation_concept_id,
observation_source_concept_id,
observation_type_concept_id,
provider_id,
qualifier_concept_id,
unit_concept_id,
value_as_concept_id,
visit_occurrence_id
FROM
sas.observation;
```
### 範例:取得病人資料
- http://localhost:8080/omoponfhir4/fhir/Patient/1

## 參考連結
[Entity-relationship Diagram of OMOP CDM](https://ohdsi.github.io/CommonDataModel/cdm54erd.html)
[OHDSI Software Tools](https://www.ohdsi.org/software-tools/)
[WhiteRabbit](https://ohdsi.github.io/WhiteRabbit/WhiteRabbit.html)
[健保特約醫療院所名冊(特約醫事機構代碼)](https://info.nhi.gov.tw/IODE0000/IODE0000S09?id=1120)
[健保用藥品項114年1月查詢檔](https://www.nhi.gov.tw/ch/cp-5622-8092a-2466-1.html)
[支付標準代碼](https://info.nhi.gov.tw/INAE5000/INAE5001S01)
[健保用藥品項查詢項目檔](https://data.gov.tw/dataset/23715)
[健保資料庫(NHIRD)之資料庫串聯](https://linhsingfen.blogspot.com/2019/01/nhird_2.html)
[OHDSI Tool Documents](https://rwd-data-environment-in-hospital.github.io/Documents/Usagi_setup.html)
[Atlas準備作業](https://github.com/OHDSI/WebAPI/wiki/CDM-Configuration)