# [PAPER] BERT:Transformer for electronic Health Records

:::info
**Author** : Yikuan Li, Shishir Rao, José Roberto Ayala Solares, Abdelaali Hassaine, Rema Ramakrishnan, Dexter Canoy, Yajie Zhu, Kazem Rahimi & Gholamreza Salimi-Khorshidi
**Paper Link** : https://www.nature.com/articles/s41598-020-62922-y
**Code** : https://github.com/deepmedicine/BEHRT
:::
## Contributions
* **BERT + EHR = B EHR T**

* Suggested a pretrained model to predict the likelihood of 301 conditions in one’s future visits.
* Embedding of tabular sequence data is performed by additionally including age (etiology + visit interval role), positional encoding (visit order), and segment (visit classification) in the diagnosis code as shown below.

* Pre-training is performed using MLM like BERT to predict the masked disease in the middle of the patient's time series data.
* To verify that the model has learned the overall progression of the disease, it performs downstream tasks to predict 1) diagnosis at the next visit, 2) diagnosis within the next 6 months, and 3) diagnosis within the next 12 months.