Camila Rangel Smith
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Versions and GitHub Sync Note Insights Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       owned this note    owned this note      
    Published Linked with GitHub
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    # Life GPT Review literature: > Example questions to review paper > - what do they do with the input data? > - what model do they use (what's the underlying self-supervised learning task) > - what do they obtain (it seems they obtain a BERT model often, which is encoder only so you get a bunch of embeddings) > - what do they test this stuff on? ## Shaky fundations The review examines 84 foundation models trained on non-imaging EHR data, creating a taxonomy of their architectures, training data, and potential use cases. Most models are trained on small clinical datasets like MIMIC-III or broad biomedical corpora like PubMed and are evaluated on tasks that do not necessarily reflect their utility in health systems. ### Categories of Clinical FMs: - Clinical Language Models (CLaMs): Specialized in clinical text and capable of extracting information, summarizing medical dialogues, and predicting mechanical ventilation needs. - Foundation Models for EMRs (FEMRs): Trained on a patient's entire medical history, outputting machine-understandable representations for downstream prediction tasks. ![Screenshot 2024-06-19 at 11.55.53](https://hackmd.io/_uploads/H1e0K4xUC.png) https://www.nature.com/articles/s41746-023-00879-8 ## Med-BERT ### Input data Med-BERT uses structured electronic health records (EHRs) from a dataset containing 28,490,650 patient records. These records include various types of structured diagnosis data. Med-BERT uses code embeddings, visit embeddings, and serialization embeddings to represent EHR data, capturing the interrelations between clinical codes within visits. ### Data processing: "given the semantic differences between EHR and text, adapting the BERT methodology to structured EHR is non-trivial. For example, while the input modality of the original BERT was a 1-D sequence of words, our input modality is structured EHR which is recorded in a multilayer and multi-relational style. There are no clear rules on how to flatten the structured EHR into a 1-D sequence and how to encode the “structures” of the structured EHR in the BERT transformer architecture.”" ### Evaluation Med-BERT was evaluated through fine-tuning on two disease prediction tasks: predicting heart failure among patients with diabetes and the onset of pancreatic cancer. The model used pretraining tasks such as the Masked Language Model (Masked LM) and prediction of prolonged length of stay in the hospital to capture contextual semantics in EHR data. ![Screenshot 2024-06-19 at 11.36.03](https://hackmd.io/_uploads/rJo2VVlIR.png) ## Large language models encode clinical knowledge ~~https://www.sciencedirect.com/science/article/pii/S1532046420302653~~ https://www.nature.com/articles/s41586-023-06291-2 Explores use of large language models (LLMs) in clinical settings, focusing on their capabilities and limitations. Objective: The paper presents MultiMedQA, a benchmark combining six existing medical question-answering datasets and a new dataset, HealthSearchQA. Evaluation: It proposes a human evaluation framework to assess model answers on multiple axes, including factuality, comprehension, reasoning, possible harm, and bias. Models Evaluated: PaLM: A 540-billion parameter LLM. Flan-PaLM: An instruction-tuned variant achieving state-of-the-art accuracy on multiple medical question-answering datasets. ## CPLLM: CLINICAL PREDICTION WITH LARGE LANGUAGE MODELS https://openreview.net/forum?id=fnBYPL5Ged "We present Clinical Prediction with Large Language Models (CPLLM), a method that involves fine-tuning a pre-trained Large Language Model (LLM) for clinical disease prediction. We utilized quantization and fine-tuned the LLM using prompts, with the task of predicting whether patients will be diagnosed with a target disease during their next visit or in the subsequent diagnosis, leveraging their historical diagnosis records. We compared our results to various baselines, including Logistic Regression, RETAIN, and Med-BERT, which is the current state-ofthe-art model for disease prediction using temporal structured EHR data. Our experiments have shown that CPLLM surpasses all the tested models in terms of PR-AUCandROC-AUCmetrics, displaying noteworthy enhancements compared to the baseline models. ... However, we want to harness the power of LLMs in understanding sequences of tokens derived from structured EHR data, specifically to train prediction models. We represent the structured data as a text by representing each medical concept corresponds to a word, admissions are treated as visits, and patient history is considered a document. The objectives of this study are to develop a novel method for using LLMs to train clinical predictors and to evaluate the performance of this method on real-world datasets. We used two different LLMs, Llama2, which is a general LLM (Touvron et al., 2023b) and BioMedLM which was trained on biological and clinical text (Venigalla et al., 2022). We used three prediction tasks and two datasets and compared the performance to three baseline models." ## Language models are an effective representation learning technique for electronic health record data https://www.sciencedirect.com/science/article/pii/S1532046420302653 The aim to demonstrate that patient representation schemes inspired by natural language processing (NLP) techniques can improve the accuracy of clinical prediction models by transferring information from a larger patient population to a smaller, relevant subset. Proposes using clinical language model-based representations (CLMBR) to leverage the structure and sequence of EHR data. Empirically evaluates the effectiveness of CLMBR for five prediction tasks and compares it with standard baselines and other representation learning techniques. "The study was done with approval by Stanford University’s Institutional Review Board. We treated each patient’s record as a sequence of days 𝑑1,…, 𝑑𝑁, ordered by time. Each day consists of a set of medical codes for diagnoses, procedures, medication orders, and laboratory test orders (ICD10, CPT or HCPCS, RXCUI, and LOINC codes respectively) recorded on that day. Fig. 3 illustrates an example patient record annotated with our notation. In this study, we did not use quantitative information such as laboratory test results or vital sign measurements. We also did not use clinical notes (i.e. textual documents), images, or explicit linkages between codesas they were not available in our de-identified EHR data due to logistical and IRB related issues." ![Screenshot 2024-06-20 at 09.58.45](https://hackmd.io/_uploads/S1c-eu-80.png) "As described in Section 2.1, our EHR data consisted of sequences of days 𝑑1…𝑑𝑁 where each day is comprised of a set of medical codes that represents the events of that particular day. The goal of building a clinical language model is to construct a model that can predict the probability of these sequences of days 𝑝(𝑑1,…, 𝑑𝑁). As is standard for many other sequence models, we factorized the probability distribution over the sequence into a series of predictions where only a single element of the sequence is predicted at a given time. In EHR data, this corresponds to predicting the next day in a patient record given the previous days, i.e., 𝑝(𝑑𝑖|𝑑1,…, 𝑑𝑖−1). Because each day 𝑑𝑖 consists of a set of medical codes, this problem is a set prediction problem which is also known as multi-label prediction [44]. We solved this set prediction problem in two steps: First, we constructed a model for computing fixed length patient representation given days of history and second, we constructed a set predictor that predicts the set of codes for thefollowing day given that patient representation. ![Screenshot 2024-06-20 at 10.19.32](https://hackmd.io/_uploads/SkdNE_WLR.png) " # Predicting Risk of Alzheimer’s Diseases and Related Dementias with AI Foundation Model on Electronic Health Records https://www.medrxiv.org/content/10.1101/2024.04.26.24306180v1 A large-scale EHR dataset contains rich information on patients. To enable the model to understand EHR best, we designed a prediction framework with two stages (Figure 1a): (1) pretraining, where we pretrained a foundation model for EHR withTransformer architecture with the pretraining cohort. The model was trained without labels and merely by reconstructing randomly masked information from the EHR; (2) fine-tuning, where we fine-tuned the model with the medical history and AD/ADRD/MCI outcomes in the fine-tuning cohort to identify the high-risk patients (see the Method section for more details). We conducted pretraining only on the pretraining cohort. We fine-tuned the model with the training set in the AD/ADRD/MCI finetuning cohort and used the validation set to examine the performance for different hyperparameter settings, which guided model selection. The performance of the selected models was evaluated on the fully held-out validation set of patients and reported as an estimate of performance in new patient cohorts. ![Screenshot 2024-06-20 at 11.30.53](https://hackmd.io/_uploads/SkjZrY-IR.png) # CEHR-GPT: Generating Electronic Health Records with Chronological Patient Timelines https://arxiv.org/abs/2402.04400 **"To our knowledge, this is the first attempt to utilize GPT for generating time-series EHR data.""** - We design a novel patient representation that captures visit types, discharge facilities for inpatient visits, and all temporal data, such as starting year, age, intervals between visits, and inpatient visit duration. This is the first instance of fully preserving such temporal information, to our knowledge (introducing artificial time token (ATT) between two neighboring visits). - We treat patient sequence generation as a language modeling problem, which allowed us to use the state-ofthe-art language model Generative Pre-trained Transformers (GPT) to learn the distribution of patient sequences to generate new synthetic sequences [8, 9]. ![Screenshot 2024-06-24 at 15.05.21](https://hackmd.io/_uploads/BJ686eP8R.png) "Because the patient representation encodes all the temporal information in the sequence, the trained GPT model could be used potentially for time-sensitive forecasting. We could prompt the trained GPT model with a patient history and estimate the time of the next visit via a Monte Carlo Sampling approach" Link the the baseline embedding model: https://arxiv.org/pdf/2111.08585 # Foresight—a generative pretrained transformer for modelling of patient timelines using electronic health records: a retrospective modelling study https://www.thelancet.com/journals/landig/article/PIIS2589-7500(24)00025-6/fulltext # Foundation Model for Advancing Healthcare: Challenges, Opportunities and Future Directions https://arxiv.org/abs/2404.03264 # BLOOM https://arxiv.org/abs/2211.05100 # Forward citation review articles for the "shaky foundations" paper Levan: searched articles citing the "shaky foundations" review paper, and then manually selected review articles among those. Sorry for the overload, will sift through these. * Augmented non-hallucinating large language models as medical information curators #review * A Systematic Review of Testing and Evaluation of Healthcare Applications of Large Language Models (LLMs) #review * Recent Advances in Large Language Models for Healthcare #review * Generative AI and large language models in health care: pathways to implementation #review * Large language models in medical and healthcare fields: applications, advances, and challenges #review * Leveraging foundation and large language models in medical artificial intelligence #review * Generative artificial intelligence and ethical considerations in health care: a scoping review and ethics checklist #review * Generative Large Language Models in Electronic Health Records for Patient Care Since 2023: A Systematic Review #review * Advancing healthcare: the role and impact of AI and foundation models #review * Potential of Large Language Models in Health Care: Delphi Study #review * Unlocking the potential of large language models in healthcare: navigating the opportunities and challenges #review # 2024 Moor et al - Foundation models for generalist medical artificial intelligence

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully