白奇剛
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Model Zoo ## What are the major problems? ## What are the corresponding datasets? ### Microbiology Open Source Repository: NCBI Microbiome Central - A collection of databases and tools designed to support the study of microbiomes. Space Experiment Data: NASA GeneLab - Provides datasets from numerous space biology experiments. Some of these experiments have focused on the effects of space on microbial organisms, including bacteria and fungi. ### Cell and Molecular Biology Open Source Repository: GEO (Gene Expression Omnibus) - A public functional genomics data repository that archives and freely distributes microarray, next-generation sequencing, and other forms of high-throughput functional genomics data. Space Experiment Data: NASA GeneLab - Contains datasets from various cell biology experiments conducted in space. For instance, studies on human cells to understand the impact of microgravity on cellular function. ### Plant Biology Open Source Repository: TAIR (The Arabidopsis Information Resource) - Provides a comprehensive collection of data and information on the genetics and molecular biology of the plant Arabidopsis thaliana. Space Experiment Data: NASA GeneLab - Includes experiments that investigate the effects of spaceflight on different plant species. For example, how plants grow in microgravity or how space radiation affects plant genetics. ### Animal Biology Open Source Repository: Ensembl - Offers high-quality genome-wide sequence and annotation data for vertebrates and key model organisms. Space Experiment Data: NASA GeneLab - Houses datasets from experiments on various animals, like rodents, sent to space. These studies can range from understanding bone density loss in microgravity to more complex behavioral studies. ### Developmental, Reproductive and Evolutionary Biology: Open Source Repository: EvoDevoJ (Evolution & Development Journal) - While not a database in the traditional sense, this is a leading journal in the field of evolutionary developmental biology, and many articles provide supplemental data. Space Experiment Data: NASA GeneLab - While it may not have a vast collection in this specific field, there are some datasets that explore how microgravity affects development, reproduction, and potentially evolutionary trajectories. For instance, studies might investigate how animals develop in space from embryo to maturity. Microbiology Human Microbiome Project (HMP) Data: A comprehensive resource that has sequences of microbial genomes found in the human body. HMP Dataset IMG/M: The Integrated Microbial Genomes & Microbiomes system offers tools for the analysis of microbial community genomes. IMG/M Cell and Molecular Biology The Cancer Genome Atlas (TCGA): Detailed genomic information for over 30 types of cancer. TCGA Dataset Gene Expression Omnibus (GEO): A public functional genomics data repository supporting MIAME-compliant data submissions. GEO Plant Biology The 1001 Genomes Project for Arabidopsis thaliana: Sequencing of over 1000 different strains of the model plant Arabidopsis. 1001 Genomes Dataset Plant PhenomeNET: A dataset connecting phenotypic effect with gene function in plants. Plant PhenomeNET Animal Biology Mouse Genome Informatics (MGI): A comprehensive database on the genetics and genomics of the laboratory mouse. MGI Zebrafish Model Organism Database (ZFIN): Provides integrated access to curated zebrafish genetic and genomic data. ZFIN Developmental, Reproductive and Evolutionary Biology FaceBase: Datasets aimed at studying craniofacial development and disorders. FaceBase TreeBASE: A repository of phylogenetic information, specifically user-submitted phylogenetic trees and the data used to generate them. TreeBASE Bgee: A database to retrieve and compare gene expression patterns in multiple animal species, produced from multiple data types such as RNA-seq, microarrays, and in situ hybridization. Bgee ## How users use a pre-train model? ### GeneLab's data corresponding Earth models. RNA-Seq data from plants to study gene expression changes in space: DeepCount: A deep learning model for predicting gene expression levels based on sequence information. D-GEX: Uses deep learning to predict gene expression across different conditions. Transformer models like BERT and its variations have been adapted for biological sequences in tools like BioBERT or BioTransformers. Though they aren't pretrained on RNA-Seq data per se, they can be fine-tuned on such data. Microbial gene expression data to study microbial behavior in space: DeepMAsED: A deep learning-based method for differential expression analysis. DRAGON: A deep learning model that can predict gene expression levels from the gene's regulatory region sequence. Again, Transformer models adapted for biological sequences could be fine-tuned on microbial gene expression datasets. Animal protein expression data to study protein synthesis changes in microgravity: DeepProfile: Uses autoencoders to learn embeddings of gene expression profiles, which can be used for various downstream tasks. DeepAffinity: Predicts protein-ligand affinity using convolutional neural networks. Alphafold: Though it's a model for protein structure prediction, it signifies how deep learning models can be used effectively for protein-related tasks. Fine-tuning a model like Alphafold on protein expression data can provide meaningful embeddings or predictions. So, take using ### Other image datas and corresponding pre-trained models 1. Microscopy of Cellular Structures: Observing cells in space can reveal how microgravity affects cellular structure and function. For instance, observing changes in the cytoskeleton of cells can provide insights into how cells sense and adapt to microgravity. 2. Bone Densitometry: Astronauts in space undergo bone density loss. Imaging the bone over time using densitometry can help in understanding the rate of bone degradation and the efficacy of countermeasures. 3. MRI Scans of Astronauts' Brains: Some studies have indicated changes in astronauts' brain structures after prolonged spaceflight. MRI scans can help in mapping these changes and understanding their implications. 4. Optical Coherence Tomography (OCT) for Eye Health: Extended space missions can affect eye health. OCT provides detailed images of the retina, helping in monitoring the health of an astronaut's eyes over time. 5. Biofilm Formation: Microorganisms in space have been observed to form biofilms differently than on Earth. Observing these structures can help understand microbial behavior in space. 6. Plant Growth Patterns: Microgravity affects how plants grow. Imaging the growth patterns can provide insights into plant behavior in space, crucial for potential long-term space missions where plants might be used for food and oxygen. * Convolutional Neural Networks (CNNs): * VGG (VGG16, VGG19): These are excellent for basic image classification tasks and can be fine-tuned for specific space biology imaging data. * ResNet (ResNet50, ResNet101): These have deeper architectures and can capture more complex patterns in images. * InceptionV3: Known for its efficiency and high performance in image classification. * U-Nets: Particularly useful for segmentation tasks, such as segmenting specific cellular structures in microscopy images. ### An example 1. Goal & Hypothesis: The space biologist aims to decipher how specific plant genes react to the microgravity conditions in space. She hypothesizes that certain genes play a pivotal role in plant adaptation to space and may be responsible for observed changes in growth or health. 2. Data Collection: She begins with the Arabidopsis thaliana datasets OSD-427 and OSD-480 from NASA GeneLab which have RNA-Seq data of the plant in microgravity. She also has her own RNA-Seq data from a similar experiment she conducted recently. 3. Pre-trained Model Exploration: On browsing the model zoo, she identifies a promising model from 2022 named scBERT specifically designed for RNA-Seq data. The model has been pre-trained on a vast array of Earth-based RNA-Seq datasets, making it adept at capturing the nuances of gene expression data. 4. Data Preprocessing: Before utilizing scBERT, she pre-processes the RNA-Seq data to: Normalize gene expression values, handle missing data and align sequences and quantify them 5. Transfer Learning with scBERT: She loads the scBERT model and fine-tunes it using her space-based RNA-Seq datasets: The model is trained on OSD-427, OSD-480, and her experiment data. During training, she adjusts the model's parameters slightly to adapt its knowledge to the specifics of microgravity-based gene expressions. 6. Results & Interpretation: Once training is completed, she utilizes the fine-tuned scBERT model to: Identify genes that have significantly altered expression in space. Understand the potential biological pathways impacted by these genes. Determine if any of these genes are associated with stress responses, growth patterns, or other vital processes in the plant. 7. Contribute The scientist upload her model to the model zoo. ## Aim: 1. To **design a comprehensive database** of publicly available biomedical datasets that could be used to pretrain different models for a “model zoo,” and 2. To determine relevant publicly available space biology datasets that could then be used to refine the models to investigate specific space biology questions. ## 網站/資料庫 設計相關 1. 為了讓科學家很方便地利用: APIs for Developers: Provide APIs for programmatic access, making it easier for developers and platforms to integrate and utilize the models and datasets. 3. Model 要介紹使用方式 類似model(有tags方便fiter) 4. Dataset 也要做分類: 太空or地面, 影像or基因, 生物種類, 性別(?!) 5. 關於Preprocessing: (在簡報中說明 那些資料怎麼被preprocess) * Normalization: Ensure datasets have consistent scales or distributions, especially when combining them. * Tokenization and Encoding for Genomic Data: Convert genomic sequences into a format suitable for ML (e.g., A=0, T=1, G=2, C=3). aging Data: Generate new training examples by applying various transformations (rotations, zooming, etc.). * Handling Missing Values: Ensure that missing data points are either imputed or removed based on the dataset's nature. * Feature Extraction: For complex datasets like imaging or time series, extract essential features to reduce dimensionality. ## Topics in Space Biology 1. 可能會使用的資料: * 影像: 顯微影像, micro-CT老鼠影像等 (在演示ML部分時可優先使用這個,有經驗) * 基因: 就基因(在演示ML部分時若有時間可使用這個) Ex: Datasets such as The 1000 Genomes Project, NCBI's GenBank, and GWAS Catalog provide genetic and genomic data. * 3. [Space biology roadmap...](https://www.nasa.gov/wp-content/uploads/2015/03/16-03-23_sb_plan.pdf) (2016-2025)主要談以下![](https://hackmd.io/_uploads/B1qIH5cga.png) 4. Nasa的任務: Free Flyer? ISS Space Biology ? 5. 更細部關於Space biology的影像問題:Determining the dynamics and roles of various cellular organelles within cells is essential to understanding how the larger organism reacts and responds to microgravity. Dr. Rojas-Pierce and her group (http://dx.doi.org/10.4161/psb.29783 ; PubMed PMID: 25482812 , Dec-2014) are seeking to define the contribution of vacuolar and cytoskeletal dynamics to amyloplast sedimentation and gravitropic responses in shoots. Using an agravitropic mutant the team has recently reported that the impaired vacuole formation is the result of a mutation in a vacuolar trafficking protein resulting in multiple organelles instead of a large central vacuole. This protein has also been shown to regulate gravitropism and protein trafficking to the vacuole. Using a series of fluorescence microscopy techniques, demonstrated that the diffuse vacuoles are independent compartments and not connected to adjacent vacuoles, and that vacuole fusion is dependent on phosphoinosidides for vacuole fusion in plants (Zheng, et al., 2014: PMID 25482812). 2. [paper](https://www.sciencedirect.com/science/article/pii/S221455242200102X?via%3Dihub) Transfer learning as an AI-based solution to address limited datasets in space medicine, mentioned.. 可以拿來處理 For a concrete example of the concept of transfer learning, consider the task for non-invasive detection of anemia in spaceflight with retinal images. With terrestrial training and target data, a deep learning prediction model to detect anemia with retinal images has recently been successfully developed (Mitani et al., 2020). However, in the case of space-derived training and target data, the prediction model results are likely to degrade. Transfer learning is needed when there is a limited supply of target training data, which is encountered in astronaut datasets (Weiss et al., 2016). Using existing, large datasets (terrestrial) which is related to the target domain of interest (space), presents a useful application of transfer learning (E Waisberg et al., 2022). In this example, transfer learning allows a neural network to be a viable option with an astronaut dataset that was previously considered to be too small (Fig. 1). 2. A pretrained model on RNA sequencing can be used as a base model for any space biology RNA sequencing dataset in the OSDR. 會選幾個來demo. Genomic Responses in Microgravity: Spaceflight induces changes at the genomic level in various organisms. By leveraging models pretrained on vast genomic datasets, researchers can refine them using spaceflight-specific datasets to understand these unique responses better. Bone Density and Muscle Atrophy: The prolonged stay in space causes astronauts to experience bone density loss and muscle atrophy. Transfer learning from datasets related to osteoporosis or muscular diseases can provide insights into space-specific conditions. Immune System Changes: The immune system undergoes changes in space. Transfer learning can assist in analyzing immune response based on vast immunological datasets to highlight the space-specific deviations. Radiation Damage: High radiation in space poses significant risks. Models trained on radiation effects on biological systems on Earth can be adapted using datasets from organisms exposed to space radiation. Cardiovascular Alterations: Microgravity affects cardiovascular systems. By using models pretrained on cardiovascular datasets from Earth, transfer learning can help analyze space-specific changes. Visual Impairment: Some astronauts experience visual impairments related to intracranial pressure. Transfer learning from datasets on related eye conditions can aid in understanding this space-specific phenomenon. Microbial Behavior: The behavior of microbes (like bacteria and fungi) can change in space environments, impacting spacecraft's health and cleanliness. Transfer learning can assist in predicting microbial behavior using models trained on vast microbial datasets. Plant Growth and Behavior: In the bid to grow food in space, understanding how plants respond to microgravity is essential. Transfer learning from extensive plant biology datasets can be pivotal in predicting and optimizing plant growth in space. Neurological Effects: Space travel can influence the nervous system and cognitive functions. Transfer learning from neurobiology datasets can offer insights into these alterations. Psychological and Behavioral Analysis: Prolonged space missions can affect astronauts' mental health. Transfer learning can help analyze psychological datasets to predict and address potential behavioral issues during extended spaceflights. ## How users use model zoo? 探索與搜尋:Smith博士正在進行一個太空生物學計劃,研究微重力對植物基因的影響。她需要一個模型來幫助分類基因模式。她訪問了太空生物模型動物園,並搜索與植物遺傳學相關的模型。 檢視模型詳情:她找到了一個名為“植物基因組序列分類器 - 微重力效應”的模型。模型卡解釋說,該模型是在數千植物基因組序列上進行訓練的,並已經針對辨識與微重力效應相關的模式進行了優化。 下載/導入模型:Smith博士下載了模型的權重和配置文件。模型動物園提供了直接下載的鏈接,並且還提供了流行的編程語言的代碼片段,以便於導入過程。 微調:儘管這個模型看起來很有前途,但Smith博士擁有自己特定實驗收集的數據。她決定在自己的數據上微調這個模型,以確保它適應她的實驗條件。 部署和使用:微調後,她在她的基因組分析管線中部署了該模型。這個模型成功地幫助她分類基因模式,加速了她的研究。 反饋和貢獻:幾個月後,Smith博士進一步改進了該模型。她將自己的版本連同其在特定植物品種上的增強性能的筆記一起貢獻回模型動物園。 ## What should we put in the model zoo? ### Models Models trained on the TCGA (The Cancer Genome Atlas) could be a good starting point for genomic tasks. For imaging tasks, models trained on large-scale medical imaging datasets, such as those from the RSNA (Radiological Society of North America), could be adapted. DeepBind: Description: This deep learning model predicts DNA and RNA binding specificities for different proteins. Potential Application: Understanding protein-DNA/RNA interactions in space biology, especially under microgravity conditions, to study gene regulation. AlphaFold: Description: Developed by DeepMind, AlphaFold predicts the 3D structures of proteins based on their amino acid sequences. Potential Application: Predicting the structural changes of proteins that may be induced under space conditions, which can offer insights into functional alterations. EpiDeep: Description: This model predicts epigenomic features, such as histone modifications, from DNA sequences. Potential Application: Understanding the epigenetic landscape of organisms in space and how it might differ from terrestrial conditions. Resnet for Microscopy: Description: Residual networks (Resnets) that are fine-tuned for high-content microscopy images. Potential Application: Analyzing morphological changes in cells or tissues during spaceflight, including aspects like cell shape, organelle health, and cell-cell interactions. seq2seq for DNA Sequences: Description: Sequence-to-sequence models that can predict, for instance, potential coding sequences within DNA. Potential Application: Discovering new genes or regulatory elements that become active in space environments ### Collection of Publicly Available Biomedical Datasets 我們做的可以是把各大公開平台的地球dataset整理出來,分檔案格式,幫他們增加tag(如同Notion一般?)至於NASA資料也整理出來。對於太空資料,我們會需要把太空特殊條件也列出來做成filter。例如Environmental Conditions: * Microgravity * High Radiation * Vacuum Exposure * Earth Images: Datasets like NIH's Chest X-ray dataset, The Cancer Imaging Archive (TCIA), and Dermatology Image databases can serve as a rich source for image-based diagnostics. Genomics: Datasets such as The 1000 Genomes Project, NCBI's GenBank, and GWAS Catalog provide genetic and genomic data. Proteomics and Metabolomics: The Human Protein Atlas and Metabolomics Workbench are great starting points. Electronic Health Records (EHR): MIMIC-III and eICU are examples of databases containing EHR data. ### Relevant Publicly Available Space Biology Datasets NASA's GeneLab: Houses spaceflight and spaceflight relevant data which can be leveraged for understanding the impact of space on organisms. NASA Open Science Data Repository (OSDR): As mentioned, this can be used for RNA sequencing datasets and other biological data types. ## 相關paper https://www.nature.com/articles/s41586-023-06139-9

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully