Fuqi Xu

@7GH6ArIbRnm_7fgcv8mmWw

Joined on Apr 8, 2020

  • Main objectives This document provides lists of available tools for data content extraction and harmonisation (ETL). This recipe serves as an enterpoint to the implementation of ETL. Data extraction, transform and loading is a common practice in the FAIRification of biomedical data. In biomedicine, it means extracting data from different sources and transform it into a cohesive dataset, which includes the enrichment of the metadata, the validation of different data types, etc. Building a scalable and portable ETL system to support data exchange with different validation and transformation rules in both local and cloud servers are important steps. We identified essential processes in the ETL process and provide tools for each task. The tools listed below are descovered from FAIRplus interviews with other groups as well as the tool discovery workflow we proposed. :bulb: The tools listed below shall not be considered as a recommendation. This is a live list. Please let us know if you find any tools deprecated or blahblah Graphical Overview of the FAIRification Recipe Objectives graph TB subgraph Data extraction
     Like  Bookmark
  • :woman: UC1: I am a project lead, I want to have a data management plan which supports FAIR-by-design. graph LR Q1[What is the <br> dataset/project status] Q2[What is the <br> data type] Q6[Which FAIR principle <br> do you want to focus] Q10[Do you want <br> to perform FAIR <br> assessment?] A1_1[Prospective data] A1_2[Retrospective <br> data] A2_1[Metabolomics]
     Like  Bookmark
  • :bulb: W.I.P FASTQ file specification Format Specification Source Link Note
     Like  Bookmark