Main objectives
This document provides lists of available tools for data content extraction and harmonisation (ETL). This recipe serves as an enterpoint to the implementation of ETL.
Data extraction, transform and loading is a common practice in the FAIRification of biomedical data. In biomedicine, it means extracting data from different sources and transform it into a cohesive dataset, which includes the enrichment of the metadata, the validation of different data types, etc. Building a scalable and portable ETL system to support data exchange with different validation and transformation rules in both local and cloud servers are important steps. We identified essential processes in the ETL process and provide tools for each task. The tools listed below are descovered from FAIRplus interviews with other groups as well as the tool discovery workflow we proposed.
:bulb: The tools listed below shall not be considered as a recommendation. This is a live list. Please let us know if you find any tools deprecated or blahblah
Graphical Overview of the FAIRification Recipe Objectives
graph TB
subgraph Data extraction