--- title: Frictionless Source Data --- # Frictionless Source Data The standard mode of operations for iSamples is to first retrieve a list of resources (advertised using sitemaps), then retrieve the advertised resources using the locations provided in the sitemaps. ```plantuml object "Source A" as catalogA object "iSB A" as iSBa catalogA ..> iSBa object "Source B" as catalogB object "iSB B" as iSBb catalogB ..> iSBb object iSC iSBa ..> iSC iSBb ..> iSC ``` Here we examine an option of using frictionless data packages as a data source for iSB instances. ```plantuml object "Source A" as catalogA object "iSB A" as iSBa catalogA ..> iSBa object "Frictionless A" as catalogB object "iSB B" as iSBb note right Data source registered as a git remote repository that contains a frictionless data package. iSB resources are updated by pulling the resource changes for the designated branch. end note catalogB .. iSBb: git pull object iSC iSBa ..> iSC iSBb ..> iSC ``` In the above diagram, a [frictionless data package](https://frictionlessdata.io/standards/) held in a git repository acts as a data source for iSB instance "B". The package is initially retrieved by cloning, with efficient subsequent updates via `git pull`. The package contains at least one table of records, with each record row containing the data for a physical sample. The data package may also be presented by the source as a zipped set of files, though in that case would not have the benefit of efficient change propogation. Note that the intent here is one-way transfer of updates, so a simple HTTP server could be used to present the repository for retrieval. ## Data Package A frictionless data package consists minimally of two files, `datapackage.json` and a data file, such as `data.csv`. Other files such as notes, metadata, media, and data files may be included. The `datapackage.json` file describes the data package, the contained data resources, and tables within resources.