Session E - Table 1

--- title: Session E - Table 1 description: Notes from Table 1, Session E date: 2018-12-13 author: Susan Brown, Valeria Vitale, Florian Thiery, Andreas Wagner license: cc-by-sa 4.0 tags: LinkedPasts IV --- Session E - Table 1 === # Title: Tools and Workflows #LinkedPipes ## Challenges * versioning -- if you have only RDF * graph partitioning (named graphs, VoID, reification etc) * Aligning ontologies and vocabularies (what are the best tools? For example [Protégé](https://protege.stanford.edu/), [OntoME](http://ontome.dataforhistory.org/) (by [Data for History consortium](http://dataforhistory.org/)). Historical vs non-historical, event-based vs resource-based etc. Interoperability of different ontologies, what are the inconsistencies that you may introduce when merging ontologies and data; how to find the right ontologie(s) for aligning existing datasets * How do you decide what is the best ontology for your project? For example when trying to choose the best ontology to describe time * Combining archaeological data from different excavations. Provenance, idiosyncratic data, different level of granularity * Building a community of users, also on the technical side. For example with the CWRC-Writer * How to express different level of academic interpretations around objects, for example around 3D objects * How people can embed LOD in research process (esp Recogito); are there workflows going from Recogito to EpiDoc, GIS, etc. * Tool replicability; often there are things that are quite close but not quite right for purpose so we build something new; or building database AND all the tools on top of it rather than just working with the data * Need intuitive tools for teaching people who cannot code; workflows that use open software (e.g. From the Page to Voyant to Recogito) * Not promoting the idea of a closed virtual work environment (one single tool to do eerything) but have a tool inventory, instead, keep the tools pipeline-able i.e. modular * pipelines are long and there are many segments; problems replicate at different stages; as aggregator WHG allows contributors to enhance their data (e.g. reconciliation); how to manage an update process * In the context of aggregation, provide tools for reconciliation and erichment of data * impact of modelling decisions on future work but challenges to make time for this work/consulting * API design (for exposition of text or whatever is specific about the kind of data at hand) * Sometimes the models and standards to enable the connections are not there yet, so it is up to the practictioners to find their way around it, possibly connecting different tools ### Clusters of challenges * Ontologies: * building (decide which to use, where to extend? where to import whole or just cherry-pick parts) * mapping/aligning (keep free from inconsistencies) * inferencing (move from rdfs to owl2) * tools for validation, quality control (e.g. Protogé) * how to support complex queries--example of building up from snippets of sparql * Data complexity * complex provenance information that needs to stay with object * choosing ontologies to bring together heterogeneous legacy databases * partitioning * versioning, persistence, long-term preservation, also relevant to moving objects from one tool environment to another * replicability of chain of actions performed on an object * Tool /flows * input/output data/serialization formats (json-ld vs rdf+xml) and their conversions; limitations in tools (loss due to transformations performed by tools) * pipelines for building LOD into researcher workflow * overlaps/complementarity * what fields do the tools absolutely need to be interoperable and documented (e.g. which bits of provenance information, serialization format, ontologies used, respective documentation location) etc. * roundtripping/snakepit of data enhancement * how and where to track provenance, when and how to refer to the original ## Strategies ### Overall Goal **Strategy on "Tools and Flows" aka Linked Pipes** **-> build up a LinkedPipes working group** ### About Linked Pipes * produce a feature matrix specifically related to LOD and promoting pipelines * fields: name, link to source code; date of entry; entry-level tool?; consumes LOD?; produces LOD?; input formats (a section with individual columns: xml, xml+tei, xml+rdf, csv, json-ld (iiif, ...), djvu+xml, geoJSON, plaintext, html, ttl, n3, graphml, jpg, audio, video, shapefiles, lpif, sql, sparql, shacl); output formats; * Encourage projects to enter in software registry ([TaPoR](http://tapor.ca/home) or teresah.dariah.eu) and link to that general profile; also advise providing link to tool use case and flows that include that tool; provide link from the main page to other pages (e.g. wiki with list of tools). Focus this on working towards pipelines: Categories like consumption/production/bridging; Data formats consumed/produced; * More general information about the tools that we suggest should be entered at those other places: license; howtos; also free comments for things to point out about tools individually; project contacts (project director); technical contacts (programmers); institution(s) * later (?): categorize by type e.g. production, visualization; also bridging tools * also produce links to the pipelines, wherever they are: another page on github, a programming notebook, blog post, or whatever. ### List of participants in the Linked Pipes working group (just those whose github ids are not listed below) |name| |---| |Guenther Goerz| #### github ids etc. | name | github-id | twitter-id | short| |------|----|---|--| | Ben Brumfield | @benwbrum | @benwbrum | Gimena del Rio | @Gimena | @gimenadelr | Andreas Wagner | @awagner-mainz | @anwagnerdreas | Frank Grieshaber | @wenamun | @wenamun | Florian Thiery | @florianthiery| @fthierygeo |FT | Rainer Simon | @rsimon | @aboutgeo | Valeria Vitale | @valeriavitale | @nottinauta |VV | Susan Brown | @susanbrown | @susanirenebrown #### Linked Pipes WG * "Project Manager": Florian * "Committee": Susan, Valeria, Rainer, Florian, Ben, Andreas, Frank, Gimena, Guenther ### Notes, tools and other links to (maybe) integrate later into the inventory * In terms of pipelines: Notebooks (Jupyter, R) * [curl - command line tool and library for transferring data with URLs](https://curl.haxx.se/) * [xTriples - Web services for extracting rdf from xml](http://xtriples.spatialhumanities.de/index.html) * [X3ML Toolkit - Extracting rdf from other data formats](https://www.ics.forth.gr/isl/index_main.php?l=e&c=721) * [jq - a commandline json processor](https://stedolan.github.io/jq/) ([lesson in Programming Historian](https://programminghistorian.org/en/lessons/json-and-jq)) * [SAMOD: an agile methodology for the development of ontologies](http://essepuntato.github.io/samod/) * [Labeling System - web app for creating and publishing terms with contextual validity as LOD](https://github.com/search?q=topic%3Alabelingsystem+org%3Amainzed+fork%3Atrue) * [Academic Meta Tool - webapp for modelling vagueness in graphs including reasoning](http://academic-meta-tool.xyz/) * [Alligator - web app transforming a correspondence analyses to a relative chronology as RDF](https://rgzm.github.io/alligator/) * [WissKI Virtual Research Environment](http://wiss-ki.eu/)[, see also](https://www.drupal.org/project/wisski) * [ResearchSpace environment](https://github.com/researchspace/researchspace) * [Pipeline RDF+XML to JSON-LD](https://hbz.github.io/swib18-workshop/#/35) * [Protégé](https://protege.stanford.edu/) * [OntoME](http://ontome.dataforhistory.org/) ## Commitments * With regards to *ontologies*, we try to resume the discussion on ADHO's LOD SIG as well as see if the [Data for History consortium](http://dataforhistory.org/) would be a good community to approach. Andreas will do the first, and Andreas will do the second (others are very welcome to chime in). * We nominate Karl :-) to create a [Linked Pipes Repo](https://github.com/LinkedPasts/LinkedPipes) for us in the [Linked Pasts github organization](https://github.com/LinkedPasts), and call it Linked Pipes (short: Linked||); have a searchable page with the list of tools that we will commit individually to documenting and Florian to set it up; *NOTE (FT):* would recommend not a md structure, we should use JSON templates (as single documents to pull request files) in order to use nice frameworks like [filter.js](http://jiren.github.io/filter.js/index.html). * *NOTE (FT): A domain is registered [http://linkedpipes.xyz](http://linkedpipes.xyz/) which will contain the filter.js framework.* * first logo proposal by FT, VV will to digital version ![](https://i.imgur.com/GKmooGG.jpg) *CC BY 4.0 Linked Pipes WG* ![](https://i.imgur.com/EYpgwNp.png) *CC BY 4.0 Linked Pipes WG* * Google group for cummunication inside the Linked Pipes Working Group: (Ben will create, Valeria will get the email addresses of the people in the group) * Florian will work with Karl to set up the pages and be project manager of Linked||; * The following folks will be admins on the repo and collaborate to approve new contributors: * Will document tool(s): * Susan * Valeria and Gimena (Recogito) * Andreas * Loïc * Florian * Frank * Guenther (Pointers to WissKi doc) * Will document at least one workflow: * Ben * Valeria and Gimena (Recogito-related workflows) * Andreas * Florian (Alligator to AMT) * Will prepare some kind of summary (blog post/white paper) for reporting at next Linked pasts meeting * Susan will lead * Florian will help ;-) * will be *technical admin(s)* of the LinkedPipes Repo * Florian * will be *content admins* of the LinkedPipes Repo * Susan * Valeria * Gimena * Florian * Andreas * Ben * Rainer * Frank * The basic structure of the template: ![](https://i.imgur.com/PwevL2s.jpg) *CC BY 4.0 Linked Pipes WG* ```json { "name": "", "links": [], "dateOfEntry": "", "entryLevel": "{beginner:yes/no}", "consumesLOD": "true/false", "producesLOD": "true/false", "inputFormats": ["JPG", "TIFF", "PNG", "N3", "RDF/XML", "XML-TEI", "CSV", "JSON-LD", "GEOJSON", "IIIF-JSON", "PLAIN-TEXT", "HTML", "TTL", "SHP", "X3D", "any 3D format", "SQL", "SPARQL", "SHAQL", "CYPHER", "audio/video"], "outputFormats": ["JPG", "TIFF", "PNG", "N3", "RDF/XML", "XML-TEI", "CSV", "JSON-LD", "GEOJSON", "IIIF-JSON", "PLAIN-TEXT", "HTML", "TTL", "SHP", "X3D", "any 3D format", "SQL", "SPARQL", "SHAQL", "CYPHER", "audio/video"] } ```

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.