# VAIA/UGain Linked Data and Solid: day 1
An introduction to the world of Linked Data: fostering semantic interoperability
## Agenda
Your hosts for today:
- [Pieter Colpaert](https://pietercolpaert.be/#me)
- [Pieter Heyvaert](https://pieterheyvaert.com/#me)
| Time | Title | By |
| -------- | -------- | -------- |
| 17:30 | Welcoming everyone at Technologiepark and networking over sandwich dinner | |
| 18:30 | First theory class: an introduction to Linked Data (recorded and livestreamed) | Pieter Colpaert |
| 19:30 | First exercises: creating your first bit of Linked Data | Pieter Heyaert |
| 20:30 | Networking drink (not recorded) | |
## Exercises
See Ufora: open the ZIP file
Description:
## Competences
Check out the recording of the lecture. You must be able to:
* Explain the difference between N-Triples, Turtle, TriG and N-Quads
* Explain RDF triples, named nodes, blank nodes.
* Read and write Turtle notation
* Read and write JSON-LD (knowing the functionality from the slides is sufficient)
* Be able to understand what triples are created when given an RDFa example
* Be able to curl
Slides: _TODO:Link_
Different people have different ways of learning. The text below also contains everything you need to know after today.
## Further reading
### URI dereferencing and disambiguation
The page identified by, or located at <https://stad.gent/nl/mobiliteit-openbare-werken/parkeren/parkings-gent/parking-sint-pietersplein> is not the same as the thing identified by <https://stad.gent/id/parking/P10>. The latter identifier points at the parking facility, not at a page about this parking lot. Nonetheless, if you dereference the parking lot’s identifier, you will be redirected to a page about it. In RDF documents, you can now do statements about both separately:
parking:P10 foaf:page <https://…sint-pietersplein> .
<https://…sint-pietersplein> foaf:primaryTopic parking:P10 .
Web trivia: this is a long-standing conundrum in Web engineering referred to as HTTPRange-14: https://en.wikipedia.org/wiki/HTTPRange-14
Two common solutions are used to make sure a real-world identifier can be disambiguated: a HTTP 303 redirect as in the example above, or by using hash-identifiers.
A 303 See Other redirect is used to indicate that this is not a page you can GET, but that there is another document somewhere else you can consult to get a representation of the thing this URI is identifying.
```bash
$ curl -I https://stad.gent/id/parking/P10
HTTP/2 303
server: nginx
location: https://mobiliteit.stad.gent/p10-sint-pietersplein
```
Based on the Web Scraping chapter, you should be able to now explain why 303 is used, and not a 301 for example.
A potential disadvantage towards developer experience is that in a browser, a web developer will not always notice a redirection happening, and may wrongly assume the current Web browser’s URL is the URI of the real-world object. This is a common mistake developers make when using Wikidata, where concept URIs and URLs for pages are only slightly different: https://www.wikidata.org/wiki/Q800814 vs. http://www.wikidata.org/entity/Q800814 for example.
Hash-identifiers are used to identify something that is described in a page. The client knows what identifier in the page to look at. As explained in the URL-section in the Web Scraping chapter, the server does not see the # and what is behind it. Then no redirection needs to be done at all. Good examples of this:
https://pietercolpaert.be/#me – identifying someone on a personal website
http://www.w3.org/1999/02/22-rdf-syntax-ns#type – publishing a vocabulary in a simple RDF file with multiple terms
Finally, it is not uncommon that no disambiguation is done at all and the Range-14 issue is ignored. Then one identifier is used for both the document and the real-world object. From the context of the triple you could then try to infer what it is about. If you are saying for example that `parking:P10` takes 5 minutes to read, you could infer that you of course can only read a page and thus this is related to the page instead of the actual parking facility.
### Varying content types
URI dereferencing can be executed by various agents: by a browser like Firefox, or by an HTTP library in a programming language. Content negotiation is often provided on URIs in order to make sure different types of user agents will get the Content-Type that’s best suited for their use case.
#### Option 1: Content negotiation
Firefox will by default send an Accept header that will look something like:
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
However, a Linked Data user agent may send an Accept header that looks more like this:
Accept: application/ld+json;q=1,application/n-quads;q=1,application/n-triples;q=1,application/rdf+xml;q=1,application/trig;q=1,text/turtle;q=1,text/n3;q=1,text/html;q=0.95
A server will also keep a similar priority list of content types it supports when handling a GET request to a URL. It multiplies matching content-types q-values with each other, and takes the content type with the highest resulting value. The Content-Type header will set the correct mime-type. A user agent cannot rely on the Accept header being consistently honored, since the origin server might not implement content negotiation for the requested resource, or might decide that sending a response that doesn’t conform to the user agent’s preferences is better than sending a 406 Not Acceptable response.
Do not forget to also set the Vary header (see the section on caching).
#### Option 2: embedding RDF in an HTML page
Another option in order to make URI dereferencing work for both humans and Linked Data clients would be to use the `<script>` tag inside your HTML page to include a Linked Data snippet, or include RDFa annotations.
You can test getting the RDF triples from various URIs yourself using https://rdf-play.rubensworks.net/. It also comes with a server proxy you can configure in case CORS is not properly configured on the server.
## Other learning resources that may help you
Telling a similar story in a slightly different way:
* An introduction to Linked Data in a video: https://vimeo.com/401026338
* The course by prof. Harald Sack: https://www.youtube.com/playlist?list=PLoOmvuyo5UAcBXlhTti7kzetSsi1PpJGR
* The “Semantic Web” from Web Fundamentals by prof. Ruben Verborgh: https://rubenverborgh.github.io/WebFundamentals/semantic-web/
* The FAIR principles: https://www.go-fair.org/fair-principles/