---
tags: sensors, eit
---
# Someone collects the data and I use it
![](https://i.imgur.com/85J1V6O.png)
## Different types of data?
**Open data**
Data that anyone can access, use, and share, with full permission to use any way they like.
**Shared data**
Data that can be shared with a specific group of people for a specific purpose.
**Closed data**
Data that can only be accessed by those who collected it or are accountable for it.
:::info
Another way of putting it
> According to the Open Data Institute, “**Open data is data that anyone can access, use or share. Simple as that. When big companies or governments release non-personal data, it enables small businesses, citizens and medical researchers to develop resources which make crucial improvements to their communities.**”
:::
### Different formats
**Human readable format**
![](https://i.imgur.com/mLCXN2Y.png)
**Machine format**
![](https://i.imgur.com/SFW1Ue8.png)
:::info
https://www.data.govt.nz/toolkit/open-data/formats-for-open-data-machine-readable-and-human-readable/
:::
## Accessing internet data
As we have seen, we can find a lot of information around in the internet. However, the information can sometimes be very scattered and dissorganised.
There are techniques that allow us to collect data from online websites:
- Webscrapping (simply making a script for collecting data from websites - not very legal sometimes)
- API (the legal way)
:::warning
Sometimes websites are not very happy when they are scrapped. For instance, [IMDB](https://www.imdb.com/conditions) says in their terms and conditions:
- **Robots and Screen Scraping**: You may not use data mining, robots, screen scraping, or similar data gathering and extraction tools on this site, except with our express written consent as noted below.
:::
For this, other means for interacting with online content is provided in the form of an API:
- A **Web API** is an [application programming interface](https://en.wikipedia.org/wiki/Application_programming_interface) for either a web server or a web browser. It is a web development concept, usually limited to a web application's client-side (including any web frameworks being used), and thus usually does not include web server or browser implementation details such as SAPIs or APIs unless publicly accessible by a remote web application.
We can connect to an API directly by it's endpoints:
- **Endpoints** are important aspects of interacting with server-side web APIs, as they specify where resources lie that can be accessed by third party software. Usually the access is via a URI to which HTTP requests are posted, and from which the response is thus expected.
An example of an open API is the [SmartCitizen API](https://api.smartcitizen.me/v0/devices/5452/):
![](https://i.imgur.com/aZaFAiZ.png)
:::info
**Machine readable format**
The data is available generally in [JSON format](https://json.org/).
Json is done by packing data in between {}:
```
{
"glossary": {
"title": "example glossary",
"GlossDiv": {
"title": "S",
"GlossList": {
"GlossEntry": {
"ID": "SGML",
"SortAs": "SGML",
"GlossTerm": "Standard Generalized Markup Language",
"Acronym": "SGML",
"Abbrev": "ISO 8879:1986",
"GlossDef": {
"para": "A meta-markup language, used to create markup languages such as DocBook.",
"GlossSeeAlso": ["GML", "XML"]
},
"GlossSee": "markup"
}
}
}
}
}
```
:::
:::info
The way we request data to an API comes with the following format:
- Base URL: `http://www.omdbapi.com/`
- Query: `?` + `parameter` + `queryname`. The `parameter` can be found in the [API documentation](http://www.omdbapi.com/). Several parameters can be separated by `&`.
An example: `http://www.omdbapi.com/?s=jose&plot=full&apikey=2a31115`
![](https://i.imgur.com/3Fk2sk1.png)
:::
:::warning
Music Brainz: https://musicbrainz.org/doc/MusicBrainz_API
:::
### Examples
:::info
Some use cases here
https://opendatahandbook.org/value-stories/en/
:::
- [Open Movie DB](http://www.omdbapi.com)
- [Open Data Barcelona](https://opendata-ajuntament.barcelona.cat/)
- Environmental Data APIs
- [Smart Citizen API](https://api.smartcitizen.me)
- [MINKA](https://minka-sdg.org/)
- [Ictio](https://ictio.cat/)
- [Natusfera](https://spain.inaturalist.org/users/sign_in)
- [OdourCollect](https://odourcollect.eu/)
- AireCiudadano
- [Text analysis](https://orange3-text.readthedocs.io/en/latest/index.html)
- [Twitter](https://developer.twitter.com/en/docs/twitter-api)
- [Wikipedia](https://www.mediawiki.org/wiki/API:Tutorial)
- [The guardian](https://open-platform.theguardian.com/explore/)
- [NYT](https://developer.nytimes.com/)
- Health
- [Covid](https://github.com/CSSEGISandData/COVID-19)
- [PubMed](https://pubmed.ncbi.nlm.nih.gov/)
- [Socioeconomic Data](https://github.com/biolab/orange3-world-happiness)
- https://worldhappiness.report/
- https://data.worldbank.org/
- https://stats.oecd.org/
### Making use of it
![](https://i.imgur.com/EtsNirn.png)
https://orangedatamining.com/
:::info
**Setup**
https://hackmd.io/LIpX3s4aT4WsloLqqC_7ZQ
**Basic example**
https://hackmd.io/4_4zeo3QQ6C9VEbhqSYddQ
:::