--- title: hslu 2022.09-A tags: presentations description: View with "Slide Mode". slideOptions: theme: white --- <!-- .slide: data-background="#000000" --> <img src="https://opendata.utou.ch/presentations/digiges%202019.2/_unused/badapple3.png" width="100"> <center><pre> .-. .-. .-. .-. .-. . . .-. . . .-. | )|-| | |-| `-. |< | | | `-. `-' ` ' ' ` ' `-' ' ` `-' `-' `-' `-' Part I with Oleg Lavrovsky datetime(2022, 09, 05) Hochschule Luzern - Design & Kunst </pre></center> ---- # Data Skills Our objectives for this module are to: 1. improve data research skills & choose good sources 1. learn to work with several different kinds of data 1. learn to interpret and to classify data in context 1. use, analyse and contribute structured information 1. apply the skills meaningfully in your own work Continues in: [Part II](https://hackmd.io/@oleg/hslu-2022-09-V) --- # Part I Introductions, key concepts. Data heroes and 'villains'. Roadmaps for data skills. ---- :wave: _Hello!_ My name is Oleg - [School of Data](https://schoolofdata.ch) / ... / [BFH](https://www.bfh.ch/ti/de/weiterbildung/cas/datenanalyse/) / [HSLU](https://www.hslu.ch/en/lucerne-school-of-art-and-design/degree-programmes/bachelor/data-design-and-art-1/) - Director / Activist @ Opendata.ch - Founder / Advisor @ [cividi GmbH](https://cividi.ch) - Data Geek / Coach @ [dataletsch](https://dat.alets.ch) - **@loleg** [twt](https://twitter.com/loleg)+[gh](https://github.com/loleg)+[lkdin](https://www.linkedin.com/in/loleg/)+[insta](https://www.instagram.com/loleg/)+[opendata](https://hack.opendata.ch/user/loleg) ---- "I am Canadian" :maple_leaf: ![](https://i.imgur.com/6ezOpI2.jpg) ---- "I am Data Engineer" :bearded_person: ![](https://i.imgur.com/cnZC49A.png) ---- <!-- .slide: data-background="#000000" --> "I am [hacktivist](https://en.wikipedia.org/wiki/Hacktivism)" :nerd_face: [![](https://blog.datalets.ch/content/images/size/w1000/2018/09/DSC00818.JPG)](https://blog.datalets.ch/045/) _Photo by Ernie Deane_ --- ## What is Data Literacy? _The ability to read, understand, create, and communicate with **data**._ -- [Wikipedia](https://en.wikipedia.org/wiki/Data_literacy) :black_square_button: :black_square_button: :black_square_button: :black_square_button: ---- Data literacy (datalit) encompasses _“the ability to collect, manage, evaluate, and apply data, in a critical manner”_ ([Ridsdale et al 2015](https://www.researchgate.net/figure/Data-Literacy-Competencies-Matrix-by-Ridsdale-et-al-2015-3-reduced-version_fig2_359343406)). ![](https://i.imgur.com/7Fpu3eP.jpg) > Photo by [LuidmilaKot](https://pixabay.com/de/photos/bengel-notizbuch-computer-lernt-1520705/) Much of this is already taught in grade school, as part of mathematics and science curriculum. How do you see your own datalit skills? ---- ![](https://i.imgur.com/n4TN3zs.png) People generally see data literacy as a broad field with complementary skills in economics, linguistics, and more. There are several models, such as the competencies matrix by [Ralph Krüger](https://www.researchgate.net/figure/Data-Literacy-Competencies-Matrix-by-Ridsdale-et-al-2015-3-reduced-version_fig2_359343406), which suggest how data literacy should be taught and applied. A useful mental model is the [DIKW pyramid](https://en.wikipedia.org/wiki/DIKW_Pyramid), which I extended here to differentiate between "structured data" (secondary data, in science) and "raw measurements" (primary data). ---- [![Data Cake](http://make.opendata.ch/wiki/lib/exe/fetch.php?w=400&tok=60aea4&media=http%3A%2F%2Fokcon.org%2Ffiles%2F2013%2F08%2Fdata-cake-graphic.jpg)](http://make.opendata.ch/wiki/information:quickstart) > Image by [Epic Graphic](https://contenthubble.com/epic-graphic/) A good metaphor is cooking. If data is the raw ingredient and we manage to produce tasty information, then knowledge is what you gain when you consume it... Wisdom is gained through staying fit - with a healthy information diet! --- ## What is open data? :flag-ch: :globe_with_meridians: ---- [<img src="https://blog.okfn.org/files/2020/02/landscape-colour.png" width="400" style="border:none;background:none" border="0">](https://youtu.be/wS-dTTNaKE8) _"We are increasingly reliant on data to **make decisions**."_ Open Knowledge is a global non-profit promoting a healthy, well-balanced relationship to data. They are one of many organizations you can turn to for advice on data literacy. ---- [![](https://opendata.utou.ch/presentations/open%20data%202012.2/images/00%20makeopendata1.jpg)](https://opendata.ch) **Opendata.ch** is the Swiss chapter of Open Knowledge, running events and supporting a community of practitioners and institutional members since 2012. The **Data Café** and **Hackdays** on a variety of topics help to boost data literacy around the country. They also help to promote jobs and opportunities for startups and established companies. ---- [![](https://opendata.utou.ch/presentations/bfh%202019.10/img/opendata-swiss-terms.png)](https://opendata.swiss/en/terms-of-use/) **[Opendata.swiss](https://opendata.swiss/)** is the central project of the Swiss federal government to support a high standard of open data publication and use. Run by the Federal Statistical Office, the teams works closely with other departments, cantons and municipalities, stakeholders in the industry and community, and promotes outstanding examples of data reuse. ---- ![](https://blog.datalets.ch/workshops/2018/gv/advent.png) There are many other interesting organizations deserving of mention, which support data literacy in particular areas such as [OpenStreetMap](https://openstreetmap.ch) (mapping), [Wikimedia](http://wikimedia.ch/) (open content, linked data), [AlgorithmWatch](https://algorithmwatch.ch/) (privacy, ethics), [ONIA](https://opennetworkinfrastructure.org/) (hardware, networks), ... --- ## Who are the "Heroes" of datalit? :chart_with_downwards_trend: 🦸:grey_exclamation: ---- <img src="https://external-content.duckduckgo.com/iu/?u=http%3A%2F%2Fwww.capital.gr%2FContent%2FImagesDatabase%2Fbe%2Fbe2b615d20654bdfa062909d5145f260.jpg&f=1&nofb=1" width="200"><br> > _"The simple message to governments around the world must be consistent and forceful: **[raw data](https://en.wikipedia.org/wiki/Raw_data), now!** Opening up data is fundamentally about more efficient use of resources and improving service delivery for citizens. The effects of that are far reaching: innovation, transparency, accountability, better governance and economic growth."_ -- [Tim Berners-Lee](https://www.wired.co.uk/article/raw-data) - photo from [Capital.gr](https://www.capital.gr/forbes/3267755/tim-berners-lee-giati-den-zitisa-to-copyright-gia-ton-pagkomio-isto) ---- ![](https://live.staticflickr.com/8068/8152201063_491cb895de_c_d.jpg) > Photo: [Alex Engel](https://www.flickr.com/photos/nycstreets/8152201063/), CC BY-NC-ND 2.0 [NYT - California Megastorm](https://www.nytimes.com/interactive/2022/08/12/climate/california-rain-storm.html) is a recent example of a stunning work of data journalism on a complex and current subject, presented with meticulously researched arguments and impressive data visualizations. Data Journalists, in general, are on the frontlines of making the public more data-aware and -literate. ---- [![](https://s3.tosdr.org/branding/tosdr-logo-128.svg) tosdr.org](https://tosdr.org) Sites like TOS;DR give us a simple :traffic_light: system to understand the conditions under which data on the Internet is collected and shared. You can even install plugins (e.g. [DuckDuckGo](https://duckduckgo.com/app)) which make it easier to keep track of your basic data footprint: the trail of cookies and IP addresses you leave on the Web, which is picked up by the data analytics of marketers and many others. ---- [![](https://eduwells.files.wordpress.com/2014/10/creative-commons-eduwells.png) creativecommons.org](https://eduwells.com/2014/10/06/safer-schools-with-creative-commons/) On the other hand, innovative forms of exchange, in part based on "open" standards such as the Creative Commons licenses, allow people to share their data and content freely, without relinquishing their rights, and participate in the new platforms and economies of remix culture. ---- [<img width="100%" src="https://im2punt0.files.wordpress.com/2017/03/open-definition.png"> opendefinition.org](http://opendefinition.org/) The Open Definition is a document maintained in 45 languages by Open Knowledge, which specifies what we mean by "open content" or "open data". It encompasses ideas which are reflected in a variety of [Open Licenses](https://opendatacommons.org/) and tools. ---- ![OKFN Index](https://opendata.utou.ch/presentations/open%20data%202013.2/images/okfncensus.png) The [Open Data Barometer](https://opendatabarometer.org/) and [Open Knowledge Census](http://global.census.okfn.org/place/ch) track how well countries around the world are implementing policies around data sharing or data protection. ---- [![](https://i.imgur.com/hJHKVot.jpg)](https://data.europa.eu/data/datasets/18845897-bundesamt-fur-statistik-bfs/quality?locale=en) Portals like the [data.europa.eu](https://data.europa.eu) enable people around the world to more efficiently discover and use data provided by authoritative sources - such as their governments or universities. They help to raise the quality of public data sources and promote exchange on all sides. --- ## Who are the "Villains", then? :8ball: 🦹 ---- [![](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fwww.econlib.org%2Fwp-content%2Fuploads%2F2020%2F03%2Fexpert.jpg&f=1&nofb=1)](https://www.econlib.org/pandemics-and-the-problem-of-expert-failure/) > Image c/o [Econlib](https://www.econlib.org/pandemics-and-the-problem-of-expert-failure/) The "expert society" is one way to describe a frame of mind that opposes data literacy by leaving decisions in the hands of specialists, limiting access to public inquiry, and assuming that there is only one correct way to frame a problem and analyse the facts. The flip-side of this coin is that we are able to avoid spending time worrying about problems by paying someone else to. No surprise, then, if they act in self-interest. ---- [![Infodemic](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Funf.imgix.net%2F2020%2F08%2FWHO-Infodemic-main-graphic-digital-no-text.jpg%3Fauto%3Dcompress%252Cformat%26ixlib%3Dphp-3.3.0&f=1&nofb=1)](https://www.who.int/health-topics/infodemic#tab=tab_1) > Image c/o [WHO](https://www.who.int/health-topics/infodemic#tab=tab_1) Information overload is a dominant malaise of our times. We are bombarded with attention-seeking arguments and stats on all sides. Determining the most relevant and truthful parts preoccupies us. _“The abundance of books is distraction!”_ --[Seneca the Elder](https://en.wikipedia.org/wiki/Seneca_the_Elder) ---- [![](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fwww.kindersley.ca%2Fwp-content%2Fuploads%2F2021%2F04%2F174352459_2944701585800155_6217775183610042313_n-768x403.png&f=1&nofb=1)](https://www.kindersley.ca/news-and-notices/stop-the-spread-of-misinformation/) > Image c/o [Kindersley, Canada](https://www.kindersley.ca/news-and-notices/stop-the-spread-of-misinformation/) Un-reliable sources of data, the virality of "fake news" - arguments masquerading as knowledge, the speed with which shocking claims circle the globe on digital networks... We urgently need to combat misinformation through diligence, discipline and better data. ---- [![CNBC Dark web](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fsc.cnbcfm.com%2Fapplications%2Fcnbc.com%2Fresources%2Ffiles%2F2018%2F04%2F13%2FScreen%2520Shot%25202018-04-13%2520at%25204.36.34%2520PM.png&f=1&nofb=1)](https://www.youtube.com/watch?v=ngT2Aq1VBFc) Shady [data brokers](https://www.vice.com/en/article/ne9b3z/how-to-get-off-data-broker-and-people-search-sites-pipl-spokeo) peddling their wares on the "dark web" have created a thriving market for leaked, forged and stolen data. The Tor network (shown above) can be both a tool for protecting the rights of honest citizens, as well as the activities of criminals. From identity theft to blackmail, we have to be constantly vigiliant and verify before we trust. ---- [![](https://i.imgur.com/KrRuW4H.jpg)](https://www.digitale-gesellschaft.ch/ratgeber/) Check out the **Data Privacy** resources from [Electronic Frontier Foundation](https://www.eff.org/issues/privacy) and [Digitale Gesellschaft](https://www.digitale-gesellschaft.ch/ratgeber/) (image above) to learn how to protect yourself and those around you, and to assert your digital rights. ---- _"One may smile, and smile, and be a villain; at least I'm sure it may be so in Denmark."_ ― William Shakespeare, Hamlet --- ## What skills should we aspire to develop now? :female-construction-worker: :male-guard: ---- [![](https://theodi.org/wp-content/uploads/2020/05/Data-Skills-Framework-1.png)](https://theodi.org/article/data-skills-framework/) The [Open Data Institute](https://theodi.org) in the UK offers "data skills courses to balance technical with non-technical skills to ensure that people can make an impact with data, and help ensure the best social and economic outcomes for everyone." We too can use their [Data Skills Framework](https://theodi.org/article/data-skills-framework/) to plan our studies. ---- [![](https://i.imgur.com/vbPF06P.jpg)](http://toolbox.schoolofdata.ch/overview.html) The [School of Data Methodology](https://schoolofdata.org/methodology/) is a sequenced approach to working with data from beginning to end. Once you better understand the data cycle and stakeholders, breaking down the process into steps helps to build confidence in your work. ---- ![](https://i.imgur.com/T0RZKgN.png) As an excellent introduction and set of warm up exercises, we use the [DataBasic](https://databasic.io) online tools and methodology to explore simple datasets as a group. --- # Part II Where to get more experience. Building a toolbox for data wrangling. Finding questions to ask. How to interpet, verify and reuse a data source properly. --- ## Reading and interpreting data :chart: :heavy_check_mark: ---- [![](https://i.imgur.com/14DlZDP.png) Ihr Kinderlein, warum kommet ihr nicht?](https://www.republik.ch/2022/08/29/als-sag-ihr-kinderlein-warum-kommet-ihr-nicht) (Republik) is an excellent overview of some of the ways that people are misled by visualizations, from some of Switzerland's leading online data journalists. ---- [![](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fiibawards-prod.s3.amazonaws.com%2Fprojects%2Fimages%2F000%2F001%2F476%2Fpage.jpg%3F1474023428&f=1&nofb=1) Information is Beautiful](https://www.informationisbeautifulawards.com/showcase/1476-where-does-data-visualization-come-from) showcases the most talented and most accurate work in data visualization around the world. This chart by Fabrice Sabatier is an overview of how the discipline has developed over the last centuries. ---- [![](https://i.imgur.com/YPQ9sd4.png) seeing-theory](https://seeing-theory.brown.edu/) (Brown) is a highly recommended set of online tutorials that introduce you to the basics of statistics in a fun, interactive way. Must see! --- ## A toolbox for data analysis :female-scientist: 🛠️ ---- ![](https://i.imgur.com/esOklQL.jpg) > Image generated with OpenAPI _Charts! Maps! Spreadsheets!_ We are all aware of how important these are, and have probably already learned to use programs in school to make tables and diagrams. We are building on these basic tools. Try the [Paper Spreadsheet](https://databasic.io/en/culture/paper-spreadsheet) exercise from DataBasic. ---- [![](https://i.imgur.com/p0LXCRD.jpg)](toolbox.schoolofdata.ch) Will work with this website with recommendations along the Data Pipeline methodology. Start your own Data Toolbox today! ---- ![](https://i.imgur.com/cI9OWPI.png) (Explore feature in Google Sheets). Your office program packs in a lot of functionality - in this case, a tool that automatically analyses and visualizes your data with just a couple of clicks. ---- ![OpenRefine](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Ftse4.mm.bing.net%2Fth%3Fid%3DOIP.k_0qGNChjbtpY9k-58fXEQHaEI%26pid%3DApi&f=1) OpenRefine is a great tool for exploring and cleaning your data, with lots of [online tutorials](https://multimedia.journalism.berkeley.edu/tutorials/openrefine/) (Berkeley). It is an important tool in many data journalists toolboxes, and helps you to be careful and transparent in how you work with data. ---- ## Quo vadis, Machine Learning? :robot_face: :books: ---- [![R Studio screenshot](https://opendata.utou.ch/presentations/bfh%202019.10/img/rstudio-2020-09-28.jpg)](https://rstudio.cloud/) Data analysis in code is often done in R, here showing the RStudio interface with an open Data Package, graphing tools, and a console for SPARQL queries. If you take an introductory class in Data Science, this is probably what you will learn. ---- ![](https://i.imgur.com/4znR7Ls.jpg) [Jupyter](https://jupyter.org/) notebooks (pictured above) as well as [Observable](https://observablehq.com/) are used with various languages (R, Python, Julia, JavaScript) by professional data scientists to develop models, make small visualizations, and share their results and methods online. ---- [![](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Ftse2.mm.bing.net%2Fth%3Fid%3DOIP.hjKOFCoqUhMufJtxua2zLQHaFM%26pid%3DApi&f=1)](https://www.ornl.gov/vis) Typically data is used in dashboards, such as [ORNL Vista](https://www.ornl.gov/vis) shown above, used in government applications. Compact, dense information displays are useful for trained professionals to make real-time course corrections. For most people, they are overwhelming (even if kind of nice to look at). ---- [![](https://blog.datalets.ch/workshops/2018/foodhackdays/eschernode/snaq_on_eschernode_1.png)](https://towardsdatascience.com/why-weight-the-importance-of-training-on-balanced-datasets-f1e54688e7df) > Image from [optimizing weights](https://towardsdatascience.com/why-weight-the-importance-of-training-on-balanced-datasets-f1e54688e7df) (TowardsDataScience) To understand data science, you need to understand that statistics is a process, and there is no "right answer" - just the probability of an answer being correct. Much of data science involves creating models, and improving them with better data or algorithms. ---- [![](https://mitsloan.mit.edu/sites/default/files/styles/2_1_large_1050x525/public/2021-04/machine-learning-infographic_2.jpg?h=e9dd200c&itok=7tbLGyrT)](https://mitsloan.mit.edu/ideas-made-to-matter/machine-learning-explained) > Image from [machine learning explained](https://mitsloan.mit.edu/ideas-made-to-matter/machine-learning-explained) (MIT) It is important to understand that while Machine Learning is a very powerful tool, it is not appropriate everywhere and all the time. Right now, we are applying it in many places where we could save time and money with much simpler or more appropriate solutions. ---- ## Where to get more experience? :door: :dolphin: ---- [![Web Analytics](https://opendata.utou.ch/presentations/open%20data%202013.2/images/visits-by-location.png)](https://alternativeto.net/software/google-analytics/) You may find it it easy to dig into data that's right on your digital doorstep. Start by collect your own web and social media metrics. Look up data from your visitor counters, web analytics, and profiles. Consider how it influences your experience today. ---- [A/B Testing](https://en.wikipedia.org/wiki/A/B_testing) is a technique used by many web marketers, where customers are randomly shown one or another design, and the resulting data is analyzed to make decisions. ![](https://upload.wikimedia.org/wikipedia/commons/2/2e/A-B_testing_example.png) > Image by [Maxime Lorant](https://commons.wikimedia.org/wiki/File:A-B_testing_simple_example.png), CC BY-SA 4.0 ---- [![](https://i.imgur.com/jzujVdE.png)](https://opendata.swiss) Learn to connect to public data sources such as Open Government Data or online APIs to benefit from a web of historical and real-time information. Take part in courses and events which use open data. ---- [![](https://opendata.utou.ch/presentations/unibern%202021.3/covid19mon.jpg)](https://db.schoolofdata.ch/event/7) Work with community data projects to learn how online data collection and cleaning works. Contributing to these can be a great way to deepen your skills. --- ## The most important skill is finding questions to ask _with_ data, and _of_ data. :moneybag: :man_in_business_suit_levitating: ---- ![](https://github.com/dribdat/design/raw/main/Whitepaper/images/datahackdaysbe.jpg) The challenges at a Hackday are a great place to meet students and professionals working in data design. We can use the same platform (dribdat) to develop our data skills in this course. ---- <img title="Insurance map" src="https://www.seantis.ch/blog/visualiserung-der-krankenkrassenpraemien/aerzte_pro_10T.png" style="border:0;width:100%"> The results of such [Hackdays](https://blog.datalets.ch/029) are prototypes that illustrate data concepts, and the whole process of obtaining and working with a particular kind of data. ---- [![MakeZurich](https://opendata.utou.ch/presentations/digiges%202019.2/images/170205_mm_MakeZurich_EventSpace.jpg)](https://makezurich.ch) Cooperating on civic tech initiatives, founding startups, working on research projects and getting crowdfunded are some popular ways to go "beyond the hackday". ---- ![](https://i.imgur.com/7cQ2xFC.png) By [campaigning for transparency](https://www.stadt-zuerich.ch/prd/de/index/stadtentwicklung/smart-city/transparenz.html) (Stadt Zürich) in digital and real-world spaces, we create more opportunities to establish good policies and data sources. ---- [![](https://i.imgur.com/2Uxuu3P.jpg)](https://farming-hackdays.ch/) Barnraising open data, hardware, source, content does not happen in a void. Governments and universities are these days keen to support such developments. ---- # Part III Big purpose, small data. Querying the data oceans. Distributing data to the edges. Packaging data beautifully. How to check data quality with validation tools. Opening collaboration. Collecting your own data. --- *[Small data](https://en.wikipedia.org/wiki/Small_data) is data that is 'small' enough for human comprehension. It is data in a volume and format that makes it accessible, informative and actionable.* (Wikipedia) ![](https://upload.wikimedia.org/wikipedia/commons/a/a4/Jackie_Chan_Cannes_2013.jpg) > Photo by Georges Biard, [CC BY-SA 3.0](https://creativecommons.org/licenses/by-sa/3.0), via [Wikimedia Commons](https://commons.wikimedia.org/wiki/File:Jackie_Chan_Cannes_2013.jpg) ---- ## How does the structure of data extend beyond a dataset? :building_construction: ---- [![](https://janakiev.com/assets/wikidata_mayors_files/wikidata_data_model.png)](https://en.wikibooks.org/wiki/SPARQL/WIKIDATA_Qualifiers,_References_and_Ranks) > Image by [Charlie Kritschmar](https://commons.wikimedia.org/w/index.php?curid=49616867) (WMDE) - Own work, CC0 When you explore the [Web of Data](https://lod-cloud.net/), you will start to get a better idea of which sources you can rely on. ---- [![](https://opendata.utou.ch/presentations/bfh%202019.10/img/tblfivestars.jpg)](http://5stardata.info/) > Image from 5stardata.info Tim Berners-Lee advocates the "5 stars of Linked Open Data" to measure how well your data is published. ---- [![](https://i.imgur.com/9pBrh2S.png) Querying Wikidata is a superpower](https://tech-news.wikimedia.de/en/2021/08/23/wikidatas-query-builder-your-new-superpower-in-the-world-of-open-data/) Tools like Query Builder make it possible to run searches and extract data without programming SPARQL. See also [CH community examples](https://db.schoolofdata.ch/event/1) for various interesting queries, and learn by example. ---- ## Scraping and saving the web :cactus: :tractor: ---- ![](https://images.squarespace-cdn.com/content/v1/5afdd6b84611a0e5999895fe/1572039482590-V18770WEQY84FHR58UKH/25395805_1575590755858598_8105091291970771884_n.jpg?format=750w) > Image: [Data Refuge Stories](https://www.datarefugestories.org/our-story-1) Whether it's saving data from politics or price hunting that you're after, transforming websites into data can be meaningful, rewarding and fun. ---- ![](https://i.imgur.com/ySAkXYu.jpg) [Parsehub](https://help.parsehub.com/hc/en-us/articles/115002659013-Lesson-0-Introduction) is a free tool for visually creating web scrapers that collect web data repeatedly. See also [Morph.io](https://morph.io), [Scrapy](https://scrapy.org/), and [Feedly](https://feedly.com). ---- [![](https://beakerbrowser.com/img/what-is-beaker.svg)](https://beakerbrowser.com/) > Image from [Beaker Browser](https://beakerbrowser.com/) Peer-to-peer (P2P) projects like [Dat](https://dat-ecosystem.org/) and [Solid](https://solidproject.org/) are at the forefront of innovation in distributed data storage, searching, sorting and validating. We will talk about the closely related topic of web3 next. ---- ## Accurate + Authentic + Appropriate = :muscle: Reliable ---- ### Accurate ![](https://i.imgur.com/1HPWWjH.png) https://opendata.swiss/de/dataset/klimanormwerte-niederschlag-1961-1990 ---- ### Authentic ![](https://upload.wikimedia.org/wikipedia/commons/b/b8/Omega_Seamaster_Diving_watch.jpg) > By Naklig - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=20200181 ---- ### Appropriate To the context of their use... ![](https://i.imgur.com/v4BqCdv.jpg) ---- ## Package my data beautiful :shopping_bags: :sparkles: --- ![](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fi1.pickpik.com%2Fphotos%2F806%2F744%2F960%2Ftomatoes-ketchup-sad-food-preview.jpg&f=1&nofb=1) > Photo by [Pixabay](https://pxhere.com/en/photo/937275), CC0 Even the best packaging can't improve the quality of the data itself, but it helps to set expectations and inform the user. ---- ![](https://frictionlessdata.io/img/frictionless-color-full-logo.svg) [Frictionless Data](https://frictionlessdata.io/) aims to solve many of the pitfalls of data reuse on the Web with set of simple standards and cross-platform tools. It is a set of projects maintained by the Open Knowledge Foundation. ---- ![](https://i.imgur.com/Fo6BL3G.png) The [Data Package Creator](https://create.frictionlessdata.io/) quickly generates descriptive metadata (packaging) based on CSV files, which you can use with other tools. ---- ![](https://i.imgur.com/mXJyyr6.png) Using [GitHub Actions](https://github.com/features/actions) and [Frictionless Repository](https://repository.frictionlessdata.io/), the Data Package can be automatically validated. ---- ![](https://i.imgur.com/U5ERDHe.png) This allows you to use public data sources as well as your own automations reliably. You can also pick up a nice badge: [![](https://github.com/frictionlessdata/repository-demo/actions/workflows/frictionless.yaml/badge.svg)](https://repository.frictionlessdata.io/docs/badges.html) ---- ![](https://i.imgur.com/BMjbarD.png) By the way, GitHub has a _lot_ of CSV files lying around, if you care to [have a look](https://github.com/search?q=extension%3Acsv&type=code). ---- [![](https://i.imgur.com/yyufOi3.jpg)](https://frictionlessdata.io/blog/2018/07/16/oleg-lavrovsky/) Whether you are a Julia programmer (of which there about 40'000, it seems), or not coding at all - there are many, many instruments to help improve your experience with data. Most of them at least partly exist on GitHub, though there are alternatives like GitLab & Gittea, too. ---- ## Collecting your own data :female-artist: :clipboard: ---- ![](https://i.imgur.com/ROIntpq.jpg) Opportunities to observe and collect data are all around us. What data influenced your experience of arriving on campus this morning? ---- [![](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fwww.frizzifrizzi.it%2Fwp-content%2Fuploads%2F2015%2F03%2Fdear_data_7-1155x770.jpg&f=1&nofb=1)](https://www.frizzifrizzi.it/2015/03/20/dear-data-quando-la-mail-art-incontra-linfografica/) Get inspired by [Giorgia Lupi and Stefanie Posavec](http://www.dear-data.com/), to go out and sketch your experiences. Practice dataful awareness! ---- <img src="https://i.imgur.com/eZU4EBx.jpg" width="30%"><img src="https://i.imgur.com/yM8uBVQ.jpg" width="30%"><img src="https://i.imgur.com/0dDqFmB.jpg" width="30%"> If you'd like to do some mapping, [Street Complete](https://wiki.openstreetmap.org/wiki/StreetComplete) and [Field Papers](https://dda.schoolofdata.ch/project/16) are playful ways to explore your neighbourhood and contribute to OpenStreetMap data. ---- <img src="https://i.imgur.com/qT4jrJ9.jpg" width="28%"><img src="https://i.imgur.com/j9VSS6y.jpg" width="38%"><img width="31.5%" src="https://i.imgur.com/FgTKkBp.jpg"> The [swisstopo app](https://www.swisstopo.admin.ch/de/karten-daten-online/karten-geodaten-online/swisstopo-app.html) gives you a detailed look at official Swiss geodata, and has a good recording feature for making traces with your phone. ---- ## Collaboration: ask and thou shall receive :robot_face: :chart_with_upwards_trend: ---- [![](https://i.imgur.com/bVOrJYn.png) Generated with OpenAI](https://hackmd.io/eYI9e9ihT4SJTUZbSZOj_A) You've sent me some [DALL-E queries](https://openai.com/) which I was happy to complete for you. Terabytes of imagery + metadata (human and A.I. classification) + years of research and finetuning + a few words + your imagination = Art! ---- [![](https://i.imgur.com/E8g2183.jpg) (C) Digitale Verwaltung Schweiz](https://www.digitale-verwaltung-schweiz.ch/umsetzungsplan/umsetzungsplan-e-government-schweiz/open-government-data) Similarly, there are people in government whose job it is to provide us with data. We can use data request forms and contacts to get their support. ---- ![Open Data Beer](https://us-east-1.linodeobjects.com/dribdat/uploads/upload_83141c555375e049c74234093d639f86.jpg) We can meet such people at [Community events](https://opendata.ch/events). Or just write them a note by email or an [online forum](https://forum.schoolofdata.ch). --- # Next week - ~~Analyse~~ - ~~Visualise~~ - Publish 🚀 - .. - Profit! ---- [![](https://datavizblog.files.wordpress.com/2013/05/map-full-size1.png) Charles Minards Flow Map](https://datavizblog.com/2013/05/26/dataviz-history-charles-minards-flow-map-of-napoleons-russian-campaign-of-1812-part-5/) ---- ![](https://i.imgur.com/jgJZJcO.jpg) > [Ginkgo2g](https://de.wikipedia.org/wiki/Junkerngassbrunnen#/media/Datei:Junkerngassbrunnen.JPG) CC BY-SA 4.0 *Keep the data flowing ...* --- :thumbsup: :thumbsdown: oleg.lavrovsky@hslu.ch <a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons Licence" style="border-width:0" src="https://opendata.utou.ch/presentations/digiges%202019.2/88x31.png" align="left" /></a><br>This presentation by Oleg Lavrovsky and, unless otherwise stated, all contents are licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</a>.