--- title: ckan 2023.02 tags: presentations description: View with "Slide Mode". slideOptions: theme: white --- My slides and notes added after the discussion, from a presentation at the [CKAN Monthly](https://ckan.org/events/ckan-monthly-live-february-how-to-survive-and-thrive-in-the-murky-depths-of-any-data-portal) community meeting today. Thanks a lot for the invitation, to all the people in the room, and congratulations on the 2.10 release! :tada: https://ckan.org/blog/the-latest-ckan-release-is-here-say-hi-to-ckan-210 _Side note: of the roughly 40 attendees there were at least 6 people who also participate in the CKAN dev calls, and IIRC about 10 who acknowledged oldschool vibes, still: I wonder how many people got the title of my talk, a reference to [Data Expeditions](https://schoolofdata.org/2012/11/14/data-expeditions-at-mozfest/), and [CKAN Hackathons](https://web.archive.org/web/20220514114436/https://irl.okfn.org/2013/10/03/ckan-hackathon-hello-from-okf-ireland/index.html) of a decade past?_ --- # Fear not the data dungeon! ![](https://i.imgur.com/QMc0laQ.jpg) Image created using [DALL·E 2](https://openai.com/dall-e-2/) by OpenAI, based on data aggregators like [Common Crawl](https://commoncrawl.org/), and the tireless efforts of millions of [content creators](https://www.linkedin.com/pulse/chatgpt-dall-e-2-show-me-data-sources-dennis-layton/) around the Web. ![Screenshot from 2023-02-16 09-51-05|690x462](upload://692VxOqGfrrAEL6ikpEFiGJH6B7.jpeg) _Screenshot of [commoncrawl.org](https://commoncrawl.org/)_ --- ![](https://i.imgur.com/83SGEc3.jpg) _[Image source](http://search-engine-optimiz.blogspot.com/2013/01/top-10-books-search-engine-optimization.html) (a blog post reviewing a book by Prof. Mario Fischer)_ We use CKAN to search for authoritative sources of data, a friendly and secure page full of metadata guiding us to resources - rather than dubious and troublesome Excel files buried deep in the murky filesystem of a random server. Finding 'things' online ... it is a question of fear, uncertainty and doubt (FUD) despite - or maybe because of - the valiant efforts of search engine optimization (SEO). Any other parents here, wondering what garbage search engines spit out at their children's queries? No, ah, ok, back [to your phones](https://journals.sagepub.com/doi/abs/10.1177/2050157919846916?journalCode=mmca) you go. Anyone else here following the [fun with lies](https://www.howtogeek.com/852769/chatgpt-is-an-impressive-ai-chatbot-that-cant-stop-lying/) (HowToGeek) that is the world's obsession with [ChatGPT](https://www.wired.com/story/openai-chatgpts-most-charming-trick-hides-its-biggest-flaw/) (Wired)? So you know that Access to Data (a.k.a. concise statements whose veracity could be more easily checked by evidence) is a Very Good Thing - but not all doorways are equally welcoming. --- ![](https://i.imgur.com/v8Nw6yP.jpg) _Screenshot of [DuckDuckGo Image search](https://duckduckgo.com/?t=ffab&q=ckan&atb=v317-1&iax=images&ia=images)_ Oops!.. Even if you use all caps and quotes, `"CKAN"` is mixed up with the Mexican musician [C-Kan](https://en.wikipedia.org/wiki/C-Kan) in DuckDuckGo's search results. Google is by the way, not much better at this - but at least you see a couple of CKAN logos in the image search results. The fallacies of search, of SEO, and the roots of many of our complaints with A.I. tools like ChatGPT, should be obvious to anyone, now. Thanks to CKAN, and the tireless efforts of portal-deployers and catalog-maintainers around the world, we can truly say today: Open Data is great for SEO! And what's great for SEO.. is usually great for A.I., too. So here's to more FUD! I mean SEO 🥶 --- > Soy un francotirador apunto, preparo rimas, soy certero nunca fallo haré que sangre tu autoestima, [...] tengo flow y rimas no me hace falta nada, quieres grabar aquí pues la calidad se paga _-- C-Kan ft. Big Rapper - No Fear, No Mercy (2013)_ > I am a sniper - I aim, I prepare rhymes, I'm so accurate I'll never miss I'll make your self-esteem bleed [...] I got flow and rhymes, I don't need nothing, You want to record here - the quality is paid for `Translated with DeepL` It's kind of fun when CKAN gets confused with a rapper, especially one whose lyrics seem to reflect a fondness for "flow", "accuracy" and "quality". A straight-laced marketing approach for an enterprise software product would try to distance itself from this. Cultural appropriation (as opposed to [cultural collaboration](https://www.bbc.com/culture/article/20220513-what-defines-cultural-appropriation) - BBC) would be wrong, if you get my gist. And if you weren't convinced that CKAN rocks in my previous slide, you are now. Go fight FUD with some "flow y rimas" 🎸 --- ![](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2F50-jahre-hitparade.ch%2Fanalysis%2F5-CH-Songs.png&f=1&nofb=1&ipt=871530dc6560553c55580bb5dfaa9f1c6fd10d320804d839be91d7ddbb0d601e&ipo=images) Pictured above: a data visualisation from [Audio Analysis](https://hack.glam.opendata.ch/project/132.html), a hackathon project involving audio analysis and automatic transcription of a pirate radio station at [GLAMhack'22](https://infoclio.ch/en/glamhack2022) (Infoclio), the Swiss annual OpenGLAM event. It's the kind of project where open data meets machine learning to empower critical voices, and the potential for public impact is high. Please be warmly invited to [GLAMhack 2023](https://opendata.ch/events/glamhack2023/) in Geneva at the end of September 🤗 --- # hello.world('oleg') Who is doing the inviting: a ~~freelancer~~ ~~solopreneur~~ coder with a cat's sense for content management glitches - as you would have if you also have been building websites since your teenage years - dedicated to furthering the art and science of commas; `sharing,data,with,<3` As [@loleg](http://okfnlabs.org/members/loleg/) you might have seen me active in the [Open Knowledge network](https://network.okfn.org/specialist/oleg-lavrovsky), run [data literacy workshops](https://schoolofdata.ch), consult [renowned institutions](https://forum.opendata.ch/u/oleg/summary), blog [on occasion](https://utou.ch), commit [with pride](https://dat.alets.ch), and - always - try to Pull Request with [deference](https://github.com/search?q=type%3Aissue+author%3Aloleg&type=issues&s=updated&o=asc) (see also: [PR etiquette - Hackernoon](https://hackernoon.com/pull-request-etiquette-20-core-principles-for-handling-prs-as-a-software-developer-a76l3yek)). What else? Canadian-Swiss [expat](https://blog.datalets.ch/oleg/), citizen of a [climatically](https://datahub.io/collections/climate-change) [destabilised](https://carbonliteracy.com/greater-manchester-green-recovery/) planet, [8-bit](https://www.echtzeitkultur.org/) [space](https://intl.startrek.com/replicateyourself/) nerd, [family](https://ru.wikipedia.org/wiki/%D0%9B%D0%B0%D0%B2%D1%80%D0%BE%D0%B2%D1%81%D0%BA%D0%B8%D0%B9) man, _et cetera._ --- # 10 challenges My input today are these ten ways to 'hack' CKAN for fun and/or profit. Think of this as a bunch of potential challenge topics for the next `<hint>`CKAN hackathon`</hint>` --- ## (1) Open data is a kind of honey pot ![](https://i.imgur.com/WRr2ykT.jpg) Pictured above is my [humble submission](https://all.utou.ch/games/LD21/) to Ludum Dare 21 (#54 will take place at the [end of September](https://ludumdare.com/)) - a game where you try to guide some honeybees to the exit with a cube of honey. A bit like herding cats, the bees are wont to ignore your bait and bump stubbornly into walls, wasting time. This seems to be a passable metaphor for the way open data is used to herd data users (developers, researchers), through more points of engagement with data publishers. Games are just a great format to invite people to [hack open source](https://github.blog/2020-10-27-github-game-off-2020/) (GitHub) ...Bzzzt! 🐝 As another kind of 'honeypot', CKAN might also be used to train IT departments in careful publication of data and metadata, educating them in tech and legal policies, preparing them for leaks and attacks. There is [a lot](https://hackmd.io/@oleg/ask-ti-jean) we could do to make community interactions with open data an opportunity for building capacity in Information Security. Which brings me to ... --- ## (2) Make CKAN more hackable[] ![](https://i.imgur.com/HHfgwSQ.png) _Screenshot of [ckan.org/features/security](https://ckan.org/features/security)_ Understanding that everything is hackable is the first step of a long journey of Internet-fu. Encouraging pentesting in user communities, spreading learnings and tools to (API) users openly, training extension developers and portal maintainers... [OWASP CRS](https://hacknight.dinacon.ch/project/4) (DINAcon) is an example of how to interact with a security community, and I'd love to hear your own stories 👂 We are part of an ecosystem and suffering a common fate of many successful software projects (think Wordpress, Windows, Java, Shockwave ..) that have been the worryworms of devops. [Bounties](https://docs.opencollective.com/help/contributing/development/bounties) (OpenCollective) and [Capture-the-Flag](https://ctftime.org/about/) (ctftime) are the most widespread methods to crowdsource attention to an open source product's footprint. They do not replace, but may well complement, a dedicated professional's evaluation. Let's keep making CKAN great for developers - with a secure, open, high performance API and transparent security footprint. Check out [ckanext-security](https://github.com/data-govt-nz/ckanext-security) and harden your instance, mate! --- ## (3) Support & champion data (re)users ![](https://i.imgur.com/ghZ0Yb7.jpg) Screenshot of https://opendata.swiss/de/showcase The [Showcase extension](https://extensions.ckan.org/extension/showcase/) is probably my favorite page on the portal. Here you can really see how the data connects to applications. This is a place where I would love to see more stories and 'raw' hackathon projects, not just polished apps (or, as my screenshot exhibits, the very Swiss preponderance for clock-like dials and maps). We could make it easier to build user experiences through data publication, storify the legal or technical hurdles that are overcome in the effort to put data online. There is an going [discussion](https://stories.schoolofdata.ch/) about connecting CKAN via DCAT, RSS, ActivityPub, and other protocols to fresh channels, for a new audience. We should hack this for [Open Data Day](https://opendataday.org/). --- ## (4) Induce participation in data workflows ![](https://github.com/ProxeusApp/community/raw/master/handbook/Proxeus%20-%20The%20Complete%20Handbook_html_10299e76126cc024.png) Screenshot of https://github.com/ProxeusApp This is a project I've been tinkering with for the past year, with which I would like to make it easier to design workflows around data collection and processing using the Proxeus ['no code' plug-in model](https://forum.opendata.ch/t/no-code-for-open-data-workflows/752). There are many such business tools used to make digitalisation or data management more visual and accessible. My money is on open data that is [small, self-publish(able), actionable](https://hacknight.dinacon.ch/project/60) - not only because my resources are modest, but because that's how data stays personal. CKAN's awesome foundations in [federation](https://ckan.org/features/federate/) of portals make it a prime environment for data replication across organisations or whole sectors - or for resilient [data refuges](https://en.wikipedia.org/wiki/Data_refuge) in activism. A [cool hack](https://community.home-assistant.io/t/blocky-style-flow-based-visual-editor-ui-for-automations-scripts-using-graphical-blocks-included-by-default/149902/9) in this vein would also be to combine the plug-n'play design of morph.io with a [Blockly](https://developers.google.com/blockly/guides/overview?hl=en) environment for scrapers. Let me know if you are also tinkering with such things, and see potential in a CKAN integration via [ckanext-workflow](https://github.com/dpc-sdp/ckanext-workflow) or otherwise. --- ## (5) Data catalogs in the age of misinformation ![](https://i.imgur.com/z49NdKU.jpg) Screenshot of https://memes.sucho.org/ In general, the better we can recognize participation, the more the whole community will benefit through new incentives and structures. But _why_ would people participate in the first place? I have been thinking a lot about the interfaces between data stewardship, volunteering, and the gig economy - and I think that having the right cause, is a big driving factor. Even though open data is often trumpted, with perfectly good reason, as a weapon against [misinformation](https://data.bris.ac.uk/data/dataset/23yv276we2mll25fjakkfim2ml) (University of Bristol), we could pay more attention to features that make it easier to validate and compare sources. See also my ODD'22 page [Cultural Refuge](https://db.schoolofdata.ch/project/177) and [Hack4SocialGood](https://hack4socialgood.ch), a research project culminating in an event in Switzerland at the end of March. --- ## (6) Not all user journeys start at the landing page ![](https://i.imgur.com/VjD7uMa.png) _Screenshot of [DuckDuckGo](https://duckduckgo.com/?q=votes+in+switzerland+open+data+&t=ffab&atb=v317-1&ia=web)_ Of course, many do. Having access to web portal analytics - like some outstanding [portals I know](https://opendata.swiss/en/dataset/web-analytics-der-open-government-data-des-kantons-zurich) - lets us as a community better understand the 'open data marketplace': which topics are trending, where gaps exist. At least, this was something my friend Konstantin and I suggested to a [bunch of designers](https://www.worldiaday.org/talks/ia-context-open-data) in 2016. It is interesting how users interchange data platforms and websites, especially once a new generation of power-users are willing to go beyond the first or second link in a search result, explore the data in their own app. Going further, to understand data reuse patterns among the apps and other downstream users of data, goes to the next level. I hear there is a lot cooking in this kitchen, and am keen to hear more 🍜 Just as a historical side note: before web pages got "rich", having to spend time massaging the content - with the plethora of file formats, and [archive formats](https://en.wikipedia.org/wiki/List_of_archive_formats) (Wikipedia) on top of those - was a normal part of the BBS-era and even early Web experience. And also - remember these guys gracing the footer of every website in the 90's? They were sometimes linked to a full-fledged web analytics viewer, no (data protection) questions asked - hard to believe, today: ![](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fwww.hostmysite.com%2Fsupport%2Ffrontpage%2Fcounter%2Fimages%2Ffpcounter2.gif&f=1&nofb=1&ipt=df6b7fe6db3526a7b08ac0f3ad2b174dc9306f34b96d3a12b762b66df704ce8f&ipo=images) _Screenshot of a Hit Counter generator via [hostmysite.com](https://www.hostmysite.com/support/frontpage/counter/)_ --- ## (7) No data is an island [![](https://cdn.fosstodon.org/media_attachments/files/109/791/868/264/610/688/original/2d17dfc1abc34bd8.jpg)](https://fosstodon.org/@loleg/109791883066353377) Where data stewards meet data makers. Check out the full, original & delightful [Data Access Map](https://www.theodi.org/project/the-data-access-map/) (CC BY), illustrated by Ian Dutnall for the ODI. I added the letters to invite more discussion of the role of public network infrastructure. Wide-Area-Networks, from public WiFi's and cheap mobile data plans, are enablers of whole classes of the economy. Yet access is not evenly distributed, as proponents of [Digital Rights](https://www.eff.org/) (EFF) or [Information Justice](https://openfuture.eu/paradox-of-open-responses/openness-and-digital-human-rights/) (openfuture) will be keen to remind us. This is why protocols like [LoRaWAN](https://www.instructables.com/ESP32-Long-Distance-LoRaWan/) (Instructables), that help to democratise the maintenance of sensor networks and edge computing, are important investments. See also [ONIA](https://opennetworkinfrastructure.org/), and look forward to hearing about the intersection of open data, IoT and security in Deborah Mesquita's [csv,conf,v7 talk](https://csvconf.com/speakers/). --- ## (8) Make excellent feedback loops ![](https://raw.githubusercontent.com/loleg/mz-forest-sound-track/master/diagram.png) _Image by [@afsoonica](https://github.com/afsoonica) from a MakeZurich project, as [discussed here](https://forum.opendata.ch/t/22-30-6-makezurich-2018/372/3?u=oleg)_ [Xeno-canto](https://xeno-canto.org/) is an incredibly cool crowdsourced dataset of wildlife sounds, and is relied upon by thousands of projects like this one. One can easily be inspired by their flexible [search syntax](https://xeno-canto.org/help/search) or the Data [Mysteries](https://xeno-canto.org/mysteries) page. How could CKAN make it easier to contribute here, or start a new effort? How do we inspire and educate others in data stewardship? At what stage, and in what way, to best plug in data collection projects ([Kobo](https://kobotoolbox.org/), [ODK](https://getodk.org/), ..)? Every good library should have a section dedicated to [how to be a better writer](https://www.wikihow.com/Be-a-Good-Writer) (WikiHow). That's what I'm ~~talkin'~~ gaffing about. --- ## (9) Pinboards? Rules! Constraints? Scores! ![](https://i.imgur.com/XcxV4JE.jpg) _Screenshot of [dribdat](https://dribdat.cc) for makezurich.ch_ Make _making_ data great again: this is the goal of [dribdat](https://github.com/dribdat), a project I initiated and continue to support, just one of the many [awesome tools](https://github.com/dribdat/awesome-hackathon) available to hackathon organizers to make playful, legal, experimentative, inclusive hacking a fun and rewarding activity. Applied correctly, it is a powerful community building tool. But we are all very much aware of hackathon fatigue, when things go out of balance. It is important to document our ideas, attempts, successes and failures, so that even the smallest contributions count. In the latest release, dribdat automatically tries to [enrich](https://github.com/dribdat/dribdat/releases/tag/v0.6.6) project descriptions with metadata from CKAN's API using [ckan-embed](https://github.com/opendata-swiss/ckan-embed). I would be happy to hear your experiences with documenting open data projects, create some playful (or _ludic_) experiences that make friends want to spend time together enjoying some delicious data cake in each other's company. No feathers ruffled 🦚 <img src="https://handbook.opendata.swiss/de/_images/embed-architecture.png" width="250"><br> _Image from [Opendata.swiss Handbook](https://handbook.opendata.swiss/de/content/glossar/bibliothek/embed.html)_ --- ## (10) Ask how does data _really_ get used? ![](https://docs.hoppscotch.io/preview-light.png) _Screenshot of [Hoppscotch](https://docs.hoppscotch.io/)_ Iteratively, expressively. API first. This is engineering as it is practiced in the so-called industry. Data has never really been just about static files - and today, machines are still connected to each other by the [loving grace](https://www.imdb.com/title/tt1955162/) of humans, with tools like the above. __ ![](https://i.imgur.com/gGHkTKf.png) _Written by Liyas Thomas in [I created Hoppscotch 👽 - Open source API development ecosystem](https://dev.to/liyasthomas/i-created-postwoman-an-online-open-source-api-request-builder-41md)_ See also: [CKAN OpenAPI viewer](https://extensions.ckan.org/extension/openapiviewer/) --- ![](https://opendata.ch/wordpress/files/2022/12/GoVTech_Hackthon_NL_2048x1150-1024x575.jpg) _Image courtesy of [The Federal Chancellery](https://www.bk.admin.ch/bk/en/home/digitale-transformation-ikt-lenkung/bundesarchitektur/api-architektur-bund/govtech-hackathon.html)._ :soon: https://opendata.ch/events/govtech-hackathon/ --- ![](https://i.imgur.com/vnlbs7b.png) _Screenshot of [Wolfram Alpha](https://www.wolframalpha.com/input?i=Time+left+until+9%3A12+AM+on+March+23%2C+2023.)._ 🕰️ Time is ticking ... ✨ [How do we make hackathons more fair?](https://github.com/dribdat/coop/discussions/1) --- #### So at the next hackathon: ask not what your data can do for you <img src="https://i.imgur.com/1iD7Kus.jpg" width="200"> ## Ask what you can do for your data! _Image generated using [pokemon-stable-diffusion](https://huggingface.co/justinpinkney/pokemon-stable-diffusion?text=Uncle+Sam+wants+YOU), trained by Justin Pinkney_ --- ## Thanks. :thumbsup: :thumbsdown: oleg@opendata.ch <a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons Licence" style="border-width:0" src="https://opendata.utou.ch/presentations/digiges%202019.2/88x31.png" align="left" /></a><br>This presentation by Oleg Lavrovsky is licensed under a<br> <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</a>.