owned this note
owned this note
Published
Linked with GitHub
---
title: "Linking OpenStreetMap and Wikidata: Case study of Taiwan's villages and river dataset"
tags: SoTM, SoTM 2022, Talk, OpenStreetMap, Wikidata, State of the Map
description: View the slide with "Slide Mode".
---
# <font color="red" size="56">Linking OpenStreetMap and Wikidata: Case study of Taiwan's villages and rivers dataset</font>
<!-- Put the link to this slide here so people can follow -->
<font color="red">slide: [https://hackmd.io/@osm-tw/BkTR0bmd9](https://hackmd.io/@osm-tw/BkTR0bmd9)</font>
<!-- .slide: data-background="https://i.imgur.com/zIpva9R.jpg" data-background-opacity="0.5"-->
<font color="yellow">State of the Map 2022 Dennis Raylin Chen</font>
Note:
Ta̍k-ke hó, Hello everyone, This is Dennis Raylin Chen from Taiwan, I want to talk about cleaning and managing dataset. My speech's title is "Linking OpenStreetmap and Wikidata: Case study of Taiwan's villages and rivers datasets"
---
## Who am I?
- Supaplex
- OpenStreetMap :heart: Wikidata :heart:
- Wikimedia Taiwan :cat:
Note:
My online ID is Supaplex, one of the community member of OpenStreetMap Taiwan and Wikidata Taiwan, currently serving as a board of director of Wikimedia Taiwan
---
## Outline
* OpenStreetMap and Wikidata
* Villages
* * rivers
* cemetery
Note:
Here is today's speech outline. I will talk about some general information of OpenStreetMap and Wikidata in Taiwan. Then the villages dataset, and the rivers dataset. The final part will talk about cemetery.
---
## OpenStreetMap Taiwan
* Monthly Meetup in Taipei, co-host with Wikidata Taiwan
* members overlapped with Wikidata Taiwan
* Major change monitor and mapping, tagging scheme discussion
Note:
I am one of the co-host of the monthly meetup in Taiwan, co-hosted with Wikidata Taiwan community. There are a hugh overlapped of community members between Wikidata and OpenStreetMap in Taiwan. The OpenStreetMap Taiwan community keep track of major development site, and sometimes discuss tagging scheme of mapping in Taiwan.
---
## Keep track of vandlism of Taiwan and recovery
**easy spotted by QA Tools like Osmcha**

Note:
It is quite easy to spot vandlism
---
## Keep track of vandlism of Taiwan and recovery

Note:
The most annoying stuff is unhappy Chinese people adding notes or editing about China owns Taiwan.
---
## Cross Taiwan Strait Railway planned only by China

Note:
Sometimes Chinese people are making unrealistic edit, for example, cross Taiwan Strait Railway.
---
## Taiwan Military Bases

Note:
Recently Taiwan is an international hotspot for diplomate and military actions. Chinese people are curious about Taiwan military camps, but misuse OpenStreetMap notes.
---
## Last year: River, villiages and cemetery
* COSCUP 2021: [OpenStreetMap 佮 Wikidata,敢會當整理台灣所有溪流資料](/d017K1UJTai0QR7s8Jr2bw)
* HOT Summit:[Using OpenStreetMap and Wikidata to arrange river data in Taiwan](https://hackmd.io/@osm-tw/ByEYs5kLY)
* Wikidata Con:[Using OpenStreetMap and Wikidata to arrange river data in Taiwan](https://hackmd.io/@wikidata-tw/rJZbfxYBF)
* Cemetery:[DRGPA 2021第五屆研究記錄亞太墓地研討會](/h1u4K9sZQViud0AP1nyK2Q)、[文化與自然地理記錄工作坊](/dn9E5hS4RXe7tD-_OlJsyw)
Note:
Last year in one of the most important open source conference COSCUP in Taiwan, and also some related conference like Wikidata Con, HOT summit, I talked about cleaning, mapping and managing data about river, village and cemetery in Taiwan
---
## OpenStreetMap Merits in TW
* Power Towers
* Hiking Routes
* Cemetery
* Cross-linked other database: Wikidata、Wikipedia
* Multilingual: Tâi-gír(Hokianese), Ha̍k-ka-fa, Formosan Austronesian languages, English
Note:
Compare to other commercial maps, OpenStreetMap have much higher coverage of power towers, hiking routes and also cemetery.
And OpenStreetMap is a web map and could easily linked to other online projects like Wikidata and Wikipedia. The multilingual tagging scheme also people to add Taiwanese Hokkien or Hakka, Formosan Austronesian Languages, and international language like English or Japanese.
----
## Hiking map user OpenStreetMap: Rudy Map

Note:
Here is an example of using OpenStreetMap, Rudy Map is used by hiking community in Taiwan.
----
## Road name after [CKS(Zhongzheng Road)](https://en.wikipedia.org/wiki/Chiang_Kai-shek)
* http://overpass-turbo.eu/s/jby

Note:
Chai-kai Shek is one the dictator and former president in Taiwan. We can use OpenStreetMap to analysis road named after him.
----
## Power Towers and Substation
http://overpass-turbo.eu/s/kpv

Note:
I have mentened that we have quite a high coverage of power towers, and also substations.
---
## [Wikidata Taiwan](https://www.wikidata.org/wiki/Wikidata:WikiProject_Taiwan)
* SLY art space
* River Code
* Village dataset
* Taiwan Government Publications Number
* Hakka Theme Data
Note:
For Wikidata, we have worked on art space dataset, river code dataset, village dataset, Taiwan government publications number, and Hakka theme data
---
## The progress in 2022
* Update village data to keep up to date to government village dataset (heep from Plantoid's tools)
* add more river relations
Note:
There are some added or removed villages by Taiwan government in 2022, so OpenStreetMap and Wikidata Taiwan community have to change corresponded OSM relations and Wikidata items.
The big rivers' tribunties are added one by one by creating waterway relations for the creeks.
---
## Villages in Taiwan
* inspired by serv [Barangay](https://en.wikipedia.org/wiki/Barangay) mapping projects, I want to told Taiwan story of mapping villages
* total number is 7,749
* mapped by community members in 4 years
* linked with government ID and Wikidata
Note:
Inspired by Serv, also known as Eugene, have started mapping Barangay in the Phillipines. I want to told Taiwan story of villages mapping. The OpenStreetMap Taiwan community took fours years to map every villages in Taiwan. The total number of villages is 7,749(date: 2022 7/1). And all villages are linked with government ID and Wikidata items
---
## Visualization
[](https://overpass-turbo.eu/s/1kR3)
Note:
Here is the visualization of the whole near 8 thousand villages of Taiwan.
---
## Sample of Village
[](https://www.openstreetmap.org/relation/14017883)
Note:
This is one of the new villages establish in 2022 in Taoyuan, Xingzhong Villages.
---
## Monitor new or dissolve village in TW
Tool links:[https://wikidata.planetoid.info/?q=已建立鄉鎮條目](https://wikidata.planetoid.info/?q=%已建立鄉鎮條目)

Note:
The community have set up a tools to monitor government dataset, if there is new entries or remove entries, we could make correspond edit on both Wikidata and OpenStreetMap.
---
## The Challenge of Up-to-date Village list
* Keep track of government dataset
* Communtiy hurry to add stuffs
* Much easier to add on Wikidata
* Have to Wait for date of the new villages established, then create new villages relation on OSM
Note:
It is quite a hard work to keep both OpenStreetMap and Wikidata up-to-date to the newest government dataset. You have to add the new villages in time, and label the revoked villages in time. It is much easier to edit Villages on Wikidata due to you don't have to edit on the effect date. But for OpenStreetMap, you have to change the villages after the effect date.
---
## Rivers in Taiwan
* Small creeks have little documents
* Adding OpenStreetMap relations are not easy
* Sometimes might have to survay
Note:
Now it is the second part: the river story. We have to admit, editing OpenStreetMap relation is a hard job, not easy for new-comers. Sometimes we have to survey to get rivers' data.
---
## List of Rivers
[](https://overpass-turbo.eu/s/1kR6)
Note:
By using Overpass Turbo, we could easily get a list of rivers in Taiwan, with government river code and Wikidata reference number
---
## Wikipeida of Nanshan Cemetery
[](https://zh.wikipedia.org/wiki/%E8%87%BA%E5%8D%97%E5%8D%97%E5%B1%B1%E5%85%AC%E5%A2%93)
Note:
Nanshan Cemetery Wikipedia article, have a detail description and some pictures, but no geo-data like area
----
[OSM南山公墓](https://www.openstreetmap.org/relation/6564784)

Note:
Nanshan Cemetery on OpenStreetMap, with the whole area information
---
## Static about Taiwan Cemetery
* National Landuse Survey Center - Landuse Coverage map by the the government
* Commercial map:Google Maps, might have a point to represent
* Shortcome:No raw data for reuse, might have to fees
Note:
Most commercial maps have some cemetery information, but most of them withour area info. And you have to pay to get government map NLSC in raw data form, if you don't, you have to draw the map.
----
## osmium command
```
osmium tags-filter taiwan-latest.osm.pbf wr/landuse=cemetery wr/amenity=grave_yard -o cemetery-areas.geojson
```
Note:
Using osmium to analysis OpenStreetMap cemetery data in Taiwan on OpenStreetMap
---
## Tags of Cemetery
```
osmium tags-count 20220531/cemetery.osm.pbf --output=20220531/cemetery-stat.txt
```
Note:
It is quite hand to use osmium to get tags static of the whole dataset
---
## Tags of Cemetery

Note:
We findout name is the most use tag, the other frequent use tag is religon. And some cemetery have wikidata link or address
---
## Names of Cemeteries
```
osmium tags-count 20220531/cemetery.osm.pbf name=* --output=20220531/cemetery-name.txt
```
Note:
Let's analysis cemetery names
---
## Private operated Cemeteries

Note:
There are several companies manage privated own cemetery.
---
## Religous of Cemetery
* Most cemetery in Taiwan are government managed, religion-independent
* Some cemetery are operated by churches or temples
Note:
Most of Taiwan cemetery are government manged, but there are still some cemtery that are managed by churches or temples. We could analysis relgion tag
----
## Religous of Cemetery
```
osmium tags-count 20220531/cemetery.osm.pbf religion=* --output=20220531/cemetery-religion.txt
```
Note:
This is the command line of filter objects with religion tag
----

Note:
There are some cemetery have their name in the religon filed. We spot some obvious mistake.
---
## Challenges
* Data Quality
* Fundemental platform, hard working not easy to catch the spotlight
* Multilingual
* Need contribution~~fundraising~~、~~human resource~~
Note:
The data quality is a big problem. OpenStreetMap and Wikidata are quite fundemental platform in most people view. And the multilingual part need people who speak the language to help. Need people, need funding!
---
## Hard Working on Cleaning and Mapping items
* OpenStreetMap and Wikidata could link to each other, and another third-party databases, ex: GNS
* Knowledge of both OpenStreetMap and Wikidata, might have to program
* Data Quality has low and high, have to keep editing and fixing
Note:
OpenStreetMap and Wikidata could link to each other, and also third-party database, ex: GNS
Have to both deal with OpenStreetMap and Wikidata, sometimes have to stop and fix some data problem, need patience
---
## Multilingual
* Lack of people who could write Taiwanese Hokkian and Hakka
* Lack of written form
* Lack of books of Taiwan National Languages
Note:
There is a language revival movement for the National Languages of Taiwan. But still lack people who could use the written form. And also not much books published in these languages
----
## Hak-ka-fa
* Which Romanization Solution for writing Hakka
* Different language code for Hakka writing: [Pha̍k-và-sṳ̀](https://zh.wikipedia.org/wiki/%E5%AE%A2%E5%AE%B6%E7%99%BD%E8%A9%B1%E5%AD%97), [Taiwanese Hakka Romanization System](https://en.wikipedia.org/wiki/Taiwanese_Hakka_Romanization_System) etc
* Looking for Hakka speaking People
Note:
Hakka is in much dangerous situation due to low usage in the socialty. They have to choose which romanization solution for written Hakka, what is the code for differet written form. Desperate calls Hakka people to help with the Hakka language label
---
## Future Plan
* Theme workshops
* Mapping OpenStreetMap river relation to Wikidata items
* Adding multilingual labels on Wikidata river items, including Taiwanese Hokkian, Taiwanese Hakka, Formosan Languages names
Note:
Even during the COVID-19 pandamic, we still want to hold theme workshop to teach new-comer how to edit OpenStreetMap. And finally to edit rivers, and link to other database like Wikidata. Taiwan have a new law to preserve National languages including Taiwanese Hokkien, Taiwanese Hakka, Formosan languages. And we want to add these National Languages to both OpenStreetMap and Wikidata.
---
## [To-siā!](https://en.wiktionary.org/wiki/%E5%A4%9A%E8%AC%9D#Chinese) [sṳ̀n-mùng-ǹ!](https://en.wiktionary.org/wiki/%E6%89%BF%E8%92%99%E4%BD%A0) Thank you! :sheep:
- [GitHub](https://github.com/Supaplextw/)
- Supaplex: [Wikidata](https://wikidata.org/wiki/User:Supaplex),[OpenStreetMap](https://www.openstreetmap.org/user/Supaplex)
- Or [email](mailto:dennis@wikimedia.tw)
- Facebook group [Wikidata Taiwan](https://www.facebook.com/groups/2212207218990971/)、[OpenStreetMap Taiwan](https://www.facebook.com/groups/OpenStreetMap.TW/)
Note:
Here is my contact information, To-siā, sṳ̀n-mùng-ǹ! Thank you!