Git scraping is technique I learned about some month ago you can read more about it here https://simonwillison.net/2020/Oct/9/git-scraping/.
In general this can be done via any CI system like (self-hosted) gitlab, github etc. but keep in mind there might be some restrictions e.g. github limits max file size to 100MB.
First we need to prepare overpass query working with umap, I have described it in this article.
Once having the query simple shell script using wget
or curl
can fetch it, let's call this script umap.sh
wget -O result.json 'https://overpass-api.de/api/interpreter?data=<our_query>'
but how to execute such script? Github deploys so called "Actions" which are triggered when certain conditions are met e.g. new commit or regularly by cron.
Let's create "New workflow"
a workflow defines these conditions an example below will run script called umap.sh
on every commit or daily at 4:56
in the morning.
After it was run it will commit the files to git repo if files have changed compared to what's already in repo.
name: Scrape latest data
on:
push:
workflow_dispatch:
schedule:
- cron: '56 4 * * *'
jobs:
scheduled:
runs-on: ubuntu-latest
steps:
- name: Check out this repo
uses: actions/checkout@v2
- name: Get the data and analyze
run: |
chmod +x ./umap.sh
./umap.sh
shell: bash
- name: Commit and push if it changed
run: |-
git config user.name "Automated"
git config user.email "actions@users.noreply.github.com"
git add -A
timestamp=$(date -u)
git commit -m "Latest data: ${timestamp}" || exit 0
git push
Here's couple of examples first one is status of recycling in Czech republic, which focuses on recycling items with incomplete data
repo - https://github.com/mahdi1234/OSM_CZ_recycling
umap result - https://umap.openstreetmap.fr/en/map/odpad_bez_urceni_cr_553696
a different project of friend of mine focusing on vegan/vegetarian/bulk-purchase in South Moravian Region (Jihomoravský kraj)
repo - https://github.com/befeleme/vegan_JMK
umap - https://umap.openstreetmap.fr/en/map/vege-jmk_557579
For current local project I wanted github to generate gpx files, but had tought times until I realized overpass doesn't produce geojson, but "just json".
I decided to switch to xml instead for this project, but as per comment section it should be possible to tranfrom to geojson as well see https://github.com/ThomasG77/demo-parks-metropole-nantes/blob/main/umap.sh in particular osmtogeojson result.json >| result.geojson
Once having xml from overpass some tool for conversion is needed, I chose gpsbabel - https://www.gpsbabel.org/
Fist install it as a part of the workflow - https://github.com/mahdi1234/OSM_CZ_phonebooths/blob/main/.github/workflows/scrape.yml
- name: Install gpsbabel
run: sudo apt-get install gpsbabel
and then convert into gpx
- name: Convert to gpx
run: |
chmod +x ./gpx_convert.sh
./gpx_convert.sh
shell: bash
where https://github.com/mahdi1234/OSM_CZ_phonebooths/blob/main/gpx_convert.sh is simple gpsbabel
#!/bin/bash
gpsbabel -i osm -f active_phone_booths.xml -o gpx -F active_phone_booths.gpx
gpsbabel -i osm -f disused_phone_booths.xml -o gpx -F disused_phone_booths.gpx
Gpx files can be linked directly from umap for download as in https://umap.openstreetmap.fr/en/map/telefonni-budky_621957
this is done via layer properties
or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Syncing