Some description summarising aim and the general idea
Aim: to protect web systems from spam while using the labour of captcha solvers to improve OpenStreetMap.
General idea: We have AI systems like fAIr that recognise buildings. They are not good enough to directly integrate into the map, or even give directly to mappers. On the other side, we have spammers creating bot accounts.
Some buildings are missing on OpenStreetMap completely, others were mapped wrong, and a few have been changed or demolished since they were mapped.
We would ask accounts who are suspected of being spammers to solve a challenge, asking them to click on the buildings in a selection of images. A small share of the images would be known positives and known negatives - used to validate whether the user is a bot; the rest of the images would be unknown - this is where we get free work from the user. The user would not know which are which. Ideally, the known positives/negatives would be examples that are known to be tricky for existing computer vision software.
This isn't only for buildings: we could validate the existance of other objects that are visible from aerial imagery like zebra crossings, roads, etc. or objects that are visible from street-side imagery like Panoramax or Mapillary, such as benches, access restrictions, speed restrictions, road signs…
We could also validate objects, for example the shapes of buildings if the AI prediction doesn't match what's mapped in OSM (has the building changed since it's been mapped? Was the mapping not done right?).
We would gather multiple votes on a single validation, requiring a minimum number of votes and minimum percentage of positive consensus on validation before sending the validated points to mappers. For example, if 12 users have seen an image and 80% agree that there's a building there, which OSM doesn't have, we could send it to maproulette to be mapped.
After using it on OSM we could expand… general use… sell protection on one side, data validation on the other… use the validated data to improve the training of computer vision systems.
We can use sample data from a Mapswipe challenge, e.g. https://web.mapswipe.org/#/en/projects/-O7hFcC2pKTnTh01SGds
The mockup would have a dozen images. We would have labeled known positives and negatives, and a few 'fake' unknowns that we would collect votes on.
We could reuse code from open source captchas. Altcha seems popular but doesn't implement puzzle solving…
Image category |
Sample image | Expected response | Meaning |
---|---|---|---|
TP |
Image Not Showing
Possible Reasons
|
Yes / Agrees | Expected building (labels = prediction) |
FP |
Image Not Showing
Possible Reasons
|
No / Doesn't agree | No building in the labels, there shouldn't be any where outlined |
FN |
Image Not Showing
Possible Reasons
|
Yes / Agrees | Expected building (it's in the labels, but not predicted) |
TN |
Image Not Showing
Possible Reasons
|
No / Doesn't agree | It's an image with no outlines (not in the labels, not in the prediction) |
… shall we create fake TN with random oulines in the middle of nowhere? Or are we happy with having empty figures for this category?
Actual TN found by the algorithm (there're not many more)
We want to send the link of the working prototype link to GH-pages site to a "good" amount (30-40? hundreds!!) of volunteers.
Are we running an A/B test?
Number of images per "session" (9? same for the swipe type?) and among these what is the proportion of:
SURVEY
We are setting up a questionnaire to get feedback from the users.
Type of questions (easier if it is not qualitative input, for infering some data out of it).
Questions below are of the type "strongly agree/agree/neutral/disagree/strongly disagree"
I have identified features (like buildings) from satellite imagery before
I prefer the grid/swipe format
[@stuart make this not a agree/disagree, only 2 options here, or 3 if you add 'neither']
I could easily identify the detected building
(You can add details in last question box)
I would like more/less zoomed imagery
(You can add details in last question box)
The user would benefit from further instructions
I find this CAPTCHA tool very cool
Is there anything we have missed and you would like to see in MapTCHA?
(For example: option for translation to other language, refresh/instructions/"skip" button, …)
Anything you would like to suggest, feedback or comment…
Version used for the FOSDEM proposal (Dec 2024):
… to present at the next State of The Map?
Stuart, Guillaume, Anna
Feedback from FOSDEM:
Maptcha Version 2
discussing the feasibility
Stuart, Guillaume, Anna
Feedback from the test (Mastodon):
Analysis of the data obtained + survey
Set up slides:
Let's write a paper‽
No of images per category:
We shall need a database to store people's responses, and decide on rules about what to show to people.
Timeline: by 13th all ready, send link to potential testers with a week time to try it, get inputs and analyse them before FOSDEM
TODO:
Stuart → finishes off the app, updates to latest images, and addresses the points above
Anna → starts off the slides, and curates the survey questions
Guillaume → to polish off the images
Agenda
Proposal accepted for a talk at Fosdem 2025 (CfP)
See it here:
MapTCHA, the open source CAPTCHA that improves OpenStreetMap
https://pretalx.fosdem.org/fosdem-2025/talk/review/KMFMJ9NSWFYSW3DAGWBKK9BZFG7RBVRV
Agenda
Agenda
Update from Stuart on the status of the mockup: he started something, involving also the family :), and the Fosdem deadline is good to push things ahead*.
Discussion about Fosdem and starting the shared document for the application.
In a lot of different ways this is a design problem, not a technical problem. Stuart
[*] Call for participation, geospatial room: https://lists.fosdem.org/pipermail/fosdem/2024q4/003597.html
To detect buildings by category… mapping the image into categories, to assign to each building a score in terms of which category it belongs to (90% overlap threshold, or something similar)
Why it is so hard to generate (a ~huge number of) images for testing maptcha? [See questions 5-6 above]
Cases we usually see online are in general used for tasks of image classification, while the buildings detection is an image segmentation task.
This makes it more complicated to create than, let's say, a typical captcha like this one:
POSSIBLE SOLUTION: we ask the user to draw the polygon. This would be minimal editing, i.e. to click on 4 (maybe 6, not many more) vertices that make up the polygon of the building.
We get the input from more than one user (2? 4?) and set a threshold on which they agree → this becomes our "labels" that we obtain from the user and that we can use as TP to overlay with the prediction (to assess them, as a secondary output)
BUT
this would imply an analysis of the inputs, which doesn't fit with the prompt detection of human/not human that we need for distinguishing from robots.
THEN you WOULD anyways need a (fast) yes/no type of input for the human recognition step [we could eventually use the idea right above for the data user input].
So, where does this take us to?
Could we maybe set up 2 steps, one initial with questions like "click on the images that contain a building" like in the second figure above (we would find "easy" cases for this‽), and a second step where people draw the buildings, to get data from the users.
Note:
This MapSwipe Web project seem to do exactly it
I do see how this one can be problematic, too, and wonder how they would treat the input they get from the user.
IDEA !!!
maybe we can take inspiration from the above, and use only one detection (buildings prediction outline, can be correct or not) per each image! After all, this is what fAIr currently accepts as a feedback input (on their predictions).
This should be easy to generate (estrapolate one outline at a time, zoom to it, buffer with background and export as a tile) and also to overlay with the labels mask/vector to filter them by T/F category.
How many people would use the the captcha every day? (How many new users does OSM have daily? Does the captcha appear also for editing the wiki? Other cases?)
Where is it going to be used, OSM only or more widely?
He read an article recently about bots getting very good at solving the street level view ones
AI bots now beat 100% of those traffic-image CAPTCHAs
… if you make your own captcha, you would almost hope that someone creates a bot that cracks it, then you could actually use their algorithm, as they found the solution for you (!!!)
THERE ARE 2 (almost competing) THINGS TO FIGURE OUT:
1. how secure you can make it - otherwise it's useless, as it can be used by bots
2. how to get good data out of it and gain good info from users
… it has to work as a security measure before it can be used for data collection
And this takes us to two different cases: 1. make something that works just to recognise if someone is human or not, 2. to obtain data on unknown cases (i.e. help in computer vision tasks)
The first case needs labels, otherwise you can't say if people are right or wrong [i.e. you need to have predictions for areas where you already know where buildings are]. In the second case we obtain data from the data.
Acknowledging that from December to March he is in paternity leave, he can help to build a prototype interface in one or two afternoons of work.
It would temporarily sit on Github.
Anna to provide him with imagery for this [~100 images for each of the four TP, TF, TN, FN categories, with a couple of buildings per each tile].
… catch-up in person at November team-meeting in Glasgow (Stuart lives in Edi).
Discussion of general idea and mockup concept.
TODO list
Guillaume launched a name and an idea for the name and logo .
First draft:
https://en.wikipedia.org/wiki/CAPTCHA
https://github.com/altcha-org/altcha Open source CAPTCHA "GDPR compliant, self-hosted CAPTCHA alternative with PoW mechanism and advanced anti-spam filter." but (proudly!) doesn't implement puzzle solving.
Dazed & Confused: A Large-Scale Real-World User Study of reCAPTCHAv2
Andrew Searles, Renascence Tarafder Prapty, Gene Tsudik
https://arxiv.org/abs/2311.10911
…
(An article covering that paper)
https://boingboing.net/2025/02/07/recaptcha-819-million-hours-of-wasted-human-time-and-billions-of-dollars-google-profit.html
ReMAPTCHA: A Map-based Anti-Spam Method that Helps to Correct OpenStreetMap Stefan KELLER
University of Applied Sciences, Rapperswil / Switzerland · sfkeller@hsr.ch
GI Forum 2014
https://gispoint.de/fileadmin/user_upload/paper_gis_open/GI_Forum_2014/537545020.pdf