g0v.tw campaign finance digitization project

captcha input -

click the green button to refresh

http://campaign-finance.g0v.ctiml.tw/cell

The results -

*Visualization

https://fuyei.github.io/cf-viz/viz.htm?debug

How it works?

1. Volunteers upload scan-files to dropbox or google dirve, and record in one google sheet: http://bit.ly/ScanPoliticalContribution

2. Make all files listing csv, which contain: original file name, page number, file url.

e.g. https://github.com/ronnywang/tw-campaign-finance/blob/master/list0610.csv

3. Cutting "Tofu": use this code to find every horizontal and vertical lines in the jpg and all the intersection coordinate

find all code above here

4. Make an API to get every "Tofu"(cell) image via row/column number.

http://campaign-finance.g0v.ronny.tw/api/gettables
http://campaign-finance.g0v.ronny.tw/api/tables/{id} (get the details data of one page)
http://campaign-finance.g0v.ronny.tw/api/getcellimage/{id}/{row}/{column}.png (get the small "tofu" image)

e.g. http://campaign-finance-pic.ronny.tw/1/3-1.png

5. Make a captcha input website to crowdsourcing

6. Analyze the best answer of each "Tofu" - since the same "Tofu" would be asked to more then one user

Live Demo / Ref

captcha input - click the green button to refresh
demo for finding grid from image - don't mind the chinese in alert, checkout the green and red bits on the canvas
grid editing interface - manually adjust how you want to extract cells into individual images
aggregated viewer
aggregated csv
temporal/amount viz

Document & Code (Chinese)

Doc: https://g0v.hackpad.com/ep/pad/static/cnmUlDwCooX
Landingpage & API list: http://campaign-finance.g0v.ctiml.tw/
github for image process https://github.com/ronnywang/tw-campaign-finance
github for captcha input https://github.com/ctiml/campaign-finance.g0v.ctiml.tw