Try   HackMD

g0v.tw campaign finance digitization project

captcha input -

click the green button to refresh

http://campaign-finance.g0v.ctiml.tw/cell

The results -

*Visualization

https://fuyei.github.io/cf-viz/viz.htm?debug

How it works?

1. Volunteers upload scan-files to dropbox or google dirve, and record in one google sheet: http://bit.ly/ScanPoliticalContribution

2. Make all files listing csv, which contain: original file name, page number, file url.

e.g. https://github.com/ronnywang/tw-campaign-finance/blob/master/list0610.csv

3. Cutting "Tofu": use this code to find every horizontal and vertical lines in the jpg and all the intersection coordinate

find all code above here

4. Make an API to get every "Tofu"(cell) image via row/column number.

http://campaign-finance.g0v.ronny.tw/api/gettables
http://campaign-finance.g0v.ronny.tw/api/tables/{id} (get the details data of one page)
http://campaign-finance.g0v.ronny.tw/api/getcellimage/{id}/{row}/{column}.png (get the small "tofu" image)

e.g. http://campaign-finance-pic.ronny.tw/1/3-1.png

5. Make a captcha input website to crowdsourcing

6. Analyze the best answer of each "Tofu" - since the same "Tofu" would be asked to more then one user

Live Demo / Ref

Document & Code (Chinese)