Summary
Roadmap
Contributing to the Gitcoin ASOP
Descriptions
Sybil Detection Roles
Sybil Detection Interfaces with Other Groups
Anti-Sybil Microservices Description
Data Dictionary
Sybil Survey Overview
:::info
Updated by December 2021
:::
Authors: Danilo Lessa Bernardineli
Results under the original scope
The definitions & methodology of the research plan were sucessfully implementedThis includes creating rewiring optimizers based on Hill Climbing and Simulated Annealing that were pushed to the NetworkX module
<sup><sub>Source: https://github.com/gitcoinco/gitcoin_cadcad_model/tree/main/optimality_gap</sub></sup>
:::info
Updated by September 2021
:::
Authors: Danilo Lessa Bernardineli
Summary
The combined FDD process is effective at catching about 83% of the Sybil Users (between 100% to 57% under 95% CI) according to blind human evaluations. The best estimate for Sybil Incidence on Gitcoin is 6.4%, and is contained between 3.6% to 9.3% with 95% CI. IP clusters together with GitHub account creation date are the most relevant features for detecting sybil users programatically right now. The Fraud Tax is computed as 0.61% of the Funding Amount.
Links
:::info
Updated by December 2021
:::
Authors: Danilo Lessa Bernardineli
Summary
An total of 27.9% of the Gitcoin users, representing 21.7% of the contributions, were flagged during R12. The Sybil Incidence during this round is significantly higher than R11, with an estimate of being 2.6x higher (lower / upper boundaries being 1.6x to 5.1x).
Links
:::info
Updated by March 2022
:::
Authors: Danilo Lessa Bernardineli
Summary
An total of 11.9% of the Gitcoin users, representing TODO% of the contributions, were flagged during R13. The Sybil Incidence during this round is significantly lower than R12, with an estimate of being 70% of it was before.
The Flagging Efficiency was 84% (lower boundary: 77% and upper boundary: 93%) which means that the combined process is underflagging sybils compared to what humans would do.
:::info
Updated by December 2021
:::
Authors: Danilo Lessa Bernardineli
This doc reports how the Gitcoin Contributions Graph changes when it is modified so that users are excluded based on a list provided by the Sybil Detection Algorithm. Summary statistics and a estimate of the Fraud Tax is computed (as defined at https://hackmd.io/e2mZ9UT7QRGMh5tg6OCXfw).
Parameters
Algorithm Aggressiveness: 20%
:::info
Updated by October 2021
:::
Authors: Charlie Rice, Nick Hirannet
Automate the evaluator notification (future)
Automate distribution of evaluator assignments (future)
Be able to run SQL queries in reasonable time (less than 10 min to pull 500,000 contributions). Create data lake or warehouse that can be used instead of Metabase. If cannot be done, need other credentialing solution (currently Jesse has credential to access) (in R12 scope)
Standardize value of amount_median, amount_mean to one currency (dollars?) (done)
:::info
Updated by June 2021
:::
Goals
To perform a dry-run of the technical anti-sybil workstream during round
To make it fun and educative to manage the anti-sybil work
Roles
Contribution data generator: DaniloSub-roles: dishonest contribution generator & honest contribution generator
:::info
Updated by August 2021
:::
Authors: Danilo Lessa Bernardineli
Input / Output
Input
A table where each row represents independent and valid contributions. Required fields:contrib_id (PK)
created_on (timestamp)
:::info
Updated by August 2021
:::
Authors: Charlie Rice
Processed notes from a meeting on 27 August 2021
DELIVERABLES:
Microservices (Emanuel)First dry-run/walk through during R&D on 1 September
:::info
Updated by March 2021
:::
Authors: Danilo Lessa Bernardineli
WS #8 Gitcoin Under Attack :volcano: + Scientifically approaching a Conjecture
MoH: @danlessa
This working session is going to exceptionally have a 3hr duration instead of 2hr
would be interested in how we decide what communities to plug into this analysis.
i think that collections ( https://gitcoin.co/grants/collections?featured=true&collection_id=14 ) might be an interesting way of grouping grants for this analysis
is it easy to make this self service, ie @frank@gitcoin.co or i want to analyze a group of grants, we can just run the analysis?
Show less
Michael Zargham
Michael Zargham
2:49 PM Today
probably need a few iterations before trying to make it self service -- the first pass we simply picked some algorithms and described what we saw -- to make this reusable we need to evaluate some alternatives and understand what kinds of results are robust to changing the community detection algo, and the params of that those algos.
:::info
Updated by December 2020
:::
04dec2020
Notes
Data must be cleaned so we don't dox anyone. Owner: Danilo
We need to pickle and load the results depending on the performanceHave a separate notebook for the sim and another for the visualizations
Andrew to tackle the video first. Danilo to do a first pass on the repo refactor/organization (by monday). Andrew to take it afterwards
:::info
Up to date by June 2021
:::
Author: Danilo Lessa Bernardineli (BlockScience)
Required fields
Contribution details: contributions graph
IP addresses: retrievable from IP address vector thread
:::info
Up to date by February 2021
:::
Contributors (unordered):
:no_bicycles: Danilo Lessa Bernardineli
:herb: Michael Zargham
:seedling: Jeff Emmett
🪐 Jiajia Hu
:::info
Up to date by March 2021
:::
Author: Danilo Lessa Bernardineli (BlockScience)
This is a terse math spec of the funding algorithm being utilized on Gitcoin Grants Rounds 8 as described on https://github.com/gitcoinco/web/blob/stable/app/grants/clr.py
Terminology
Sets