--- tags: Book --- # Anti-Sybil Data Dictionary The data dictionary below includes all fields in pipeline- and human-evaluated datasets. Not all columns will be available in all versions of the data, however. | Column Name | Data Type | Description | | ----------- | --------- | ------------------- | | amount_median | Float | Median contribution of the user | | amount_std | Float | Standard deviation of user's contributions | eth_share | Float | Percent of contributions made in ethereum | | update_distance | Int | Number of days since the last update to user's github profile relative to the round start date | | contrib_count | Int | Number of contributions made on gitcoin | | ip_cluster_size | Int | Number of IP addresses shared with this user | | bio_length | Int | Length of github profile | | handle | Str | github/gitcoin username | | public_repos | Int | Number of public github repositories | | human_sybil_score (1-5) | Int | Human-applied Sybil score; 1 is 'definitely not sybil', 5 is 'definitely sybil' | | is_sybil | Int | (0/1) Human reviewer's assessment of Sybilness | | reviewer_is_certain (0/1) | Int | Human reviewer's certainty in their assessment of Sybil | | followers | Int | Number of followers of the github profile | | following | Int | Number of github users user follows | | flag_type | Str | Whether the Sybil flag is human generated, ML-generated, or heuristic | | ml_score | Float | Confidence score of the ML algorithm in the Sybilness of the user | | create_distance | Int | Number of days since github account creation relative to round start date |