---
tags: Book
---
# Anti-Sybil Data Dictionary
The data dictionary below includes all fields in pipeline- and human-evaluated datasets. Not all columns will be available in all versions of the data, however.
| Column Name | Data Type | Description |
| ----------- | --------- | ------------------- |
| amount_median | Float | Median contribution of the user |
| amount_std | Float | Standard deviation of user's contributions
| eth_share | Float | Percent of contributions made in ethereum |
| update_distance | Int | Number of days since the last update to user's github profile relative to the round start date |
| contrib_count | Int | Number of contributions made on gitcoin |
| ip_cluster_size | Int | Number of IP addresses shared with this user |
| bio_length | Int | Length of github profile |
| handle | Str | github/gitcoin username |
| public_repos | Int | Number of public github repositories |
| human_sybil_score (1-5) | Int | Human-applied Sybil score; 1 is 'definitely not sybil', 5 is 'definitely sybil' |
| is_sybil | Int | (0/1) Human reviewer's assessment of Sybilness |
| reviewer_is_certain (0/1) | Int | Human reviewer's certainty in their assessment of Sybil |
| followers | Int | Number of followers of the github profile |
| following | Int | Number of github users user follows |
| flag_type | Str | Whether the Sybil flag is human generated, ML-generated, or heuristic |
| ml_score | Float | Confidence score of the ML algorithm in the Sybilness of the user |
| create_distance | Int | Number of days since github account creation relative to round start date |