Using Mathematics to Understand Attacks on GitCoin Grants

# Using Mathematics to Understand Attacks on GitCoin Grants ## Background [Quadratic Funding](https://medium.com/coinmonks/breaking-down-buterin-hitzig-and-weyls-liberal-radicalism-paper-ba5192248b2) is an innovative proposal for providing matching funds to public goods. The main feature of Quadratic Funding (QF) is that a proposal which earns support from several small donors should receive more funds than a single donation by a rich donor, Under QF, a hundred donors, each of whom has just 0.01 ETH to contribute, should wield more influence than a single contribution of 1 ETH. However, this basic feature could be exploited by users who want to strategically manipulate their contribution amounts and targets to maximize the matching funds. These behaviors are different from organic community action, since they are consciously directed with the matching funds as the primary goal. We do not want to reward *manipulation* -- funds should only go to real users making donations to grants they truly value. We consider two basic types of strategic behaviors: * **Sybil Attack:** In this scenario, many fake accounts are created to give the illusion of broad public support for some grant $F$. The funds available to a user are split into small amounts between these fake accounts, who then make many small donations $F$. As a result, $F$ receives a large matching donation amount, even though its actual support is limited. * **Collusion:** This occurs when there are two or more groups of users, with each group supporting a different grant. If the groups recognize this situation beforehand, they might decide to work together, directing part of their donations to each others' grants. This would inrease the overall amount that each grant receives. Again, we view this behavior as undesirable since the donations are not being driven primarily by the donors' authentic feelings about the values of the grants. Both types of attacks are based on the principle alluded to earlier, which we will call the **Subdivision Principle**: two accounts that each donate $\frac{1}{2}$ units to a grant will get more matching funds for the grant than one account that donates 1.0 unit. As part of the eight-week Gitcoin Grants Research Group sponsored by [Token Engineering Academy](https://tokenengineeringcommunity.github.io/website/docs/academy-welcome/), we performed a focused mathematcal analysis of the [modified CLR algorithm proposed by Buterin](https://ethresear.ch/t/pairwise-coordination-subsidies-a-new-quadratic-funding-design/5553). This proposal shares the basic properties of the implementation used by GitCoin, but is more amenable to mathematical analysis. Our goal was to come to an understanding of the strategic attacks described above. We hoped to provide both a formal foundation and and informal insight anyone interested in understanding this aspect of the GitCoin ecosystem. In this article, we will concentrate on the high-level overview of our findings -- formal proofs will be provided elsewhere. --- ## How To Look at The Donations? There are many ways to represent the donations from users to grants for a particular round of GitCoin Grants: Django, Python dictionaries, .csv files, pandas data frames, etc. From a mathematical perspective, there are a few other valuable ways to represent this information. The first is as a [***bipartite graph***](https://en.wikipedia.org/wiki/Bipartite_graph), where each node represents either a *contributor* or a *grant*. There is a *directed edge* from the contributor to a grant, labeled with the donation amount. Every bipartite graph has a closely related ***bipartite adjacency matrix*** $A$, where the entry in row $i$ and column $j$ is determined by the contribution of user $i$ to grant $j$. All of the donations by particular user will form a row vector in this matrix, while all of the donations to a particular grant can be viewed as a column vector. This enables one to think about distance, angles, etc. between grants or users in the sense of vector calculus. Although they have not played much role in our current analysis, graph theory and matrix analysis are powerful mathematical tools that may yield further insight. In particular, it is often possible to achieve [massive speedups of algorithms by reformulating them in terms of matrices](https://medium.com/@mikeliao/numpy-vectorization-d4adea4fc2a). ## Understanding Strategic Attacks as an Optimization Problem The main thrust of our research was to understand how a user or group of users could manipulate the system to achieve maximum matching for a particular grant or group of grants. Imagine that we have a scenario of three users, Users 1,2, and 3. These users have with a total budget to split between them of 1.0 ETH. They are interested in maximizing the raw matching funds allocated to two particular grants, which we call Grant 1 and Grant 2. We let $x_{i\rightarrow j}$ denote the **sqare root** of the amount given by User $i$ to Grant $j$ in this scenario. Under this framework, the raw matching funds that these users can get for these grants by strategically adjusting their donations has a precise formulation -- Maximize the value of the expression: $$ \displaystyle\frac{x_{1 \rightarrow 1}x_{2 \rightarrow 1} + x_{1 \rightarrow 2}x_{2 \rightarrow 2}}{1 +x_{1 \rightarrow 1}x_{2 \rightarrow 1} + x_{1 \rightarrow 2}x_{2 \rightarrow 2} } +\displaystyle\frac{x_{1 \rightarrow 1}x_{3 \rightarrow 1} + x_{1 \rightarrow 2}x_{3 \rightarrow 2}}{1 +x_{1 \rightarrow 1}x_{3 \rightarrow 1} + x_{1 \rightarrow 2}x_{3 \rightarrow 2} } +\displaystyle\frac{x_{2 \rightarrow 1}x_{3 \rightarrow 1} + x_{2 \rightarrow 2}x_{3 \rightarrow 2}}{1 +x_{2\rightarrow 1}x_{3 \rightarrow 1} + x_{2 \rightarrow 2}x_{3 \rightarrow 2} }$$ subject to the constraint $$x_{1\rightarrow 1}^2 + x_{1 \rightarrow 2}^2 + x_{2 \rightarrow 1}^2 + x_{2 \rightarrow 2}^2 + x_{3 \rightarrow 1}^2 + x_{3 \rightarrow 2}^2 = 1.$$ This is a ***constrained optimization*** problem on a ***manifold***. It is similar principle to something that might be encountered in a multivariable Calculus class (though the notation, number of variables, and existence of denominators in the function to be optimized make it very challenging). The good news is that there is both extensive matematical literature and some nice programs focused on such problems. We analyzed many instances of these problems using the computational tool [pymanopt](https://www.pymanopt.org) and theoretical tools from textbooks like [Optimization In Theory and Practice, by Forst and Hoffman](https://www.springer.com/gp/book/9780387789767). In the end, we were able to come up with a solution to the general problem -- "What is the optimal strategy for extracting matching funds, and what is the most matching funds it can extract?" ## What is the "Best" Strategic Attack? How much can it make? Suppose that we have a collection of accounts (either real or fake) have a specified budget (total amount of resources that they could allocate). Let's say that there are $x$ accounts who have a total budget of $B$ to give to a grant $F$. Under the CLR framework, what is the maximum amount of matching funds they could attract using this budget of $B$? In reality, this will depend not only on this grant, but the other grants as well -- each grant will be given a raw matching amount based solely on the CLR formula, but the final amount will be scaled proportionally to the raw amounts of all the grants. However, our strategic accounts have no control over what other grants give, so their best option is simply to maximize the raw amount. :bulb: **Major Idea:** The maximum raw amount of matching funds that a collection of $x$ users with a budget of $B$ can get for a grant is $$ \frac{x(x-1)B}{2(x+B)}. $$ Notice that as $x$ increases, this quantity can increase without bound. There was some fairly heavy mathematical lifting involved in getting to this formula -- we will save the details for another day. :bulb: **Major Idea:** If there is no lower limit placed on the donation amount, and no way to prevent the creation of fake accounts, then there is no limit to how much these users could theoretically earn in raw matching funds. :bulb: **Major Idea:** The creation of new grants will not increase the amount that strategic play can extract from the system. If their total matching amounts are pooled together, multiple grants can earn no more than a single grant. **Strategic users who want to get as much as possible in matching funds will do so by creating many accounts that make small donations in equal amounts to a single grant.** ## How Might Strategic Play Be Recognized? How Might It Be Disguised? :bulb: **Major Insight: Strategic Donations will look like complete bipartite subgraphs -- all of the edges from a set of users will go to a particular subgrant or set of subgrants.** Before funds are allocated from GitCoin Grants, the matching funds are reviewed by a governance committee that [issues reports before the final allocations of funds, disqualifying grants that are clearly engaging in undesirable behavior](https://gitcoin.co/blog/gitcoin-grants-round-9-governance-brief/). A mathematically optimal strategy (lots of accounts with many small donations ta single grant) runs the risk of being recognized and thwarted by the final human decision-makers. Dishonest actors who are intelligent enough to solve the optimization problem described above will probably realize this. Thus, we would expect to see "*meta-strategies*" that make trade-offs, willing to reduce the total amount extracted from the system in exchange for a better chance of eluding detection. One potential tool for filtering unfair strategic actions would be to formulate and analyze subgraphs in the original bipartite graph based on some metric. **We want to point out that any metric based on edges in the original graph is susceptible to being "gamed" -- a strategic player can make small camofluage contributions to create new edges with authentic grants, thus distorting network-based analysis. Any attempt at network-based analysis will need to reckon with how to make this counter-strategy ineffective.** ## Thoughts and Future Directions There are many different mathematical ideas and tools relevant to GitCoin Grants and related CLR funding mechanisms. The answer isn't always deeper mathematics or more intricate algorithms. Although we are able to provide answers to some important questions, there are still many more to answer. It is crucial that a community decision to adjust a grant's matching funds be based on transparent protocols and procedures. In addition to solving the problems, we need to ensure that these solutions are understood by decision-makers and by the community at large.

Read more

Analysis of Impact Hours for Retweeting and Attending Meetings

Formulating the Optimal Subgraph as a Continuous Optimization Problem :octopus: