# Validator Economics: Variable minimum validator deposit size ## :bulb: Project Abstract :::success To help facilitate the feasibility of single slot finality (SSF) we need to address the challenge of a potentially unbounded number of validators operating on the Ethereum beacon chain. ::: Moreover, a situation could arise where the system cannot cope with the sheer number of validators. The most obvious way to prevent the system getting into such a state is to set an upper limit on the total number of validators in the system so that the number of new validators joining is limited by this upper bound. :::success Therefore, the overarching research question motivating this grant application is: **What is the preferred strategy to cap validator set size?** ::: ## :pushpin: Objectives Vitalik proposed several strategies for capping the validator set [(See blog post)](https://notes.ethereum.org/@vbuterin/single_slot_finality). He discusses the pros and cons of each of the strategies. This project will focus on one of the proposed strategies, viz. the **variable minimum validator balance**, to achieve four key objectives: :::success #### :small_blue_diamond: In-depth knowledge of the potential impacts on the wider ecosystem #### :small_blue_diamond: Identification of risks and attack scenarios #### :small_blue_diamond: Identification of potential mitigations of risks resulting in a slightly altered proposal #### :small_blue_diamond: Recommendation of the suitability of the proposed strategy ::: ## :dart: Outcomes :::success **Research into the validator economics aspect of SSF brings the Ethereum ecosystem a step closer to single-slot finality.** ::: This project will shed light on one of the current proposals to cap validator set size, providing a detailed analysis of **potential security implications, risk profile, attack scenarios**, and the **strengths and weaknesses** that have been discovered in the *variable minimum validator deposit* proposal. ## :books: Grant Scope :::success The variable minimum validator deposit strategy to cap validator set size will be researched in this project, with the expected output being an **assessment of the feasibility of implementing this strategy**. Accompanying this assessment we expect to have at least one **academic paper** submitted to a journal and / or conference, the **model** that has been built and the **code** used to analyse the data. ::: Initially relevant data will be gathered to gain a current view of validators and stakers in the beacon chain. This data gathering step involves API calls to an archive node, looking at current aggregations provided by others such as rated network, beaconcha.in and queries provided by Dune Analytics, as well as the option of writing bespoke SQL queries to extract the required data. The intention is to validate any intuitions or assumptions that underlie proposed strategies outlined in Vitalik's blog post regarding the paths to single slot finality. Once a more accurate view of the current beacon chain is available with accompanying visualisations and analysis, the processes and variables affected by validators will be investigated with a view to building a model representing these effects to assist in further exploration. Any potential risks and attack scenarios will be discussed and if possible potential option to elimitate or mitigate these adverse consequences proposed. ## :busts_in_silhouette: Project team :::success The project team of three comprises a researcher and two expert collaborators. ::: ### :small_blue_diamond: Sandra Johnson (researcher): Sandra is a principal researcher at ConsenSys Software and has a PhD in Environmental Statistics. She is a visiting fellow at School of Mathematics, Queensland University of Technology (QUT), Brisbane. Prior to joining ConsenSys she worked as a Data Scientist at Flight Centre and as a Research Fellow in Applied Statistics at QUT. Her research interests include data science technologies, decision making under uncertainty, Bayesian network modelling and statistical modelling applied to the Ethereum ecosystem. [[Google scholar]](https://scholar.google.com.au/citations?hl=en&user=1gsap5oAAAAJ) [[LinkedIn]](https://www.linkedin.com/in/sandjohnson/) ### :small_blue_diamond: Kerrie Mengersen (expert collaborator): Kerrie Mengersen is a Distinguished Professor of Statistics at QUT and the Director of the QUT Centre for Data Science (CDS). CDS encompasses around 180 researchers from across the University with expertise in all facets of data collection, curation, privacy, modelling, visualisation and analysis, underpinning applications in twelve key domains including health, business, digital systems and social systems. She is also a co-founder of the Australian Data Science Network (ADSN) which brings together 32 centres in data science across the country. This concentration of expertise will be available to the proposed project. Dr Mengersen's research sits at the intersection of computational and applied statistics and machine learning, and focuses on developing ways to efficiently collect, analyse, share and trust diverse data sources. Her applied work focuses on health, environment and industry. [[Google scholar]](https://scholar.google.com.au/citations?hl=en&user=eiD83s4AAAAJ) [[LinkedIn]](https://www.linkedin.com/in/kerrie-mengersen-197347208/) ### :small_blue_diamond: Patrick O'Callaghan (expert collaborator): Patrick O’Callaghan is a mathematical economist with expertise in finance, machine learning and mathematics of decisions and games under uncertainty. He has specialised in transforming sentiment data (eg binary or rankings such as SERPS) into numerical representations (eg pricing data). He has worked at the University of Queensland in a research and teaching capacity since 2012, and holds a PhD from the University of Warwick, UK. [[Google scholar]](https://scholar.google.com.au/citations?hl=en&user=BxTFW6oAAAAJ) [[LinkedIn]](https://www.linkedin.com/in/patrick-o-callaghan--path-doc/) # :memo: Background ### :small_blue_diamond: Publications & Blog Posts The publications and blog posts listed here do not relate to the proposed project, but rather to the expertise of the research team to conduct the research using some of the approaches mentioned in the papers. #### Sandra Johnson: - [1] S. Johnson, D. Hyland-Wood, A. L. Madsen, and K. Mengersen, “Stateful to Stateless: Modelling Stateless Ethereum,” Electron. Proc. Theor. Comput. Sci., vol. 355, pp. 27–39, Mar. 2022.[Link to paper](https://arxiv.org/abs/2203.12435v1) - [2] ConsenSys R&D blog post on Stateless Ethereum project (2021): [part one](https://consensys.net/blog/research-development/modelling-stateless-ethereum-a-journey-into-the-unknown/), [part two](https://consensys.net/blog/research-development/building-a-stateless-ethereum-model/), [part three](https://consensys.net/blog/research-development/measuring-the-health-of-the-ecosystem-in-a-stateless-ethereum/) - [3] S. Johnson, B. Cristescu, J. T. Davis, D. W. Johnson, and K. Mengersen, Now You See Them, Soon You Won’t: Statistical and Mathematical Models for Cheetah Conservation Management. 2017. #### Kerrie Mengersen: - [1] J. Holloway-Brown, K. J. Helmstedt, and K. L. Mengersen, “Spatial Random Forest (S-RF): A random forest approach for spatially interpolating missing land-cover data with multiple classes,” Int. J. Remote Sens., vol. 42, no. 10, pp. 3756–3776, May 2021. [Link to abstract](https://www.tandfonline.com/doi/abs/10.1080/01431161.2021.1881183?journalCode=tres20) - [2] F. Jahan, E. W. Duncan, S. M. Cramb, P. D. Baade, and K. L. Mengersen, “Multivariate Bayesian meta-analysis: joint modelling of multiple cancer types using summary statistics,” Int. J. Health Geogr., vol. 19, no. 1, p. 42, 2020. [Link to paper](https://ij-healthgeographics.biomedcentral.com/articles/10.1186/s12942-020-00234-0) - [3] J. Davis, K. Good, V. Hunter, S. Johnson, and K. L. Mengersen, “Bayesian Networks for Understanding Human-Wildlife Conflict in Conservation BT - Case Studies in Applied Bayesian Data Science: CIRM Jean-Morlet Chair, Fall 2018,” K. L. Mengersen, P. Pudlo, and C. P. Robert, Eds. Cham: Springer International Publishing, 2020, pp. 347–370.[Link to paper](https://link.springer.com/epdf/10.1007/978-3-030-42553-1_14?sharing_token=JLzZMwHIackGAhn5ARin-Pe4RwlQNchNByi7wbcMAY48SE14Zzb555HgLhh2BhFXdmUQIXKvhc7WIDstKnxSCSxuE3FVdb1n19m866v6-LyRzKtEnuezOgPMtmV3zI-7Ark9MzJKFr3SvUgSZKH__kIY4AJlCE7ngHUBCXbUxP8=) #### Patrick O'Callaghan: - [1] P. H. O’Callaghan, “Axioms for parametric continuity of utility when the topology is coarse,” J. Math. Econ., vol. 72, pp. 88–94, 2017. [Link to abstract](https://www.sciencedirect.com/science/article/abs/pii/S0304406816300775?via%3Dihub) - [2] P. H. O’Callaghan, “Second-order Inductive Inference: an axiomatic approach.” arXiv, 2019. [Link to paper](https://arxiv.org/abs/1904.02934) - [3] S. Grant, J. Kline, P. O’Callaghan, and J. Quiggin, “Sub-models for interactive unawareness,” Theory Decis., vol. 79, no. 4, pp. 601–613, Dec. 2015. [Link to abstract](https://ideas.repec.org/a/kap/theord/v79y2015i4p601-613.html) # :bar_chart: Methodology The research objectives will be achieved by following a logical progression from data analysis to statistical modelling, evaluation and conclusions. The various steps we plan to take are outlined in a bit more detail in the *Timeline* section below. At each stage, the current outcomes will be assessed in the context of the overall goal and we intend to stay in touch with the Robust Incentives Group (RIG) at the EF on a regular basis to ensure that our research complements the work that they are undertaking. Kerrie and Sandra have worked on many and varied research projects over the years and have published several papers together, a small number of which are noted in the previous *Background* section. Patrick has not previously been part of these research projects, but he is bringing vital expertise to the team through his work in economic mathematical modelling and game theory. Patrick and Sandra have previously connected to discuss a journal special issue that she and David Hyland-Wood co-edited on *Blockchain Consensus Protocols*. # :timer_clock: Timeline | **Milestone** | **Expected deliverable** | **Funding Notes** | |:----------------:|:------------:|:------------:| | Identify relevant data insights that will provide a deeper understanding of the current situation and any intuitions or assumptions used in formulating proposed solutions. | Document the information that need to be visualised to provide the required data insights.| Researcher & collaborator time to achieve milestone and deliverable. | Data gathering / wrangling | Document describing progress, procedures & visualisations. | Researcher time to 1) extract data from archive node, 2) query other data sources and 3) clean & summarise data for visualisations. | | Exploratory data analysis and visualisation | Publish the data analysis and visualisation code in GitHub. | Researcher time to code and publish on GitHub. Collaborator time to brainstorm, discuss findings to date and review the exploratory data analysis. | | Detailed breakdown of the variable minimum validator balance proposal | Identification of variables, impacts and possible knock-on effects resulting from the introduction of this strategy. Potential paper or blog post for community input and comments. | Researcher and collaborator time to brainstorm & examine the strategy in detail. | | Statistical/mathematical model of the variable minimum balance strategy | Statistical Model published. Conference or journal paper to describe progress to date. | Researcher and collaborator time to build model. | | Identify risks, edge cases & attack scenarios | Use model to explore extreme scenarios and document outcomes. | Researcher & collaborator time as required. | | Identify any major and minor issues with the proposal, and variations on the initial proposal that could mitigate or eliminate those problems. | Write project up as a journal article, and/or conference paper. |Researcher & collaborator time to write paper. Submission costs to journal, and/or conference registration costs if paper accepted. ## :money_with_wings: Budget The estimated costs to fund this grant proposal for a 6 month project: ### :small_blue_diamond:Principal Researchers Costs: - Primary researcher: (US$ 22,000.00 * 0.75) * 6 = US$99,000.00 - Collaborators: (US$ 22,000.00 * 0.05) * 6 * 2 = US$13,200.00 ### :small_blue_diamond:Indirect costs: - Publication fees in journals - US$2,000.00 (Pay to have open access.) - Conference registration for accepted paper(s) - US$1,000.00 :::success **Total funding request: US$115,200.00** :::