Anoma open research questions for Research Day NYC

P2P networking
- On the boundary between structured and unstructured networks, where the preferred topology of the network is learned over time based on local information, preferences, and measurements, can we characterise the multivariable optimisation problem of bandwidth, latency, resiliency, privacy, etc. sufficiently clearly that specific routing algorithms can be automatically found and updated with reinforcement learning or similar techniques instead of programming them in as heuristics or (computationally expensive) search problems?
- What routing kudo (reward) systems are both incentive-compatible (encourage routing) and capture resistant (don't reward particular peers for capturing the network)? How do the possible mechanisms involving pre-commitment (commitment to pay for successful routing) vs. only paying post-route (with a secondary reputation metric) compare? Is full anti-Sybil necessary for a safe incentive system, or can we make do with something less?
Distributed systems
- Which consensus algorithms are most suitable for the "slow game", where we want to be able to run consensus on long timescales which can incorporate user preferences, doesn't rely on a proposer, and supports large numbers (billions) of voters?
- Can we make randomized consensus protocols {LINK} heterogeneous, in the style of Heterogenous Paxos and the rest of the Typhon protocol stack?
- If we take distributed storage and compute to be distributed cache locality optimisation problems subject to the additional constraint of trust and an unknown & evolving physical network topology, can we characterise the optimisation space cleanly and craft algorithms for dynamically selecting and reselecting storage and compute locations according to the best guess as to the underlying network topology? Can this be done with only local information in a way which composes well?
Programming language theory
- When comparing circuit compilation approaches, there's a trade-off axis between VM-based execution, which adds overhead but allows proofs to be made only for the actual execution path, and direct circuit compilation, which is lower overhead but requires that proofs be made for all possible execution paths (branches). In principle, it is possible to combine these approaches such that parts of a program (sub-programs) are compiled directly to circuits, while parts are interpreted. What does the optimisation space look like for this compilation approach?
- With a suitably abstract intermediate representation such as GEB, if we compile programs either directly to circuits with known cost models (costs for addition gates, multiplication gates, custom gates, depth, height, etc.) or to VMs with known cost models, can we characterise the optimisation problem cleanly enough that it can be handed off to existing NP optimisation techniques (SAT solvers, simulated annealing, learning algorithms, etc.)? How does this interact with translating between representations with different carrier fields (e.g. curves for a ZK proof system, binary for typical computers)?
- Can we adapt and extend current mathematical techniques for reasoning about the information flow properties of programs, such as hyperproperties {LINK}, to a sufficiently powerful setting (including e.g. statistical reasoning about difficulty of hidden variable inference) that they can be used for cryptographic primitive synthesis? (for example, ZK proof systems)
Cryptography
- In the Anoma architecture, solvers must know some information about user preferences (the "what") in order to match intents. After intents have been matched, zero-knowledge proofs can be created for full privacy to subsequent verifiers. In general, we are interested in different ways of making solvers privacy (which can compose freely with each other as long as they produce Taiga-compliant outputs). A few particular directions of investigation:
  - Can we improve solver privacy by using TEEs? In particular, can we use clever techniques such as re-randomisation to run the solver algorithm in the TEE, then produce output information which can be used to make the requisite zero-knowledge proof, without losing privacy (subject to the typical TEE assumptions)?
  - Can we run some kind of MPC solver algorithm followed by a collaborative ZKP {LINK} to match intents in private, then generate the requisite zero-knowledge proof, without losing privacy (subject to some quorum honesty assumption)?
- What is the frontier of witness encryption research? Relatively little effort and capital seems to have been devoted to the topic, relative to the potential for applications (e.g. trust-minimized bridging, improved encrypted mempools, intent encryption to a possible counterparty). Are there potentially any low-hanging fruit?
- Similarly, what is the frontier of fully homomorphic encryption research? Are there known bounds on how (asymptotically) efficient FHE constructions can be, and why or why not? What are the areas theorists should focus on?
Cybernetics
- The physical world - say, of energy and material flows - is complex and non-linear. If we want to accurately model this with digital abstractions such as tokens, we need to be able to capture the physical relations involved, at least to a certain degree of granularity. Specifically, physical systems are typically not freely exchangeable in the same sense as (abstract) tokens - there is some cost (energy loss) to conversion, many conversions are irreversible, and conversions affect other parameters of the system beyond just the atomic units being converted. How can we model these complex relations (in between "soulbound" tokens and typical transferable tokens), and in particular, how can we craft a system such that it can learn the relations of the physical variables over time (and enforce these constraints on the evolution of the digital representation)?
- Mark-to-market accounting fails to capture the possible evolutions of a system away from the current equilibrium, in part because it commits the fallacy of observer-independence (in order to actually convert, one must buy/sell, which may induce other participants to also act, etc.) and in part because it neglects non-equilibrium events (such as an external stimulus inducing many coordinated actions). Can we come up with better standards for risk accounting (e.g. as might be used in something like Compound) which don't make such simplifying assumptions, and which can better capture the interdependent nature of a complex system?
- When crafting a model to fit a world, and optimising actions taken in the world based on the model, it is crucial to avoid over-fitting the model to the world, and consequently taking actions in the world as a result of randomness or anomalies in the model. How can we craft multi-agent systems (such as financial systems) which account for the possibility of over-fitting, and limit themselves accordingly? One promising direction is limited-granularity accounting, where the system limits precision on purpose such that discrepancies below some "noise level" don't result in any systemic incentive change.