### Attendees
-
- Deger
- AI Objectives Institute
- Rob
- Startup founder
- Mirona
- Returning to technical roles and interested to investigate AI
- Mike
- More positive-sum games
- Vincent
- Concerned about centralisation and lack of diversity in AI
- Wants to explore alternative and decentralised pathways for both alignment and capabilities work
- Interested in novel mechanism design
- _Paul? (might be wrong name)_
- Very little exposure to AI, but learning more
- Interested in the long-term perspective on AI as a father
- George
- Twofold interest - equal access to technology between communities, with AI aligned to community values; and how to use crypto coordination technologies to align adversarial behaviour (in a non-foom world at least)
- Afra
- Concerned about emergence of AGI (based on MS paper about emergence of AGI from GPT-4)
- Science fictional ideas are becoming real
- Lots of ideas about MEV and on-chain smart agents could go wrong
- Christian
- Engineer working on AI, first in computer vision and now in biotech
- Also concerned about centralisation, especially with closed-source
- Sees power of blockchain incentives to solve problems
- Dan
- PhD student studying incentive design in crypto
- Interested in exploits and failures in mechanism design
- Feeling out the connection to AI alignment
- Nima
- Investing and incubating DAOs and infrastructure for decentralised collaboration
- Studied cognitive science and AI
- Early investor in Stability, figuring out whether there is a role for DAOs in AI
- Primavera
- Lawyer and researcher into legal challenges and governance opportunities of blockchain technologies
- Wants to understand the interface between AI and copyright
- How we can use IP and patents to enforce openness in the viral/copyleft sense
- More afraid of corporate AI than AGI
- Michael
- Used to run Singularity Institute
- Observes that AIs are in a better position to be explicit about their incentives than humans are
- Observes failure of Singularity Institute to plan correctly/follow plans at the point at which powerful AI became a realistic prospect
- Jessica
- Old-timey AI safety researcher
- Observed similar dynamics to Michael
- Doesn't buy the claim that LLMs will kill everyone - sees neither a path to superintelligence or aligning superintelligence, and is more concerned with how we can use current tech to improve society
- Sxysun
- MEV researcher (MEV is negative externalities from deficiencies in mechanism design in coordination)
- Mechanism design is only for lawfulness (not goodness?)
- _Eren? didn't catch the name properly_
- Building extended version of ERC-4627
- Also working on MEV
- As a smart contract wallet developer, interested in the interfaces for AI and how they will be improved
- Yihan
- Building AI tutor for critical thinking
- AIs and humans working together to be smarter
- Holke
- Political economist and research scientist at Protocol Labs and Hypercerts Foundation
- Impact certificates and how we can fund the things we want to achieve
### Discussion
Deger: What experiments have we done with different takes on value and incentive alignment?
Nima: Defi 3.0 will be mostly about AI agents, so we need legal frameworks for AIs to operate in financial markets. Against giving personhood, but pseudo-personhood might make sense? Something similar to corporate personhood.
Mike: Right to see AI agents as a threat. It's also a big opportunity. Civilisation rises when we have positive-sum games, and we can use crypto to generate positive-sum games involving AI. New games open up when an AI can disclose the constitution and prompt it is using.
Michael: Constitution for AI means a set of attractors and repulsors in high-dimensional space. The concept of boundaries is important in constitutional law, but is very hard to apply in the high-dimensional space of a transformer network that makes up an LLM - there can only be conditioning.
Deger: Can crypto help us to have any useful guarantees about how an LLM has been trained or validated/tested? Can we prove that it has passed an evaluation?
Michael: "Ideological" AI is a better translation for the concept that AI uses "constitution" for.
Deger: Other end of the spectrum is rule-based system for credible commitment. This is very much unlike that.
Sxysun: Crypto acts as a permissionless credible commitment device. MEV bots are algorithmic agents that use the blockchain to satisfy their needs, and their only objective is to optimise their ETH balance. This corresponds with some scenarios that alignment researchers are concerned with: a powerful optimiser that is out there in the world and playing games which involve behaviours that are undesirable from a human perspective. Suggests that we increase the expressiveness of the language that agents can use to coordinate.
Jessica: If AI systems are not reliable, is there any way we can verify a system that includes both deep learning and a state-based system? Can we use techniques from reliability engineering, such as tests designed to create hard or difficult conditions? Can we use red-teaming to test AI?
Deger: Can we create shared evals?
Jessica: Starting with simple cases such as prompt injection attacks might allow us to progress to more complex things later.
Deger: How can we design an incentive structure to evaluate more complex AI? Rewards or bounties? Can Hypercerts help? Has anything happened in crypto which serves as an example, e.g. funding pools for bounties.
Syxsun requests AI example
Deger gives example
*** brief sidetrack ***
Scott: Suggests that we could replicate the Zuzalu funding track in Gitcoin for use in funding AI eval work. People create Hypercerts and using either 1P1V or quadratic models, we identify people who are interested in solving the problem. Effectively we end up with an RFP process where skilled evaluators identify teams who seem likely to useful work. Alternatively a CTF approach can work where there is some target to be attacked.
Holge: Hypercerts can be used for accounting of all contributions, even in cases where there is a complex puzzle to be assembled. We can also use retrospective funding in a way that is credible and transparent, where the accounting system can be known beforehand. S-process or other similar trusted processes can be used to evaluate the work done.
Deger: Is prospective or retrospective better? When?
Vincent: Quadratic funding in Gitcoin creates a real incentive for smaller funders to participate because they have an outsized influence over the total distribution of matching funds.
Scott: Exploration phase where there's lots of diversity is good for QF and retrospective. When you have a clearer idea of what you want, more expert decisions and prospective funding can be better. Notes that AI field is a matter of broad concern, so lots of people might be interested in donating and having influence over AI alignment.
Holke: From perspective of the funders, they might be worried about it but they don't know what to fund. Most funders might prefer to fund things through networks of trust. S-process can help with this due to the role of recommenders creating a layer of trust. Later on, funders learn whether the recommendations were good and can change their preference of recommender for later funding rounds. Argues that the community might not be the best judge of what to do right now.
Jessica: It's much easier to work out if something worked in retrospect. Cites Robin Hanson on the importance of rewarding things that are shown to be good after the fact. Has participated in S-process as both a granter and receiver, and thinks it's a terrible process. The process is opaque and unaccountable and spreads responsibility by design. Argues that it's a bad example of a process to copy, and has seen mistaken uses of funds resulting from it.
Vincent: Argues that EA examples are flawed because of the centralisation in the EA community, but that in a more diverse community the process could work better.
Holke: Agrees that S-process does not mandate transparency, but there are ways to do it with much greater transparency. Hypercerts mean that there is a more direct connection between funders and the things being funded, creating some traceability.
Ehan: Makes the point that retroactive funding sometimes rewards things that were simply lucky.
Jessica/Vincent/Holke discuss tweaks to the S-process approach to make it more transparent.
Deger: Raises the concept of patent pools, and also which technologies we ought to build first
Vincent: Argues that funders might want broad exposure to lots of things, including long-shots that create enormous value.
Deger: Asks for suggestions for what AI technologies ought to be prioritised, e.g. within OAA
Vincent: Suggests short-term risks are of greatest concern rather than theoretical problems
Scott: Interpretability is important, cites Scott Aaronson's talk mentioning watermarks
Michael: Questions connection between interpretability and watermarks
Scott: Explains being able to verify the behaviour of the AI as compared to evals or with respect to their constitution
Christian: Cites Yann Le Cun on interpretability and argues that it is in principle tractable, and that with work it could be done
Michael: Interpretability is probably more important for capabilities than for alignment, though for language models it's actually better to get more capabilities, so this might be good!
Rob: Could we replicate OpenAI's work on interpretability?
Vincent: Doing so might be risky in open source
Jessica: Believes that AI ought to be aligned to truth, rather than to corporate values, and sees un-crippled AI as being good. Happy to take the risk of releasing un-RLHF'ed models openly.
Vincent: Sees most LLM misuse as stemming from open source LLMs, and so making this easier is bad.
Jessica and Vincent back-and-forth
George: Asks Jessica to steelman her opponents
Jessica: Gives the example of how LLMs could give bomb-making instructions, but argues that demand for this is actually low, and maximally-bad intent is rare in humans. Also argues that we need to make other systems more resilient.
Deger: How can LLMs help with collective coordination, individual autonomy and sovereignty? Can we fund both open-source and for-profit using the same funding structure, provided the benefit is being created?
Rob: If we measure impact then it doesn't matter if it's for-profit or non-profit.
Holke: Brings up question of taxes, but broadly agrees.
Vincent: Funders want upside in for-profit entities, especially if they're the first money in.
Deger: Asks for further suggestions
George: Gives example of how NFTs can offer upside when equity is not on the table
Deger: What conditions on funding make sense?
Vincent: Corporate structure of OpenAI and Anthropic are benefit-corps. We might want to be able to restrict funding to public-benefit corporations in order to avoid fiduciary duties constraining behaviour.
Mike: Takes us back to the core AI/crypto mashup question. What affordances, primitives can crypto build for AI alignment? ZK proofs, on-chain commitment devices. AI companies would need to be excited by this. Second thread is AI constitutions: if this is the future then how do we make good ones?
Deger: Is it technically feasible to use ZK-proofs and state-based systems to allow AI agents to prove the constitution they were derived from?
George: Questions whether capabilities are yet sufficient to make this necessary
Deger: Re-asks the question
Sxysun: Only way we could do this is to put a hash of the model on-chain?
Dan: Sounds hard to me
Deger: Thinks it's sufficiently important to be worth pursuing