Creating fog of war for Network States 🏰

Building the first true Fog of War (FoW) for on-chain games. ## What's Network States? <figure style="width:100%;margin:0px"> <img src="https://hackmd.io/_uploads/SJzskzjKh.jpg" style="width:100%"/> <figcaption align = "center">A snapshot from a <a href="https://generals.io/">generals.io</a>, the game that inspired Network States. </figcaption> </figure> ### A brief overview of game mechanics Network States is played on a 2D grid. You're randomly spawned onto the grid with one tile that generates troops. The objective is to move these troops around to conquer as much of the grid area as you can. As you expand your lands, your view of the map widens. Every tile you conquer allows you to see what is happening in the 8 neighboring tiles. In the image above, we see the game as viewed from blue's POV. Notice that blue can see one tile out from all blue lands, while everything else in the map is shrouded in FoW. Whenever we mention "private state" in this doc, it refers to 1) who occupies the tile and 2) how many resources exist are kept at that tile. As you expand your lands, you eventually encounter other players doing the same. Your armies clash as you fight for territory. The game this is based on, [generals.io](https://generals.io/), is ridiculously fun. I had close to a hundred games logged in [my first two weeks of playing](https://generals.io/profiles/lyronctk). ### Today's Network States is played with full information The [Network States team](https://twitter.com/0xsmallbrain/status/1660459523771973632) has most of the game logic / graphics implemented in MUD. They already did a successful playtest that featured an autonomous narrative via GPT-4 integration. What's missing is the FoW. It seemed impossible to construct for the longest time. Regardless, the importance of FoW in core gameplay made it worthwhile to continue devoting engineering resources. We now have a pragmatic solution with acceptable trust assumptions. This is what we'll go over today. ## Why current constructions are insufficient for fog of war <figure style="width:100%;margin:0px"> <img src="https://hackmd.io/_uploads/rJijy4oY3.jpg" style="width:100%"/> <figcaption align = "center">A universe in <a href="https://zkga.me/">Dark Forest</a>, the hit fully on-chain game with temporary FoW.</figcaption> </figure> ### Doesn't Dark Forest have FoW? Why can't we use that? The game features a temporary FoW (*does that make it Mist of War?*). Players are *meant* to dispel the FoW using a brilliant game mechanic- an [Explorer](https://github.com/darkforest-eth/darkforest-v0.6/blob/main/client/embedded_plugins/Remote-Explorer.ts). To put another way, the private state was secured by a function that was deliberately breakable. Case in point: motivated players capitalized on the feature to build [GPU miners](https://twitter.com/darkforest_eth/status/1429120719669981185?s=20) that revealed parts of the map 100x faster than normal. Perfect for Dark Forest, insufficient for us. We do not want players to be able to see the entire grid given enough motivation and compute. ### Pure ZK or FHE solutions are insufficient Persistent FoW cannot be built with a construction based purely on zero knowledge (ZK). Fully homomorphic encryption (FHE) is in a similar boat. The root cause is the same for both classes of solutions: you can't modify what you don't know. Suppose you hold some set of tiles as your private state. The commitment to this private state lives on-chain. For someone to encroach on your territory, you must transition this commitment. To do so using ZK, they must prove that they have the resources to conquer the tile. How do they construct a satisfying witness for the ZKP? It requires part of your private state- who owns the tile and how many resources are devoted to it. They don't have this, so they can't. Dark Forest got around this with the Explorer. Other players can eventually uncover your private state using a mechanism that's built into the game. We, along most other games, do not have an Explorer-esque mechanic as part of the core game loop. To do so using FHE, they must generate ciphertext to add to yours. How do they create compatible ciphertext? Again, they need the tile's owner and the number of resources devoted to it. They don't have this, so they can't. If someone could transition your private state commitment, then they somehow used parts of your private state. If they used parts of your private state in those protocols, then they can recover parts of your private state. If they have your private state, then it isn't really private state. Rough and informal, but we see why we need more than just ZK or FHE. ### MPC or ZK with economic incentives cannot save us today We need to compute a function jointly across multiple parties, each with their own private state. The natural solution is Multi-Pary Computation (MPC). This indeed works in theory. In fact, there's a fast [malicious MPC protocol based on oblivious PRFs ](https://eprint.iacr.org/2021/883.pdf) that would work well for a Network States esque function. We would then have some economic incentive- likely in the form of staked assets- to encourage continuous participation. Economic incentives can even be used with ZK to have the same effect. This is demonstrated in [ZKHunt](https://www.youtube.com/watch?v=JKyUV1s2O08&t=443s), an excellent first exploration into collaborative private state for fully on-chain games. When applied to AWs, however, solutions using MPC or ZK with economic incentives do not meet the following requirements: 1. Malicious actors can't have the ability to break the world's persistence. A player who eats the slashing penalty by quitting the game without cleaning up private state would render their tiles unplayable forever. 2. Computation and communication must scale sub-linearly with the number of players. This is especially important for MMO games that feature concurrent play on the same instance. Many AW designs fall under this category, with Network States being one of them. Some MPC protocols scale linearly, though most are quadratic. ZK + economic incentives are also quadratic. None of the options are sub-linear. 3. The game must accommodate asynchronous play. Restricting to synchronous play significantly limits the design space for games. Players should be able to log off and log back on again later. This isn't possible if they need to stay online to participate in the protocol. ## ABE + ZK = a pragmatic solution ### How we can satisfy all the requirements [Attribute-based encryption (ABE)](https://ntt-research.com/ntt-research-cis-cryptography-attribute-based-encryption/#:~:text=Attribute%2Dbased%20encryption%20provides%20the,and%20multi%2Dauthority%20access%20policies.) splits data into two parts, the index and the payload. You can encrypt your data under ABE and broadcast your key with a predicate. If the predicate applied to the index is ever true, anyone can decrypt the payload. Why would ABE be useful? We can use tile number as the index and private state as the payload. The key will then be matched with the predicate "decrypt the payload if you prove you are close to this cell." Given this primitive, the idea is simple. You encrypt your private state with ABE whenever you move. You then make the cipher text and key available to everyone. Those who prove they have the right to view your private state by satisfying the predicate of "being close by" can decrypt. Once those close by can view your private state, they can create a ZKP to transition the commitment. The protocol satisfies all of our requirements. Number 1, it doesn't matter if a player abruptly quits since anyone else can decrypt their tiles later. Number 2, a player now only interacts with cipher text, so there's no longer a need to interact with every other player before moving. Number 3, players are free to come and go as they please and the game will still continue to run. ### Using trusted hardware for ABE We see that ABE is the most promising way to create true FoW. Our ABE implementation will be done via Nitro Enclaves. This comes with significant trust assumptions on the hardware. For this reason, we've taken the following steps in protocol design to mitigate risks. 1. The enclave will only be used for ABE. All game logic will still be conducted on-chain or in a ZKP. The enclave will have no priveleged access to functions that manipulate game state. 2. Using Nitro Enclaves instead of SGX. These enclaves run on their own card. There are no untrusted cores running on the same processor that can mount side-channel attacks or inspect shared / stale caches. 3. We can boost security by maintaining a threshold decryption network between multiple enclaves. Any attackers would need to breach at least a threshold number of systems to access the private data. 4. Restricting enclave functions to ABE means we do not need to upgrade enclave code once the primitive is done. Upgrades on TEEs, and in general programs, are notoriously dangerous. These go a long way for protecting game integrity, but a few risks are still outstanding: 1. Nitro Enclaves have not been broken since their release in 2020, but that is likely due to AWS being the only party with access to the chips. The story may have been different if anyone was granted physical access. 2. A data breach isn't clearly falsifiable. Though manipulating game state wouldn't be possible, one party with asymmetric information could play according to the rules and win without drumming up suspicion. In this case, FoW is not a strict benefit to the game. A breach wouldn't revert the game back to its original state of full information. Instead, one party would be left with an advantage while the others still operate on the assumption that everyone has limited vision. We'd love to discuss strategies to further de-risk this system until we're able to transition to one that's secured end-to-end with cryptography. ### A cryptographic solution for ABE There are a number of [lattice-based constructions](https://www.sciencedirect.com/science/article/abs/pii/S1574013721000770) for ABE today. They're promising, but too slow for practical builds. A few efforts are currently underway to make ABE much faster. Once these are released, we can swap out enclave ABE with the crytographic equivalent. We're optimistic that we can have a protocol in the short-term that's only dependent on battle tested cryptographic assumptions. One caveat here is that we are technically having the enclave conduct private index ABE. There are ways to modify the protocol to remove the index hiding dependency when we do the swap. ## Describing the system in its entirety ![]() <figure style="width:100%;margin:0px"> <img src="https://hackmd.io/_uploads/BJDr87152.png" style="width:100%"/> <figcaption align = "center">Flow for inducing state transitions. Notice this is a closed loop which sets up subsequent moves without depending on continuous participation from the originating client.</figcaption> </figure> ### The chain, the client, and the enclave There are three relevant parties: the chain, the client, and the enclave. The chain holds the commitment to global private state in a Merkle root. The client interacts with 1) the enclave via a secure channel to see neighboring tiles and 2) with the chain via ZKPs to transition global private state. The enclave is only for ABE. With this system, decrypting parts of global private state is quick. Step 1, send the satisfied predicate (ie "I conquered a neighboring tile", a predicate that involves running a light client) to the enclave. Step 2, receive the private state. The knowledge from decryption is used to move. This begins with private information for the move being communicated to the enclave. The enclave signs the blinded private state to acknowledge reception. Next, the client produces a ZKP that attests to the validity of the move and a corresponding nullifier to invalidate old state. The ZKP, signature, blinded move, and nullifier is then published on-chain, where the contract updates the Merkle root. Decrypt and move are powerful together. They create the FoW we've been working toward. As with most solutions, the gist is simple but the details are many. See the section *Here's the exact protocol* for more. ### No locks or ordering implications Our construction is designed to avoid caveats for transaction flow. For one, there are no locks. Multiple players can move in the same block. One player can move multiple times in the same block. Different actions can attempt to operate on the same tile within the same block. All of these cases proceed as before FoW implementation due to our `MerkleTreeWithHistory` + nullifier mechanism. Another notable feature is that transaction ordering remains the sole domain of the block builder. The enclave has no power over ordering. There's no additional surface area for MEV to detract from the player experience. ### Still very much a fully on-chain world All properties of autonomous worlds that we can think of are preserved with this construction. Permissionless enclaves that can sync with the main enclave ensures persistence. The source of truth for game-state still lives in the contract, facilitating composability. ZKPs for state transitions enforce fidelity, allowing players to put real value into the game. ### The bigger picture: collaborative private state Let's look at what we are building more abstractly for a moment. Conditionally revealing others' secrets, then using them to roll forward global private state via ZKPs and nullifiers, is essentially *collaborative private state*. Every on-chain game, every blockchain application that has been deployed to date has either featured public state or isolated private state. In the world of isolated private state, users have been able to edit their own chunk, but never reach into someone else's. Network States is acting as an experiment for a fundamentally new type of interaction. It unlocks an entire class of game mechanics that stretch much further than positional FoW. That's something cool to explore in the future though. For now, let's spin up this world. ## Here's the exact protocol ### Representing the game board A tile $t$ is represented as the object $\{l, pk_{owner}, r, \alpha\}$, where $l$ is the location, $pk_{owner}$ is the public key of the owner, $r$ is the resoure count, $\alpha$ is the key. The game board is represented as a vector of tiles $B = \{t_0, t_1, ..., t_n\}$. Its Merkle committment $\bar B$ is stored on-chain. ### Client functions for interacting with the game A move induces state changes in two tiles- the one resources are coming $from$ and the one resources are moving $to$. Submitting a move involves nullifying the old states $(t_{from}, t_{to})$. These old states are replaced with blinded new states $(u_{from}, u_{to})$. $move(t_{from}, t_{to}, u_{from}, u_{to})$ 1. Sample $\alpha\leftarrow$$ $F$, where $F$ is the scalar field of BN128. This will be the key for $u_{from}$, an attribute assumed to be $nil$ in the above parameter. Do the same for $u_{to}$. 2. Compute a nullifier using the $from$ tile's key, $\rho_{from} = H(t_{from}.\alpha)$. Do the same with $t_{to}$ to obtain $\rho_{to}$. 3. Let $P$ be the predicate "decrypt tile state iff its hash is finalized on-chain AND the decrypting party proves ownership of at least one neighbor". Send $encrypt_P(u_{from})$ to the enclave. The enclave acknowledges the message and sends back a signature $\sigma_{from} = Sign(sk_{enclave}, H(u_{from}))$. The same is done with $u_{to}$ to obtain $\sigma_{to}$. 4. Generate ZKP $\pi_{move} = P_{NIZK}(pp_{move}.pkey, \bar B, H(u_{from}), H(u_{to}), \rho_{from}, \rho_{to})$, where $P_{NIZK}(⋅)$ denotes the prover for our zk-SNARK protocol and $pp_{move}.pkey$ is the proving key for our $move$ circuit. 5. Publish $move(\pi_{move}, \bar B, H(u_{from}), H(u_{to}), \rho_{from}, \rho_{to}, \sigma_{from}, \sigma_{to})$ on-chain. A player who owns tile $t$ can decrypt neighboring tile $q$. $decrypt(sk_{owner}, t, q.l) \rightarrow q$ 1. Compute proof $\pi_{schnorr} = P_{schnorr}(sk_{owner}, t.pk_{owner})$, where $P_{schnorr}(⋅)$ is the prover for Schnorr's protocol. 2. Send $decrypt_P(\pi_{schnorr}, t, q.l)$ to the enclave. 3. Receive $q$ from the enclave and return it. Notice that, for any party who knows $\alpha$, the nullifier leaks information about when a tile's state is overriden. This isn't a problem in most constructions since $\alpha$ is typically a secret value known only to the owner. In the collaborative private state paradigm, however, $\alpha$ is meant to be revealed to other parties. The consequence is that any party who is given $\alpha$ can monitor the chain for the corresponding nullifier in perpetuity, even when they no longer have viewing rights to the tile. We solve this issue using the following refresh mechanism to revoke access to all parties. Those who retain the right to view the tile can then re-decrypt. $refresh(t)$ 1. Sample $\alpha \leftarrow$$ $F$. 2. Construct $u = \{t.l, t.pk_{owner}, t.r, \alpha\}$. 3. Compute nullifier $\rho = H(t.\alpha)$. 4. Send $encrypt_P(u)$ to the enclave. The enclave acknowledges the message and sends back a signature $\sigma = Sign(sk_{enclave}, H(u))$. 6. Generate ZKP $\pi_{refresh} = P_{NIZK}(pp_{refresh}.pkey, \bar B, H(u), \rho)$. 7. Publish $refresh(\pi_{refresh}, \bar B, H(u), \rho, \sigma)$ on-chain. ### Proving valid state transitions with ZKPs The $move$ circuit proves that the current state is included in the Merkle tree, the nullifiers are computed faithfully, the player owns the tile, and the game transition is valid. Merkle proofs $\psi$ and indices $i$ are obtained from indexers. Players must be careful to use indexers they trust to avoid leaking movement patterns. $move(\bar B, H(u_{from}), H(u_{to}), \rho_{from}, \rho_{to})$ 1. Load private signals: $\{sk_{owner}, \psi_{from}, \psi_{to}, i_{from}, i_{to}, t_{from}, t_{to}, u_{from}, u_{to}\}$. 2. Assert $Verify_{merkle}(i_{from}, H(t_{from}), \bar B, \psi_{from})$ AND $Verify_{merkle}(i_{to}, H(t_{to}), \bar B, \psi_{to})$, where $Verify_{merkle}(⋅)$ is the algorithm from the Merkle commitment scheme. 3. Assert $H(t_{from}.\alpha) = \rho_{from}$ AND $H(t_{to}.\alpha) = \rho_{to}$. 4. Assert $H(u_{from})$ and $H(u_{to})$ are computed correctly. 5. Assert $sk_{owner}$ is the secret key for $t_{from}.pk_{owner}$. 6. Assert $G_{move}(t_{from}, t_{to}, u_{from}, u_{to})$, where $G_{move}(⋅)$ accepts iff the move abides by the rules of the game. For Network States, this entails a few `if` statements that specify how resources from one tile are moved to another. The $refresh$ circuit is similar to $move$, except it only transitions a single tile and instead of game logic we check for consistency with the original. $refresh(\bar B, H(u), \rho)$ 1. Load private signals: $\{sk_{owner}, \psi, i, t, u\}$ 2. Assert $Verify_{merkle}(i, H(t), \bar B, \psi)$. 3. Assert $H(t.\alpha) = \rho$. 4. Assert $H(u)$ is computed correctly. 5. Assert $sk_{owner}$ is the secret key for $t.pk_{owner}$. 6. Assert $t.location == u.location$ AND $t.owner == u.owner$ AND $t.resource == u.resource$. ### The chain is the source of truth for game state Transitioning chain state concretely means adding leaves to the Merkle Tree maintained in the contract. Asserting a valid update involves verifying the ZKP, anchoring it to on-chain roots, checking nullifiers, and ensuring the enclave logged private state. As with the client functions, logic is similar for both $move(⋅)$ and $refresh(⋅)$. $move(\pi_{move}, \bar B, H(u_{from}), H(u_{to}), \rho_{from}, \rho_{to}, \sigma_{from}, \sigma_{to})$ 1. Assert $V(pp_{move}.vkey, \bar B, H(u_{from}), H(u_{to}), \rho_{from}, \rho_{to}, \pi_{move})$, where $V(⋅)$ denotes the verifier for our zk-SNARK protocol and $pp_{move}.vkey$ is the corresponding verifying key for our $move$ circuit. 2. Assert $\bar B$ is a historic or current Merkle root. 3. Assert $\rho_{from}, \rho_{to} \notin N$, where $N$ is the nullifier set. 4. Assert $Verify_{sig}(pk_{enclave}, H(u_{from}), \sigma_{from})$ AND $Verify_{sig}(pk_{enclave}, H(u_{to}), \sigma_{to})$, where $Verify_{sig}(⋅)$ is the algorithm from our chosen signature scheme. 5. Add $H(u_{from})$ and $H(u_{to})$ to `MerkleTreeWithHistory`. $refresh(\pi_{refresh}, \bar B, H(u), \rho, \sigma)$ 1. Assert $V(pp_{refresh}.vkey, \bar B, H(u), \rho, \pi_{refresh})$. 2. Assert $\bar B$ is a historic or current Merkle root. 3. Assert $\rho \notin N$. 4. Assert $Verify_{sig}(pk_{enclave}, H(u), \sigma)$ 5. Add $H(u)$ to `MerkleTreeWithHistory`. ### Attribute based encryption with the enclave We initialize tiles at all locations $L$. The board can be expanded witoverh multiple rounds post-initialization if we wish. Initialization can also be done using MPC if we want to reduce enclave-related trust assumptions. $init(L)$ 1. $\forall l \in L$: sample $\alpha \leftarrow$$ $F$ and store $t = \{l, nil, 0, \alpha\}$. Players can then begin storing private state with the enclave. The enclave runs a light client to monitor the chain for $H(u)$ being finalized in the Merkle tree. When this occurs, it updates its map for $u.l$. $encrypt_P(u) \rightarrow \sigma$ 1. Store u. 2. Return $\sigma = Sign(sk_{enclave}, H(u))$ As per the predicate $P$, anyone who can prove ownership of a neighboring current tile $t$ can the view private state at $q.l$. $decrypt_P(q.l, t, \pi_{schnorr}) \rightarrow q$ 1. Assert $t$ is the latest state stored at $t.l$. 2. Assert $V(t.sk_{owner}, \pi_{schnorr})$, where $V(⋅)$ is the verifier for Schnorr's protocol. 3. Assert $G_{decrypt}(t, q)$, where $G_{decrypt}(⋅)$ accepts iff a player who owns $t$ can view $q$ by the rules of the game. For Network States, this entails checking $t$ is a neighboring tile to $q$. 4. Return the latest state at $q.l$. ## Thinking through questions 1. What if two parties attempt to move into the same tile within one block? One is going to get reverted. Determined by block builder. 2. Will a player who previously sees another player's tile be privy to any additional information? No, due to refresh(). 3. What if one person attempts to move twice within the same block? That's fine. Everything works as expected. 4. What if someone gets a move signed by the enclave and does not publish it to the chain? Won't get leaked in the future since predicate includes "move must be finalized". Need to make sure we don't get hit with a DDoS though. 5. What if someone tries to bypass the enclave and directly submit the move on-chain? Tx will fail due to the missing signature. 6. How would using this design affect composability? Nada 7. How would using this design affect putting money into the game? Barely 8. How would using this design affect persistence? What if the enclave goes down? Permissionless syncing. 9. Why not use a server? We, the game designers, would be able to see all private state.