hoverboard 🛹

(e-ipfs mk2). Run miniswap in a Cloudflare worker handling bitswap requests via websockets. Like E-IPFS without having to manage EKS infra.

Motivation

Lower bandwidth egress costs from CF.
bitswap-peer is crash looping and its getting worse. The elastic in the E-IPFS has lost its snap.
There is too much infra in e-ipfs for the team to support.

Hosting it on cloudflare instead of aws + eks would mean

🎉 Memory management easier when worker per peer/connection
😊 No long lived, multi tenant processes to babysit
💰 Cheaper egress from Cloudflare

@alanshaw notes that the majority of the code is already written

R2Blockstore (Blockstore backed by an R2 bucket which contains CARv2 indexes alongside CAR files) - https://github.com/web3-storage/freeway/blob/main/src/lib/blockstore.js
Claudio (libp2p running in a Cloudflare Durable Object) - https://github.com/web3-storage/claudio
dagular gateway - the inverse of this idea. turns http reqs into bitswap reqs.
miniswap our own minimal and maintainable bitswap impl.

The flow

📡 A libp2p client (kubo, helia, ir0h) connects to
wss://hoverboard.w3s.link and sends a Want-Have message over bitswap v1.2.0 with a root CID.

🛹 Hoverboard looks up a CAR CID for the root CID in DUDEWHERE then locates the CAR index in satnav ands stores them in KV using the clients peerID as the key.

🛹 Hoverboard could send a Have if the block is large, but in this case it's just a small directory node so it sends the block.

📡 the client sends a Want with the CIDs for all the links in the Directory.

🛹 Hoverboard looks up the satnav index from the KV for this peer ID, and fetches the blocks from R2 and sends block messages with the batch.

repeat to fade

Interesting parts

Sub-request limit

We get to make 1000 subrequests per request to a worker. We need to check if each inbound message on the websocket from the libp2p counts as a reset on that counter… it does for durable objects, so if we need to we could move the work there.

If not, we can just stop at 1000, drop the connection, and let the client contact us again if they still want the blocks. Worse may be better here, as we are just another peer in a decenralised network. Clients must try reconnecting if they think we are a useful provider. This is already happening every few seconds in E-IPFS, but in an uncontrolled way, and gives us some throttling for "free".

Finding non-root blocks

If a worker has to drop a connection mid bitswap dance, and a new connection comes in from an existing peer asking for a non-root block because we already sent them some of the dag then we need a mechanism for find the CAR and index for any block we store.

Longer term, this could be done reliably by exposing satnav-as-a-service, but in the short term we can get this working by just caching the CAR CIDs and indexes we find from the first request from a new connection. Assuming that most sessions start with a request for a root CID, we already have enough info in R2 to find all the blocks for a given dag (…that was uploaded to us! We're not offering "find random blocks from IPFS" here)

Ideally we'd have both. Most requests could be satisfied by caching the indexes, but for true random access we'd have to either fallback to satnav-as-a-service or consider migrating (or replicating) the per block indexes to R2 instead of dynamo.

It is worth exploring a world where we map every block to the list of CARs it is stored as an extention of DUDEWHERE… keys in R2 pointing to empty files (or directly to the relevant indexes if some duplication of storage is tollerable) see upload-api in CF

There is also the potential to unify w3link here so that our gateway can find non-root blocks directly from R2 instead of the current triple spend of e-ipfs -> ipfs.io -> w3s.link

Maximum websocket message size

CF workers websocket impl limits inbound message size to 1Mib but this is ok, as we'd assume that inbound bitswap messages would be lists of CIDs only and unlikely to get near that limit.

https://developers.cloudflare.com/workers/platform/limits#durable-objects-limits

Worker duration

As far as I can tell a worker on the unbound plan gets 30s of CPU spin time per request. In the durable object docs it is clear that that limit is per inbound websocket message but it's not clear if that also applies to workers without a durablie object.

Actual "duration" is uncapped, though workers are "more likely to be evicted after 30s".. which we could use some clarification on.

There is no hard limit for duration. However, after 30 seconds, there is a higher chance of eviction.
– https://developers.cloudflare.com/workers/platform/limits#bundled-usage-model

Capping infra costs

We'd be charged for each worker that is running. 128Mb of mem is allocated per worker, and we'd have to mulitply that by session duration to transfer the complete dag + bandwidth egress from CF.

By keeping track of work done per peerId we can choose to not respond to Want requests after a certain cap per hour/day etc.

Not responding or deliberately closing the connection and letting them reconnect as needed would let us cap costs. Actually slowing down our response rate would likely cost us more over time, as it would dive up session duration.

We can send Not Have if we want to be explicit, but that might send a confusing signal if we really do have the block. However the spec is clear that it's appropriate to send Not Have if you dont want to send a thing you do have.

if C sends S a Have request for data S does not have (or has but is not willing to give to C) and C has requested for DontHave responses then S should respond with DontHave
– https://github.com/ipfs/specs/blob/main/BITSWAP.md#bitswap-120-interaction-pattern

How long will it take to build

We are motivated to build it! E-IPFS is creaking ominously and none of us want to manage (or has experience of) the terraform and eks stack that is there.

We can prototype it out in 1 week, demo it, and have a better idea of the fesibilty and how long it would take to make production ready.

Publishing CID provider records

We currently hardcode "E-IPFS" via a dns multiaddr as the provider of every block uploaded to us. We need to switch that to be a dnsaddr address so that we can easily change or add multiple providers for each block from a dns txt record we control.

This would allow us to experiment with adding hoverboard as an additional source source without having to change every existing block index that has already been publised (…going forwards. Existing ones will need updating as alas we didn't land this recommendation first time round)

This is currently handled by E-IPFS indexer in aws. There is an opportunity to re-write this to happen in cloudflare and store the complete index of blocks in the same place as the w3s.link gateway to save money on finding non-root blocks as discussed above.

Notes

WANT cid, cid
- increment asks from peerId
  - if 1000 drop conn ? or let CF do it.
- check denylist KV
- do we have a DUDEWHERE map CID to CARCID
  - fetch index (100Kb)
      - stream block to client (1Mb)
  - ask satnav for index info
      - have block in r2?
          - stream block from r2 to client (1Mb)
          - or stream block from s3 to client
      - no: increment nopes
          - nopes > threshold?
              - drop connection.


const peerMap = new Map([["peerIdString", {ask: 0, hit: 0, miss: 0}]])
// drop if n misses in a row.

satnav-as-as-service

how does this play with content claims?

what's in there?

Vasco Santos

2023/02/22 15:43:07

Maybe also add as motivation that Egress in CF is cheaper? (Edited)

2023/02/22 15:44:34

and also, that we won't pay double (or triple :P) bandwidth costs to get content from E-IPFS to be served via our gateway. It will all run inside same CF datacenter, instead of w3link - ipfs.io - e-ipfs (Edited)

2023/02/22 15:45:51

We also need to have a trigger to publish to indexer nodes that hoverboard is also providing given root CIDs (Edited)

2023/02/22 15:48:55

where

yes! also in this context, I would also add a note that a satnav as a service would help us serve historical content that we also have in R2, but previous to DUDEWHERE and SATNAV existence. If we get `carCid` from this service, we can check for its existence in R2 before hitting S3 bucket (Edited)

Oli Evans

2023/02/22 16:00:59

1. Lower bandwidth egress costs from CF. 2.

Added a note about that in the "finding non-root blocks" section (Edited)

David Choi

2023/02/22 16:59:53

bitswap-peer is crash looping and its getting worse. The elastic in the E-IPFS has lost its snap. 3.

do we know why it's no longer elastic? it intuitively feels like something should be able to be scaled up to help handle whatever change in user behavior has changed (Edited)

2023/02/22 17:00:49

Hosting it on cloudflare instead of aws + eks would mean -

it also feels like there's a number of risks here too. for instance, CF didn't publicly share what R2 rate limits were, and they get us in trouble from time to time (and getting answers out of them is like pulling teeth) (Edited)

2023/02/22 18:43:39

added a section "Publishing CID provider records" (Edited)

Mikeal Rogers

2023/02/22 19:05:25

would it be useful to stick the peerid in the DNS as a subdomain? we’d probably get some good metrics for free :) (Edited)

2023/02/22 19:42:22

the clients peerID as the key

i'm definitely just missing something, but why do we need to store anything in the KV? and why is the peerID the key? (Edited)

2023/02/22 21:46:30

It's the problem if being able to find non-root blocks. E-IPFS has access to the full db that maps any block cid to the car it's in. We only have the root CID to car mapping info in R2, so the proposal is to assume that for a new connection from a new client they will ask us for a root cid and when they do we are able to locate the car it's in and the Index file which tells us all the block CIDs _in that car_. If we get disconnected from the client and they reconnect and ask us

2023/02/22 21:48:12

...for a non-root cid, it will nearly always be for the reminder of the DAG they started asking for previously, so the index file we last looked up for them is a great place to check to see if the non-root block they are asking for is in the last car we looked up.

2023/02/22 21:52:19

With gateway requests it's common for requests to start from a root cid. But with bitswap it's just want-lists of blocks. The first request typically has a single cid for a root block. Subsequent requests will have just the CIDs for the next later of the DAG... The root cid won't be asked for again... So we need to keep that state around to figure out where things are. (Edited)

2023/02/24 14:48:13

do you mean the peerId of our service (or 1 of n peerIDs for our service)? ... these multiaddrs to connect to us are published in provider records, so we dont know which peerIds will connect to us to ask for them (Edited)

2023/02/24 14:49:40

i've written up a doc for doing something like pickup in cloudflare durable objects, where I recommend setting the peerId in a subdomain so we can route the requests to the right durable object see: https://hackmd.io/@olizilla/beam (Edited)