Lets redirect to where the data is, instead of proxying.
TL;DR
Redirect requests for stuff we dont have to dweb.link. Stop proxying.
Set up a freeway-like gateway on aws and redirect requests to it for stuff we have but only have the index for in aws…
Or a satnav-api that lets freeway ask the e-ipfs db where a non-root block is.
…and this doc does not attempt to solve for future issues like retrieving things from Filecoin SPs, only moving around our current features.
There are 3 scenarios we have to handle when processing a request for GET /ipfs/:cid
For the simple case where we have the CARs and the mapping from root CID to CAR CIDs is in R2, keep serving the data from our w3s.link worker
readers note, the Cloudflare Cache layer is assumed in all these diagrams, but is left off for brevity.
As an optimisation we could fold the Freeway code into w3s here, to save the hop. It's unlikely to be a significant source of costs as it's all inside CF, but we may marginally reduce response time for content we have.
The E-IPFS db knows where all the blocks are, so let's use it.
w3s would make a HEAD /ipfs/:cid request to Autobahn ("gateway on aws"). Autobahn is a version of Freeway that runs in aws and asks the E-IPFS db where CARs are for a given root CID.
If Autobahn finds the CID then it responds with a 200 only, to indicate that it has that CID.
w3s returns a 303 See Other response to the user redirecting the request to Autobahn on aws. In this way we retain the ability to serve non-root CID paths without incurring twice the egress cost.
We could make a service in aws that returns the set of CARs and CAR indexes that contain a given CID. Freeway just needs to know which CARs those non-root CIDs are in to be able to serve them.
Pros:
Cons:
Let's redirect requests for CIDs we don't have to dweb.link
We currently proxy multiuple gateways and can incur bandwith fees in both directions.