api.nft.storage onboarding to w3filecoin

# api.nft.storage onboarding to w3filecoin ## Background `api.nft.storage` relies on dagcargo to track insertions on its postgres DB, in order to get uploaded CAR files to land into Filecoin SPs. `api.web3.storage` was sunset on the 9th of January in favour of new w3up, which relies on a brand new Filecoin pipeline called w3filecoin. Currently w3filecoin is triggered by a S3 Object Write bucket event in `carpark-prod-0`. However, we are currently working on decoupling the pipeline [web3-storage/w3up#1299](https://github.com/web3-storage/w3up/issues/1299) from this event into client triggering `filecoin/offer`. ## nft.storage a user of w3filecoin nft.storage direction is to not be migrated as a "tenant" of web3.storage. Therefore, the easiest path to get nft.storage to move from dagcargo to w3filecoin is to have nft.storage to be an independent user of w3filecoin aggregation pipeline. This way, nft.storage acts as a different Storefront without actually implementing the whole protocol for the Storefront. It will simply invoke `piece/offer` from Aggregator. This puts web3.storage without any responsability to in the future deal with renewals for nft.storage data. nft.storage performing `filecoin/submit` would put Storefront (w3up) in the middle, which is likely not desirable. ## nft.storage triggers w3filecoin options There are a few alternatives to have this integration: - trigger w3filecoin from S3 bucket `dotstorage-prod-1` write event - run a CRON job to query new uploads - trigger w3filecoin from `api.nft.storage` worker ### trigger w3filecoin from S3 bucket event This essentially mimics what we have today in place with w3up + w3filecoin. A Bucket event is triggered on write, running a lambda function that reads the bytes from S3 bucket, computes a PieceCID for them, writes equal claim and triggers the Filecoin pipeline via `piece/offer`. The main differences to current system would be: - hook up a lambda with `dotstorage-prod-1` instead of `carpark-prod-0` - use a nft.storage issuer instead of web3.storage issuer to differentiate offered pieces - trigger Filecoin pipeline via `piece/offer` instead of `filecoin/submit` #### Advantages - easy to implement - handles uniqueness out of the box #### Disadvantages - will flood R2 bucket with Head requests - workaround for main potencial issue with this solution is write cases where `api.nft.storage` failed uploading because it could not write to R2 (e.g. a rate limit issue) and S3 bucket event would still be triggered. This could cause SP to not be able to read the data. An alternative here would be to do a HEAD request to CF `carpark-prod-0` bucket before moving forward with the rest. This would cause a big spike on class A ops in R2 bucket ### run a CRON job to query new uploads This solution would go more towards what dagcargo partially does today. Run queries over the DB to get to know new uploads, and send them over for aggregation. New uploads discovered should be put into a queue to run similar lambda function to the S3 bucket event, but without need to perform HEAD request to R2, as it is guaranteed to have succeeded. A store MUST exist to both track a pointer to where we were before in the CRON #### Advantages - handles uniqueness out of the box if we just track `content` table insertions. But Aggregator will also dedupe by `piece/group` #### Disadvantages - more complex and expensive to run - may need more monitoring ### trigger w3filecoin from `api.nft.storage` worker We can compute the PieceCID of the data while running the worker with the user request. Note that running Piece computation in the 30s of the worker runtime may be problematic for arbitrary piece sizes, but a maximum of 100mb chunks is fine. My tests showed around 10s to compute 100mb chunk (while also needing to read from R2). Once we compute PieceCID, an equal claim can be issued and the Piece can be offered to the `w3filecoin` right away using `piece/offer`. #### Advantages - easy to implement - lower costs to run #### Disadvantages - no uniqueness on requests to aggregator. But Aggregator will also dedupe by `piece/group` ## Design proposal For the nft.storage write side of things, based on the `nft.storage triggers w3filecoin options` section, it looks like the best solution will be to trigger the w3filecoin pipeline directly from `api.nft.storage` worker, when user is uploading the data. On the read side, status/check API will need to be updated to not be fully backed by dagcargo DB to get information about deals. We currently perform 2 queries to get the information of an upload, first get the upload and then get the "deals" for it. Therefore, we can check the insertion date, and with that decide if we check in dagcargo DB, or if we will get that information from w3filecoin, relying on deal tracker. Regarding w3filecoin pipeline there are two main issues that need to be solved to accomplish this: - w3up providers support - https://github.com/web3-storage/w3filecoin-infra/issues/49 - aggregate grouping by storefront requester - create separate aggregate offers for web3.storage and nft.storage, so that each product can have their own wallet, and test it goes all the way to Dealer Finally, within Spade integration context we will need to flush out how to distinct aggregates offered from each product: - w3filecoin Dealer currently writes offers to a bucket with a `"collection":"did:web:web3.storage"` property in the JSON. If we can make use of collections to set the nft.storage collection it would be great. - otherwise, Dealer can also look into having a second bucket with offers from nft.storage. This solution is less optimal as it would make w3filecoin not generic on who issued aggregation as the collection, as it would need to provision buckets for each new client. ## Assumptions - api.nft.storage continues to write to `carpark-prod-0` in Cloudflare and roundabout reads from there by default, or at least fallsback to it - otherwise, SPs won't be able to read pieces to build the aggregate for commitment - api.nft.storage still has access to dagcargo DB ## Tasks P0 - [x] Make PieceCid computation possible in CF workers [fr32-sha2-256-trunc254-padded-binary-tree-multihash#33](https://github.com/web3-storage/fr32-sha2-256-trunc254-padded-binary-tree-multihash/pull/33) - [ ] Write inclusion claims - https://github.com/web3-storage/w3up/issues/1275 - [ ] Run PieceCid computation within api.nft.storage Cloudflare Worker - [ ] Offer Piece for aggregation using `piece/offer` and submit equal claim - [ ] Change status API to (uploads inserted after change date) check deal/info instead - filecoin/info - [ ] needs to rely on equal claims and inclusion claims P1 - [ ] Make aggregator distinct offered pieces by issuer https://github.com/web3-storage/w3filecoin-infra/issues/49 - [ ] Connect with Spade team on how to achieve multi-wallet design - either distinct bucket, or did offered? ## Open Questions - @dchoi What will we do with renewals for nft.storage? - @dchoi Must MVP include separation of aggregates built for each product? - @riba would you keep dagcargo DB for now as read only so that we can read the state? or will we need to get a dump of any sort?

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.