# IPFS / Filecoin Interop Plan
## Introduction
### Scope
This document is focus on how we can build an IPFS client that fetches data transparently from both IPFS and Filecoin.
The following assumptions are made in the scope of this document:
- retrievals from Filecoin are free
- a centralized indexing service or other discovery method for Filecoin data exists
Out of scope for the time being are:
- paid retrievals
- adding data to Filecoin from IPFS
That said, we know we want to do paid retrievals and possibly adding data to filecoin from a unified IPFS client in the future, so we should pay attention to not creating roadblocks.
### IPFS Request Lifecycle
Let's say we are using go-ipfs and we call:
```
ipfs get <some cid representing the root of a unixfs file>
```
At the command line, we ultimately have the contents of that file printed to standard out. But what happens under the hood?
The UnixFS file is a Merkledag. Some or all of that DAG may be in the users local blockstore while some or all of it may need to be fetched from the IPFS network.
When we fetch this file and print it out, we do all of the following:
- Determine what parts are in our local blockstore and which are remote
- For the parts that are not in our local blockstore:
- Find peers who have all or part of the content
- We can ask peers in our local swarm if they have some part of the content
- We can query the IPFS DHT as a fallback
- As a baseline assumption, we want to keep track of who's responding affirmatively and sending us parts of the content and keep asking them if they have other parts of the content
- Ask the peers to send you content
- For large DAGS, we should generally prioritize those who send us content fastest
- We need to store verified content in our local blockstore
- Possibly: Announce to the IPFS Network DHT that we have received each block in the content (if we are providing)
- Possibly: Start serving newly fetched content to other peers requesting this content
- Possibly later: We need to continue reproviding pinned content to the DHT
That is a LOT!
If we add Filecoin retrieval, at a high level this process stays the same, except we add a Filecoin indexer or other discovery method as an additional fallback to find content when local peers don't have it, and we use Filecoin protocols (go-data-transfer etc) to actually fetch content from other peers.
When we get down to the brass tacks of implementation, it gets a lot more complicated.
### IPFS Request Lifecycle In Go-IPFS
Let's look at how this lifecycle is ACTUALLY implemented in go-IPFS:
*Note*: For simplicity, I've represented all steps here as largely sequential though many are actually asynchronous / parallel
1. When go-ipfs fetches a DAG, it initializes a new **Session**. *A Session specifies a set of network fetching operations that are related and should be tracked together for the purposes of peer optimization.*
1. Starting from the root CID, go-IPFS walks the DAG, requesting each block from the **BlockService** (it passes the CID and the BlockService also has the session). BlockService is implemented in the [go-blockservice](https://github.com/ipfs/go-blockservice) repo
1. The BlockService checks the local blockstore to see if it has the block.
- If it has the block locally, the block is returned.
- It it does not have the block locally, it requests the block from **Bitswap** (it passes the CID, and Bitswap also has the session). Bitswap is implemented in the [go-bitswap](https://github.com/ipfs/go-bitswap) repo
1. Bitswaps block requesting protocol is as follows:
1. For the first block in a session, Bitswap is in "discovery mode". It sends a 'WANT-HAVE' message on the Bitswap protocol to every connected peer to see if they have the requested block. Any peers that have the block respond with a `HAVE` message. They are added to the Session.
1. If no peers have the block, Bitswap querys the IPFS DHT to find peers who have it using the **ContentRouting** system implemented in [go-libp2p-kad-dht](https://github.com/libp2p/go-libp2p-kad-dht)
1. One session has identified peer that have responded affirmatively to previous `WANT-HAVE` requests, Bitswap switches to block fetching mode. For each block, Bitswap selects one peer based previously tracked session data and sends it a `WANT-BLOCK` message in order to receive the block. It also sends `WANT-HAVE` message for the new block to the other peers in case the selected peer does not send the block.
1. If none of its peers have a block (they all respond with `DONT-HAVE` messages) bitswap reverts to `discovery` mode and broadcasts `WANT-HAVE` messages to all peers while also querying the DHT to find more peers.
1. Bitswap ALSO periodically queries the DHT for a random WANT inorder to expand the session peer list.
1. When Bitswap receives a block, and confirms its validity (data matches hash and CID was previously requested):
1. It stores the block in the local blockstore.
1. It updates its peer tracking information
1. It returns the block via channel to any callers that requested it in any active sessions.
1. It notifies its own block sending engine (responding to other peer's wants) that it has the block
1. It queues the block to be provided and uses [go-libp2p-kad-dht] to provide the block to the IPFS DHT (assuming providing is enabled)
1. On periodic intervals, go-ipfs will reprovide fetched CIDs to the IPFS DHT. It uses the the **Provider** system implemented in [go-ipfs-provider](https://github.com/ipfs/go-ipfs-provider/) to do so
[Sequence Diagram of Fetching Process](http://www.plantuml.com/plantuml/svg/XLJBSjiw3DthAp3PtDk5ym5UTF98f-cqJJD9scv3IP3DCKsm9AHC_hwWIJDcHhMzC1_du10F21U1bUTnQm2WKHsJXvy1FQW0ewoD4rHtsYdBSBQYXNP3E8jet5Je8uQWoKOiOAYnAqFTvX7zf3kCX3R8VeR1v3CUTpL1StXHBg6rMtle3lc5_EOiSUmywHxHmvDNtKRRLQQwMMhoIUAoVQ7-w-bxJvD-ezM2rsGPBJ_GoCA8X7xD5IEPfxGxsJTgP58Twqan0jacrW8M7tTPNy8A6Nh5ged7U1qmo7NdCT93AaHZY1mCv05LjuPs5uosy1tznk2DMZKGgT4-rMC93_Xdn92BqK2w5ZpNOmcVaQ7De4SUlTth3k0y1hGi2BdC0akylmUvKCQy2WwLye4g39wsO5CP4z4f3pwv9RbBj21G_Yd3qElGqOJ-5RRK8onWq3w8OeutfmF82zkKeqFq0Ngo_p6irOGbhhTVctDZbz1wKdsdHFRvz-NTq-BsyjUdPs2AaacgiaovcH6JmY_5Fzk3ooKFqSURsoVuFwqVZWgX2LY7KF8-vxbBlT9Yu1rxHhi2PsA79OXyd-E--lRZ-ciC_AZcl58HgX98l4R_BYIovlMehEWq9H-2TgYdkVQ8_kScqxQZRU4SKlQCIHdTAyP2dM_AUPUIJlbGTakoFsqYLHhTxkKOFd0UJg3jG7whM9CzAI817BGnY264DBXEcihpydsD1wQfEIjZePN-JIZ2zPURo3uQviovnpBPsl2UbcTFgNBwBoBnaBUPFsY3G2ExaPlpytCZibxMCXgZit6oPQURk9P8yjnD4rB2gRMR2tbkt9g_)
[Diagram Source code](http://www.plantuml.com/plantuml/uml/XLJBSjiw3DthAp3PtDk5ym5UTF98f-cqJJD9scv3IP3DCKsm9AHC_hwWIJDcHhMzC1_du10F21U1bUTnQm2WKHsJXvy1FQW0ewoD4rHtsYdBSBQYXNP3E8jet5Je8uQWoKOiOAYnAqFTvX7zf3kCX3R8VeR1v3CUTpL1StXHBg6rMtle3lc5_EOiSUmywHxHmvDNtKRRLQQwMMhoIUAoVQ7-w-bxJvD-ezM2rsGPBJ_GoCA8X7xD5IEPfxGxsJTgP58Twqan0jacrW8M7tTPNy8A6Nh5ged7U1qmo7NdCT93AaHZY1mCv05LjuPs5uosy1tznk2DMZKGgT4-rMC93_Xdn92BqK2w5ZpNOmcVaQ7De4SUlTth3k0y1hGi2BdC0akylmUvKCQy2WwLye4g39wsO5CP4z4f3pwv9RbBj21G_Yd3qElGqOJ-5RRK8onWq3w8OeutfmF82zkKeqFq0Ngo_p6irOGbhhTVctDZbz1wKdsdHFRvz-NTq-BsyjUdPs2AaacgiaovcH6JmY_5Fzk3ooKFqSURsoVuFwqVZWgX2LY7KF8-vxbBlT9Yu1rxHhi2PsA79OXyd-E--lRZ-ciC_AZcl58HgX98l4R_BYIovlMehEWq9H-2TgYdkVQ8_kScqxQZRU4SKlQCIHdTAyP2dM_AUPUIJlbGTakoFsqYLHhTxkKOFd0UJg3jG7whM9CzAI817BGnY264DBXEcihpydsD1wQfEIjZePN-JIZ2zPURo3uQviovnpBPsl2UbcTFgNBwBoBnaBUPFsY3G2ExaPlpytCZibxMCXgZit6oPQURk9P8yjnD4rB2gRMR2tbkt9g_)
#### Other Considerations
- The phrase "go-IPFS walks the DAG" elides a lot of complexity. There are several functions used in go-IPFS for walking DAGS and no single inflection point for where it happens. The IPLD-in-IPFS work introduced the Fetcher which in my mind should be the inflection point, but it's not integrated widely at the moment.
- For more in depth information on Bitswap's block request and response process, see the [How Bitswap Works](https://github.com/ipfs/go-bitswap/blob/master/docs/how-bitswap-works.md)
- It's been a long term IPFS design goal to get providing out of Bitswap and have the Provider system do all the providing. It's partially implemented - Bitswap has a contructor option to disable its internal providing. However the default mode in go-ipfs is to have Bitswap do the initial Provide, and have the Provider system do reproviding
- I've left out the **Pinner** (implemented in [go-ipfs-pinner](https://github.com/ipfs/go-ipfs-pinner)) and pinning in general, even though it's tightly integrated in `go-ipfs` data fetching, and also is coupled with the Provider system.
#### Conclusion
As we can see, the current `go-IPFS` request lifecycle does a whole lot, and a whole lot of it happens inside of Bitswap (or the go-bitswap implementation), even if the concerns are not actually related to the bitswap network protocol. To enable the process to work across both protocols, we're going to need to pull apart and refactor many portions of code, and likely split apart
## Inserting Filecoin Fetching Into IPFS
Recall that at a high level what we need is to:
- Add a Filecoin indexer or other discovery method as an additional fallback to find content when local peers don't have it
- Use Filecoin protocols (go-data-transfer etc) to actually fetch content from other peers.
I think these are fairly seperate problems and should be tackled seperately. Content routing is mostly not a part of bitswap, and those parts that are are easy to extract. As such, adding a Filcoin indexer in the IPFS routing process is a relatively straightforward undertaking. Adding Filecoin protocols to fetch content on the other hand, requires very significant refactor of Bitswap to all negotiation within the fetching proces.
### Adding Filecoin to Content Routing
#### State of content routing
Currently, content routing has the following interface:
```golang
// ContentRouting is a value provider layer of indirection. It is used to find
// information about who has what content.
//
// Content is identified by CID (content identifier), which encodes a hash
// of the identified content in a future-proof manner.
type ContentRouting interface {
// Provide adds the given cid to the content routing system. If 'true' is
// passed, it also announces it, otherwise it is just kept in the local
// accounting of which objects are being provided.
Provide(context.Context, cid.Cid, bool) error
// Search for peers who are able to provide a given key
//
// When count is 0, this method will return an unbounded number of
// results.
FindProvidersAsync(context.Context, cid.Cid, int) <-chan peer.AddrInfo
}
```
The only concrete implementation of this interface is in go-libp2p-kad-dht (i.e. routing through the DHT)
There is also an implementation that composes other routers (thought they have to satisfy libp2p's full `Routing` interface which implements more than the above `ContentRouting` interface) in [go-libp2p-routing-helpers](https://github.com/libp2p/go-libp2p-routing-helpers)
#### ProviderQueryManager
Bitswap also contains one important and relatively complex content routing component of it's own -- the **ProviderQueryManager**.
The ProviderQueryManager provides only a `FindProvidersAsync` method, and it has the same signature as the one in ContentRouting. However, it makes some important additions:
- it buffers provider queries so only a set maximum are in progress at the same time
- it dedups query requests -- if two seperate sessions make a query at or around the same time, it makes only one query and serves it to both requestors
- it attempts to connect with each peer via libp2p, and removes it from the list if it can't (there is plenty of bad data in the DHT, so this is an important optimization)
#### Future state of routing
ResNetLab is working on a composable routing framework that mirrors the goals of what we want to do and much more.
Both the DHT as implemented and a future Filecoin index fit in quite well to the definition of a `CIRCUIT` outlined in the design doc here:
https://docs.google.com/document/d/18Ryov6vxZOwhG5xjt7EcKpEwAtsrGJZ3jiyj4Bhlag0/edit
However, part of the reason they slot in easy is that the composable routing framework is only a WIP design that is mostly high level for now.
#### Filecoin Index Routing
Since the Filecoin Index does not yet exist, we can't know the interface it will satisfy. However, we can make some assumptions:
It will probably be able to satisfy a `FindProvidersAsync` method.
Depending on how we implement data transfer, the `FindProvidersAsync` method may need to be modified to return not only the peer, but also some additional data to facilitate data transfer. This idea of "return more than just the peer" is conceptually similar to the concept of a Smart Record in the composable routing design doc
We can also say with some certainty that the Filecoin Index will likely not support a `Provide` operation in the context of Filecoin routing.
#### Attack plan
How we go about implementing the routing interface to support Filecoin index in a reasonable amount of time is tricky but we can say some things for certain:
1. We want it to to be a step towards the composable routing designs, but it's surely not going to implement all of the composable routing design, especially since the design is still in progress.
2. For now, we can assume we are building multiple-router composability ONLY for finding providers. We can safely assume single method interface either matches or closely resembles the current `FindProvidersAsync` method.
3. We'll be developing with two moving targets: a Filecoin index that doesn't yet exist and a composable routing design that is still evolving.
We can imagine as an absolute most naive implementation:
- assume the Filecoin index will match FindProvidersAsync with no changes
- extract the FindProvidersAsync method from the libp2p Parallel router implemented in [go-libp2p-kad-dht]
We can implement this and put it into Bitswap (which expects a ContentRouting interface, but we can easily break into two at the injection point in Bitswap) in just a couple weeks.
From there we can see various directions for development:
- The ProviderQueryManager is useful functionality that has little to do with Bitswap and should probably be moved out of Bitswap. It is itself an implementation of a finding-providers only interface. While we could just implement the multiple router negotiation inside ProviderQueryManager the code for this class is already quite complex and I would instead push for multiple simpler implementations that can layer on top of each other. I'd advocate breaking up the ProviderQueryManager into three seperate layers to handle each of the the things it does. We can do a 1-2 week "extract the ProviderQueryManager" sprint.
- As the nature of additional data served by the index becomes more clear we can adjust our interface to serve this data. Moreover, hopefully the evolving SmartRecord design will also get more clear as well and we can move in both directions. We might do a 1-2 week sprint around "improving record return values".
All in all it seems likely we can build a router that satisfies the needs for our use case in 1 month to 1.5 month long project (but possibly broken up over seperate non-contiguous sprints to sync with moving targets)
#### Where does it live?
Routing currently lives in the libp2p organization, in multiple repos, except for the ProviderQueryManager in go-bitswap. We know that we want this implementation to form a base for implementing more parts of the composable routing design. I think this should live wherever we decide composable routing's implementation should live, understanding the eventual implementation will look quite different than the ContentRouting implemented in libp2p. I recommend we make a best effort to decide this now.
### Using Filecoin Transfer Protocols In IPFS
#### Filecoin Data Transfer Today
Currently, Filecoin retrievals are done through a combination of:
- The go-fil-markets retrieval client (though other clients exist -- see Estaury)
*which calls*
- The go-data-transfer library to initiate and negotiate a transfer
*which calls*
- The go-graphsync library to actually transfer the data
Almost all logic in the go-fil-markets retrieval client has to do with negotiating paid transfers, so we can most likely assume we'll only need go-data-transfer and go-graphsync in IPFS for the purpose of free retrievals.
Filecoin retrievals are currently point to point -- we ask a single peer for all the content we want, and it sends it to us. We can think of transfer as being more analogous to HTTP than IPFS. An important caveat here is that we do support selectors on retrievals, which means that if I want the whole DAG for a given CID, we could break the fetch up into several deals for parts of the DAG. However, currently we lack selectors for some of the most common ways we might want to break up a dag, such as a byte range inside of a UnixFS file.
#### DAG vs Blocks
Filecoin
#### An MVP to Filecoin Data Transfer In IPFS
```plantuml
@startuml
[IPFS Core API / Command Line] as cl
[Fetcher] as f
[Graphsync] as gs
[Data Transfer] as dt
[Block Service] as bserv
[Bitswap] as bs
[Find Providers] as fp
[Providing] as prov
[DHT] as dht
[Gateway] as gw
[Track Peers] as tp
cl --> f
f --> bserv
bserv --> bs
f --> fp
f --> tp
f --> dt
dt --> gs
cl --> prov
fp --> dht
fp --> gw
prov --> dht
@enduml
```
Currently, the `go-bitswap` libary is much more than a simple implementation of the Bitswap protocol. Several functions related to content routing, content providing, and managing peers are tightly integrated into the library. The overall process of getting data from the IPFS network ultimately involves several steps other than simply data transfer. Currently all of these steps are managed by `go-bitswap` in `go-ipfs`.
As such any efforts to introduce other protocols to transfer data into IPFS invariably face obstacles. `go-bitswap` offers little ability to swap in another protocol for the transfer step alone. And without the other functions of go-bitswap, these protocols are an incomplete solution.
This project aims to refactor/decompose the `go-bitswap` library so that its components can be used indepedently of one another, in order for IPFS to be fetch data through other transport protocols besides Bitswap. It is a non-trivial prerequisite to any efforts to introduce free Filecoin retrieval via `go-data-transfer` into IPFS.
This effort is predicated on the assumption that we want to introduce `go-data-transfer` to IPFS in order to support transfer from Filecoin and other networks (see https://github.com/protocol/web3-dev-team/pull/57)