changed a year ago
Linked with GitHub

EPF C4 - Tomás Arjovsky - Week 8

These are the updates for the week of 2023/09/04.

First update

Gossipsub design

Had a design session with Tomás Grüner on the dessign for the gossipsub implementation in elixir. We concluded the following:

  • From the elixir side, we’ll build one handler per topic, with a behavior to implement a “handle” function. We’ll do this in the broadway consumer directly. These handlers will implement the spec callbacks.
  • We’ll delete the network agent, which was a singleton, in favour of getting its values once and passing them as params. If we decide that we need the singleton, we’ll build a GenServer + ETS pair for concurrent access to the data.

Project presentation

I studied a bit and reviewed the slides for the project presentation on Tuesday 5th.

Second update

Architecture validation

I had a sync with other seniors at lambda to validate the elixir architecutre. I wrote the following diagram:

sequenceDiagram
    participant prod as Topic Producer (GenStage)
    participant proc as Topic Processor (Broadway)
    participant block as Block DB
    participant state as Beacon States DB
    participant FC as Fork-choice store DB

    prod ->> proc: Produce demand
    proc ->> proc: Decompress and deserialize message
    proc ->>+ proc: on_block(block)
    proc ->> FC: request validation metadata
    FC -->> proc: return
    proc ->> proc: Validate block
    proc ->> block: Save new block
    proc ->> proc: Calculate state transition
    proc ->> state: Save new beacon state metadata
    proc ->>- FC: Save fork-choice store (proposer boost, checkpoints)

NIF discussion

I also had a great series of discussions and iterations on the usage of NIFs, specially for the case of LibP2P.

Communicating with other languages is typically done with NIFs. The main concern with NIFs is that they run in the same thread as the scheduler, and:

  • If it crashes, it crashes the whole BEAM.
  • If it blocks, one of the schedulers is blocked.
sequenceDiagram
  participant E as Elixir Process
  participant N as NIF
  participant L as libp2p Subscription

  E->>N: Invoke NIF Next()
  N->>L: Invoke Next() on Subscription. THIS BLOCKS
  L-->>N: Return message
  N-->>E: Return result to Elixir Process

Ideally, we would have the subscription run in a separate goroutine, so that we instead got callbacks asynchronously

sequenceDiagram
  participant G as Message Processor (Genserver)
  participant E as Elixir Subscriber
  participant N as NIF
  participant L as libp2p Subscription

  E->>N: Invoke NIF subscribe()
	N->>L: Launch new goroutine
	L-->>N: ok
  N-->>E: ok
  L->>L: Next(). THIS BLOCKS THE GOROUTINE
  L-->>N: Call a NIF callback
  N-->>G: Send a message to a Genserver

The problem with this idea is that running goroutines inside of a NIF might cause unexpected behavior, so we’re making sure that this is something that will work and not impact the BEAM schedulers.

Dirty scheduler NIFs

Alternatively, we can use dirty scheduled NIFs, which run in a separate thread than the normal scheduler. The problem is that there’s a limited amount of dirty schedulers, related to the amount of processors a machine has. If all dirty schedulers are waiting on blocking input, the whole BEAM is blocked as well.

There are Dirty I/O schedulers, but they seem to have the same issue.

Ports

We decided that we would move the LibP2P implementation to ports, to communicate to go.

Ports run in a separate process than the BEAM, which means that if they crash, the erlang VM won’t and also that blocking is not a problem.

Advantages:

  • They run on a separate process, but linked to the port owner, so they can be supervised like a regular elixir process.
  • They can have blocking and long running operations as they are fully separate from the elixir schedulers (in a separate OS process). Their runtime is managed by the other process (e.g. goroutines managed by the Go runtime).
  • Using libp2p with the “intended” go syntax also causes less translation issues from one language to another.
  • Ports send data through standard input/output, which means that we don’t need to implement any special networking primitives.

Disadvantages are:

  • Information needs to be serialized and deserialized and we’re fully responsible for implementing that. We can use something like protobuf if we need to.

A go main program could be used to:

  • Bootstrap all libp2p variables and goroutines with all typical subscriptions.
  • Fully manage discovery, and just send serialized discovered peers.
  • We could have a few requests for specific on-demand subscriptions like committees.
  • Ports output would multiplex all updates from any subscription.
Select a repo