Try   HackMD

Update

This last week I worked on:

  • a refactor of the SSZ library, to reduce code duplication #170
  • incoming request handling for Status, Goodbye, GetMetadata, and Ping requests #162
  • making some libp2p calls non-blocking #180 #181
  • updating the README #177

Req/Resp

As of now, we are handling incoming Req/Resp requests. This should change how other nodes view us, from an opaque, unresponsive peer, to a seemingly functional one (even though some responses are hardcoded). All of the requests from peers are logged, which enables us to see - among other things - the reason for peer disconnects. 80% of the messages have reason 129 (TooManyPeers in Lighthouse), but some of them are 3 (fault/error in the spec), or 0 (unknown in Lighthouse).

Blocking libp2p

The Elixir runtime - Erlang's ERTS - contains schedulers that run Elixir/Erlang bytecode. They can also run native code via NIFs (FFI equivalent), but by doing so yield execution to the underlying binary for an undeterminated amount of time. This can make the system unresponsive, or even bring it to a halt (known as scheduler collapse) if this code takes too much CPU time. This is why we also have dirty schedulers, which can be used to run blocking IO or CPU-intensive operations. The problem with this is that we have a limited amount of schedulers (both dirty and non-dirty), and reaching this limit can also halt the system. To avoid this, we are making this functions asynchronous: we run the NIF side operation in a new thread (or, in this case, goroutine), and send a message to the Elixir-side process when it is finished, that way both sides wait independently (and efficiently) in their own runtimes.

Next steps

The next step is to do some integration tests on the networking component, while working on the storage part.