owned this note changed 4 years ago
Linked with GitHub

Event Driven Concurrency using the Ruby Fiber Scheduler - Samuel Williams

tags: COSCUP2021 en COSCUP2021 RubyConf Taiwan 2021 TR214 - Ruby Conf

歡迎來到 https://hackmd.io/@coscup/2021 共筆

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

點擊本頁上方的 開始用 Markdown 一起寫筆記!
手機版請點選上方 按鈕展開議程列表。

請從這裡開始

Please feel free to help translate this document.

Background

  • RubyDNS is a DNS client and server library for Ruby.
  • EventMachine is a library for implementing scalable non-blocking programs for Ruby.
  • Celluloid is a framework for Ruby actor-based concurrency which includes support for non-blocking I/O.
    • Celluloid::DNS was a fork of RubyDNS implemented on top of Celluloid.

The Ruby fiber scheduler interface is specified in this document.

The CRuby implementation of the interface is in scheduler.c. The public interface (for use in C extensions, etc) is given in scheduler.h.

Alternative Implementations

These are some other implementations (other than Async) of the fiber scheduler interface:

Socketry

Socketry is an organisation for Async and related event-driven concurrency libraries for Ruby.

Async

Async

Async is a composable asynchronous I/O framework for Ruby. There are currently two branches: stable-v1 (for Ruby 2.x+) and main (for Ruby 3.x+). Currently, Async 2.0 (main) is not released yet due to CRuby bugs, but will be released at the latest by the end of the year with Ruby 3.1.0, and hopefully sooner with the release of Ruby 3.0.3.

Sleep Sort Example

#!/usr/bin/ruby

# gem install async

require 'async'

# Generate 10 random numbers between 0...10:
numbers = 10.times.map{rand(10)}

Async do
    # For each number...
    numbers.each do |number|
        Async do |task|
            # Sleep for that number:
            task.sleep(number)
            # Then print it out:
            puts number
        end
    end
end

Event

Event

Event is a low level event readiness selector.

As discussed, there are 4 implementations:

If you would like to read more about io_uring, Linux Weekly News has a good overview.

IO Read/Write Example

#!/usr/bin/env ruby

require 'event'
require 'fiber'

# A pipe for communication:
input, output = IO.pipe

# An event readiness selector:
selector = Event::Selector.new(Fiber.current)
puts selector.class # implementation

# A fiber which will wait for the pipe to have data and print it out:
reader = Fiber.new do
    selector.io_wait(Fiber.current, input, Event::READABLE)
    puts input.read
    input.close
end

# A fiber which will wait for the pipe to be writable and send a message:
writer = Fiber.new do
    selector.io_wait(Fiber.current, output, Event::WRITABLE)
    output.write "Hello World!"
    output.close
end

# It will try to read but it's not ready:
reader.transfer

# It's ready to execute on the next event loop iteration:
selector.push(writer)

# Loop until the input is closed - i.e. we are finished:
until input.closed?
	selector.select(1)
end

Falcon

Falcon

Falcon is an event-driven web server built on top of Async.

99 Bottles of Beer on the Wall

This example shows live streaming through a Rack application running on Falcon. It uses a special Async::HTTP::Body::Writable which can be used to send chunks of data from one task to the HTTP response.

Source code: https://github.com/socketry/utopia-falcon-heroku/blob/master/pages/beer/controller.rb

Conway's Game of Life

This example shows live client-server interaction with a application running in real time. The server state is synchronised to the client using a WebSocket.

This is the first time I tried to deploy such an application to Heroku, and I found some problems:

  • WebSocket latency can be a problem for real-time applications.
  • Actual data rate is quite high, we need to implement WebSocket compression
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →
    . I lowered the update rate to 5 times a second (instead of 30 or 60).
  • The application is not stateful. Reconnecting the WebSocket will loose your application state. Can be fixed with persistent state (either on client or on server).

In any case, it's an interesting demo, showing the possibilies of event-driven concurrency in a web application.

You can make glider like this:

Glider

Then press "Start" button.

Feel free to add other fun shapes you find.

Source code: https://github.com/socketry/lively-falcon - you can run it locally too.

Q & A session

  1. What is the cool tool you show in your presentation? The tool is called Async::Debug. It's a live debugger and can be used in any async program. It is currently quite simple, but I have some ideas to make it more useful for debugging live systems. Concurrency can be quite tricky, and so we need better debugging tools such as this. The idea originally came from a build tool I created called teapot. I wanted to visualise the build graph, and all the build logs, and dependencies, etc. I think the same approach makes sense for concurrent systems in general.
  2. Isn't Process.detach(pid) already non-blocking? In my answer I got a bit confused between detach and daemonize - I thought it was the same but it's not. Process.detach creates a thread which waits on the child process. In a way, all child process is non-blocking. But even this thread, if you call join on it, it will block (but we support non-blocking Thread#join in the fiber scheduler). In the fiber scheduler interface, we introduced a new hook called process_wait which makes Process.wait yield to the event loop. In the io_uring implementation of the event gem, we use pidfd to wait on the process using the event loop, so it's pretty efficient and non-blocking. More efficient than creating a thread per child process. Regarding Process.daemonize you should avoid this kind of model (including, more generally, fork) and use Async::Container.
  3. Would you mind to introduce how to use Falcon in a production site? Falcon is designed to be a drop in replacement for Puma, except that it uses HTTPS by default. You can see the full documentation here which explains how to use it, and some example site here. If you are using ActiveRecord, you might have issues as ActiveRecord only allows one connection per thread (which is problematic as the entire event loop runs on one thread). If you want to try a completely event driven database driver, you could consider db. Sequel also has some partial support for event-driven concurrency.
  4. Is it possible to use Async::WebSocket as an ActionCable adapter? Yes, this is something which people are exploring. Falcon is compatible with Rack hijack so it works with existing ActionCable implementations, but there are alternatives including async_cable.
  5. Can you give hints on how to avoid potential race conditions (or scoped fiber-based naming if they matter)? Race conditions can be tricky to avoid, and they get more problematic as the complexity of your program increases. I talked about this several times in the past at RubyWorld Conference 2019 and RubyConf TW 2019. In addition, to your point about fiber-scoped variables, I have been discussing this. I believe task-scoped (so inherited by child tasks) variables is going to be important for isolation, so I'm working on a proposal to introduce inheritable fiber local variables into Ruby itself.
  6. If something goes wrong, will the entire process stop? Well, this depends on your definition of "goes wrong", but if you are talking about normal exceptions, this is clearly defined within Async.
    ​​​​Async do
    ​​​​    endpoint.accept do |peer|
    ​​​​        Async do
    ​​​​            raise "Fail"
    ​​​​        end
    ​​​​    end
    ​​​​end
    
    In this program, the exception will cause the inner most task to fail, but it won't propagate out since the outer most task does not wait on it. Async::Task#wait propagates the exception. For network servers, the above design isolations failures to the inner most "task per request". The server itself is generally robust.
  7. Could you provide a simple example of zero-copy and how it affects performance? I don't have many performance measurements yet. There is working proposal here. I think the advantages of this proposal are: (1) simplicity (2) memory mapped mmap buffers (3) no encoding and (4) page aligned (without trailing \0 chracter forced by Ruby). I did introduce some basic pack/unpack for binary data and it was 4x faster than String#unpack so it would be exciting for binary formats as performance could be a lot better.

If you have any other questions feel free to add them below.

n. Your Question

Select a repo