Use Ocluster to schedule CB pipelines on multiple machines

For Current-Bench, we are only able to run benchmarks on one machine (autumn). To scale up and to allow Sandmark benchmarks to run on customized multicore machines, we want to use OCluster to dispatch the jobs on different workers. As a future goal, when the machines are idle, ocaml-ci can perhaps use these machines to run ci jobs.

OCluster workers are currently restricted to the task of building an image with Dockerfile/OBuilder. We want to extend their capability so that workers can run an OCurrent pipeline. In that way, we will be able to cut out the sub-pipeline "build and run" of the CB pipeline and create workers specialized for that task.

Ideally, the design is fairly independent from the CB needs. It allows any subgraph of a pipeline to be extracted and run on specialized workers. The clients and workers are expected to use the OCluster pool system as a contract that the workers will evaluate their jobs with the right pipeline.

When a client submits a new job to the cluster, its worker will interpret it as an input node for its pipeline. The worker can then stream back to the client the state of the output nodes and the logs of its internal jobs. This output nodes appears as standard OCurrent values on the client.

As an example, this is how the automata looks like right now on Current-Bench, with only one repository:

With the OCluster integration, the same graph will look like:

The orange ocluster-pipeline symbolises the connection to an active worker executing the sub-pipeline. The original automata had some implicit nodes that are now explicit: Our CB database is filled progressively with the build_job_id, then run_job_id and finally the output is the json metrics resulting from the benchmark run. Here the worker has not yet completed the run hence the output is pending (orange).

The worker has the missing pipeline that was extracted with the same three outputs nodes:

A known limitation is that the worker pipeline can only receive one input from the client, as it's not currently possible for the client to stream the status of multiple nodes to the workers. As you would expect, the client will cancel the job if its pipeline is not required anymore (freeing the worker to do something else); and the worker will inform the client of any failure. Also not shown in the drawing, the worker can have multiple jobs in progress.

We would like to break the work in a few PRs to ease the review of the proposed changes:

Extend the OCluster protocol so that jobs description can have a different set of information than "how to build this repository". We also need to extend the OCluster logs so that workers can stream structured informations to the client about the Current.state of their output nodes and jobs' logs / artifacts.

Nearly done, just needs to check that it works correctly with step 3 before submitting: https://github.com/art-w/ocluster/tree/capnp-anypointer The changes are in principle backward compatible, the running schedulers, workers and clients don't need to be upgraded to interact with schedulers/workers/clients using the new protocol (as long as they don't use the extension).
Simplify the definition of new workers.

This will be a small refactoring to introduce a functor that splits the ocluster protocol logic from the definition of the worker specific implementation of build, purge, update, etc

Nearly done, needs some tiny adjustement to ease the completion of step 3.
Add a Current_worker public library that can be used to create a worker that executes a pipeline.

Remaining TODOs:
- Use the protocol modifications (of step 1) rather than marshalling inputs/outputs to strings.
- Stream jobs' logs and file artifacts from the worker to the client.
- It's not clear what it means for a worker to be done with a job. If the worker's pipeline has external sources, then its pipeline can re-trigger and produce new outcomes. So by default, the worker stays connected to the client. We plan to introduce some helper functions on the client and worker to have more control on job completion.
- Related, it's not yet clear how many jobs are really active on a worker (and if it should request more jobs from the cluster)
Extend the Current_ocluster library so that clients can submit custom jobs to their Current_worker, and receive the outputs/logs/etc. The remaining TODOs are the dual of step 3.

Some ideas we could explore later:

The client could send endpoints as part of the job description, such that the worker can communicate directly without passing its results through the cluster.
It should be a bit easier to create workers that listen for jobs from multiple clusters.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`	在筆記中貼入程式碼
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.