owned this note
owned this note
Published
Linked with GitHub
# Summary
This paper introduces a potentially very influential and insightful new definition of "adversarial runtime" in computation by chemical reaction networks (CRNs) consisting of reactions among abstract species such as $A+B \to X+Y$. Existing studies of CRN computation (many of which are done under the guise of the distributed computing model know as population protocols that captures a very expressive and powerful subclass of CRNs) typically take the following approach. They require that the protocol is "guaranteed correct" in the sense that any "fair" yet otherwise adversarial schedule of reactions will eventually produce the correct output. This is known as *stable computation*, a.k.a., "rate-independent" computation since reaction rates can vary (almost) arbitrarily over time, yet the protocol is still correct.
But of course we are also interested in the time complexity of such protocols. The approach taken in other papers has been to use the standard stochastic scheduler (one that picks a pair of molecules uniformly at random to be the next to react) to analyze the runtime. So there's a bit of a disconnect between the definitions of correctness and speed: analysis of correctness allows a powerful adversary to schedule reactions in improbable ways, but analysis of speed gets to assume the well-behaved stochastic scheduler. The purpose of requiring stable computation/rate-independence is to ensure that the system is robust to violations in the assumptions inherent in the stochastic scheduler. It assumes for example that the solution is well-mixed, but it may not be initially well-mixed, for example if inputs are added by pipette and then (several seconds or minutes later) vortexed. Or the CRN may be operating inside of a cell, which uses many spatial strutures such as membranes to keep the cell from being well-mixed. This could in turn mean that the protocol will not in fact produce an output as quickly as under the standard stochastic scheduler.
One prior work addressed this (ref [CCDS17])
# Strengths
# Weaknesses
My main issue with the paper is that it is heavy on formal notation and light on intuition. I was okay reading some of it because I already know much of the material, but stuff that is new to this paper, such as the definitions of adversarial time complexity, were very difficult to parse. I think that this paper could be very influential, but only if people can read it, so I urge the authors to take seriously my suggestions below.
The examples helped somewhat, but even then the examples are heavy on formal notation where intuition would suffice. It was also the case that one must jump around the pages a lot to understand some examples. Example 5 on page 9, for instance, references Figure 1 on page 37, which itself references reactions defined in Example 1 on page 4. Furthermore, no reference to example 1 is given in Figure 1. Looking at that figure caption, the only way to know which CRN it is referring to is the variable $\Pi_e$ appearing in the superscript of a graph $D^{\Pi_e}_{\vec{c}^0}$ named in the figure caption. The reader should not have to memorize every single one of the hundreds of variables defined in the paper in order to read it.
Let me give an example of how the examples could be improved. Suppose I want to explain why some configuration is always reachable. Here's two ways to explain an example of this:
1. Consider the CRN with reactions $A \to B$, $B \to A$, and $A + X \to A + Y$, starting with $1 A$ and $20 X$. Then the configuration with $1B$ and $20Y$ is always reachable. The reason it is always reachable is that, if $A$ is absent, then there is one $B$, so the reaction $B \to A$ can produce an $A$, after which the reaction $A+X \to A+Y$ can convert all remaining $X$ to $Y$, and finally the reaction $A \to B$ converts the $A$ back to $B$.
2. Let $\Pi=(\Lambda,\mathcal{R})$ where $\Lambda=\{A,B,X,Y\}$ and $\mathrm{NV}(\mathcal{R}) = \{(\vec{r_1}, \vec{p_1}), (\vec{r_2}, \vec{p_2}), (\vec{r_3}, \vec{p_3}) \}$, where $\vec{r_1} = (1,0,0,0)$, $\vec{p_1} = (0,1,0,0)$, $\vec{r_2} = (0,1,0,0)$, $\vec{p_2} = (1,0,0,0)$, $\vec{r_3} = (1,0,1,0)$, $\vec{p_3} = (1,0,0,1)$. Let $\vec{x} = (1,0,20,0)$ be the initial configuration. Then from any configuration $\vec{c}$ reachable from $\vec{x}$, we can reach configuration $\vec{y} = (0,1,0,20)$ from $\vec{c}$ by applying execution $\eta = \langle \vec{c}^t, \alpha^t \rangle_{t \geq 0}$, where for all $t \geq 0$, define $\vec{c}^t =$...
You get the idea. I realize I'm being a bit unfair with this example (the submission's examples actually use reaction notation $A \to B$ and not the vectors $\vec{r},\vec{p}$). Nevertheless, this is honestly how it feels to read many of the examples in this paper.
These both convey the same information. But one of them requires intense mental processing to unpack the formal notation, and the other just explains plainly the basic ideas, bringing in formal notation only when necessary. Now, one could certainly *augment* the first explanation with more formal notation, in *addition* to the intuitive explanation. This can be useful for the goal of helping a reader interested in seeing an example of how each object in a formal definition gets used "in practice". But if the formal notation *replaces* the intuitive explanation, then the goal of the example, to help build intuition for the concepts, is destroyed.
And of course in proofs and definitions, formal notation is useful when there's a risk that simple English will be ambiguous, in order to make crystal-clear what is meant. But even in definitions and proofs, intuition just before or after formal notation is useful as well.
I think the introduction would be improved by adding more discussion and motivation of weak fairness. Presumably an attempt to define a notion of time based on strong fairness would fail in some way, but it's not clear how. I recommend walking the reader through this issue in order to justify why a new definition of fairness is needed.
# Other comments
The definition of stochastic time used in this paper is quite non-standard. In CRNs that do computation, one typically assumes that the volume is not constant, but scales with the total molecular count. This was not the case in ref [CCDS17], but nevertheless if you use this non-standard definition of time, I would point it out explicitly as soon as you begin discussing time in the introduction, or else it will confuse people who are used to the other definition. There's a discussion of this in Section 10, and a footnote on page 6, but it should be mentioned as soon as you start discussing time and talking about existing results on time complexity.
The definition of "mass-preserving" given in the submission is non-standard, though unfortunately I know of know alternative term for this definition. Mass-preserving (or more commonly mass-conserving, or sometimes just "conservative") means that each species can be assigned a positive mass such that all reactions preserve total mass. For instance, a dimerization reaction $A+B \to C$ is mass-conserving so long as mass($A$)+mass($B$) = mass($C$).
I recommend using the term "weakly fair" throughout the paper instead of just "fair", since "fair" (what's called strongly fair in the submission) has another meaning in other population protocol and CRN papers. Perhaps also the same with terms like "stably decide" that are using weak fairness instead of strong fairness as in other papers using those same terms.
page 3: "It is straightforward to observe that in our setting, a speed fault directly translates to an $\Omega(1)$ runtime lower bound, leading to an $\Omega(1)$ runtime lower bound for the task of deciding any non-detection semilinear predicate."
It would be worth stating this as a formal lemma and proving it.
page 4, example 1: I would state what problem this CRN solves: it detects whether there are any B present out of n-1 total A and B.
page 5, example 2: State that $\Pi_e$ can be found in Example 1. In general, tell the reader where to find things if they are not defined immediately before being referenced.
page 5: "Although a strongly fair execution is not necessarily fair (according to the current paper’s notion of fairness)"
Explain why with a short example.
last sentence of page 6: The claim is $\pi_{\vec{c}}(R) = \binom{\|\vec{c}\|}{2}$, but wouldn't this only be true if all pairs of species could have a reaction? You could talk about implicit null reactions between all other pairs of species, but I would think you would want $\pi_{\vec{c}}(R)$ to be the actual propensity of something happening.
In fact, I'm very confused by the definition of time given on the top of page 7. It sounds as though this is defining the standard notion of time under Gillespie kinetics, but it looks much different than I'm used to seeing it presented, and not just because the volume is assumed to be 1. I don't understand how the time span of a step depends only on the number of molecules in the configuration $\vec{c}$ and not on the individual reaction propensities. If there are $n$ molecules and every pair of species present can have a non-null reaction, then a reaction will happen faster than if some pairs do not have a non-null reaction.
Perhaps the explanation is that you are assuming something more like the population protocol definition of time that counts every interaction, whether null or not? [**Update**] Ah, I see now, there's this sentence on page 4 "We further assume that for each two (not necessarily distinct) species $A, B \in S$, there exists exactly one reaction $(\vec{r}, \vec{p}) \in R$".
This is a **very** nonstandard assumption in CRNs. It is worth calling it out and making it more obvious, and discussing it somewhere else. In general, any time this submission breaks from tradition in CRNs, it is worth making a big deal out of it to avoid confusion for the reader.
In fact, the assumptions put on the model, that all reactions are bimolecular, that every interaction counts against the "time", that molecule counts cannot go down, and they can go up but only by a limited amount, that all rate constants are 1, means that this model is somehow closer to the population protocol model than the full CRN model. I wonder if the entire paper would be easier to understand if phrased in the language of population protocols instead of CRNs. Unless there is some CRN that increases molecular count in an interesting way, I think using population protocols terminology and definitions would make this a much more digestible paper for most CS readers.
page 7: The definition of *escapes* seems a bit restrictive to require that *every* configuration in $S$ can reach a configuration not in $S$ by reaction $\alpha$. I would have thought something looser would be useful, such as saying there exists a configuration in $S$ with an escaping edge. Since the component is strongly connected, every other configuration can reach this one, and so every configuration in $S$ has a path to escape, just not necessarily a length-1 path. [**Update**]: okay, I see how it is used in the proof of Lemma 3.1, since your definition of fair allows to starve a reaction until it becomes inapplicable, so this seems like the right definition of escapes to match with your definition of fair. But I would discuss that issue a bit.
It's worth pointing out that Lemma 3.2 requires the assumption of a finite reachable configuration graph. Otherwise you could keep escaping SCCs forever.
I believe Lemmas 3.2 and 3.3, but I would recommend adding explicit proofs (maybe in the appendix).
page 9: It's a bit confusing that the steps $\sigma(0), \sigma(1), \dots$ are one more than I'd expect. It seems like you want them intuitively to identify the step at which a target reaction is executed, or becomes in applicable, i.e., to be the step that takes you from one epoch to the next epoch, but it's actually one after that.
I'm having a lot of trouble understanding the notion of temporal cost. The introduction made it sound like the main contribution of the paper is a definition of "adversarial time", i.e., a way to define the time taken by a CRN when an adversary, rather than a stochastic scheduler running Gillespie's algorithm, is determining the execution. So I expected probabilities and expected values to disappear from the definition, the same way that adversarial correctness (a.k.a. stable computation) can be defined purely with reachability without referencing probabilities. But then here is the key concept of the whole contribution, the way that we connect the adversary's choices about an execution to a definition of time, and now there's an expected value taken over some probability distribution on executions.
Now, clearly this is a different definition, and maybe a simpler one without probabilities doesn't work. But somehow I feel some intuitive explanation and justification of this temporal cost definition is lacking here.
page 15: "Unfortunately, the technique of [AAE08a] fails once we switch to the adversarial scheduler assumed in the current paper."
It would be worth explaining their trick (which is fairly simple), and then walking through an intuitively explanation of why it doesn't work, i.e., how the adversary defined in this paper could force a runtime of larger than constant.
page 18: Obtaining Vote Amplified CRDs
Somewhere earlier the paper mentions that unlike with strong fairness, with weak fairness you can't simply take the cross product of two protocols to run them in parallel and expect them both to keep working. The standard way of vote amplifying (e.g, in [AAE08a]) involves letting the original voters be permanent voters and the other species be fluid voters, i.e., create two functionally identical versions of each non-permanent-voter species, and then have the vote amplification run in parallel with the original protocol. It seems this is not how this construction is working presumably because of the possibility of the parallel protocol failing under weak fairness.
I think what's happening is that the "startup gadget" is making a large number of copies of fluid voters whose *only* role is to vote, and not otherwise participate in the original protocol, so that it's really happening in "parallel" in the test tube, but not in parallel within any one species, each of whom is either a fluid voter, or a participant in the original protocol. It's worth pointing this out explicitly, and with some intuitive explanation similar to what I just said, if that's indeed accurate.
page 19: Please give an intuitive explanation of Proposition 8.5, e.g., "predicates stably decidable under weak fairness are closed under Boolean operations."