The original reactor document is here. In this document, I'll present a full system: if you're new to reactor, you shouldn't necessarily need to reference the original document. With that said, what's new?
Safety comes at the cost of efficiency in situations where that safety is unnecessary. In optimizing for safety to this extent, the claim that this convention is an improvement expresses a belief that with solid play, the team will be totally fine in (almost) all situations where that safety is unnecessary.
I expect this version performs significantly worse in null. This is due to both the stable and reactive clues suffering extra in that variant relative to the original Reactor conventions.
The basic premise of Reactor holds: this is a system designed with 3p in mind. The next player without a safe play is the reacter, and the other player is the reciever. When a clue is given, the reacter expects that it is providing them with a safe action. Clues given to the reacter's hand are called stable; the reacter takes their information, and nothing additional happens. Clues given to the receiver are reactive: they promise the reciever a safe action, but the reacter gets one too: they react by playing or discarding a card.
This is just good. Most cards in the deck are good. Most systems should play with a weak good touch principle: if a card in Alice's hand is known to be playable or trash, but it is not known which, playing that card is the default action should Alice choose to not give a clue.
When chop exists, it is the leftmost untouched card in hand.
Stable clues are clues given on the reacter's hand. They can occasionally be given to fix otherwise good-touch playables.
If a clue fills-in a previously-touched card, revealing it playable, trash, or a duplicate of a card in the receiver's hand, it is a signal to act on that card, with no further meaning.
Color clues say to play the leftmost newly-touched card. If the reciever holds a playable card of that color, it is assumed to be a help-yourself delayed play clue; the reacter should help get that card (or cards) to play to become unlocked. Shoutouts to timo for this excellent idea: it is a main reason this version returns to direct color play clues.
Rank clues say to discard the untouched card to the left. When multiple cards are referred to, the signal is on the leftmost referred-to card, except chop (if it exists) has lowest priority. If chop is the only untouched referred-to card, the clue is a lock signal.
If there are two non-chop cards to the left, it is simply a direct discard clue.
As of now, this is undefined.
Reactive clues get an action from each of the other players.
If you want give a clue getting two untouched cards to play, count the total number of untouched cards to the left of each. If one or both cards are touched, the count for that hand should include all untouched cards and touched cards to the left. You should get a number between 0 and 4 for each hand. Add them, and take mod 5.
Count that many untouched cards then touched cards if you run out in from the right of the reciever's hand. The card after that is the one you want to focus. A color clue touching that card and none of the ones you counted over will work, at least as far as the slot math is concerned.
The focus is the rightmost newly-touched card, if one exists. If none exists, it is the rightmost re-touched card.
The target can be a play or a discard. The priority:
The target is almost always a play. The priority:
If the clue is a color clue, the actions will be two plays or (very occasionally) two discards. If the clue is a rank clue, the actions will be one play and one discard.
The focused card's priority value determines the priority values of the two targets. If the focus is previously-untouched, count untouched cards to its right. If it's previously-touched, count all untouched cards and previously touched cards to its right. This is the focus priority value.
If the target is previously-untouched, count previously-untouched cards to its left. If it is previously-touched, count all untouched cards to its left and previously-touched cards to its left. This determines the target priority value in both the reacter and the reciever's hands.
Finally, the equation all players must make true is simple:
focus priority value = target priority value + target priority value
Until the team has 10/25 points, chops do not exist; all discards must be instructed. Get good. The turn after the team reaches 10/25 points, all players without known actions have chops set to their leftmost untouched card, and players play with sieve-style chops from there. Maybe this threshold should change by a bit, but approximately there has heuristical justification and did pretty good in hypos.
Finnesses between Bob and Cathy become available for one of three reasons:
When Bob is asked to play into a finesse, he should ask
If the answer to any of these questions is yes, that is the priority finesse target. If multiple are the answer, the priority is: Bob's slot 1, Cathy's slot 1, then matching color.
If all are no, Bob should then assume prompt (play matching touched card) over finesse, then choose the leftmost card in Cathy's hand.
Overall, here is Bob's priority order, which should be not that bad:
Prior to chop's existence, a free choice convention would be good to tell cards to discard from the left.
The best chop systems involve chop-moving cards.
I propose something further. I claim that in easy variants, we can afford to spend clues on all our early discards as well as our plays. Let's do the following.
One perspective on this is that we assume by default players have no safe discards until we know for sure someone has one. Both of these forumations are approximately equivalent to what is in the doc: no chops until ten points.
Everything here about safety makes a big assumption: that we will not run out of clues. In general, this is the fundamental downside to care in clue meanings. H-live discusses a version of this concept as care with clue efficiency: the ratio between numbers of cards left to play to clues left available. I prefer to think of a game of Hanabi as optimizing both our plays and our discards, making this framing of clue efficiency less useful.
One thing we've discussed for stable clues is using them to provide information to the reciever as well as an action to the reacter. I think this is an extremely promising idea but I don't have a good idea of how to implement it. As such, it is left out of these conventions.
I think 1, 2, and 3 could reasonably be permuted in any way. 5 could be simply something else. 6 could come before 5 or even 4. Yet so it is written and so it shall be.
It would be nice if Alice could give 1-for-1s to Cathy even if Bob has no safe action when Cathy's hand is awkward: That is, Bob has no way to get a safe action from her with a single stable clue.
Suppose Alice has no safe action, and so she gives a clue. Before her next turn, she recieves a 1-for-1 she apparently could have gotten last turn (it says something other than telling a newly-playable card to play). She could assume it is trying to get a different action in some way.
One Option:
Color Clues
When a rank clue touches multiple cards, focus invert with left-bias. Become willing to discard a newly-touched card.
When a rank clue touches a single card, discard ??
A stable play clue is unnecessary if it could have been gotten reactively. This being defined is a benefit to stable play clues being direct and with color: knowing what card is playing lets the reacter decide far more often whether the clue could have been gotten reactively. That said, a toxicity assumption isn't currently in the doc, but maybe it should be.
Endgames are sometimes awkward with Reactor. I think we could define the endgame as ~15+ points or 2- pace, then use that adjust conventions accordingly. Like stable clue information, I don't have strong beliefs for how to implement this.
See here for an alternative solution to the chop problem: https://hackmd.io/G48yGOEYSHGM5hiIux4MbA
Both of these conventions are trying to get at a similar problem: when players are locked, it is important to be cycling the deck in order to draw the cards needed to unlock them. The difference is largely in how situations when two players have no trash are handled.
To compare scenarios, call a player hardlocked if they hold no playables or trash and have no OK discards. Call them softlocked if they hold no playables or trash but have at least one OK (inter-hand duplicate or 3/4 BDR) discard. To further analyze, we could break down softlocked states into even smaller cases depending.
Scenarios Chopless performs better
Scenarios Semi-spookiness performs better
Upon review, these have OK but imperfect decisionmaking between conventional acions. I think they demonstrate the system's strength well.