Safety-First Reactions

# Safety-First Reactions The biggest difference between designing conventions for 2-player and 3-player is the possibility of *reactions*. A clue to Cathy can have a meaning for Bob which is based on the contents of Cathy's hand and a meaning for Cathy which is based on the way Bob reacted. When designing a convention system with reactions, the key questions are: - What clues from Alice get reactions? - What actions from Bob are considered reactions? - What does each possible reaction from Bob tell Cathy? Viewed in this frame, Hat Guessing takes the extreme position of saying that every clue gets a reaction and that almost actions from Bob are considered reactions, with the only exception being playing already known playable cards. In terms of what the reactions tell Cathy, it's usually that a particular slot is playable or trash. H-Group has a few categories of clues which get reactions - Play clues on unplayable cards ("bluff", "5 color ejection", "unknown trash discharge") - Save clues on non-critical cards - Save clues on trash cards ("bad chop move ejection") - Save clues on playable cards ("chop move ignition", "rank choice ejection") - Clues where there was a preferred alternative way of telling the clue receiver the same thing ("suboptimal save", "unnecessary trash push") In terms of what it considers to be a reaction, H-Group takes a rather opposite extreme: Blind-plays are reactions, nothing else. Already clued cards are allowed to play unexpectedly from private knowledge, and discards from unexpected slots are just interpreted as being due to confusion. Bluffs and 5 Color Ejections tell Cathy information about an unplayable card, but some moves, such as Unknown Trash Discharges do give her a safe action. Reactor takes a slightly less extreme position, saying that clues that skip over "the reacter", the next player without a play, always get a reaction, but clues to them only get a reaction if a player who acts before them can see that the clue wouldn't work otherwise. In compari A key concept in Hanabi is "reacting". What do you do when a clue is given to another player that you can see will give them the wrong signal? You can react by playing or discarding unexpectedly, alerting the team to the correct signal. A ubiquitous example is a bluff, where you react by playing your newest card, and it alerts the team that a card that was told to play was actually one-away from playable. Bluffs are great for giving information for *later*, but if you want to avoid discarding any useful cards, your priority is spending as few clues as possible to give *safe actions now*. In H-Group, a Bluff, 5 Color Ejection, Unknown Trash Discharge, or 4 Charm tells the clue receiver that the targeted card is not playable and tells them what kind of card it is instead (one-away, 5, trash, or 4). They call this a *Signal Shift*. What if our reactions told them that the targeted card was not playable and told them *which card was playable instead*? We might call this a *Target Shift*. This document presents a recipe for applying this logic to your system and some case studies of applying it to various existing ones. ## Recipe - The slot that a reaction comes from always determines the target of the resulting signal. - A play reaction to a play signal simply shifts the target. - A discard reaction to a save signal shifts it to a play signal (and shifts the target). - A play reaction to a save signal shifts it to a trash signal (and shifts the target). It is up to the system to decide - The exact way that different reactions result in different targets - The way that the reacting player decides which card in the receiver's hand they should target - What constitutes a play signal/save signal in the first place - What kinds of clues warrant reactions in the first place ## Example: - When Alice gives a play clue is on an unplayable card, Bob's job is to react with a play that allows the clue receiver to play. - Each position Bob could play would tell the clue receiver to play a different card. - A simple way of doing this is to say that the target shifts to the right one space for each finesse position. - i.e. Bob plays finesse => clue receiver plays the card to right of the original target, second finesse => card two spaces to the right, and so on. - - If Bo play your finesse position, then the clue receiver will play a card that is shifted over one space from the original target of the play clue. - If you play your second finesse position, then the target will shift two spaces, and so on. - (As a system designer, choose whether you want the target to shift left or right) - In order to know what position to react with, you need to know what card the clue giver wants the clue receiver to play. - As a system - If the clue receiver has any playable cards, then assume that you're trying to tell them to play one of them. Don't assume that you're trying to make an unplayable card playable, because that requires more of your hand. - You want the clue giver to be able to get two plays no matter where the playable card in your hand is, so make sure that each play clue to the clue receiver would get a different reaction from you. - ## Recipe (Bluffs) ## Reactor With the [latest rework to Reactor](https://hackmd.io/@hanab/reactor#Remaking-Reactor), one way of analyzing it is as follows: - The slot that a reaction comes from always determines the target of the resulting signal. - A discard reaction to a save signal shifts it to a play signal. - A play reaction to a save signal shifts it to a discard signal. - A play reaction to a play signal keeps it as a play signal (potentially targeting a one-away card in Cathy's hand) This framework for clue reactions might actually be applicable to any convention system that has arbitrary off-chop save clues. When you apply it, you have to decide what constitutes a reaction, what clues are bad enough to react to, how the reaction slot corresponds to the target slot, etc. Reactor is basically the result of applying this to a Referential Sieve and making specific choice for each of those questions. ## Direct Reactor :exploding_head: Case-study with a direct color play clue system - Baseline: Direct sieve - Discard newest (for now) - Color clues mean play leftmost - Number clues mean sieve-like discard to the right (do we want lock to be the highest precedence?) - We don't react to good clues to Cathy - Discarding chop is not a reaction - Bob playing in reaction to a color clue means that Cathy can play something - Bob discarding (off-chop) in reaction to a number clue means Cathy can play something - Bob playing slot 1 in reaction to a number clue means that the focus is playable - Bob playing anything else in reaction to a number clue tells Cathy to discard something - Bob's reaction slot + Cathy's target slot = focused slot + 1 - We target the leftmost playable in Cathy's hand if she has one. - With number clues, we target the leftmost trash in Cathy's hand if she doesn't have any playables. - With finesses, we ask for the leftmost play from Bob - When a bluff occurs, it means that Cathy did not have any playables. Cathy might have trash, but by convention, we agree Cathy is locked ### Replays First game is in! https://hanab.live/replay/943129 Second game (messier) is in! https://hanab.live/replay/943133 [Hypo in 5p](https://hanab.live/shared-replay-json/515agaquyqwfopfmbkpd-cspduaivxrlrhwivctlk-kubfhsgnmjnex,03sbld-alapwbacaeseanar1bag-saayaqabahpdaoaza3af-sba1a2bdaxseamasbBa8-bubDbEbFbG1aaI1caK1d-ajaM,0) ([Ref Sieve hypo](https://hanab.live/shared-replay-json/515agaquyqwfopfmbkpd-cspduaivxrlrhwivctlk-kubfhsgnmjnex,03lbae-gakaatadgdalaotbabxc-gbaypdaaahbkan1caza2-a3b4gca1afbigeassdbA-a7a8bCbcbupaam1ca5aI-aj1aaqaK1caM,0) is also a win, but [real Ref Sieve game](https://hanab.live/replay/941852) was a loss that felt like a typical ref sieve 5p loss due to lack of tempo in the early game -> tight endgame -> mistake) ### Ideas - When Cathy has two good 1s, we want to have 1s be expected and colors to the y1s get reactions. The problem is that there's no way to react to a color clue and say to play the clued card, so color on the leftmost 1 will just be undefined. - Possible solution: make an exception for color clues where the 1s of that color haven't been played yet. Playing slot 1 in response to such a color clue would mean the clued card is still playable (not a bluff). Then 1s is always expected, and playing slot 1 is response to a 1s clue would mean the focus is trash. - When Alice gives a color clue to Bob, we currently allow reverse finesses. This allows us to clue a delayed playable card in Bob's hand whenever the connector is on finesse. What if we allowed cluing the delayed playable no matter where the connector was in Cathy's hand? Then Bob would be locked until he got Cathy to play all the connecting cards. - It makes some sense for Bob not to assume he has a duplicate of Cathy's playable, because Alice probably could've told him to discard it instead. - What if Cathy's playable was actually on finesse? Then Bob was going to give a clue anyway, so Alice didn't need to lock him yet. In this case, Bob can still discard chop, and after Bob discards, Cathy will know that her playable is on finesse. ## EDCB-Based Reactor - Baseline: GTKPP - Discard oldest - Color clues are chop-focused - Number clues mean to save all the cards to the right - Connecting finesses are *only* allowed to players with no immediate playables: in such a situation, players are expected to get the immediate playable. - Prompts/Ignitions allow for direct Plays/Bluffs on 1-aways - Ejections, Discharges, Charms, Blasts shift play signals to the right by 1, 2, 3, or 4 previously-untouched cards. - Discarding rightmost kt/chop is not a reaction Likely: - Playing in response to a rank clue indicates a bad chop move: one of the saved cards should be discarded. Possible, but significantly increases the desync from asymmetric chop-moves: - Discarding inward from chop all in response to a rank clue indicates a chop-move of a playable, indicated by how many cards inward from chop. ### Option A | Reaction | Initial Clue Meaning | Adjusted Clue Meaning | | --------- |:--------------------:|:-------------------------------------------:| | Ignite | Play Card | Card is 1-away, no playables to indicate | | Eject | Play card | Leftmost playable one card to the right | | Discharge | Play card | Leftmost playable two cards to the right | | Charm | Play card | Leftmost playable three cards to the right | | | | | | Ignite | Save Card | Card is playable | | Eject | Save Card | Rightmost trash is one card to the right | | Discharge | Save Card | Rightmost trash is two cards to the right | | Charm | Save Card | Rightmost trash is three cards to the right | ### Option B | Reaction | Initial Clue Meaning | Adjusted Clue Meaning | | --------- |:--------------------:|:-------------------------------------------:| | Ignite | Play card | Card is leftmost playable | | Eject | Play card | Leftmost playable one card to the right | | Discharge | Play card | Leftmost playable two cards to the right | | Charm | Play card | Leftmost playable three cards to the right | | | | | | Ignite | Save card | Card is rightmost trash | | Eject | Save card | Rightmost trash is one card to the right | | Discharge | Save card | Rightmost trash is two cards to the right | | Charm | Save card | Rightmost trash is three cards to the right | TODO: Case-study with ref sieve and being less strict about all clues being reactive