# Injector error handling ## General remarks ### Mempool replacements Operations that are injected by the injector can disappear from the mempool at any moment if an operator injects an operation with the same counter an higher fees ratio (replacement). In this case the operation will remain as "injected" (but not included) for the operator forever, but this should not happen in practice because only the rollup node injects operations with the operator keys. ### Errors logging The injector will by default log errors it encounters. Some of these errors are expected especially if we rely on the injector's behavior of ignoring/retrying. However this can be intimidating (and misleading) to users so we should select only errors which are in the **Abort** category below. ### TTL in injector There is currently no TTL for operations in the injector queue. We should decide on one and handle the case where operations exceed their TTL. ### Never included operations The injector does nothing if an operation was injected but never included. After the TTL on L1, this case should be considered an error and handled. ## `Sc_rollup_add_messages` > NOTE: `add_message` does not need to be in the error monad. ### Reorg Action: **Retry** The messages that the batcher sent are in the wrong branch and there is no reason for someone to resend them to the batcher. ### Simulation fail - `Error_encode_inbox_message`: **Forget** This can only happen if a message is above `Constants_repr.sc_rollup_message_size_limit` = 4kB. The batcher only accepts messages below this limit so this should not happen. - `Sc_rollup_max_number_of_messages_reached_for_commitment_period`: **Retry** This can happen if there are too many messages that were published. We should keep the operation in the batcher queue because they will be publishable after the commitment period has ended. - `Tried_to_add_zero_messages`: **Forget** This should no happen because the batcher will never produce operations with zero messages. - `Invalid_level_add_messages`: ?? This can only happen if the level in the simulation context is different from the inbox level in the context. How can this happen in practice? - `Key_bound_to_different_value`: ?? This is an internal error it seems in `remember` in inbox history. - `Gas.Operation_quota_exceeded`: **Retry** This should likely not happen in practice as the operation is simulated with the maximum gas and storage. Maybe cache issues? - Other errors: **Abort** The error will be reported and the injection request will fail. ### Included as failed - `Error_encode_inbox_message`: **Forget** This should not happen because of simulation. - `Sc_rollup_max_number_of_messages_reached_for_commitment_period`: **Retry** This can happen if there are additional messages that were published between simulation and inclusion. - `Tried_to_add_zero_messages`: **Forget** This should no happen. - `Invalid_level_add_messages`: ?? This can only happen if the level in the simulation context is different from the inbox level in the context. How can this happen in practice? - `Key_bound_to_different_value`: ?? This is an internal error it seems in `remember` in inbox history. - `Gas.Operation_quota_exceeded`: **Retry** This may happen in practice if the gas accounting depends on the inbox messages tree. There can be discrepancies if there are other messages that were published bewteen simulation and inclusion. - Other errors: **Abort** The error will be reported and the new head request will fail. ## `Sc_rollup_publish` ### Reorg Action: **Retry** Commitments are always produced on finalized blocks. They don't need to be recomputed, and as such are valid in another branch. ### Simulation fail - `Gas.Operation_quota_exceeded`: **Retry** This should likely not happen in practice as the operation is simulated with the maximum gas and storage. Maybe cache issues? - `Sc_rollup_does_not_exist`: **Abort** If the rollup has not been originated yet, for instance if the origination operation disappears because of a reorg. This should not happen in practice if we start the rollup node after the origination is confirmed. - `Sc_rollup_staker_funds_too_low`: **Retry** The operator does not have enough to make the deposit the stake. The injector should wait for the the balance of the operator to be funded. - `Sc_rollup_too_far_ahead`: **Retry** This can happen if some cementation has not happen yet (e.g. we inject in the wrong order). We should keep the operation because the cementation operation may come. - `Sc_rollup_unknown_commitment`: **Retry/Forget**? This can happen if the predecessor commitment is not included yet (or is not in the same L1 batch). Maybe we are injecting in the wrong order, so we can retry. Maybe we should ask the rollup node if the missing commitment is known and if it isn't we for forget the operation. - `Sc_rollup_bad_inbox_level`: **Abort** We are trying to inject a commitment for the wrong level with respect to the previous one. This should not happen. > NOTE: shouldn't this error be _Permanent_ for the Octez mempool? - `Sc_rollup_state_change_on_zero_tick_commitment`: **Abort** This can happen if there is a bug in the rollup node wrt. to L1, so in this case we don't want to inject the commitment! - `Sc_rollup_staker_backtracked`: ?? Refined stake up to LCC but not staked on it? - Other errors: **Abort** The error will be reported and the injection request will fail. ### Included as failed - `Gas.Operation_quota_exceeded`: **Retry** This may happen in practice if there is another similar commitment that was published between simulation and inclusion. In this case the second commitment will consume more gas, which the simulation couldn't have predicted. - `Sc_rollup_does_not_exist`: **Abort** If the rollup has not been originated yet, for instance if the origination operation disappears because of a reorg between the simulation and the injection. This has almost no chance to happen in practice. - `Sc_rollup_staker_funds_too_low`: **Retry** The operator does not have enough to make the deposit the stake. This can happen in theory if the operator made transferred some tz between the simulation and injection, however this should not happen in practice because only the rollup node should inject operations with the operator keys. The injector should wait for the the balance of the operator to be funded. - `Sc_rollup_too_far_ahead`: **Retry** This can happen if a cementation has not happen yet but the simulation said it was ok. This can happen if the block with the cementation is in another branch due to a reorg between simulation and inclusion. - `Sc_rollup_unknown_commitment`: **Retry/Forget**? This can happen if the predecessor commitment is not included yet or is in the wrong order in the block (unlikely in practice as we are using only one key for injection right now). We can ask the rollup node if the commitment is known. - `Sc_rollup_bad_inbox_level`: **Forget** This should not happen in practice, because of simulation among other things. - `Sc_rollup_state_change_on_zero_tick_commitment`: **Abort** This should not happen in practice. - `Sc_rollup_staker_backtracked`: ?? Refined stake up to LCC but not staked on it? - Other errors: **Abort** The error will be reported and the new head request will fail. ## `Sc_rollup_cement` ### Reorg Action: **Retry** The cementation operations should be re-injected because the node only keeps track of the last cemented level and the last published commitment, without rollbacks. ### Simulation fail - `Gas.Operation_quota_exceeded`: **Retry** This should likely not happen in practice as the operation is simulated with the maximum gas and storage. Maybe cache issues? - `Sc_rollup_does_not_exist`: **Abort** If the rollup has not been originated yet, for instance if the origination operation disappears because of a reorg. This should not happen in practice if we start the rollup node after the origination is confirmed. - `Sc_rollup_no_stakers`: **Forget** This could happen if the operator has been slashed (is it possible if we are past the challenge window?), but this should have stopped the node when detected. - `Sc_rollup_unknown_commitment`: **Abort**? This should never happen in practice because the commitments are cemented well after the commitment operation is finalized. This can only happen if there is a bug in the rollup node. - `Sc_rollup_parent_not_lcc`: **Retry** This can happen if the cementation operation for the previous commitment was not included yet. For instance, if we try to inject two cementation operations at the same time (_e.g._, during catch-up) or if the previous cementation operation has been reverted due to a reorg. Should we check that we also have a cementation operation for the predecessor in the queue? - `Sc_rollup_disputed`: ?? Stake on new LCC is different from total stakers on rollup. Not sure what this means. > NOTE: What if someone stakes on a later commitment? The total will be > increased without any impact on the LCC. - `Sc_rollup_commitment_too_recent`: **Retry** This can happen if a block has been reverted during a reorg. > NOTE: It seems this check should be made before other checks. - Other errors: **Abort** The error will be reported and the injection request will fail. ### Included as failed - `Gas.Operation_quota_exceeded`: **Retry** This should not happen in practice as gas consumption should not change after simulation. _Is this correct if there is a new commitment?_ - `Sc_rollup_does_not_exist`: **Abort** If the rollup has not been originated yet, for instance if the origination operation disappears because of a reorg between the simulation and the injection. This has almost no chance to happen in practice. - `Sc_rollup_no_stakers`: **Forget** This could happen if the operator has been slashed between simulation and injection (is it possible if we are past the challenge window?). - `Sc_rollup_parent_not_lcc`: **Retry** This can happen if the previous cementation operation has been reverted due to a reorg between simulation and inclusion. Should we check that we also have a cementation operation for the predecessor in the queue? - `Sc_rollup_unknown_commitment`: **Abort** This should never happen in practice if the simulation succeeded. - `Sc_rollup_disputed`: ?? See above. - `Sc_rollup_commitment_too_recent`: **Retry** This can happen if a block has been reverted during a reorg. - Other errors: **Abort** The error will be reported and the new head request will fail. ## `Sc_rollup_refute` ### Reorg Action: **Retry** Refutation should be re-submitted in case of fork. Maybe check if game exists of other branch (https://gitlab.com/tezos/tezos/-/issues/3459). ### Simulation fail - `Gas.Operation_quota_exceeded`: **Retry** This should likely not happen in practice as the operation is simulated with the maximum gas and storage. Maybe cache issues? - `Sc_rollup_does_not_exist`: **Abort** If the rollup has not been originated yet, for instance if the origination operation disappears because of a reorg. This should not happen in practice if we start the rollup node after the origination is confirmed. - `Sc_rollup_game_already_started`: **Forget** Another player has already started a refutation game. This error is expected to happen in practice as the two conflicting players will likely start a game at the same time. - `Sc_rollup_staker_in_game`: **Retry** This can happen if one of the opponents is already in a refutation game. We should wait for the other game to be ended before starting a new one. (Do we need a special TTL for this operation as games can be long?) - `Sc_rollup_no_conflict`: **Forget** The rollup node checks for existence of a conflict before starting a new game so this should not happen in practice. - `Sc_rollup_not_staked`: **Forget**?? One of the opponent has no stake in the rollup. It seems that it cannot happen if there is a conflict. - `Sc_rollup_no_game`: **Retry** The refutation move happened before the game start. - `Sc_rollup_wrong_turn`: **Retry** It was the other opponent to play the game. Could happen if the block where other opponent played is in a reorg. - `Dissection_choice_not_found`, `Dissection_number_of_sections_mismatch`, `Dissection_invalid_number_of_sections`, `Dissection_start_hash_mismatch`, `Dissection_stop_hash_mismatch`, `Dissection_edge_ticks_mismatch`, `Dissection_invalid_successive_states_shape`, `Dissection_invalid_distribution`, `Dissection_ticks_not_increasing`, `Dissecting_during_final_move`: **Abort** Invalid dissection move, can happen if there is a bug in rollup node. - `Proof_unexpected_section_size` - `Proof_start_state_hash_mismatch` - `Sc_rollup_proof_check`: : **Abort** Invalid proof, can happen if there is a bug in rollup node. - Other errors: **Abort** The error will be reported and the injection request will fail. ### Included as failed - `Gas.Operation_quota_exceeded`: **Retry** This should not happen in practice as only one player can play at a time. - `Sc_rollup_does_not_exist`: **Abort** If the rollup has not been originated yet, for instance if the origination operation disappears because of a reorg between the simulation and the injection. This has almost no chance to happen in practice. - `Sc_rollup_game_already_started`: **Forget** Another player has already started a refutation game. This error is expected to happen in practice as the two conflicting players will likely start a game at the same time. - `Sc_rollup_staker_in_game`: **Retry** This can happen if one of the opponents started another game between simulation and inclusion. We should wait for the other game to be ended before starting a new one. (Do we need a special TTL for this operation as games can be long?) - `Sc_rollup_no_conflict`: **Forget** Should not happen in practice if simulation succeeded. - `Sc_rollup_not_staked`: **Forget** One of the opponent has no stake in the rollup. It seems that it cannot happen if there is a conflict. - `Sc_rollup_no_game`: **Retry** The refutation move was included before the game opening move. - `Sc_rollup_wrong_turn`: **Retry** It was the other opponent to play the game. Could happen if the block where other opponent played is in a reorg between simulation and inclusion. - `Dissection_choice_not_found` `Dissection_number_of_sections_mismatch` `Dissection_invalid_number_of_sections` `Dissection_start_hash_mismatch` `Dissection_stop_hash_mismatch` `Dissection_edge_ticks_mismatch` `Dissection_invalid_successive_states_shape` `Dissection_invalid_distribution` `Dissection_ticks_not_increasing` `Dissecting_during_final_move`: **Abort** Invalid dissection move, can happen if there is a bug in rollup node. - `Proof_unexpected_section_size` `Proof_start_state_hash_mismatch` `Sc_rollup_proof_check`: **Abort** Invalid proof, can happen if there is a bug in rollup node. - Other errors: **Abort** The error will be reported and the new head request will fail. ## `Sc_rollup_timeout` ### Reorg Action: **Retry** Timeout should be re-submitted as the timeout may be reached as well on the other branch. ### Simulation fail - `Gas.Operation_quota_exceeded`: **Retry** This should likely not happen in practice as the operation is simulated with the maximum gas and storage. Maybe cache issues? - `Sc_rollup_does_not_exist`: **Abort** If the rollup has not been originated yet, for instance if the origination operation disappears because of a reorg. This should not happen in practice if we start the rollup node after the origination is confirmed. - `Sc_rollup_no_game`: **Forget** The timeout move happened before the game start (unlikely) or after the game finished (more likely, for instance if there are two Timeout operations in the queue and the first one succeeds). - `Sc_rollup_timeout_level_not_reached`: **Forget** The operation is sent to the injector when the rollup node thinks there is a timeout reached. However some move may be made by the opponent in between the queuing and the simulation. This error is expected to arise in the normal operation of the rollup node. - Other errors: **Abort** The error will be reported and the injection request will fail. ### Included as failed - `Gas.Operation_quota_exceeded`: **Retry** This should not happen in practice as only one player can play at a time. - `Sc_rollup_does_not_exist`: **Abort** If the rollup has not been originated yet, for instance if the origination operation disappears because of a reorg between the simulation and the injection. This has almost no chance to happen in practice. - `Sc_rollup_no_game`: **Forget** The timeout move was included before the game start (unlikely) or after the game finished (more likely, for instance if another timeout operation is included in the block or if a refutation proof is produced by the opponent). - `Sc_rollup_timeout_level_not_reached`: **Forget** The operation is sent to the injector when the rollup node thinks there is a timeout reached. However some move may be made by the opponent in between the the simulation and the inclusion. It is normal for operations which fail with this error to be included in blocks. - Other errors: **Abort** The error will be reported and the new head request will fail.