--- tags: game-of-chains --- # Consumer Chains Validator Set Updates Status: - `2022-11-14 2:40 UTC` - The `apollo` consumer chain is not receiving validator set updates from the provider chain due to a VSC packet not being handled properly. - The `sputnik` consumer chain seems to be receiving validator set updates from the provider chain. Provider consensus monitor: https://provider-consensus.goc.earthball.xyz/ Sputnik consensus monitor: http://sputnik-consensus.goc.earthball.xyz/ Apollo consensus monitor: http://apollo-consensus.goc.earthball.xyz/ **Latest update at the top:** ### 2022-Nov-14 --- We try clearing the IBC packets in the apollo relayer: ``` hermes clear packets --chain provider --port provider --channel channel-2 2022-11-14T13:42:55.648345Z INFO ThreadId(01) using default configuration from '/home/hermes/.hermes/config.toml' 2022-11-14T13:42:56.845441Z INFO ThreadId(01) PacketRecvCmd{src_chain=provider src_port=provider src_channel=channel-2 dst_chain=apollo}: unreceived packets found: 260 2022-11-14T13:43:11.692351Z INFO ThreadId(01) PacketRecvCmd{src_chain=provider src_port=provider src_channel=channel-2 dst_chain=apollo}: pulled packet data for 0 events; events_total=260 events_left=210 2022-11-14T13:43:26.141233Z INFO ThreadId(01) PacketRecvCmd{src_chain=provider src_port=provider src_channel=channel-2 dst_chain=apollo}: pulled packet data for 0 events; events_total=260 events_left=160 2022-11-14T13:43:40.682975Z INFO ThreadId(01) PacketRecvCmd{src_chain=provider src_port=provider src_channel=channel-2 dst_chain=apollo}: pulled packet data for 0 events; events_total=260 events_left=110 2022-11-14T13:43:55.130085Z INFO ThreadId(01) PacketRecvCmd{src_chain=provider src_port=provider src_channel=channel-2 dst_chain=apollo}: pulled packet data for 0 events; events_total=260 events_left=60 2022-11-14T13:44:09.622661Z INFO ThreadId(01) PacketRecvCmd{src_chain=provider src_port=provider src_channel=channel-2 dst_chain=apollo}: pulled packet data for 0 events; events_total=260 events_left=10 2022-11-14T13:44:12.522770Z INFO ThreadId(01) PacketRecvCmd{src_chain=provider src_port=provider src_channel=channel-2 dst_chain=apollo}: pulled packet data for 0 events; events_total=260 events_left=0 SUCCESS [] ``` That did not fix the out-of-sequence errors: ``` Nov 14 08:48:08 apollo-relayer hermes[79283]: 2022-11-14T13:48:08.976018Z ERROR ThreadId(84) packet_cmd{src_chain=provider src_port=provider src_channel=channel-2 dst_chain=apollo}: will retry: schedule execution encountered error: link failed with underlying error: gRPC call failed with status: status: Unknown, message: "failed to execute message; message index: 1: receive packet verification failed: packet sequence ≠ next receive sequence (654 ≠ 393): packet sequence is out of order", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "x-cosmos-block-height": "50842"} } Nov 14 08:48:08 apollo-relayer hermes[79283]: 2022-11-14T13:48:08.976060Z WARN ThreadId(84) packet_cmd{src_chain=provider src_port=provider src_channel=channel-2 dst_chain=apollo}: task encountered ignorable error: link error: link failed with underlying error: gRPC call failed with status: status: Unknown, message: "failed to execute message; message index: 1: receive packet verification failed: packet sequence ≠ next receive sequence (654 ≠ 393): packet sequence is out of order", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "x-cosmos-block-height": "50842"} } ``` --- ### 2022-Nov-13 --- We restarted the following services without success: - `provider` chain endpoint for the relayer - `apollo` chain endpoint for the relayer - Hermes service in `provider`<->`apollo` relayer machine It looks like this is where the packets stopped being relayed: ``` Nov 12 12:01:15 apollo-relayer hermes[3309]: 2022-11-12T17:01:15.473077Z ERROR ThreadId(23) send_tx_with_account_sequence_retry{id=apollo}:estimate_gas: failed to simulate tx. propagating error to caller: gRPC call failed with status: status: Unknown, message: "failed to execute message; message index: 1: receive packet verification failed: packet sequence ≠ next receive sequence (394 ≠ 393): packet sequence is out of order", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "x-cosmos-block-height": "19235"} } Nov 12 12:01:15 apollo-relayer hermes[3309]: 2022-11-12T17:01:15.473204Z ERROR ThreadId(67) packet_cmd{src_chain=provider src_port=provider src_channel=channel-2 dst_chain=apollo}: will retry: schedule execution encountered error: link failed with underlying error: gRPC call failed with status: status: Unknown, message: "failed to execute message; message index: 1: receive packet verification failed: packet sequence ≠ next receive sequence (394 ≠ 393): packet sequence is out of order", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "x-cosmos-block-height": "19235"} } Nov 12 12:01:15 apollo-relayer hermes[3309]: 2022-11-12T17:01:15.473229Z WARN ThreadId(67) packet_cmd{src_chain=provider src_port=provider src_channel=channel-2 dst_chain=apollo}: task encountered ignorable error: link error: link failed with underlying error: gRPC call failed with status: status: Unknown, message: "failed to execute message; message index: 1: receive packet verification failed: packet sequence ≠ next receive sequence (394 ≠ 393): packet sequence is out of order", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "x-cosmos-block-height": "19235"} } ``` We are seeing the following errors in the hermes log shortly after starting the service: ```log= Nov 13 21:25:39 apollo-relayer hermes[79283]: 2022-11-14T02:25:39.928279Z INFO ThreadId(01) Hermes has started Nov 13 21:25:43 apollo-relayer hermes[79283]: 2022-11-14T02:25:43.796261Z WARN ThreadId(83) DetectMisbehaviorWorker{client=07-tendermint-1 src_chain=apollo dst_chain=provider}: [apollo -> provider:07-tendermint-1] misbehaviour checking is being disabled: MisbehaviourExitSubdetail { reason: "could not extract header from update client event cs_h: 07-tendermint-1(0-42650) emitted by chain ChainId { id: \"provider\", version: 0 }" } Nov 13 21:25:47 apollo-relayer hermes[79283]: 2022-11-14T02:25:47.497160Z WARN ThreadId(83) DetectMisbehaviorWorker{client=07-tendermint-1 src_chain=apollo dst_chain=provider}: [apollo -> provider:07-tendermint-1] misbehaviour checking is being disabled: MisbehaviourExitSubdetail { reason: "could not extract header from update client event cs_h: 07-tendermint-1(0-42769) emitted by chain ChainId { id: \"provider\", version: 0 }" } Nov 13 21:26:12 apollo-relayer hermes[79283]: 2022-11-14T02:26:12.395398Z INFO ThreadId(84) packet_cmd{src_chain=provider src_port=provider src_channel=channel-2 dst_chain=apollo}: pulled packet data for 0 events; events_total=248 events_left=198 Nov 13 21:26:33 apollo-relayer hermes[79283]: 2022-11-14T02:26:33.733641Z INFO ThreadId(84) packet_cmd{src_chain=provider src_port=provider src_channel=channel-2 dst_chain=apollo}: pulled packet data for 0 events; events_total=248 events_left=148 Nov 13 21:26:37 apollo-relayer hermes[79283]: 2022-11-14T02:26:37.191107Z WARN ThreadId(26) DetectMisbehaviorWorker{client=07-tendermint-0 src_chain=provider dst_chain=apollo}: [provider -> apollo:07-tendermint-0] misbehaviour checking is being disabled: MisbehaviourExitSubdetail { reason: "could not extract header from update client event cs_h: 07-tendermint-0(0-96029) emitted by chain ChainId { id: \"apollo\", version: 0 }" } ``` And this one shows up regularly: ```log= Nov 13 21:20:31 apollo-relayer hermes[72395]: 2022-11-14T02:20:31.409503Z ERROR ThreadId(23) send_tx_with_account_sequence_retry{id=apollo}: broadcast_tx_sync: Response { code: Err(22), data: Data([]), log: Log("packet messages are redundant"), hash: transaction::Hash(ECF974DB85D2051CC39675643C28CBB36FC36E90DD34D040D761EB85909BEA82) }: diagnostic: unknown TX sync response error: 22 ``` Our Hermes config: ```toml= [global] log_level = 'info' [mode] [mode.clients] enabled = true refresh = true misbehaviour = true [mode.connections] enabled = false [mode.channels] enabled = true [mode.packets] enabled = true clear_interval = 100 clear_on_start = true tx_confirmation = true [rest] enabled = true host = '127.0.0.1' port = 3000 [telemetry] enabled = true host = '127.0.0.1' port = 3001 [[chains]] id = 'apollo' rpc_addr = 'http://apollo-backup.goc.earthball.xyz:26657' grpc_addr = 'http://apollo-backup.goc.earthball.xyz:9090' websocket_addr = 'ws://apollo-backup.goc.earthball.xyz:26657/websocket' rpc_timeout = '10s' account_prefix = 'cosmos' key_name = 'testkey' store_prefix = 'ibc' max_gas = 2000000 fee_granter = '' gas_price = { price = 0.001, denom = 'upol' } gas_multiplier = 1.1 clock_drift = '30s' trusting_period = '2days' trust_threshold = { numerator = '1', denominator = '3' } [[chains]] id = 'provider' rpc_addr = 'http://provider-sentry-01.goc.earthball.xyz:27001' grpc_addr = 'http://provider-sentry-01.goc.earthball.xyz:27003' websocket_addr = 'ws://provider-sentry-01.goc.earthball.xyz:27001/websocket' rpc_timeout = '10s' account_prefix = 'cosmos' key_name = 'testkey' store_prefix = 'ibc' max_gas = 2000000 fee_granter = '' gas_price = { price = 0.001, denom = 'uprov' } gas_multiplier = 1.1 clock_drift = '30s' trusting_period = '2days' trust_threshold = { numerator = '1', denominator = '3' } ``` --- Claimens from Crypto Crew expects the `07-tendermint-1` client in the provider chain will expire around Nov 14 at 16:00 UTC if relayers aren't able to relay packet 399. https://discord.com/channels/669268347736686612/1034957556038254612/1041303299246264320 They want to know if the coordinator relayer is in some weird state, but they think it could be any other relayer out there too. --- ### 2022-Nov-12 --- Claimens from Crypto Crew Validators reported VSC packets stuck between `provider` and `apollo`: https://discord.com/channels/669268347736686612/1034957556038254612/1041039228810829854) ```log= ThreadId(01) fetched channel from source chain: IdentifiedChannelEnd { port_id: PortId("consumer"), channel_id: ChannelId("channel-0"), channel_end: ChannelEnd { state: Open, ordering: Ordered, remote: Counterparty { port_id: PortId("provider"), channel_id: Some(ChannelId("channel-2")) }, connection_hops: [ConnectionId("connection-0")], version: Version("\n-cosmos17xpfvakm2amg962yls6f84z3kell8c5lserqta\u{12}\u{1}1") } } chain=apollo SUCCESS Summary { src: PendingPackets { unreceived_packets: [], unreceived_acks: [], }, dst: PendingPackets { unreceived_packets: [ Sequence( 393, ), Sequence( 394, ), Sequence( 395, ), Sequence( 396, ), Sequence( 397, ), Sequence( 398, ), Sequence( 399, ), ], unreceived_acks: [], }, } ``` --- ### 2022-Nov-11 --- People started reporting validators not being synced properly in the Sputnik chain. https://discord.com/channels/669268347736686612/1040649755912589343/1040653174408753282 ---