Goerli TX DOS post-mortem

First reported 8:40 CET by @jcksie & Jakub from Status on the ETH R&D discord and subsequently confirmed by Pari from EF devops

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’

CPU usage suddenly spiked at 1am CEST on all geth nodes on goerli. Both Disk writes and Reads increased significantly as well as Ingress and Egress for all geth nodes.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’

Unfortunately the geth team does not have goerli nodes within our own grafana instance, so we had to rely on EF devops to look into the issue.

EF devops does not support all of the very geth specific metrics, since they are based on influx not prometheus.

I had a hunch that it was because of txpool churn. @ding_wanning and I worked on a PR to mitigate some DOS issues during the Protocol Fellowship: https://github.com/ethereum/go-ethereum/pull/26648

This PR implements two additional rules to the transaction pool:

  • Transactions that would overspend a senders balance are rejected
  • Future transactions that churn pending transactions are rejected

We set up a node with this PR at 9:56 and noticed a huge influx of transactions being rejected and warnings in the log WARN [03-10|10:04:04.643] Peer delivering stale transactions peer=7490f10d6d03c09dacbcfb43295163053ca0376d9171451fdac21f11f5a8e390 rejected=43

All measures, Memory, CPU, Ingress, Egress, Disk Writes went down significantly afterwards.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’

After some more fixes to the PR we decided to update all EF goerli geth nodes to the PR.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’