--- tags: lighthouse --- # Monitoring Lighthouse Backfill Processing CPU Usage **Issue**: https://github.com/sigp/lighthouse/issues/3212 **Pull Request**: https://github.com/sigp/lighthouse/pull/3936 Rate-limiting on backfill sync processing has been introduced to solve the CPU consumption issue on lower spec machines. To figure out a good default rate for processing, it is helpful to understand: - **the processing time** for each backfill batch, so we don't schedule the processing on critical time window and impact validator performance, e.g. slot start - **the cpu usage** improvement when rate-limit is applied to backfill processing - the impact to **overall time** to complete backfill sync Things to consider include: - whether have a fixed default rate (e.g. 3 batches per slot) or a rate based on the number of work threads / CPU cores avaialble. - the timing on when to schedule processing ## Batch processing times <!-- ### Goerli **NOTE:** The below metrics were captured from running a Lighthouse node on Goerli, using Docker. Mainnet processing times will likely be longer due to bigger blocks. - **Sample #1**: MacOS, M1 Pro, no CPU limit - **Sample #2**: Linux, Docker, container CPU limit set to 2 - **Sample #3**: Linux, Docker, container CPU limit set to 2 | Description | Sample #1 | Sample #2 | Sample #3 | |:---------------------------- | ---------:| ---------:| ---------:| | Number of batches processed | 10 | 124 | 44 | | Processing time (ms) - Min | 95 | 121 | 107 | | Processing time (ms)- Max | 187 | 600 | 174 | | Processing time (ms)- Median | 123.5 | 171 | 140 | | Processing time (ms)- Mean | 130.9 | 207.06 | 138.98 | | Processing time (ms)- STD | 27.49 | 95.157 | 17.15 | --> ### Mainnet **Setup**: Docker, CPU limit = 2 **Summary:** - The batch processing time generally fluctuates between 180ms - 250ms - When NOT rate limited, the BeaconProcessor can process approx. up to 8 batches per slot #### Rate limit disabled ![](https://i.imgur.com/D92ad64.png) #### Rate limit enabled (6,7,9 secs after slot start) ![](https://i.imgur.com/9399zLz.png) <!-- **UPDATE (2023/02/23)** Added some Mainnet metrics below I had some issues connecting to a remote EE with Docker (wanted to use Docker as it's easier to limit CPU resources with it), so I've applied a workaround to limit CPU by setting a worker thread count in code (thanks to @AgeManning). The configuration: - tokio worker threads set to 3 (to simulate 2 CPU cores, 3 worker threads + main thread) - running on MacOS - connecting to a remote execution engine **Result**: processing times mostly fall between **140 ms - 180ms per batch**, and it processes 10-15 batches per minute when rate limited to 3 batch per epoch. ![](https://i.imgur.com/k617Gci.png) --> <!-- ## CPU metrics and backfill rate comparison Below are some metrics comparing the current stable version against [this branch](https://github.com/sigp/lighthouse/pull/3936) with rate-limiting enabled: - **1 batch / slot**: batch processing at 6s after slot start - **3 batch / slot**: batch processing at 6s, 7s, and 9s after slot start #### Setup - **Network**: Goerli - **Operating System**: Ubuntu 22.04 LTS - Tested in Docker with **CPU limit set to 2** #### Comparison of stats over a 10 minute sample: | Description | No rate limit (stable) | 1 batch / slot | 3 batches / slot | |:---------------------------------- | ----------------------:| --------------:| ----------------:| | Number of batches processed | 203 | 44 | 124 | | Average CPU % during backfill sync | 47.7% | 31.2% | 42.7% | -->