Lighthouse Tracing

## Lighthouse Tracing ### Experiment - Tracing span visualisation with Tempo - Logs + Metrics Correlation (`trace_id`) - given a log with `trace_id`, its possible to view its trace - given a trace, it's possible to see all logs within the trace / span ### Interesting Traces #### `pending_components` Shows all events for a pending component, and the timings. ![image](https://hackmd.io/_uploads/SyN4zP9vgg.png) #### `proces_gossip_blob` (`2.45s`) `recompute_head` taking 1+ seconds. Multiple occurences. Doesn't seem to happen on `process_gossip_block` path. ![image](https://hackmd.io/_uploads/r1bU4TYPeg.png) #### `process_gossip_block` ![image](https://hackmd.io/_uploads/HJ1AEx5Pex.png) #### `process_rpc_blob` `import_block_update_validator_monitor` took unnecessary long ![image](https://hackmd.io/_uploads/BynZWeqvee.png) #### `process_rpc_block` ![image](https://hackmd.io/_uploads/SJuZBlqwlx.png) #### `process_chain_segmenet` (`15.34s`) - `3.84s` before processing the first block - first block took `3.16s` ![image](https://hackmd.io/_uploads/Hyaus1qDge.png) First block: ![image](https://hackmd.io/_uploads/HJWHRy9vge.png) I instrumented the gap in the above trace and looks like its signature verification taking some time: ![image](https://hackmd.io/_uploads/S1Vwdl9vee.png) #### `get_advanced_hot_state` (`3.6s`) 2s building caches seems a bit excessive ![image](https://hackmd.io/_uploads/H1XPnyYPxe.png) #### `attester_cache_update` (`1.33s`) ![image](https://hackmd.io/_uploads/SJVCBrPPgx.png) look into `overflow_lru_cache`: - state might be removed too early - then we rebuild the cachces when import the block (the attester cache write should be later anyway) ### Learnings - use `#[instrument]` if creating a span for the entire function - use `.instrument` for async closures that are passed to `tokio::spawn_task` - use `span!(...).in_scope(|| {})` for creating a scoped span #### Using Spans in async code https://docs.rs/tracing/latest/tracing/span/struct.Span.html > Warning: in asynchronous code that uses async/await syntax, Span::enter should be used very carefully or avoided entirely. Holding the drop guard returned by Span::enter across .await points will result in incorrect traces. > To enter a span for a synchronous section of code within an async block or function, prefer `Span::in_scope`. Since in_scope takes a synchronous closure and exits the span when the closure returns, the span will always be exited before the next await point.