Injective stress testing plan

Question: gather info how to increase tps without increasing blocktimes: - increase mempool size from x to 200 leads to increased block times? what single incident did you observe? - other factor??? that affect block times such as block size and p2p. - handful of txs causes same blocktimes as 200 txs -> relation between txs submitted and blocksizes. It might be that for the case of handful of txs, the network has the same mempool size as the case of 200txs, the differences here is the txs that is finalized into the block. - Debug extended block, we should investigate mempool and block size - Why network slow (for all node) at certain time, what factor???, we should investigate mempool and block size => Is this the same case as extended block - Attack vector: long memo -> this goes back to the problem of high load/mempool Research Factors that affects blocktime: Tendermint params (consensus param that affects p2p): - mempool size - block.max_bytes, block.max_gas Other factor: - geolocation, set up node at different geo to compare Node performance: - Hardware spec, singlecore vs multicore cpu - db - prune vs unprunned we only need to benchmark commit time, compare commit time to block time. general code performances, put more profiles this belong to node performance -> low priority Solution: p2p investigation: - Write load test script to spam exchange transaction - Setup testnet or use current testnet (how many nodes do we have/ how decentralize). If using your testnet, we will also add in some node of ours - Build config module - Try out different mempool/broadcast method such as cat mempool or libp2p - Try out different config node performance investigation: - benchmark commit time with different prune mode/db/hardware * Specific for each test:  * For `block.max_gas` and `block.max_bytes` determining, we will set up 3 testnets with the current `block.max_gas`, the max_gas decreased & increased. See how it effects block times, in the traffic is it reach block size, what size is fit with gas? * For iavl determining, we will set up a testnet with a full-node & prunning node and see signing rate of nodes. * Set up traffic environment: * We need a script of 200+, 500+, 1000+ txs corresponding to `mempoo.max_size`, should primarily include exchange-module messages * Need to run in parallel in a certain time (cause our blocktime currently is around 0.8s and its hard to inclue 200 txs into a block in sequency). * Results: * Check signing rate in real-time. Use scripts shared by Achilleas * Query block info in the traffic time, the result will be a correlation chart between the cheking params to be tested and the block size and block time. * Example: ![265780912-23233227-1a4c-4176-8895-846a7b51fd74](https://hackmd.io/_uploads/HkezPUE6T.png) * Determine bottleneck on nodes. * How iavl state affect node performance * In logic, increase iavl => process increased by log(n). But we still need a test. Need setup 2 testnet with diff iavl size (would be 2x with other one). * After have the testnet, query block time & commit time. * iavl v2 will remove fastnode * Determine the root cause of extended blocks, why uptime/signing rate decreases network-wide for all validators at certain times.. * TBD * Evaluate different DBs & CPU & P2P * TBD * IBC DDos * We can set up a list of ibc tx for spaming. Run in parallel same as traffic scenario