###### tags: `LéCøre` # 008 Migration benchmarking [TOC] We test the two large migrations, flatten storage (https://gitlab.com/metastatedev/tezos/-/merge_requests/231) and baking accounts (https://gitlab.com/metastatedev/tezos/-/merge_requests/255). ## Storage backend (Irmin) During the migration, Irmin reads data from disk, but there are no disk storage writes until the whole migration is finished. Therefore, all the migrated data live in in-memory context, which is not freed until the migration block is committed and the data is written to the disk. ## Methodology The main concern about these migration is how long they take to run and their memory usage. We profile memory usage using OCaml `4.09.1+spacetime` compiler variant with the automated Tezt migration test. - manual: https://caml.inria.fr/pub/docs/manual-ocaml/spacetime.html - result viewer: https://github.com/lpw25/prof_spacetime - blog post: https://blog.janestreet.com/a-brief-trip-through-spacetime/ ```shell # run the test, writes profiling data into spacetime-xxx files export OCAML_SPACETIME_INTERVAL=60000 && dune exec ./tezt/manual_tests/main.exe -- migration --log-file spacetime-ba.log # view spacetime profile file prof_spacetime serve spacetime-xxx ``` ## Results ### On a mac laptop - CPU 3,1 GHz Quad-Core Intel Core i7 - memory 16 GB 2133 MHz LPDDR3 The baking-accounts migration took about 12 minutes and storage flattening about 20 minutes and finished successfully without running out of memory. Also ran with the spacetime variant. ### On cloud - CPU Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz - memory 8 GB Both migrations encounter out of memory issues at around `Roll.Owner.Snapshot` migration, which is the largest migration. Unfortunately, we did not succeed with installing spacetime on this machine (after looong opam solver run, it fails on installing `dune`) As we can see from the spacetime profiling data from the laptop, both migrations currently require a little bit over 8 GB memory. ### Spacetime Graphs - baking accounts, run 1: ![baking accounts, run 1](https://hackmd.io/_uploads/H1cG6yvFD.png) - baking accounts, run 2: ![baking accounts, run 2](https://hackmd.io/_uploads/H1GIpyvKD.png) For baking-accounts, the time 1200s roughly corresponds to when the snapshots migration started and we can see that it takes the largest chunk of the total consumed memory. - storage flattening: ![storage flattening](https://hackmd.io/_uploads/BJdK61PtP.png) ## What's next There's a new storage backend about to be merged into master within next few weeks, which is going to affect our results. Unfortunately for us, it has been noted that the new storage takes more RAM than the previous store (but is faster). Some small gains might be possible by using lower-level primitives in the migration that avoid extra allocations. However, these are probably not going to significant enough. Another change that may provide some gain for baking accounts it to keep a map of the baker's consensus keys for a shared access (currently, it's being read from `Baker.Consensus_key_rev` storage with 15 occurences). Whether this will be beneficial depends on how efficient Irmin is with sharing this data when read from storage multiple times. ### Gradual rolls migration As the rolls snapshots migration is the most intesive, we can explore the possibility of not migrating the *existing* roll snapshots at the transition blocks, but only write *new* rolls snapshots using the new storage type. Taking advantage of the facts that know: - rolls snapshots are written 6 cycles into the future (`preserved_cycles + 1`) - old rolls are cleared from the storage (after `preserved_cycles`) - baker's consensus keys take at least 6 cycles to update (`preserved_cycles + 1`) It should be possible to: - for 6 cycles from the protocol activation, read the data from the old storage type and reverse look-up bakers by their consensus key (because bakers' consensus key they cannot change during these first 6 cycles - if changed, they would only be pending until 7th cycle) - write new roll snapshots using the new storage type This could combine both the storage flattening and baking accounts change. ### TODOs - [x] benchmark without roll snapshots migration - for baking-accounts, the memory usage roughly halved (a little over 4GB at most) - [ ] benchmark baking-accounts with in-memory baker consensus key map (skips 15 occurences of read from `Baker.Consensus_key_rev`) - [ ] try to rewrite baking-accounts roll snapshots migration using lower-level primitives and compare - [ ] benchmark with the new storage backend, once merged - [ ] benchmark migration *without* the roll owner snapshots migration - [ ] prototype gradual migration