Tuyen Nguyen
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
# Improve BeaconState hashTreeRoot() ## 1. Description As of Jun 2024, holesky validator size is 1.73M and mainnet validator size is 1.45M. We see the BeaconState's hashTreeRoot() was up to 1.5s to prepare for next epoch and it will increase in the upcoming months. Improving this function is critical for lodestar, especially after Electra where we'll likely see longer epoch transition there With the new SIMD implementation of sha256, we hope that we can improve this time more, see below for more details * ### 1.1 Current implementation Majority of the hashTreeRoot() time is for validators so we need to care about it the most. Right now each validator is typed as a ContainerNodeStruct which cache validator value without a tree. When a validator is modified, we need to recreate th wholee tree and compute root for it. Note that the tree is not stored anywhere, we just create when we need it and evicted by gc after all - Pros: - It's super fast to get a value of validator - It's much lighter to store validator as value compared to nodes because an epoch field is tracked as 1 number in value instead of 8 as in a Node - Cons: - Tree creation time is a lot - Cannot cache hashed nodes - Cannot batch hash * ### 1.2 Benchmark This is the benchmark result with a 200k validator beacon state and half of that is modified ``` ✓ BeaconState ViewDU recursive hash vc=200000 1.045194 ops/s 956.7598 ms/op - 13 runs 16.5 s ✓ BeaconState ViewDU recursive hash - commit step vc=200000 30.49162 ops/s 32.79589 ms/op - 12 runs 2.96 s ✓ BeaconState ViewDU validator tree creation vc=100000 4.024561 ops/s 248.4743 ms/op - 15 runs 6.76 s ``` This shows something we need to improve: - Find a way for batch hash - validator creation tree * ### 1.3 SIMD sha256 There's hashtree-js binding work [here](https://github.com/ChainSafe/hashtree-js) which call the great native hashtree implementation of Prysm which promise to be 20x faster than the current version of Lodestar. I'm waiting for Mathew to try its performance on different cpus but if it's 5x faster than the current version, it's already amazing to me Besides that, we also have wasm implementation [here](https://github.com/ChainSafe/ssz/pull/367), the best it can do is 2x faster on Macbook 2017 Intel cpu It shows that hashtree is almost 5x faster than as-sha256 on Mac M1 cpu ``` hasher as-sha256 ✓ hash 2 Uint8Array 500000 times - as-sha256 2.717292 ops/s 368.0135 ms/op - 12 runs 47.8 s ✓ hashTwoObjects 500000 times - as-sha256 2.864475 ops/s 349.1041 ms/op - 12 runs 45.4 s - batchHashObjects - as-sha256 ✓ executeHashComputations - as-sha256 31.25938 ops/s 31.99039 ms/op - 37 runs 3.19 s hashtree ✓ hash 2 Uint8Array 500000 times - hashtree 5.551561 ops/s 180.1295 ms/op - 12 runs 23.4 s ✓ hashTwoObjects 500000 times - hashtree 6.362881 ops/s 157.1615 ms/op - 12 runs 20.4 s - batchHashObjects - hashtree ✓ executeHashComputations - hashtree 149.3710 ops/s 6.694742 ms/op - 136 runs 8.26 s ``` ## 2. Get HashComputations and execute it ### 2.1 persistent-merkle-tree I created this data structure for batch hash work, the field name is not really nice but good enough for a POC ```typescript= export type HashComputation = { src0: Node; src1: Node; dest: Node; }; ``` it shows batch hash is less than 2x faster than regular hash although `hashtree.executeHashComputations()` is 5x faster than as-sha256, I think HashComputation above caused gc to run more? ``` Node batchHash ✓ getHashComputations 250000 nodes 86.79037 ops/s 11.52202 ms/op - 32 runs 28.0 s ✓ batchHash 250000 nodes 19.13791 ops/s 52.25231 ms/op - 29 runs 22.2 s ✓ get root 250000 nodes 10.25390 ops/s 97.52391 ms/op - 18 runs 14.1 s ✓ getHashComputations 500000 nodes 38.27039 ops/s 26.12986 ms/op - 34 runs 57.2 s ✓ batchHash 500000 nodes 8.678133 ops/s 115.2322 ms/op - 27 runs 38.3 s ✓ get root 500000 nodes 5.180939 ops/s 193.0152 ms/op - 12 runs 18.8 s ``` `getHashComputation()` takes 11% - 13% time, it basically do one traversal from root to group HashComputations by level and execute from the bottom. There's an idea to improve `getHashComputation()` as below ### 2.2 Get HashComputations during ViewDU.commit() in ssz ViewDU in ssz() has a great design to track changed nodes and views. Based on this we can compute HashComputations to save one tree traversal (see above). The saved time is very small as I benchmarked, however it's reasonable to do so. It's 10ms saved as in my benchmark result ``` BeaconState ViewDU partially modified tree vc=200000 numModified=100000 ✓ BeaconState ViewDU batchHash - commit & getHashComputation vc=20 1.802902 ops/s 554.6614 ms/op - 79 runs 52.9 s ✓ BeaconState ViewDU hashTreeRoot - commit step vc=200000 1.835587 ops/s 544.7849 ms/op - 59 runs 39.5 s ``` ## 3. Lazily create and persist validator trees TLDR: This is only good enough to start with, persisting the tree in order to batch hash is not a great idea after I run a node on holesky (see below) The current implementation only create validator tree on the fly without tracking it so we cannot do batch hash. First approach is to lazily compute it when needed and persist the tree in BranchNodeStruct ``` BeaconState ViewDU partially modified tree vc=200000 numModified=100000 ✓ BeaconState ViewDU recursive hash vc=200000 1.128123 ops/s 886.4279 ms/op - 27 runs 29.1 s ✓ BeaconState ViewDU recursive hash - commit step vc=200000 13.13220 ops/s 76.14873 ms/op - 45 runs 8.58 s ✓ BeaconState ViewDU validator tree creation vc=100000 2.462462 ops/s 406.0976 ms/op - 67 runs 39.7 s ✓ BeaconState ViewDU batchHash vc=200000 1.292733 ops/s 773.5550 ms/op - 43 runs 37.7 s ✓ BeaconState ViewDU batchHash - commit & getHashComputation vc=20 1.802902 ops/s 554.6614 ms/op - 79 runs 52.9 s ✓ BeaconState ViewDU batchHash - hash step vc=200000 3.886184 ops/s 257.3218 ms/op - 57 runs 57.0 s ✓ BeaconState ViewDU hashTreeRoot vc=200000 1.322128 ops/s 756.3564 ms/op - 63 runs 53.6 s ✓ BeaconState ViewDU hashTreeRoot - commit step vc=200000 1.835587 ops/s 544.7849 ms/op - 59 runs 39.5 s ✓ BeaconState ViewDU hashTreeRoot - hash step vc=200000 3.717275 ops/s 269.0142 ms/op - 57 runs 60.2 s ``` Above benchmark shows that tree creation takes even 1.5x hash time while we cannot do much to improve hash time, I decided to improve this. Note that tree creation time increased 1.6x (406ms vs 248ms) because of the persistence of it, which shows it's not a great approach But batch hash helped a bit (total time is 756ms vs 886ms as the old way) ## 4. Avoid validator tree creation time TLDR: great for the benchmark, paid 9GB more RAM to save 1s epoch transition, not sustainable in the long run ## 4.1. Maintain both validator tree and value I enhanced ContainerNodeStruct ViewDU to track both the value, nodes and views changed. - The read operations are as great as earlier - The write (commit/hashTreeRoot) operations are as great as Container, no tree creation involved - Benchmark results are great: - no tree creation time, default recursive hash is 2x faster - number of HashComputations are reduced, hash time is 2x faster - in total, it shows 3x faster than the original result (328ms vs 956ms) ``` BeaconState ViewDU partially modified tree vc=200000 numModified=100000 ✓ BeaconState ViewDU recursive hash vc=200000 2.383132 ops/s 419.6159 ms/op - 18 runs 12.6 s ✓ BeaconState ViewDU recursive hash - commit step vc=200000 5.037264 ops/s 198.5205 ms/op - 16 runs 6.25 s ✓ BeaconState ViewDU validator tree creation vc=100000 35.76944 ops/s 27.95682 ms/op - 38 runs 20.7 s ✓ BeaconState ViewDU batchHash vc=200000 2.855090 ops/s 350.2516 ms/op - 13 runs 7.67 s ✓ BeaconState ViewDU batchHash - commit & getHashComputation vc=20 4.192557 ops/s 238.5179 ms/op - 14 runs 6.50 s ✓ BeaconState ViewDU batchHash - hash step vc=200000 7.569542 ops/s 132.1084 ms/op - 51 runs 29.1 s ✓ BeaconState ViewDU hashTreeRoot vc=200000 3.039540 ops/s 328.9972 ms/op - 12 runs 6.44 s ✓ BeaconState ViewDU hashTreeRoot - commit step vc=200000 4.499143 ops/s 222.2646 ms/op - 21 runs 8.92 s ✓ BeaconState ViewDU hashTreeRoot - hash step vc=200000 7.361959 ops/s 135.8334 ms/op - 36 runs 21.0 s ``` ### 4.2 Testing I continued with spec tests, they all passed Then I integrated to lodestar and got this surprise: OOM My investigation showed that I can only load validators until 1.2M, both mainnet and holesky surpassed that Its not possible to store a full BeaconState tree with the current implementation ### 4.3 Troubleshooting Each node object could takes up to 2 bytes x8 for property names and 8 bytes x8 for property values, plus the overhead for Object itself, so it takes at least (2x8 + 8x8) = 80 bytes A validator value is represented by: ``` - pubkey: 96 bytes = 3 nodes - withdrawalCredentials: 32 bytes = 1 node - effectiveBalance: 8 bytes = 1 node - slashed: boolean = 1 node - activationEligibilityEpoch: 8 bytes = 1 node - activationEpoch: 8 bytes = 1 node - exitEpoch: 8 bytes = 1 node - withdrawableEpoch: 8 bytes = 1 node ``` validator tree: ``` level 0 validator root / \ 1 10 11 / \ / \ 2 20 21 22 23 / \ / \ / \ / \ 3 pub with eff sla act act exit with / \ 4 pub0 pub1 ``` ``` level 0: 1 node level 1: 2 nodes level 2: 4 nodes level 3: 8 nodes (8 properties in total) level 4: 2 nodes (only for pubkey) total: 17 nodes ``` so it has 16 nodes in addition to the current implementation (which is 1 node). Consider each node is 80 bytes, a validator tree could take at least 16 x 80 = 1280 bytes, so with more than 1.7M of holesky validators it takes at least 1.28x1.7 = 2.178 GB, this explains the OOM ### 4.4 More testing I was only able to run a node with this branch with max-old-heap-size being set to 16GB PrepareNextEpoch duration is 1s less than stable ![Screenshot 2024-06-18 at 16.32.19](https://hackmd.io/_uploads/BkeE4CABA.png) So we pay around 9GB to save 1s epoch transition which may not be worth it, `gc` increased through and it does not seem sustainable in the long term ## 5. Batch commit Right now we commit validators individually so we have to put everything inside HashComputations in order to batch, but that requries persisting trees as we see. We need to find a way to commit validators and compute its root in batch ### 5.1 Prepopulate validator trees - Extend state.validators type to specify its own ViewDU called ValidatorsViewDU - ValidatorsViewDU extends ListCompositeViewDU - Create 16 validator tree once - Each item of it is ValidatorViewDU - For every 16 validators, populate trees from ValidatorViewDU value, compute root in batch - ValidatorViewDU create new node =root + value from there - ValidatorViewDU extends ContainerStructViewDU - new method to populate tree from value: valueToTree(node: Node) void - new method commitHashObject(ho: HashObject): this create new root from there + value, apply ho (HahsObject) to root - BranchNodeStruct - no change - In ValidatorViewDU.commitHashObject(), new an instance and applyHash() directly - Pros: - able to batch hash but no need to create tree every time - trees are created once and reuse, should be less gc - Cons: - no generic, has to create specific type for state.validators - 2-step commits, one for validators and one for remaining. state.validators.commit() actually do the hash as well Result (with hashtree) is great: - recursive hash is 3.24x faster - batchhash is 2.89x faster - hashTreeRoot is 3x faster ``` BeaconState ViewDU partially modified tree vc=200000 numModified=100000 ✓ BeaconState ViewDU recursive hash vc=200000 3.653621 ops/s 273.7010 ms/op - 28 runs 12.8 s ✓ BeaconState ViewDU recursive hash - validators commit step vc=20 4.545098 ops/s 220.0172 ms/op - 18 runs 6.41 s ✓ BeaconState ViewDU validator tree creation vc=100000 128.2135 ops/s 7.799492 ms/op - 28 runs 22.1 s ✓ BeaconState ViewDU batchHash vc=200000 3.733401 ops/s 267.8523 ms/op - 18 runs 6.93 s ✓ BeaconState ViewDU batchHash - commit & getHashComputation vc=20 4.310165 ops/s 232.0097 ms/op - 15 runs 5.80 s ✓ BeaconState ViewDU batchHash - hash step vc=200000 31.72988 ops/s 31.51603 ms/op - 14 runs 10.9 s ✓ BeaconState ViewDU hashTreeRoot vc=200000 3.954695 ops/s 252.8640 ms/op - 16 runs 6.07 s ✓ BeaconState ViewDU hashTreeRoot - commit step vc=200000 4.277347 ops/s 233.7898 ms/op - 18 runs 6.67 s ✓ BeaconState ViewDU hashTreeRoot - hash step vc=200000 30.73572 ops/s 32.53543 ms/op - 26 runs 15.0 s ``` ### 5.2 Prepopulate Uint8Array Start from same idea to 5.1, the idea is that we don't need to populate middle nodes in order to have validator root. We only need to have Uint8Array at different levels of validator,and we let the hasher to compute root Result is even 30% faster than 5.1 ``` BeaconState ViewDU partially modified tree vc=200000 numModified=100000 ✓ BeaconState ViewDU recursive hash vc=200000 4.706004 ops/s 212.4945 ms/op - 23 runs 9.20 s ✓ BeaconState ViewDU recursive hash - validators commit step vc=20 6.520329 ops/s 153.3665 ms/op - 12 runs 3.77 s ✓ BeaconState ViewDU validator tree creation vc=100000 7.992198 ops/s 125.1220 ms/op - 14 runs 6.58 s ✓ BeaconState ViewDU batchHash vc=200000 4.850986 ops/s 206.1437 ms/op - 14 runs 4.79 s ✓ BeaconState ViewDU batchHash - commit & getHashComputation vc=20 5.573049 ops/s 179.4350 ms/op - 14 runs 4.28 s ✓ BeaconState ViewDU batchHash - hash step vc=200000 40.95685 ops/s 24.41594 ms/op - 16 runs 10.0 s ✓ BeaconState ViewDU hashTreeRoot vc=200000 5.156880 ops/s 193.9157 ms/op - 23 runs 6.88 s ✓ BeaconState ViewDU hashTreeRoot - commit step vc=200000 5.963063 ops/s 167.6991 ms/op - 14 runs 4.41 s ✓ BeaconState ViewDU hashTreeRoot - hash step vc=200000 35.06000 ops/s 28.52253 ms/op - 30 runs 13.2 s - BeaconState ViewDU hashTreeRoot - commit step each validator vc=200000 ``` #### 5.2.1 Improve hashtree digestNLevelUnsafe The key function to compute validator root from a single Uint8Array is digestNLevelUnsafe(). We used 2 Uint8Arrays to hash at each level, turned out we can use one only, it avoid a `Uint8Array.set()`. Also, we should not allocate temporary HashObject and call "applyHash()", each hash method should support result as a output parameter. This gives 30% faster than the above ``` BeaconState ViewDU partially modified tree vc=200000 numModified=100000 ✓ BeaconState ViewDU recursive hash vc=200000 6.950837 ops/s 143.8676 ms/op - 14 runs 5.26 s ✓ BeaconState ViewDU recursive hash - commit step vc=200000 9.200534 ops/s 108.6893 ms/op - 13 runs 3.20 s ✓ BeaconState ViewDU validator tree creation vc=100000 8.308017 ops/s 120.3657 ms/op - 12 runs 4.73 s ✓ BeaconState ViewDU batchHash vc=200000 7.139702 ops/s 140.0619 ms/op - 15 runs 3.88 s ✓ BeaconState ViewDU batchHash - commit & getHashComputation vc=20 8.223193 ops/s 121.6073 ms/op - 18 runs 4.35 s ✓ BeaconState ViewDU batchHash - hash step vc=200000 35.87954 ops/s 27.87103 ms/op - 20 runs 8.51 s ✓ BeaconState ViewDU hashTreeRoot vc=200000 7.531041 ops/s 132.7838 ms/op - 19 runs 4.53 s ✓ BeaconState ViewDU hashTreeRoot - commit step vc=200000 9.199858 ops/s 108.6973 ms/op - 13 runs 3.24 s ✓ BeaconState ViewDU hashTreeRoot - hash step vc=200000 38.55383 ops/s 25.93776 ms/op - 22 runs 8.32 s ``` Compare to 1.2, this is 956/132 ~= 7.24x faster than unstable in my environment. ## Lesson learn from this work hashtree performance is great, but an important improvement from this work is to use reusable memory to hash validators in batch. We may not use batch technique in other places, but having reusable memory to hash is a great technique, we can use it to compute root for: - Attestation - Signing Root hashtree also has a nice "hashInto" api which saved a lot of memory for consumer, should apply the samething to as-sha256

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully