Try   HackMD

Storagenode write tp vs hash/disk sync

This test creates a new piece store (pieces.NewStore) and tries to write as many 2.3 Mb files as possible.

Variables:

  • test is executed both with SHA-256 hashing (current) and BLAKE (planned)
  • test is executed both with/without disc sync before the rename of the commit

2.3 Mb is chosen because the 64 Mb block size and 29 EC parameter results with 2.3 Mb piece size.

Results

TLDR;

  • Using BLAKE3 we can achieve significant better write thoughput (4x)
  • but if disk is slow (good old HDD), we won't see the benfit: we are busy with sync, cpu is not a bottleneck

On a real physical server machine (which also runs other workloads):

Phyisical server machine (spinning disk!!!):

Note: This machine also had other production workloads (like a storagenode) at the same time

name                       time/op
ReadWrite/sync-SHA256-8      80.0ms ± 0%
ReadWrite/nosync-SHA256-8    6.61ms ±15%
ReadWrite/sync-BLAKE3-8      79.1ms ± 4%
ReadWrite/nosync-BLAKE3-8    2.18ms ± 8%

name                       speed
ReadWrite/sync-SHA256-8    29.0MB/s ± 0%
ReadWrite/nosync-SHA256-8   355MB/s ±14%
ReadWrite/sync-BLAKE3-8    29.3MB/s ± 4%
ReadWrite/nosync-BLAKE3-8  1.07GB/s ± 9%

Local machine (tmpfs!!! no sync cost):

This shows that Blake have 4x better potential.

name                        time/op
ReadWrite/sync-SHA256-12      4.64ms ± 1%
ReadWrite/nosync-SHA256-12    4.62ms ± 0%
ReadWrite/sync-BLAKE3-12      1.01ms ± 1%
ReadWrite/nosync-BLAKE3-12    1.00ms ± 2%

name                        speed
ReadWrite/sync-SHA256-12     500MB/s ± 1%
ReadWrite/nosync-SHA256-12   502MB/s ± 0%
ReadWrite/sync-BLAKE3-12    2.30GB/s ± 1%
ReadWrite/nosync-BLAKE3-12  2.32GB/s ± 2%

Cloud machine (GCE, root disk is new balanced persistent disk):

name                       time/op
ReadWrite/sync-SHA256-8     19.0ms ±40%
ReadWrite/nosync-SHA256-8   9.53ms ±27%
ReadWrite/sync-BLAKE3-8     16.1ms ± 0%
ReadWrite/nosync-BLAKE3-8   4.94ms ±86%

name                       speed
ReadWrite/sync-SHA256-8    127MB/s ±31%
ReadWrite/nosync-SHA256-8  250MB/s ±24%
ReadWrite/sync-BLAKE3-8    144MB/s ± 0%
ReadWrite/nosync-BLAKE3-8  560MB/s ±67%

Test method

cd storagenode/pieces/b
CGO_ENABLED=0 go test -c

And the binary is used to execute tests:

./b.test -test.v -test.benchtime=1000x -test.bench 'BenchmarkReadWrite' | tee -a server.txt

go install golang.org/x/perf/cmd/benchstat@latest
benchstat server.txt

Note: we need to use counter based (1000x) and not timer based limit to write the same amount of bytes. Results of tests with time based limit couldn't be compared as one test may write significant more bytes.

Used main commit:

commit f35b4163f98de6c8a9032a738c12ce3766a2314d (origin/main, origin/HEAD)
Author: NickolaiYurchenko
Date:   Mon Aug 22 17:47:43 2022 +0300

    web/satellite: multiple passphrase notification added

But it also includes the piece hash patches:

https://github.com/elek/storj/commit/9bd26c777c272e152d7b90ec6703d93d484219c4

Reference

curl https://gist.githubusercontent.com/elek/7b1eb55b37d2988534bb05ce98f7f6e0/raw/hw.sh | tee | sudo bash

Local machine

VENDOR
	Manufacturer: ASUS
	Product Name: System Product Name
MEMORY
	Size: 32 GB
	Non-Volatile Size: None
	Volatile Size: 32 GB
	Cache Size: None
	Logical Size: None
	Size: 32 GB
	Non-Volatile Size: None
	Volatile Size: 32 GB
	Cache Size: None
	Logical Size: None
CPUs
	Version: 12th Gen Intel(R) Core(TM) i5-12400
	Max Speed: 4400 MHz
	Core Enabled: 6
	Thread Count: 12
DISKS
     NAME          SIZE TYPE  FSTYPE            MOUNTPOINT VENDOR   MODEL
     sda           3.6T disk                               ATA      WDC WD40EFRX-68N32N0
     └─sda1        3.6T part  linux_raid_member                     
       └─md127     3.6T raid1 ext4                                  
     nvme0n1     931.5G disk                                        Samsung SSD 980 PRO 1TB
     ├─nvme0n1p1   100M part  vfat              /boot               
     ├─nvme0n1p2    16M part                                        
     ├─nvme0n1p3 243.4G part  ntfs                                  
     ├─nvme0n1p4   611M part  ntfs                                  
     └─nvme0n1p5 687.4G part  crypto_LUKS                           
       └─root    687.4G crypt ext4              /

Server machine

VENDOR
	Manufacturer: FUJITSU
	Product Name:  
MEMORY
	Size: 16384 MB
	Size: 16384 MB
CPUs
	Version: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
	Max Speed: 8300 MHz
	Core Enabled: 4
	Thread Count: 8
DISKS
     NAME     SIZE TYPE  FSTYPE            MOUNTPOINT VENDOR   MODEL
     sda      3.7T disk                               ATA      TOSHIBA_MG04ACA400EY
     ├─sda1   3.7T part  linux_raid_member                     
     │ └─md0  3.7T raid1 ext4              /                   
     └─sda2     1M part                                        
     sdb      3.7T disk                               ATA      TOSHIBA_MG04ACA400EY
     ├─sdb1   3.7T part  linux_raid_member                     
     │ └─md0  3.7T raid1 ext4              /                   
     └─sdb2     1M part

Cloud machine

  • GCE e2-standard-8
  • Balanced persistent disk
VENDOR
	Manufacturer: Google
	Product Name: Google Compute Engine
MEMORY
	Size: 16 GB
	Size: 16 GB
CPUs
	Version: Not Specified
	Max Speed: 2000 MHz
	Version: Not Specified
	Max Speed: 2000 MHz
	Version: Not Specified
	Max Speed: 2000 MHz
	Version: Not Specified
	Max Speed: 2000 MHz
	Version: Not Specified
	Max Speed: 2000 MHz
	Version: Not Specified
	Max Speed: 2000 MHz
	Version: Not Specified
	Max Speed: 2000 MHz
	Version: Not Specified
	Max Speed: 2000 MHz
DISKS
     NAME      SIZE TYPE FSTYPE MOUNTPOINT VENDOR   MODEL
     sda       200G disk                   Google   PersistentDisk
     ├─sda1  199.9G part ext4   /                   
     ├─sda14     3M part                            
     └─sda15   124M part vfat   /boot/efi