owned this note
owned this note
Published
Linked with GitHub
---
tags: ozone, perftest
---
# Ozone Performance #8: Number of required chunk write threads
Date: 2020-02-20
Ozone master: 4ba1932da
Ratis master: 0.5.0-90cd474-SNAPSHOT
Docker image: elek/ozonedev:20200220-1
## How to test?
* Freon test executed Chunk Writes calling the `ChunkManagerImpl.writeChunk` method directly.
* Number of threads were changed (1,5,10,60)
* All the tests are executed parallel on 3 machines.
* After the freon test a `sync` call has been executed. If it took >0.1s, the sync time has been added to the measured write time.
## Environment
Tests are run on a 10 node physical cluster (only 3 nodes were used in a given time)
* Cisco UCSC-C220-M4L racks
* 2 x Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz (2 x 8 core --> 2 x 16 threads)
* 256GB mem
* HGST HUS726060AL4210 SCSi disks
**All the tests are executed three times, on 3 different machines parallel**
## Summary
* In case of spinning disk (limited IO) there is no significant benefit to write chunks from high number of threads
* When the number of threads was higher than the number of cores, context switches can make it slightly slower
* These tests are based on one disk, for multiple disks it can be better to use multiple disks
Memdisk:
Spinning disk:
| Thread # | disk type | mean_rate ( write / sec) |
| -------- | ------------ | ------------------------ |
| 1 | memdisk | 9600
| 5 | memdisk | 10000
| 10 | memdisk | 11000
| 60 | memdisk | 10000
Memdisk:
| Thread # | disk type | mean_rate ( write / sec) |
| -------- | ------------ | ------------------------ |
| 1 | memdisk | 55 000
| 5 | memdisk | 155 000
| 10 | memdisk | 250 000
| 60 | memdisk | 325 000
## Test runs (measurement details)
### Disk speed estimate / spinning disk
Command:
Result:
```
0485760000 bytes (10 GB, 9.8 GiB) copied, 6.82971 s, 1.5 GB/s
Thu Feb 20 14:09:39 UTC 2020
real 0m59.187s
user 0m0.000s
sys 0m0.045s
Thu Feb 20 14:10:39 UTC 2020
```
Adjusted speed with 6.8 s + 59.1s sync time: ~150 MB / sec real write speed
### 1 thread / spinning disk
Command:
```
ozone freon --verbose cmdw -n1000000 -s 1024 -t1
```
```
for i in `seq 0 5`; do kc logs --tail=12 ozone-datanode-$i -c freon; done
2020-02-20 14:17:57 INFO ProgressBar:163 - Progress: 99.51 % (995099 out of 1000000)
2020-02-20 14:17:58 INFO ProgressBar:163 - Progress: 100.00 % (1000000 out of 1000000)
2020-02-20 14:17:58 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.035166, max=0.263912, mean=0.07271159551709777, stddev=0.025927220819722697, median=0.072524, p75=0.085028, p95=0.107678, p98=0.136631, p99=0.186516, p999=0.263912, mean_rate=11550.5015735705, m1=12272.868416154946, m5=17841.47097015596, m15=19235.55162975163, rate_unit=events/second, duration_unit=milliseconds
2020-02-20 14:17:59 INFO BaseFreonGenerator:75 - Total execution time (sec): 87
2020-02-20 14:17:59 INFO BaseFreonGenerator:75 - Failures: 0
2020-02-20 14:17:59 INFO BaseFreonGenerator:75 - Successful executions: 1000000
Thu Feb 20 14:17:59 UTC 2020
real 0m17.775s
user 0m0.001s
sys 0m0.174s
Thu Feb 20 14:18:16 UTC 2020
2020-02-20 14:17:56 INFO ProgressBar:163 - Progress: 99.60 % (996000 out of 1000000)
2020-02-20 14:17:57 INFO ProgressBar:163 - Progress: 100.00 % (1000000 out of 1000000)
2020-02-20 14:17:57 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.036517, max=0.166045, mean=0.07333402606209502, stddev=0.020694921437676126, median=0.075362, p75=0.084973, p95=0.10895, p98=0.123663, p99=0.142989, p999=0.161965, mean_rate=12044.14538957183, m1=13446.998088673514, m5=18171.828094791254, m15=19368.936387861406, rate_unit=events/second, duration_unit=milliseconds
2020-02-20 14:17:57 INFO BaseFreonGenerator:75 - Total execution time (sec): 84
2020-02-20 14:17:57 INFO BaseFreonGenerator:75 - Failures: 0
2020-02-20 14:17:57 INFO BaseFreonGenerator:75 - Successful executions: 1000000
Thu Feb 20 14:17:57 UTC 2020
real 0m24.898s
user 0m0.001s
sys 0m0.221s
Thu Feb 20 14:18:22 UTC 2020
2020-02-20 14:18:12 INFO ProgressBar:163 - Progress: 98.99 % (989899 out of 1000000)
2020-02-20 14:18:13 INFO ProgressBar:163 - Progress: 100.00 % (1000000 out of 1000000)
2020-02-20 14:18:13 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.03797, max=0.467031, mean=0.07932518320829089, stddev=0.031048159312935704, median=0.078613, p75=0.089541, p95=0.122467, p98=0.153085, p99=0.227063, p999=0.337011, mean_rate=10305.93062226682, m1=10501.27018421953, m5=16500.493636633943, m15=18067.24691054141, rate_unit=events/second, duration_unit=milliseconds
2020-02-20 14:18:13 INFO BaseFreonGenerator:75 - Total execution time (sec): 98
2020-02-20 14:18:13 INFO BaseFreonGenerator:75 - Failures: 0
2020-02-20 14:18:13 INFO BaseFreonGenerator:75 - Successful executions: 1000000
Thu Feb 20 14:18:13 UTC 2020
real 0m6.128s
user 0m0.002s
sys 0m0.178s
Thu Feb 20 14:18:19 UTC 2020
```
Adjusted speed (number of 1k messages / full disk sync + test duration):
| test + sync time | Adjusted rate (1/sec ~ kB/sec) |
| ---------------- | ----------- |
| 87 + 17.7 | 9615 / sec |
| 84 + 24 | 9259 / sec |
| 98 + 6.1 | 9606 / sec |
### 5 thread / spinning disk
```
2020-02-21 08:44:21 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.042854, max=0.588063, mean=0.06643296726931376, stddev=0.025372750207290875, median=0.059015, p75=0.072859, p95=0.106764, p98=0.12417, p99=0.140419, p999=0.200175, mean_rate=66429.2493620267, m1=71435.45696979325, m5=72623.43467763602, m15=72829.85702690326, rate_unit=events/second, duration_unit=milliseconds
2020-02-21 08:44:21 INFO BaseFreonGenerator:75 - Total execution time (sec): 16
2020-02-21 08:44:21 INFO BaseFreonGenerator:75 - Failures: 0
2020-02-21 08:44:21 INFO BaseFreonGenerator:75 - Successful executions: 1000000
Fri Feb 21 08:44:21 UTC 2020
real 1m23.270s
user 0m0.001s
sys 0m0.309s
Fri Feb 21 08:45:44 UTC 2020
2020-02-21 08:44:22 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.04363, max=0.593925, mean=0.06354858145987276, stddev=0.022740961206400442, median=0.058424, p75=0.067515, p95=0.096417, p98=0.109459, p99=0.125007, p999=0.156311, mean_rate=71285.49326104026, m1=73802.38389322006, m5=73859.35736553182, m15=73869.22494350841, rate_unit=events/second, duration_unit=milliseconds
2020-02-21 08:44:22 INFO BaseFreonGenerator:75 - Total execution time (sec): 15
2020-02-21 08:44:22 INFO BaseFreonGenerator:75 - Failures: 0
2020-02-21 08:44:22 INFO BaseFreonGenerator:75 - Successful executions: 1000000
Fri Feb 21 08:44:22 UTC 2020
real 1m26.965s
user 0m0.001s
sys 0m0.305s
Fri Feb 21 08:45:49 UTC 2020
2020-02-21 08:44:25 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.038833, max=0.170998, mean=0.06362126775485034, stddev=0.01892859201430118, median=0.057115, p75=0.070474, p95=0.10135, p98=0.117313, p99=0.13332, p999=0.170998, mean_rate=67049.08632908376, m1=69607.07032408653, m5=69239.0920135085, m15=69175.7373136932, rate_unit=events/second, duration_unit=milliseconds
2020-02-21 08:44:25 INFO BaseFreonGenerator:75 - Total execution time (sec): 16
2020-02-21 08:44:25 INFO BaseFreonGenerator:75 - Failures: 0
2020-02-21 08:44:25 INFO BaseFreonGenerator:75 - Successful executions: 1000000
Fri Feb 21 08:44:25 UTC 2020
real 1m20.805s
user 0m0.000s
sys 0m0.311s
Fri Feb 21 08:45:46 UTC 2020
```
Adjusted speed (number of 1k messages / full disk sync + test duration):
| test + sync time | Adjusted rate (1/sec ~ kB/sec) |
| ---------------- | ----------- |
| 83 + 16 | 10000 / sec |
| 86 + 15 | 9990 / sec |
| 81 + 16 | 10309 / sec |
### 10 thread / spinning disk
```
2020-02-21 08:52:46 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.050537, max=0.244627, mean=0.07848132802238444, stddev=0.02284865208612827, median=0.072666, p75=0.088921, p95=0.120331, p98=0.141813, p99=0.15519, p999=0.209323, mean_rate=108813.39202524291, m1=118824.4, m5=118824.6, m15=118824.6, rate_unit=events/second, duration_unit=milliseconds
2020-02-21 08:52:46 INFO BaseFreonGenerator:75 - Total execution time (sec): 11
2020-02-21 08:52:46 INFO BaseFreonGenerator:75 - Failures: 0
2020-02-21 08:52:46 INFO BaseFreonGenerator:75 - Successful executions: 1000000
Fri Feb 21 08:52:46 UTC 2020
real 1m23.532s
user 0m0.001s
sys 0m0.309s
Fri Feb 21 08:54:09 UTC 2020
2020-02-21 08:52:59 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.049747, max=0.241438, mean=0.0761444104790861, stddev=0.021478207738376196, median=0.070935, p75=0.086, p95=0.118981, p98=0.13232, p99=0.143328, p999=0.169497, mean_rate=110814.24406071528, m1=116996.40000000001, m5=116996.40000000001, m15=116996.40000000001, rate_unit=events/second, duration_unit=milliseconds
2020-02-21 08:52:59 INFO BaseFreonGenerator:75 - Total execution time (sec): 11
2020-02-21 08:52:59 INFO BaseFreonGenerator:75 - Failures: 0
2020-02-21 08:52:59 INFO BaseFreonGenerator:75 - Successful executions: 1000000
Fri Feb 21 08:52:59 UTC 2020
real 1m14.588s
user 0m0.000s
sys 0m0.261s
Fri Feb 21 08:54:13 UTC 2020
2020-02-21 08:53:01 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.050611, max=0.418828, mean=0.07886651419937879, stddev=0.025204953338253087, median=0.071876, p75=0.090546, p95=0.122268, p98=0.143135, p99=0.152643, p999=0.253836, mean_rate=110828.70420455093, m1=116664.2, m5=116664.2, m15=116664.2, rate_unit=events/second, duration_unit=milliseconds
2020-02-21 08:53:01 INFO BaseFreonGenerator:75 - Total execution time (sec): 11
2020-02-21 08:53:01 INFO BaseFreonGenerator:75 - Failures: 0
2020-02-21 08:53:01 INFO BaseFreonGenerator:75 - Successful executions: 1000000
Fri Feb 21 08:53:01 UTC 2020
real 1m15.663s
user 0m0.000s
sys 0m0.289s
Fri Feb 21 08:54:16 UTC 2020
```
Adjusted speed (number of 1k messages / full disk sync + test duration):
| test + sync time | Adjusted rate (1/sec ~ kB/sec) |
| ---------------- | ----------- |
| 83 + 11 | 10638 / sec |
| 74 + 11 | 11764 / sec |
| 75 + 11 | 11627 / sec |
### 60 thread / spinning disk
Command:
```
ozone freon --verbose cmdw -n1000000 -s 1024 -t60
```
Adjusted speed (number of 1k messages / full disk sync + test duration):
| test + sync time | Adjusted rate (1/sec ~ kB/sec) |
| ---------------- | ----------- |
| 84 + 18 | 9803 / sec |
| 80 + 17 | 30309 / sec |
| 83 + 18 | 9900 / sec |
### Disk speed estimate / memdisk
Command:
```
dd if=/dev/zero of=/data/test bs=10240 count=1024000
```
```
10485760000 bytes (10 GB, 9.8 GiB) copied, 4.82052 s, 2.2 GB/s
10485760000 bytes (10 GB, 9.8 GiB) copied, 4.81242 s, 2.2 GB/s
10485760000 bytes (10 GB, 9.8 GiB) copied, 4.78061 s, 2.2 GB/s
```
### 1 thread / memdisk
__Note: in the following tests the sync time was negligable__
Command:
```
ozone freon --verbose cmdw -n1000000 -s 1024 -t1
```
```
2020-02-20 13:41:14 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.012958, max=0.08102, mean=0.015753795714047336, stddev=0.005191259913082809, median=0.013991, p75=0.015209, p95=0.024343, p98=0.031314, p99=0.036779, p999=0.066172, mean_rate=53476.078401249455, m1=53499.98244308232, m5=52752.31922855491, m15=52618.40702974219, rate_unit=events/second, duration_unit=milliseconds
2020-02-20 13:41:16 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.012531, max=0.074975, mean=0.01586054775092252, stddev=0.005614021366827497, median=0.013925, p75=0.015256, p95=0.027609, p98=0.031872, p99=0.037771, p999=0.074975, mean_rate=55490.48607056913, m1=52217.93513722455, m5=51077.48224260141, m15=50873.41427434891, rate_unit=events/second, duration_unit=milliseconds
2020-02-20 13:41:18 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.012095, max=0.432351, mean=0.015923984900212688, stddev=0.013816765419942192, median=0.013276, p75=0.015097, p95=0.027072, p98=0.031729, p99=0.046047, p999=0.119139, mean_rate=55489.79700784496, m1=55152.33267127673, m5=54279.27519759456, m15=54121.6100603089, rate_unit=events/second, duration_unit=milliseconds
```
### 5 threads / memdisk
Command:
```
ozone freon --verbose cmdw -n1000000 -s 1024 -t5
```
Results:
```
2020-02-20 13:45:56 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.015595, max=0.186502, mean=0.024765946083317496, stddev=0.013115361804753915, median=0.020078, p75=0.027391, p95=0.043922, p98=0.057626, p99=0.068982, p999=0.149605, mean_rate=149845.9603532005, m1=165011.40000000002, m5=165011.40000000002, m15=165011.40000000002, rate_unit=events/second, duration_unit=milliseconds
2020-02-20 13:45:58 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.015955, max=0.53282, mean=0.024875032559059648, stddev=0.021179042964426974, median=0.019787, p75=0.025332, p95=0.043992, p98=0.055852, p99=0.09208, p999=0.215252, mean_rate=151616.12806319087, m1=167905.0, m5=167905.4, m15=167905.4, rate_unit=events/second, duration_unit=milliseconds
2020-02-20 13:45:59 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.015291, max=0.276295, mean=0.022783249906325898, stddev=0.0141649808883555, median=0.019222, p75=0.023684, p95=0.03498, p98=0.048569, p99=0.082985, p999=0.194746, mean_rate=166108.97596211443, m1=187128.59999999998, m5=187129.0, m15=187129.2, rate_unit=events/second, duration_unit=milliseconds
```
After event **FULL DISK** sync time:
### 10 threads / memdisk
Command:
```
ozone freon --verbose cmdw -n1000000 -s 1024 -t10
```
Results:
```
2020-02-20 13:56:24 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.018876, max=0.23164, mean=0.031360605112133816, stddev=0.01528473181467679, median=0.026259, p75=0.035471, p95=0.053266, p98=0.065778, p99=0.089684, p999=0.194772, mean_rate=221185.89909360776, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second, duration_unit=milliseconds
2020-02-20 13:56:26 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.01813, max=1.024403, mean=0.031063502573116783, stddev=0.03403288266534335, median=0.025219, p75=0.034649, p95=0.052022, p98=0.066793, p99=0.094489, p999=0.170708, mean_rate=223381.98205856053, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second, duration_unit=milliseconds
2020-02-20 13:56:28 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.01822, max=0.553308, mean=0.027611349896098043, stddev=0.02053910758384099, median=0.023285, p75=0.028778, p95=0.044363, p98=0.05864, p99=0.074687, p999=0.212676, mean_rate=246700.4065474433, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second, duration_unit=milliseconds
```
### 60 threads / memdisk
```
2020-02-20 14:01:16 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.02071, max=1.332123, mean=0.05963921750577804, stddev=0.06254475489384029, median=0.055516, p75=0.066296, p95=0.080871, p98=0.122221, p99=0.168383, p999=1.063085, mean_rate=326027.7493863214, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second, duration_unit=milliseconds
2020-02-20 14:01:18 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.018394, max=50.864288, mean=0.12001282867083439, stddev=1.6000847986490176, median=0.060316, p75=0.067335, p95=0.082039, p98=0.121812, p99=0.152737, p999=8.08431, mean_rate=326048.1461603413, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second, duration_unit=milliseconds
2020-02-20 14:01:21 INFO metrics:107 - type=TIMER, name=chunk-write, count=1000000, min=0.021695, max=33.169076, mean=0.1455207154023543, stddev=1.5108162531712879, median=0.06086, p75=0.071208, p95=0.112895, p98=0.137324, p99=0.209767, p999=31.589199, mean_rate=264912.7058499313, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second, duration_unit=milliseconds
```