# `wgpu-sig-ops` Benchmarks [TOC] ## Multiple-shader benchmarks ### Linux + Nvidia A1000 secp256k1 signature recovery benchmarks (multiple shaders): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 1024 | 32 | 297 | | 2048 | 70 | 165 | | 4096 | 128 | 162 | | 8192 | 256 | 198 | | 16384 | 513 | 288 | GPU timings include data transfer. secp256r1 signature recovery benchmarks (multiple shaders): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 256 | 125 | 258 | | 512 | 252 | 164 | | 1024 | 503 | 158 | | 2048 | 1006 | 162 | | 4096 | 2010 | 181 | | 8192 | 4027 | 255 | GPU timings include data transfer. ed25519 signature verification benchmarks (multiple shaders): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 1024 | 89 | 270 | | 2048 | 219 | 149 | | 4096 | 433 | 149 | | 8192 | 866 | 195 | | 16384 | 1424 | 270 | ### Apple M1 Mini ``` system_profiler SPHardwareDataType Hardware: Hardware Overview: Model Name: Mac mini Model Identifier: Macmini9,1 Model Number: MGNR3FN/A Chip: Apple M1 Total Number of Cores: 8 (4 performance and 4 efficiency) Memory: 8 GB System Firmware Version: 10151.101.3 OS Loader Version: 10151.101.3 Serial Number (system): C07F9BT8Q6NV Hardware UUID: CA426CDA-9D8C-581C-8DDC-EFAF0B849996 Provisioning UDID: 00008103-000C58C13AF2001E Activation Lock Status: Disabled ``` secp256k1 signature recovery benchmarks (multiple shaders): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 1024 | 33 | 734 | | 2048 | 103 | 934 | | 4096 | 167 | 2164 | | 8192 | 297 | 3923 | | 16384 | 555 | 8107 | GPU timings include data transfer. secp256r1 signature recovery benchmarks (multiple shaders): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 256 | 127 | 851 | | 512 | 293 | 893 | | 1024 | 546 | 986 | | 2048 | 1059 | 1321 | | 4096 | 2079 | 2616 | | 8192 | 4122 | 5044 | GPU timings include data transfer. ed25519 signature verification benchmarks (multiple shaders): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 1024 | 103 | 489 | | 2048 | 241 | 598 | | 4096 | 449 | 1074 | | 8192 | 864 | 2328 | | 16384 | 1687 | 4228 | GPU timings include data transfer. ### Apple M3 Pro ``` system_profiler SPHardwareDataType Hardware: Hardware Overview: Model Name: MacBook Pro Model Identifier: Mac15,7 Model Number: MRW23LL/A Chip: Apple M3 Pro Total Number of Cores: 12 (6 performance and 6 efficiency) Memory: 36 GB System Firmware Version: 10151.101.3 OS Loader Version: 10151.101.3 Serial Number (system): H6XY12XXV4 Hardware UUID: 61A10CB7-1CA9-52DA-BEE0-F40D25126908 Provisioning UDID: 00006030-000624583628001C Activation Lock Status: Disabled ``` secp256k1 signature recovery benchmarks (multiple shaders): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 1024 | 25 | 359 | | 2048 | 46 | 378 | | 4096 | 98 | 456 | | 8192 | 185 | 920 | | 16384 | 393 | 1814 | | 32768 | 759 | 3658 | | 65536 | 1573 | 7421 | GPU timings include data transfer. secp256r1 signature recovery benchmarks (multiple shaders): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 256 | 97 | 435 | | 512 | 193 | 445 | | 1024 | 389 | 483 | | 2048 | 781 | 531 | | 4096 | 1567 | 627 | | 8192 | 3139 | 1270 | | 16384 | 6326 | 2526 | GPU timings include data transfer. ed25519 signature verification benchmarks (multiple shaders): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 1024 | 81 | 177 | | 2048 | 162 | 204 | | 4096 | 324 | 309 | | 8192 | 649 | 586 | | 16384 | 1298 | 1133 | | 32768 | 2597 | 2288 | | 65536 | 5194 | 4547 | GPU timings include data transfer. ## Single-shader benchmarks ### Linux + Nvidia A1000 On the Linux machine with an Nvidia A1000 GPU, we found that performing the whole computation using a single shader had a very slight performance advantage over splitting up the computation into multiple shaders. secp256k1 signature recovery benchmarks (single shader): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 1024 | 32 | 692 | | 2048 | 65 | 657 | | 4096 | 129 | 919 | | 8192 | 257 | 3997 | GPU timings include data transfer. secp256r1 signature recovery benchmarks (single shader): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 256 | 125 | 253 | | 512 | 251 | 171 | | 1024 | 516 | 151 | | 2048 | 1007 | 140 | | 4096 | 2014 | 193 | | 8192 | 4026 | 288 | GPU timings include data transfer. ed25519 signature verification benchmarks (single shader): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 1024 | 89 | 530 | | 2048 | 178 | 446 | | 4096 | 429 | 592 | | 8192 | 712 | 1115 | | 16384 | 1424 | 2246 | GPU timings include data transfer. ### Apple M1 Mini Internal error: new_compute_pipeline_state: "Compute function exceeds available stack space" for all 3 benchmarks. ### Apple M3 Pro secp256k1 signature recovery benchmarks (single shader): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 1024 | 25 | 498 | | 2048 | 49 | 550 | | 4096 | 98 | 640 | | 8192 | 197 | 1243 | | 16384 | 394 | 2407 | | 32768 | 788 | 4825 | | 65536 | 1575 | 9773 | GPU timings include data transfer. secp256r1 signature recovery benchmarks (single shader): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 256 | 95 | 979 | | 512 | 198 | 658 | | 1024 | 396 | 719 | | 2048 | 791 | 780 | | 4096 | 1581 | 896 | | 8192 | 3163 | 1677 | | 16384 | 6328 | 3286 | GPU timings include data transfer. ed25519 signature verification benchmarks (single shader): | Num. signatures | CPU, serial (ms) | GPU, parallel (ms) | | ------------------ | ------------------ | ------------------ | | 1024 | 81 | 300 | | 2048 | 162 | 333 | | 4096 | 324 | 380 | | 8192 | 651 | 741 | | 16384 | 1297 | 1418 | | 32768 | 2597 | 2832 | | 65536 | 5195 | 5818 | GPU timings include data transfer.