Model Performance Compare with P-SIMD
===
## RNN
```
fuse_flatten_compute_, 10c08, 10c10
fuse_flatten_compute_, 10c1c, 10c24
fuse_dense_compute_, 10c40, 10c48
fuse_dense_compute_, 10c64, 10c6c
fuse_elemwise_add_compute_, 10c80, 10c88
fuse_split_compute_, 10cac, 10cb4
fuse_sigmoid_compute_, 10cc0, 10cc8
fuse_sigmoid_compute_, 10cd4, 10cdc
fuse_sigmoid_compute_, 10ce8, 10cf0
fuse_tanh_compute_, 10cfc, 10d04
fuse_elemwise_mul_compute_, 10d18, 10d20
fuse_elemwise_mul_compute_, 10d34, 10d3c
fuse_elemwise_add_1_compute_, 10d50, 10d58
fuse_tanh_compute_, 10d64, 10d6c
fuse_elemwise_mul_compute_, 10d80, 10d88
```
### nosimd
fuse_flatten_compute_
: count P/V/ALL: 0/0/2453735
cycle P/V/ALL: 0/0/4420235
: : count P/V/ALL: 0/0/2453994
cycle P/V/ALL: 0/0/4420494
: Unknown command fuse_flatten_compute_
: : count P/V/ALL: 0/0/2453997
cycle P/V/ALL: 0/0/4420497
: : count P/V/ALL: 0/0/2454256
cycle P/V/ALL: 0/0/4420756
: Unknown command fuse_dense_compute_
: : count P/V/ALL: 0/0/2454263
cycle P/V/ALL: 0/0/4420763
: : count P/V/ALL: 0/0/2957624
cycle P/V/ALL: 0/0/5074554
: Unknown command fuse_dense_compute_
: : count P/V/ALL: 0/0/2957631
cycle P/V/ALL: 0/0/5074561
: : count P/V/ALL: 0/0/3460992
cycle P/V/ALL: 0/0/5728352
: Unknown command fuse_elemwise_add_compute_
: : count P/V/ALL: 0/0/3460997
cycle P/V/ALL: 0/0/5728357
: : count P/V/ALL: 0/0/3466383
cycle P/V/ALL: 0/0/5735031
: Unknown command fuse_split_compute_
: : count P/V/ALL: 0/0/3466392
cycle P/V/ALL: 0/0/5735040
: : count P/V/ALL: 0/0/3470573
cycle P/V/ALL: 0/0/5742283
: Unknown command fuse_sigmoid_compute_
: : count P/V/ALL: 0/0/3470576
cycle P/V/ALL: 0/0/5742286
: : count P/V/ALL: 0/0/3476987
cycle P/V/ALL: 0/0/5752547
: Unknown command fuse_sigmoid_compute_
: : count P/V/ALL: 0/0/3476990
cycle P/V/ALL: 0/0/5752550
: : count P/V/ALL: 0/0/4298239
cycle P/V/ALL: 0/0/6860967
: Unknown command fuse_sigmoid_compute_
: : count P/V/ALL: 0/0/4298242
cycle P/V/ALL: 0/0/6860970
: : count P/V/ALL: 0/0/4893495
cycle P/V/ALL: 0/0/7663193
: Unknown command fuse_tanh_compute_
: : count P/V/ALL: 0/0/4893498
cycle P/V/ALL: 0/0/7663196
: : count P/V/ALL: 0/0/10094063
cycle P/V/ALL: 0/0/14666983
: Unknown command fuse_elemwise_mul_compute_
: : count P/V/ALL: 0/0/10094068
cycle P/V/ALL: 0/0/14666988
: : count P/V/ALL: 0/0/10095422
cycle P/V/ALL: 0/0/14668670
: Unknown command fuse_elemwise_mul_compute_
: : count P/V/ALL: 0/0/10095427
cycle P/V/ALL: 0/0/14668675
: : count P/V/ALL: 0/0/10096781
cycle P/V/ALL: 0/0/14670357
: Unknown command fuse_elemwise_add_1_compute_
: : count P/V/ALL: 0/0/10096786
cycle P/V/ALL: 0/0/14670362
: : count P/V/ALL: 0/0/10098140
cycle P/V/ALL: 0/0/14672044
: Unknown command fuse_tanh_compute_
: : count P/V/ALL: 0/0/10098143
cycle P/V/ALL: 0/0/14672047
: : count P/V/ALL: 0/0/30109948
cycle P/V/ALL: 0/0/41628430
: Unknown command fuse_elemwise_mul_compute_
: : count P/V/ALL: 0/0/30109953
cycle P/V/ALL: 0/0/41628435
: : count P/V/ALL: 0/0/30111307
cycle P/V/ALL: 0/0/41630117
:
### simd
fuse_flatten_compute_
: count P/V/ALL: 0/0/2468732
cycle P/V/ALL: 0/0/4440016
: : count P/V/ALL: 0/0/2468991
cycle P/V/ALL: 0/0/4440275
: Unknown command fuse_flatten_compute_
: : count P/V/ALL: 0/0/2468994
cycle P/V/ALL: 0/0/4440278
: : count P/V/ALL: 0/0/2469253
cycle P/V/ALL: 0/0/4440537
: Unknown command fuse_dense_compute_
: : count P/V/ALL: 0/0/2469260
cycle P/V/ALL: 0/0/4440544
: : count P/V/ALL: 114816/0/3018960
cycle P/V/ALL: 114816/0/5320916
: Unknown command fuse_dense_compute_
: : count P/V/ALL: 114816/0/3018967
cycle P/V/ALL: 114816/0/5320923
: : count P/V/ALL: 229632/0/3566470
cycle P/V/ALL: 229632/0/6197508
: Unknown command fuse_elemwise_add_compute_
: : count P/V/ALL: 229632/0/3566475
cycle P/V/ALL: 229632/0/6197513
: : count P/V/ALL: 229760/0/3567632
cycle P/V/ALL: 229760/0/6199438
: Unknown command fuse_split_compute_
: : count P/V/ALL: 229760/0/3567641
cycle P/V/ALL: 229760/0/6199447
: : count P/V/ALL: 229760/0/3570474
cycle P/V/ALL: 229760/0/6204000
: Unknown command fuse_sigmoid_compute_
: : count P/V/ALL: 229760/0/3570477
cycle P/V/ALL: 229760/0/6204003
: : count P/V/ALL: 229760/0/3599427
cycle P/V/ALL: 229760/0/6234257
: Unknown command fuse_sigmoid_compute_
: : count P/V/ALL: 229760/0/3599430
cycle P/V/ALL: 229760/0/6234260
: : count P/V/ALL: 229760/0/4442716
cycle P/V/ALL: 229760/0/7361702
: Unknown command fuse_sigmoid_compute_
: : count P/V/ALL: 229760/0/4442719
cycle P/V/ALL: 229760/0/7361705
: : count P/V/ALL: 229760/0/5060473
cycle P/V/ALL: 229760/0/8183883
: Unknown command fuse_tanh_compute_
: : count P/V/ALL: 229760/0/5060476
cycle P/V/ALL: 229760/0/8183886
: : count P/V/ALL: 229760/0/10270338
cycle P/V/ALL: 229760/0/15189832
: Unknown command fuse_elemwise_mul_compute_
: : count P/V/ALL: 229760/0/10270343
cycle P/V/ALL: 229760/0/15189837
: : count P/V/ALL: 229856/0/10270539
cycle P/V/ALL: 229856/0/15190225
: Unknown command fuse_elemwise_mul_compute_
: : count P/V/ALL: 229856/0/10270544
cycle P/V/ALL: 229856/0/15190230
: : count P/V/ALL: 229952/0/10270740
cycle P/V/ALL: 229952/0/15190618
: Unknown command fuse_elemwise_add_1_compute_
: : count P/V/ALL: 229952/0/10270745
cycle P/V/ALL: 229952/0/15190623
: : count P/V/ALL: 229984/0/10270876
cycle P/V/ALL: 229984/0/15190946
: Unknown command fuse_tanh_compute_
: : count P/V/ALL: 229984/0/10270879
cycle P/V/ALL: 229984/0/15190949
: : count P/V/ALL: 229984/0/16733193
cycle P/V/ALL: 229984/0/23929791
: Unknown command fuse_elemwise_mul_compute_
: : count P/V/ALL: 229984/0/16733198
cycle P/V/ALL: 229984/0/23929796
: : count P/V/ALL: 230080/0/16733394
cycle P/V/ALL: 230080/0/23930184
## fxp_mnist
### nosimd
- Total
- 10d68 -> 10dfc
- nosimd
- start
count P/V/ALL: 0/0/764436714
cycle P/V/ALL: 0/0/1089034284
- end
count P/V/ALL: 0/0/771139936
cycle P/V/ALL: 0/0/1098644450
- fuse_matmul
- 10d74 -> 10d7c
- nosimd
- start
count P/V/ALL: 0/0/764436717
cycle P/V/ALL: 0/0/1089034287
- end
count P/V/ALL: 0/0/766398576
cycle P/V/ALL: 0/0/1092252304
- fuse_elemwise_add
- 10d90 -> 10d98
- nosimd
- start
count P/V/ALL: 0/0/766398581
cycle P/V/ALL: 0/0/1092252309
- end
count P/V/ALL: 0/0/766399641
cycle P/V/ALL: 0/0/1092253627
- fuse_sigmoid
- 10da4 -> 10dac
- nosimd
- start
count P/V/ALL: 0/0/766399644
cycle P/V/ALL: 0/0/1092253630
- end
count P/V/ALL: 0/0/770700651
cycle P/V/ALL: 0/0/1098046811
- fuse_matmul_1
- 10dc0 -> 10dc8
- start
count P/V/ALL: 0/0/770700656
cycle P/V/ALL: 0/0/1098046816
- end
count P/V/ALL: 0/0/770722883
cycle P/V/ALL: 0/0/1098083167
- fuse_elemwise_add_1
- 10ddc -> 10de4
- start
count P/V/ALL: 0/0/770722888
cycle P/V/ALL: 0/0/1098083172
- end
count P/V/ALL: 0/0/770722931
cycle P/V/ALL: 0/0/1098083215
- fuse_sigmoid_1
- 10df0 -> 10df8
- start
count P/V/ALL: 0/0/770722934
cycle P/V/ALL: 0/0/1098083218
- end
count P/V/ALL: 0/0/771139935
cycle P/V/ALL: 0/0/1098644449
### simd
- Total
- 10d68 -> 10dfc
- start
count P/V/ALL: 0/0/764421719
cycle P/V/ALL: 0/0/1089002555
- end
count P/V/ALL: 100560/0/769521425
cycle P/V/ALL: 100560/0/1096129831
- fuse_matmul
- 10d74 -> 10d7c
- start
count P/V/ALL: 0/0/764421722
cycle P/V/ALL: 0/0/1089002558
- end
count P/V/ALL: 98025/0/765147731
cycle P/V/ALL: 98025/0/1090238975
- fuse_elemwise_add
- 10d90 -> 10d98
- start
count P/V/ALL: 98025/0/765147736
cycle P/V/ALL: 98025/0/1090238980
- end
count P/V/ALL: 98050/0/765147839
cycle P/V/ALL: 98050/0/1090239233
- fuse_sigmoid
- 10da4 -> 10dac
- start
count P/V/ALL: 98050/0/765147842
cycle P/V/ALL: 98050/0/1090239236
- end
count P/V/ALL: 98050/0/769093516
cycle P/V/ALL: 98050/0/1095548928
- fuse_matmul_1
- 10dc0 -> 10dc8
- start
count P/V/ALL: 98050/0/769093521
cycle P/V/ALL: 98050/0/1095548933
- end
count P/V/ALL: 100555/0/769109903
cycle P/V/ALL: 100555/0/1095576219
- fuse_elemwise_add_1
- 10ddc -> 10de4
- start
count P/V/ALL: 100555/0/769109908
cycle P/V/ALL: 100555/0/1095576224
- end
count P/V/ALL: 100560/0/769109931
cycle P/V/ALL: 100560/0/1095576277
- fuse_sigmoid_1
- 10df0 -> 10df8
- start
count P/V/ALL: 100560/0/769109934
cycle P/V/ALL: 100560/0/1095576280d
- end
count P/V/ALL: 100560/0/769521424
cycle P/V/ALL: 100560/0/1096129830
## CNN
### NOSIMD
- Total
- 10d48 -> 10f20
- start
count P/V/ALL: 0/0/49569310
cycle P/V/ALL: 0/0/68689446
- end
count P/V/ALL: 0/0/256071796
cycle P/V/ALL: 0/0/383320478
- fuse_reshape_compute_
- 10d48 -> 10d50
- start
count P/V/ALL: 0/0/49569310
cycle P/V/ALL: 0/0/68689446
- end
count P/V/ALL: 0/0/49575367
cycle P/V/ALL: 0/0/68699795
- fuse_conv2d_compute_
- 10d68 -> 10d70
- start
count P/V/ALL: 0/0/49575373
cycle P/V/ALL: 0/0/68699801
- end
count P/V/ALL: 0/0/52567754
cycle P/V/ALL: 0/0/72335176
- fuse_relu_compute_
- 10d7c -> 10d84
- start
count P/V/ALL: 0/0/52567757
cycle P/V/ALL: 0/0/72335179
- end
count P/V/ALL: 0/0/53267094
cycle P/V/ALL: 0/0/73730714
- fuse_max_pool2d_compute_
- 10d90 -> 10d98
- start
count P/V/ALL: 0/0/53267097
cycle P/V/ALL: 0/0/73730717
- end
count P/V/ALL: 0/0/53605365
cycle P/V/ALL: 0/0/74429917
- fuse_conv2d_1_compute_
- 10db4 -> 10dbc
- start
count P/V/ALL: 0/0/53605372
cycle P/V/ALL: 0/0/74429924
- end
count P/V/ALL: 0/0/242114005
cycle P/V/ALL: 0/0/361974317
- fuse_relu_1_compute_
- 10dc8 -> 10dd0
- start
count P/V/ALL: 0/0/242114008
cycle P/V/ALL: 0/0/361974320
- end
count P/V/ALL: 0/0/242465481
cycle P/V/ALL: 0/0/362677495
- fuse_max_pool2d_1_compute_
- 10ddc -> 10de4
- start
count P/V/ALL: 0/0/242465484
cycle P/V/ALL: 0/0/362677498
- end
count P/V/ALL: 0/0/242582986
cycle P/V/ALL: 0/0/362925674
- fuse_transpose_compute_
- 10df0 -> 10df8
- start
count P/V/ALL: 0/0/242582989
cycle P/V/ALL: 0/0/362925677
- end
count P/V/ALL: 0/0/242649364
cycle P/V/ALL: 0/0/363030124
- fuse_reshape_1_compute_
- 10e04 -> 10e0c
- start
count P/V/ALL: 0/0/242649367
cycle P/V/ALL: 0/0/363030127
- end
count P/V/ALL: 0/0/242740320
cycle P/V/ALL: 0/0/363146174
- fuse_matmul_compute_
- 10e20 -> 10e28
- start
count P/V/ALL: 0/0/242740325
cycle P/V/ALL: 0/0/363146179
- end
count P/V/ALL: 0/0/250900190
cycle P/V/ALL: 0/0/376329722
- fuse_elemwise_add_compute_
- 10e3c -> 10e44
- start
count P/V/ALL: 0/0/250900195
cycle P/V/ALL: 0/0/376329727
- end
count P/V/ALL: 0/0/250901255
cycle P/V/ALL: 0/0/376331045
- fuse_sigmoid_compute_
- 10e50 -> 10e58
- start
count P/V/ALL: 0/0/250901258
cycle P/V/ALL: 0/0/376331048
- end
count P/V/ALL: 0/0/255217522
cycle P/V/ALL: 0/0/382151908
- fuse_matmul_1_compute_
- 10e6c -> 10e74
- start
count P/V/ALL: 0/0/255217527
cycle P/V/ALL: 0/0/382151913
- end
count P/V/ALL: 0/0/255239754
cycle P/V/ALL: 0/0/382188264
- fuse_elemwise_add_1_compute_
- 10e88 -> 10e90
- start
count P/V/ALL: 0/0/255239759
cycle P/V/ALL: 0/0/382188269
- end
count P/V/ALL: 0/0/255239802
cycle P/V/ALL: 0/0/382188312
- fuse_softmax_compute_
- 10f18 -> 10f20
- start
count P/V/ALL: 0/0/255267037
cycle P/V/ALL: 0/0/382236573
- end
count P/V/ALL: 0/0/256071796
cycle P/V/ALL: 0/0/383320478
### SIMD
- total
- 10ddc -> 10f38
- start
count P/V/ALL: 0/0/49747138
cycle P/V/ALL: 0/0/68940178
- end
count P/V/ALL: 21154176/0/122267132
cycle P/V/ALL: 21154176/0/176188306
- fuse_reshape_compute_
- 10ddc -> 10de4
- start
count P/V/ALL: 0/0/49747138
cycle P/V/ALL: 0/0/68940178
- end
count P/V/ALL: 0/0/49750379
cycle P/V/ALL: 0/0/68944895
- fuse_conv2d_compute_
- 10dfc -> 10e04
- start
count P/V/ALL: 0/0/49750385
cycle P/V/ALL: 0/0/68944901
- end
count P/V/ALL: 634304/0/51721619
cycle P/V/ALL: 634304/0/71936245
- fuse_relu_compute_
- 10e10 -> 10e18
- start
count P/V/ALL: 634304/0/51721622
cycle P/V/ALL: 634304/0/71936248
- end
count P/V/ALL: 634304/0/52421053
cycle P/V/ALL: 634304/0/73332065
- fuse_max_pool2d_compute_
- 10e24 -> 10e2c
- start
count P/V/ALL: 634304/0/52421056
cycle P/V/ALL: 634304/0/73332068
- end
count P/V/ALL: 634304/0/52750676
cycle P/V/ALL: 634304/0/74007638
- fuse_conv2d_1_compute_
- 10e48 -> 10e50
- start
count P/V/ALL: 634304/0/52750683
cycle P/V/ALL: 634304/0/74007645
- end
count P/V/ALL: 20759616/0/113413289
cycle P/V/ALL: 20759616/0/162898939
- fuse_relu_1_compute_
- 10e5c -> 10e64
- start
count P/V/ALL: 20759616/0/113413292
cycle P/V/ALL: 20759616/0/162898942
- end
count P/V/ALL: 20759616/0/113776163
cycle P/V/ALL: 20759616/0/163635149
- fuse_max_pool2d_1_compute_
- 10e70 -> 10e78
- start
count P/V/ALL: 20759616/0/113776166
cycle P/V/ALL: 20759616/0/163635152
- end
count P/V/ALL: 20759616/0/113895274
cycle P/V/ALL: 20759616/0/163888146
- fuse_transpose_compute_
- 10e84 -> 10e8c
- start
count P/V/ALL: 20759616/0/113895277
cycle P/V/ALL: 20759616/0/163888149
- end
count P/V/ALL: 20759616/0/113961652
cycle P/V/ALL: 20759616/0/163992596
- fuse_reshape_1_compute_
- 10e98 -> 10ea0
- start
count P/V/ALL: 20759616/0/113961655
cycle P/V/ALL: 20759616/0/163992599
- end
count P/V/ALL: 20759616/0/114052608
cycle P/V/ALL: 20759616/0/164108646
- fuse_matmul_compute_
- 10eb4 -> 10ebc
- start
count P/V/ALL: 20759616/0/114052613
cycle P/V/ALL: 20759616/0/164108651
- end
count P/V/ALL: 21151641/0/117032622
cycle P/V/ALL: 21151641/0/169127868
- fuse_elemwise_add_compute_
- 10ed0 -> 10ed8
- start
count P/V/ALL: 21151641/0/117032627
cycle P/V/ALL: 21151641/0/169127873
- end
count P/V/ALL: 21151666/0/117032730
cycle P/V/ALL: 21151666/0/169128126
- fuse_sigmoid_compute_
- 10ee4 -> 10eec
- start
count P/V/ALL: 21151666/0/117032733
cycle P/V/ALL: 21151666/0/169128129
- end
count P/V/ALL: 21151666/0/121361423
cycle P/V/ALL: 21151666/0/174961613
- fuse_matmul_1_compute_
- 10f00 -> 10f08
- start
count P/V/ALL: 21151666/0/121361428
cycle P/V/ALL: 21151666/0/174961618
- end
count P/V/ALL: 21154171/0/121377810
cycle P/V/ALL: 21154171/0/174988904
- fuse_elemwise_add_1_compute_
- 10f1c -> 10f24
- start
count P/V/ALL: 21154171/0/121377815
cycle P/V/ALL: 21154171/0/174988909
- end
count P/V/ALL: 21154176/0/121377838
cycle P/V/ALL: 21154176/0/174988962
- fuse_softmax_compute_
- 10f30 -> 10f38
- start
count P/V/ALL: 21154176/0/121377841
cycle P/V/ALL: 21154176/0/174988965
- end
count P/V/ALL: 21154176/0/122267132
cycle P/V/ALL: 21154176/0/176188306
### Alexnet
```
fuse_transpose, 11138, 11140
fuse_conv2d, 1115c, 11164
fuse_relu, 11170, 11178
fuse_max_pool2d, 11184, 1118c
fuse_conv2d_1, 111a8, 111b0
fuse_relu_1, 111bc, 111c4
fuse_max_pool2d_1, 111d0, 111d8
fuse_conv2d_2, 111f4, 111fc
fuse_relu_2, 112c0, 112c8
fuse_conv2d_3, 11444, 1144c
fuse_relu_2, 11458, 11460
fuse_conv2d_4, 1147c, 11484
fuse_relu_3, 11490, 11498
fuse_max_pool2d_2, 114a4, 114ac
fuse_conv2d_5, 114c8, 114d0
fuse_relu_4, 114dc, 114e4
fuse_conv2d_6, 11500, 11508
fuse_relu_4, 11514, 1151c
fuse_conv2d_7, 11538, 11540
```
## nosimd
fuse_reshape
: count P/V/ALL: 0/0/6575731745
cycle P/V/ALL: 0/0/9285807185
: : count P/V/ALL: 0/0/6578757106
cycle P/V/ALL: 0/0/9290649196
: Unknown command fuse_conv2d
: : count P/V/ALL: 0/0/6578757113
cycle P/V/ALL: 0/0/9290649203
: : count P/V/ALL: 0/0/7804138887
cycle P/V/ALL: 0/0/11214092843
: Unknown command fuse_relu
: : count P/V/ALL: 0/0/7804138890
cycle P/V/ALL: 0/0/11214092846
: : count P/V/ALL: 0/0/7808982927
cycle P/V/ALL: 0/0/11223772009
: Unknown command fuse_max_pool2d
: : count P/V/ALL: 0/0/7808982930
cycle P/V/ALL: 0/0/11223772012
: : count P/V/ALL: 0/0/7813159747
cycle P/V/ALL: 0/0/11232793455
: Unknown command fuse_conv2d_1
: : count P/V/ALL: 0/0/7813159754
cycle P/V/ALL: 0/0/11232793462
: : count P/V/ALL: 0/0/10161765206
cycle P/V/ALL: 0/0/13848889784
: Unknown command fuse_relu_1
: : count P/V/ALL: 0/0/10161765209
cycle P/V/ALL: 0/0/13848889787
: : count P/V/ALL: 0/0/10165112030
cycle P/V/ALL: 0/0/13855555638
: Unknown command fuse_max_pool2d_1
: : count P/V/ALL: 0/0/10165112033
cycle P/V/ALL: 0/0/13855555641
: : count P/V/ALL: 0/0/10167751612
cycle P/V/ALL: 0/0/13861194170
: Unknown command fuse_conv2d_2
: : count P/V/ALL: 0/0/10167751619
cycle P/V/ALL: 0/0/13861194177
: : count P/V/ALL: 0/0/11310408001
cycle P/V/ALL: 0/0/15181822247
: Unknown command fuse_relu_2
: : count P/V/ALL: 0/0/11310408004
cycle P/V/ALL: 0/0/15181822250
: : count P/V/ALL: 0/0/11311840959
cycle P/V/ALL: 0/0/15184683399
: Unknown command fuse_conv2d_3
: : count P/V/ALL: 0/0/11311840966
cycle P/V/ALL: 0/0/15184683406
: : count P/V/ALL: 0/0/13596075568
cycle P/V/ALL: 0/0/17824652396
: Unknown command fuse_relu_2
: : count P/V/ALL: 0/0/13596075571
cycle P/V/ALL: 0/0/17824652399
: : count P/V/ALL: 0/0/13597503372
cycle P/V/ALL: 0/0/17827498086
: Unknown command fuse_conv2d_4
: : count P/V/ALL: 0/0/13597503379
cycle P/V/ALL: 0/0/17827498093
: : count P/V/ALL: 0/0/15120554111
cycle P/V/ALL: 0/0/19587784889
: Unknown command fuse_relu_3
: : count P/V/ALL: 0/0/15120554114
cycle P/V/ALL: 0/0/19587784892
: : count P/V/ALL: 0/0/15121477345
cycle P/V/ALL: 0/0/19589667117
: Unknown command fuse_max_pool2d_2
: : count P/V/ALL: 0/0/15121477348
cycle P/V/ALL: 0/0/19589667120
: : count P/V/ALL: 0/0/15122059767
cycle P/V/ALL: 0/0/19590929497
: Unknown command fuse_conv2d_5
: : count P/V/ALL: 0/0/15122059774
cycle P/V/ALL: 0/0/19590929504
: : count P/V/ALL: 0/0/16058122855
cycle P/V/ALL: 0/0/21015570083
: Unknown command fuse_relu_4
: : count P/V/ALL: 0/0/16058122858
cycle P/V/ALL: 0/0/21015570086
: : count P/V/ALL: 0/0/16058225209
cycle P/V/ALL: 0/0/21015778815
: Unknown command fuse_conv2d_6
: : count P/V/ALL: 0/0/16058225216
cycle P/V/ALL: 0/0/21015778822
: : count P/V/ALL: 0/0/16725187677
cycle P/V/ALL: 0/0/22025040191
: Unknown command fuse_relu_4
: : count P/V/ALL: 0/0/16725187680
cycle P/V/ALL: 0/0/22025040194
: : count P/V/ALL: 0/0/16725289127
cycle P/V/ALL: 0/0/22025246211
: Unknown command fuse_conv2d_7
: : count P/V/ALL: 0/0/16725289134
cycle P/V/ALL: 0/0/22025246218
: : count P/V/ALL: 0/0/16726632295
cycle P/V/ALL: 0/0/22027215033
:
## simd
fuse_reshape
: count P/V/ALL: 0/0/6575726750
cycle P/V/ALL: 0/0/9285797086
: : count P/V/ALL: 0/0/6578901517
cycle P/V/ALL: 0/0/9290787607
: Unknown command fuse_conv2d
: : count P/V/ALL: 0/0/6578901524
cycle P/V/ALL: 0/0/9290787614
: : count P/V/ALL: 170224320/0/7200547643
cycle P/V/ALL: 170224320/0/10259264377
: Unknown command fuse_relu
: : count P/V/ALL: 170224320/0/7200547646
cycle P/V/ALL: 170224320/0/10259264380
: : count P/V/ALL: 170224320/0/7205950697
cycle P/V/ALL: 170224320/0/10270248049
: Unknown command fuse_max_pool2d
: : count P/V/ALL: 170224320/0/7205950700
cycle P/V/ALL: 170224320/0/10270248052
: : count P/V/ALL: 170224320/0/7210256858
cycle P/V/ALL: 170224320/0/10279570868
: Unknown command fuse_conv2d_1
: : count P/V/ALL: 170224320/0/7210256865
cycle P/V/ALL: 170224320/0/10279570875
: : count P/V/ALL: 598325952/0/8954633676
cycle P/V/ALL: 598325952/0/12616380068
: Unknown command fuse_relu_1
: : count P/V/ALL: 598325952/0/8954633679
cycle P/V/ALL: 598325952/0/12616380071
: : count P/V/ALL: 598325952/0/8958376640
cycle P/V/ALL: 598325952/0/12623974766
: Unknown command fuse_max_pool2d_1
: : count P/V/ALL: 598325952/0/8958376643
cycle P/V/ALL: 598325952/0/12623974769
: : count P/V/ALL: 598325952/0/8961112885
cycle P/V/ALL: 598325952/0/12629847613
: Unknown command fuse_conv2d_2
: : count P/V/ALL: 598325952/0/8961112892
cycle P/V/ALL: 598325952/0/12629847620
: : count P/V/ALL: 683924160/0/9406529413
cycle P/V/ALL: 683924160/0/13265614287
: Unknown command fuse_relu_2
: : count P/V/ALL: 683924160/0/9406529416
cycle P/V/ALL: 683924160/0/13265614290
: : count P/V/ALL: 683924160/0/9408128243
cycle P/V/ALL: 683924160/0/13268862471
: Unknown command fuse_conv2d_3
: : count P/V/ALL: 683924160/0/9408128250
cycle P/V/ALL: 683924160/0/13268862478
: : count P/V/ALL: 855120576/0/10297655845
cycle P/V/ALL: 855120576/0/14538889457
: Unknown command fuse_relu_2
: : count P/V/ALL: 855120576/0/10297655848
cycle P/V/ALL: 855120576/0/14538889460
: : count P/V/ALL: 855120576/0/10299249425
cycle P/V/ALL: 855120576/0/14542121891
: Unknown command fuse_conv2d_4
: : count P/V/ALL: 855120576/0/10299249432
cycle P/V/ALL: 855120576/0/14542121898
: : count P/V/ALL: 969251520/0/10892494175
cycle P/V/ALL: 969251520/0/15389114639
: Unknown command fuse_relu_3
: : count P/V/ALL: 969251520/0/10892494178
cycle P/V/ALL: 969251520/0/15389114642
: : count P/V/ALL: 969251520/0/10893525047
cycle P/V/ALL: 969251520/0/15391248773
: Unknown command fuse_max_pool2d_2
: : count P/V/ALL: 969251520/0/10893525050
cycle P/V/ALL: 969251520/0/15391248776
: : count P/V/ALL: 969251520/0/10894111362
cycle P/V/ALL: 969251520/0/15392525398
: Unknown command fuse_conv2d_5
: : count P/V/ALL: 969251520/0/10894111369
cycle P/V/ALL: 969251520/0/15392525405
: : count P/V/ALL: 969251520/0/11686749386
cycle P/V/ALL: 969251520/0/16262702084
: Unknown command fuse_relu_4
: : count P/V/ALL: 969251520/0/11686749389
cycle P/V/ALL: 969251520/0/16262702087
: : count P/V/ALL: 969251520/0/11686864034
cycle P/V/ALL: 969251520/0/16262939514
: Unknown command fuse_conv2d_6
: : count P/V/ALL: 969251520/0/11686864041
cycle P/V/ALL: 969251520/0/16262939521
: : count P/V/ALL: 969251520/0/12457142467
cycle P/V/ALL: 969251520/0/17375516831
: Unknown command fuse_relu_4
: : count P/V/ALL: 969251520/0/12457142470
cycle P/V/ALL: 969251520/0/17375516834
: : count P/V/ALL: 969251520/0/12457254569
cycle P/V/ALL: 969251520/0/17375746623
: Unknown command fuse_conv2d_7
: : count P/V/ALL: 969251520/0/12457254576
cycle P/V/ALL: 969251520/0/17375746630
: : count P/V/ALL: 969251520/0/12458989411
cycle P/V/ALL: 969251520/0/17378188315
:
## preprocess data
```
fxp nosimd
fuse_matmul, 3218017
fuse_elemwise_add, 1318
fuse_sigmoid, 5793181
fuse_matmul_1, 36351
fuse_elemwise_add_1, 43
fuse_sigmoid_1, 561231
fxp simd
fuse_matmul, 1236417
fuse_elemwise_add, 253
fuse_sigmoid, 5309692
fuse_matmul_1, 27286
fuse_elemwise_add_1, 53
fuse_sigmoid_1, 553550
cnn_nosimd
fuse_reshape_compute_, 10349
fuse_conv2d_compute_, 3635375
fuse_relu_compute_, 1395535
fuse_max_pool2d_compute_, 699200
fuse_conv2d_1_compute_, 287544393
fuse_relu_1_compute_, 703175
fuse_max_pool2d_1_compute_, 248176
fuse_transpose_compute_, 104447
fuse_reshape_1_compute_, 116047
fuse_matmul_compute_, 13183543
fuse_elemwise_add_compute_, 1318
fuse_sigmoid_compute_, 5820860
fuse_matmul_1_compute_, 36351
fuse_elemwise_add_1_compute_, 43
fuse_softmax_compute_, 1083905
cnn_simd
fuse_reshape_compute_, 4717
fuse_conv2d_compute_, 2991344
fuse_relu_compute_, 1395817
fuse_max_pool2d_compute_, 675570
fuse_conv2d_1_compute_, 88891294
fuse_relu_1_compute_, 736207
fuse_max_pool2d_1_compute_, 252994
fuse_transpose_compute_, 104447
fuse_reshape_1_compute_, 116047
fuse_matmul_compute_, 5019217
fuse_elemwise_add_compute_, 253
fuse_sigmoid_compute_, 5833484
fuse_matmul_1_compute_, 27286
fuse_elemwise_add_1_compute_, 53
fuse_softmax_compute_, 1199341
alexnet nosimd
fuse_reshape, 4842011
Unknown command fuse_conv2d, 1923443640
Unknown command fuse_relu, 9679163
Unknown command fuse_max_pool2d, 9021443
Unknown command fuse_conv2d_1, 2616096322
Unknown command fuse_relu_1, 6665851
Unknown command fuse_max_pool2d_1, 5638529
Unknown command fuse_conv2d_2, 1320628070
Unknown command fuse_relu_2, 2861149
Unknown command fuse_conv2d_3, 2639968990
Unknown command fuse_relu_2, 2845687
Unknown command fuse_conv2d_4, 1760286796
Unknown command fuse_relu_3, 1882225
Unknown command fuse_max_pool2d_2, 1262377
Unknown command fuse_conv2d_5, 1424640579
Unknown command fuse_relu_4, 208729
Unknown command fuse_conv2d_6, 1009261369
Unknown command fuse_relu_4, 206017
Unknown command fuse_conv2d_7, 1968815
alexnet simd
fuse_reshape, 4990521
Unknown command fuse_conv2d, 968476763
Unknown command fuse_relu, 10983669
Unknown command fuse_max_pool2d, 9322816
Unknown command fuse_conv2d_1, 2336809193
Unknown command fuse_relu_1, 7594695
Unknown command fuse_max_pool2d_1, 5872844
Unknown command fuse_conv2d_2, 635766667
Unknown command fuse_relu_2, 3248181
Unknown command fuse_conv2d_3, 1270026979
Unknown command fuse_relu_2, 3232431
Unknown command fuse_conv2d_4, 846992741
Unknown command fuse_relu_3, 2134131
Unknown command fuse_max_pool2d_2, 1276622
Unknown command fuse_conv2d_5, 870176679
Unknown command fuse_relu_4, 237427
Unknown command fuse_conv2d_6, 1112577310
Unknown command fuse_relu_4, 229789
Unknown command fuse_conv2d_7, 2441685
```