# [11-02 to 11-06] - Single image throughput was 49 imgs/sec - Batch size for optimum throughput: | Batch_size | Time | Throughput (imgs/sec) | | -------- | -------- | -------- | | 2 | 20ms | 100 | | 4 | 27ms | 148 | | 6 | 44ms | 136 | 8 | 58 ms | 138 | 10 | 62 ms | 161 | 16 | 83 ms | 193 | 32* | 100 ms | 320 | 64 | 169 ms | 378 | 128 | 312 ms | 410 | 256 | out of mem | lol Key GPU performance changes in code: - Exported Onnx model support dynamic i/p dimensions - Set max batch size when you create the engine. - Use `execute_async` instead of `execute` for doing inference. It increases performance by overlapping GPU stream and kernel compute. Profile of batch inference code of - 384 image samples - with batch size 32  GPU streams 
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up