# Performance Tuning: naivepool
* The source code is at [`unknowntpo/naivepool`](https://github.com/unknowntpo/naivepool)
## Performance comparison between different implementation of naivepool
<img src="https://i.imgur.com/obpiC4l.png"></img>
![]()
## 1. Only restrict the number of goroutines
First implementation: just limit the number of workers
In this implementation, we use a jobChan to receive jobs from the user,
and then just start a new goroutine for every job.
* trace the program running simple CPU bound task
<img src="https://i.imgur.com/8Q6D3tB.png"></img>
* It take more time to schedule the task, rather than do real work.
## 2. Reuse goroutines
### 2.1 Pool: For-select loop, workers: for-select loop
This version of naivepool uses for-select loop to receive a job from `Pool.jobChan`, and send it to workers through `workerChan`.
The worker use for-select pattern to receive job from Pool until channel is closed.
#### The cost of select statement
Select statement needs to...
* Step1: shuffle all channel it waits.
* Step2: check each channel one by one (doesn't need lock), to see if it's ready for communication, if it does, do communication and return.
* Step3: Prepare for blocking on all channels.
* Step4: block
* 1. go for every channel
* 2. lock that channel, add goroutine with `select` statement to that channel's wait queue.
* 3. Check if channel is ready, if true, unlock that channel, prepare for send / receive value from channel.
* if not ready, unlock the channel, go to next channel
```c
Scase *select(Select *sel) {
randomize channel order;
for(;;) {
// Phase 1.
foreach(Scase *cas in sel) {
if(chansend/recv_nonblock(cas->c, ...))
return cas;
}
// Phase 2.
selectstate = nil;
foreach(Scase *cas in sel) {
lock(cas->c);
cas->sg->g = g;
cas->sg->selectstatep = &selectstate;
addwaiter(&cas->c->sendq/recvq, cas->sg);
if(isready(cas->c)) {
unlock(c);
goto ready;
}
unlock(cas->c);
}
// Phase 3.
block();
ready:
CAS(&selectstate, nil, 1);
foreach(Scase *cas in sel) {
lock(cas->c);
removewaiter(&cas->c->sendq/recvq, cas->sg);
unlock(cas->c);
}
// If we were unblocked by a sync chan operation,
// the communication has completed.
if(selectstate > 1)
return selectstate; // denotes the completed case
}
}
```
See [Go channels on steroids](https://docs.google.com/document/d/1yIAYmbvL3JxOKOjuCyon7JhW4cSv1wy5hC0ApeGMV9s/pub) for more information.
### 2.2 Use for-range loop on both pool and workers
* Less lock contention!
* Because a worker / dispatcher just need to lock channel 1 time to add itself to channel's recvq / sendq
* See [chanrecv](https://cs.opensource.google/go/go/+/master:src/runtime/chan.go;l=455?q=chanrecv&sq=&ss=go%2Fgo), [chansend](https://cs.opensource.google/go/go/+/master:src/runtime/chan.go;l=159?q=chansend&sq=&ss=go%2Fgo)
### 2.3 Remove jobChan
This version of naivepool use pool.Schedule to send job directly to pool.workerChan,
The worker use for-range pattern to receive job from Pool until channel is closed.
### 2.4 Auto-Scaling
* Growing
* If `len(workerChan) >= 2/3 cap(workerChan)` and `p.NumIdleWorkers < some ratio`
* dispatch more workers
* Shrinking