# Dynamic Scaling in docker
[github](https://github.com/wilson20010327/ScaleableResourceForDocker)
* Every requests will send to mn1 server and there are 20% probability send to mn2 server by mn1 after receive the request, each request will create a cpu instence task on target server (Calculate prime number from 1~N)
* Rt in reward function will be calculated at the last 5 second in the period(30 second), this request will send to mn1 and mn2 servers respectively per second, which we can calculate the response time, and their ports are not same as the previous ports(use in create cpu instence task), Rt will be the average of these five response time, but there is an exception, when there is a timeout in this five collection, Rt will be set as max rt (in our setting 50 ms)
* $c_{utilization}$ in reward function will be calculated at the last 5 second in the period(30 second), we get the container's metric by docker stats command, $c_{utilization}$ will be the average of these five cpu usage which are contained in container's metric.
* Reward function
To calculate the reward I use moving average (4) to smooth the unstable $c_{utilization}$ and Rt
\begin{align*}
Rt &= \begin{cases}
RtList_{lastfourmean} & \text{if } RtList_{size} \geq {4} \\
Rt & \text{otherwise}
\end{cases}, \\
c_{utilization} &= \begin{cases}
c_{utilization}List_{lastfourmean} & \text{if } c_{utilization}List_{size} \geq {4} \\
c_{utilization} & \text{otherwise}
\end{cases}, \\
c_{delay} &= \begin{cases}
1 & \text{if } Rt \geq {t_{max}} \\
0 & \text{otherwise}
\end{cases}, \\
c_{utilization} &= \begin{cases}
1 & \text{if } {cpu_{utilization}} > 0.9 \\
0 & \text{otherwise}
\end{cases}, \\
c_{res} &= \begin{cases}
1 & \text{if } {cpu_{utilization}} < 0.4 \\
0 & \text{otherwise}
\end{cases}, \\
reward_{perf} &= {w_{perf}} \cdot (c_{\text{delay}} + c_{\text{utilization}}), \\
reward_{res} &= {w_{res}} \cdot c_{\text{res}}, \\
{reward} &= -({reward_{perf}} + {reward_{res}}).
\end{align*}
```python
c_delay=0
if(Rt>=self.t_max): c_delay=1
# cpu_utilization cost
if relative_cpu_utilization >0.9:
c_utilization = 1
else:
c_utilization = 0
# calculate the reward
c_perf = c_delay+ c_utilization
# resource penalty
c_res=c_utilization<0.4
reward_perf = self.w_perf * c_perf
reward_res = self.w_res * c_res
reward = -(reward_perf + reward_res)
```
## MPDQN Train
* 8 epochs
* 3600 s/per epoch
* N=20000
* Request rate
According to [ paper](https://github.com/effereds/rlad-core-simulator), take the request list as name "request" from it and take the number in this list every 30 second.ex: at time 0 I take the request[0], at time 30 I take request[30], I ignore the number bewteen them.

* tau_actor: 0.1
* tau_actor_param: 0.01
* learning_rate_actor: 0.01
* learning_rate_actor_param: 0.001
* gamma: 0.9
* epsilon_steps: 840
* epsilon_final: 0.01
* replay_memory_size: 960
* batch_size: 16
* loss_function: MSE loss
* layers: [64]
* w_perf: 0.5
* w_res: 0.5
* action space: Hybrid 3 (discrete(for replica), continuous (for cpus))
### Train Plot
```
First_level_MNCSE Avg_Cpus: 0.8453645833333202
First_level_MNCSE Avg_Replicas: 2.1666666666666665
First_level_MNCSE Median: 18.75
First_level_MNCSE Tmax_violation: 0.446875
Second_level_MNCSE Avg_Cpus: 0.8432395833333205
Second_level_MNCSE Avg_Replicas: 2.221875
Second_level_MNCSE Median: 1.5
Second_level_MNCSE Tmax_violation: 0.0
```
MN1
|  |  |
| -------- | -------- |
|  |  |
|  |  |
MN2
|  |  |
| -------- | -------- |
|  |  |
||
## MPDQN Evaluation
* 1 epochs
* 3600 s/per epoch
* Request rate
Using same request rate as train process
* Parameters same as training
### Evaluation Plot
```
First_level_MNCSE Avg_Cpus: 0.8016528925619817
First_level_MNCSE Avg_Replicas: 1.9917355371900827
First_level_MNCSE Median: 13.5
First_level_MNCSE Tmax_violation: 0.2892561983471074
Second_level_MNCSE Avg_Cpus: 0.8016528925619817
Second_level_MNCSE Avg_Replicas: 2.834710743801653
Second_level_MNCSE Median: 0.5
Second_level_MNCSE Tmax_violation: 0.0
```
MN1
| |  |
| -------- | -------- |
|  |  |
|||
MN2
|  |  |
| -------- | -------- |
| |  |
|||
## DQN Train
* 8 epochs
* 3600 s/per epoch
* N=20000
* Request rate
According to [ paper](https://github.com/effereds/rlad-core-simulator), take the request list as name "request" from it and take the number in this list every 30 second.ex: at time 0 I take the request[0], at time 30 I take request[30], I ignore the number bewteen them.

* tau: 0.1
* learning_rate: 0.01
* gamma: 0.9
* epsilon_steps: 840
* epsilon_final: 0.01
* replay_memory_size: 960
* batch_size: 16
* loss_function: MSE loss
* layers: [128]
* w_perf: 0.5
* w_res: 0.5
* action space: discrete 5 ([None, +0.1 ,-0.1 , +1, -1 ])
### Train Plot
MN1
|  |  |
| -------- | -------- |
|  |  |
|||
MN2
|  |  |
| -------- | -------- |
|  |  |
|||
## DQN Evaluation
* 1 epochs
* 3600 s/per epoch
* Request rate
Using same request rate as train process
* Parameters same as training
### Evaluation Plot
```
First_level_MNCSE Avg_Cpus: 1.0
First_level_MNCSE Avg_Replicas: 1.0
First_level_MNCSE Median: 34.0
First_level_MNCSE Tmax_violation: 0.7355371900826446
Second_level_MNCSE Avg_Cpus: 0.708264462809919
Second_level_MNCSE Avg_Replicas: 1.0
Second_level_MNCSE Median: 2.0
Second_level_MNCSE Tmax_violation: 0.0
```
MN1
|  |  |
| -------- | -------- |
|  |  |
|  | |
MN2
|  |  |
| -------- | -------- |
|  |  |
| |  |
## DQN-9 Train
* 8 epochs
* 3600 s/per epoch
* N=20000
* Request rate
According to [ paper](https://github.com/effereds/rlad-core-simulator), take the request list as name "request" from it and take the number in this list every 30 second.ex: at time 0 I take the request[0], at time 30 I take request[30], I ignore the number bewteen them.

* tau: 0.1
* learning_rate: 0.01
* gamma: 0.9
* epsilon_steps: 840
* epsilon_final: 0.01
* replay_memory_size: 960
* batch_size: 16
* loss_function: MSE loss
* layers: [128]
* w_perf: 0.5
* w_res: 0.5
* action space: discrete 9
([-1,0.1],[-1,0],[-1,-0.1],[0,0.1],[0,0],[0,-0.1],[1,0.1],[1,0],[1,-0.1])
### Train Plot
MN1
| | |
| -------- | -------- |
|  |  |
|  |  |
MN2
|  | |
| -------- | -------- |
|  |  |
|  |  |
## DQN-9 Evaluation
* 1 epochs
* 3600 s/per epoch
* Request rate
Using same request rate as train process
* Parameters same as training
### Evaluation Plot
```
First_level_MNCSE Avg_Cpus: 1.0
First_level_MNCSE Avg_Replicas: 2.9338842975206614
First_level_MNCSE Median: 10.0
First_level_MNCSE Tmax_violation: 0.12396694214876033
Second_level_MNCSE Avg_Cpus: 0.7173553719008279
Second_level_MNCSE Avg_Replicas: 1.28099173553719
Second_level_MNCSE Median: 1.5
Second_level_MNCSE Tmax_violation: 0.0
```
MN1
|  | |
| -------- | -------- |
|  |  |
|  |  |
MN2
|  |  |
| -------- | -------- |
|  |  |
|  |  |
## Graph compare
### N=20000
#### MN1
| | MPDQN | DQN | DQN-9 |
| ------------ | ----- | ---- | ----- |
| Avg_Cpus | 0.8016528925619817 | 1.0 | 1.0 |
| Avg_Replicas | 1.9917355371900827 | 1.0 | 2.9338842975206614 |
| Median | 13.5 | 34.0 | 10.0 |
| Tmax_violation | 0.2892561983471074 | 0.7355371900826446| 0.12396694214876033 |
| | MPDQN | DQN | DQN-9 |
| --------------- | ------------------------------------------------------------------------------ | --- | ----- |
| reward |  |  |  |
| Cpus |  |  |  |
| CPU utilization |  |  |  |
| Replicas |  |  |  |
| Resource |  |  |  |
| Response time |  |  | |
#### MN2
| | MPDQN | DQN | DQN-9 |
| ------------ | ----- | ---- | ----- |
| Avg_Cpus | 0.8016528925619817 | 0.708264462809919 | 0.7173553719008279 |
| Avg_Replicas | 2.834710743801653 | 1.0 | 1.28099173553719 |
| Median | 0.5 | 2.0 | 1.5 |
| Tmax_violation | 0.0 | 0.0 | 0.0 |
| | MPDQN | DQN | DQN-9 |
| -------- | -------- | -------- | --- |
| reward |  |  |  |
| Cpus |  |  |  |
| CPU utilization |  |  |  |
| Replicas |  |  |  |
| Resource |  |  |  |
| Response time |  |  |  |
### N=4000
#### MN1
| | MPDQN | DQN | DQN-9 |
| ------------ | ----- | ---- | ----- |
| Avg_Cpus | 0.8016528925619817 | 0.9983471074380165 | 1.0 |
| Avg_Replicas | 2.9834710743801653 | 1.0 | 2.975206611570248 |
| Median | 10.0 | 25.0 | 10.0 |
| Tmax_violation | 0.05785123966942149 | 0.628099173553719 | 0.09917355371900827 |
| | MPDQN | DQN | DQN-9 |
| -------- | -------- | -------- | --- |
| reward |  |  |  |
| Cpus |  |  |  |
| CPU utilization |  |  |  |
| Replicas |  |  |  |
| Resource |  |  |  |
| Response time |  |  | 
|
#### MN2
| | MPDQN | DQN | DQN-9 |
| ------------ | ----- | ---- | ----- |
| Avg_Cpus | 0.8016528925619817 | 0.9495867768595043 | 0.7338842975206628 |
| Avg_Replicas | 2.0661157024793386 | 1.4049586776859504 | 1.0 |
| Median | 5.5 | 0.5 | 6.5 |
| Tmax_violation | 0.0 | 0.0 | 0.0 |
| | MPDQN | DQN | DQN-9 |
| -------- | -------- | -------- | --- |
| reward |  |  |  |
| Cpus |  |  |  |
| CPU utilization |  |  |  |
| Replicas |  |  |  |
| Resource |  |  |  |
| Response time |  |  | 
|
### N=2000
#### MN1
| | MPDQN | DQN | DQN-9 |
| ------------ | ----- | ---- | ----- |
| Avg_Cpus | 0.8016528925619817 | 1.0 | 1.0 |
| Avg_Replicas | 1.9256198347107438 | 1.0 | 2.958677685950413 |
| Median | 10.0 | 24.5 | 10.0 |
| Tmax_violation | 0.17355371900826447 | 0.5950413223140496 | 0.10743801652892562 |
| | MPDQN | DQN | DQN-9 |
| -------- | -------- | -------- | --- |
| reward |  |  |  |
| Cpus |  |  |  |
| CPU utilization |  |  |  |
| Replicas |  |  |  |
| Resource |  |  |  |
| Response time |  |  |  |
#### MN2
| | MPDQN | DQN | DQN-9 |
| ------------ | ----- | ---- | ----- |
| Avg_Cpus | 0.8016528925619817 | 0.9049586776859507 | 0.7876033057851248 |
| Avg_Replicas | 2.9834710743801653 | 1.0 | 1.0 |
| Median | 5.5 | 0.5 | 6.0 |
| Tmax_violation | 0.0 | 0.0 | 0.0 |
| | MPDQN | DQN | DQN-9 |
| -------- | -------- | -------- | --- |
| reward |  |  |  |
| Cpus |  |  |  |
| CPU utilization |  |  |  |
| Replicas |  |  |  |
| Resource |  |  |  |
| Response time |  |  |  |