# Queue Overload - Autoscaling
## Navigation
1. [Problem](https://hackmd.io/@jwdunne/HJXyhNY4h)
2. [Observability](https://hackmd.io/@jwdunne/S1pJ1CgHn)
3. [Testing](https://hackmd.io/@jwdunne/H1zKkAeSn)
4. [Throughput optimisation opportunities](https://hackmd.io/@jwdunne/H1h2k0xH3)
5. [Backpressure](https://hackmd.io/@jwdunne/B1WZeCeBh)
6. [Load shedding](https://hackmd.io/@jwdunne/BJB4MReH2)
7. [Autoscaling](https://hackmd.io/@jwdunne/Bkw_zAxHn)
## Solution
> This depends on a working solution for DB connection pooling for it to be fully effective. We can, however, get this working to within the same capacity as we have now.
We can configure ECS service to auto-scale based on a custom metric and a target value. AWS allows us to:
1. Set a min and max capacity, to prevent runaway costs
2. Set a scaling up cooldown period, so it doesn't create a runaway scaling up effect
3. Set a scaling down cooldown period, so it doesn't scale down too early, allowing the problem to come back
We should be conservative with these values to begin with and iterate.
This is also the general approach to autoscaling ECS services. If we need to do this for API workers, this is how we would do it.
#### Variables
To control autoscaling across environments (it'd be undesirable in staging long-term but useful to test), we should define a set of variables to control autoscaling behaviour.
```terraform
variable "autoscaling_worker" {
description = "Controls autoscaling parameters for workers"
type = object({
min_capacity = number
max_capacity = number
target = number
scale_in_cooldown_seconds = number
scale_out_cooldown_seconds = number
})
}
```
#### Set up a custom CloudWatch metric
We just need to send a metric to CloudWatch from the application:
```php
$cloudWatch->putMetricData([
'MetricData' => [
[
'MetricName' => 'JobQueuedCompletedRatio'
'Value' => $queueLoad->ratio()
]
],
'Namespace' => 'Leadflo/Queue'
])
```
We should sample this for every 5% (configurable) jobs queued to minimise the costs in sending the data to AWS whilst still giving reasonable data to AWS.
We can do this in a listener that listens for both `JobQueued` and `JobCompleted` events. This would be done using synchronous listeners since we cannot rely on the queue. This may incur some latency but we can dispatch events after the response is sent to the browser using FPM which should mitigate.
#### Set up a scalable target
This configures a target for application autoscaling.
```terraform
resource "aws_appautoscaling_target" "worker" {
max_capacity = var.autoscaling_worker.min_capacity
min_capacity = var.autoscaling_worker.max_capacity
resource_id = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.leadflo_worker.name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
```
The variables allow us to keep autoscaling a flat 1 on staging environments.
#### Set up an auto-scaling policy
```terraform
resource "aws_appautoscaling_policy" "worker" {
name = "worker"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.worker.resource_id
scalable_dimension = aws_appautoscaling_target.worker.scalable_dimension
service_namespace = aws_appautoscaling_target.worker.service_namespace
target_tracking_scaling_policy_configuration {
target_value = var.autoscaling_worker.target
scale_in_cooldown = var.autoscaling_worker.scale_in_cooldown_seconds
scale_out_cooldown = var.autoscaling_worker.scale_out_cooldown_seconds
customized_metric_specification {
namespace = "Leadflo/Queue"
metric_name = "JobQueuedCompletedRatio"
statistic = "Average"
unit = "Percent"
}
}
}
```
## Implementation
- Implement an IAM role for sending metrics to CloudWatch
- Implement a Terraform variable for configuring worker autoscaling
- Configure a scalable target and auto-scaling policy