AWS ECS Autoscaling: Target Tracking, Step Scaling, and Fargate Capacity Providers with Terraform
Configure AWS ECS autoscaling for Fargate workloads. Covers target tracking on CPU and memory, step scaling policies for burst traffic, ECS capacity providers, scale-in protection, cooldown periods, and complete Terraform configuration.
ECS Fargate autoscaling solves a billing problem and a reliability problem simultaneously. Without it, you either overprovision (paying for idle capacity) or underprovision (task crashes under load). The key is choosing the right scaling policy: target tracking for steady-state workloads, step scaling for predictable burst patterns.
Scaling Policy Comparison
| Policy | Use Case | Behavior | Latency |
|---|---|---|---|
| Target tracking (CPU 50%) | Web APIs, SaaS apps | Continuously adjusts to maintain target | 60–120 seconds |
| Target tracking (request count) | ALB-fronted services | Scales per requests-per-task | 60–120 seconds |
| Step scaling | Known burst patterns (e.g., batch jobs) | Add N tasks when threshold exceeded | 30–60 seconds |
| Scheduled scaling | Predictable traffic (e.g., business hours) | Pre-scale before load arrives | Instant |
Terraform: ECS Service with Autoscaling
# terraform/ecs-service.tf
# ECS Cluster
resource "aws_ecs_cluster" "main" {
name = "${var.app_name}-cluster"
setting {
name = "containerInsights"
value = "enabled" # CloudWatch Container Insights
}
}
# ECS Service (Fargate)
resource "aws_ecs_service" "app" {
name = "${var.app_name}-app"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.app.arn
desired_count = 2 # Starting count (autoscaling overrides this)
launch_type = "FARGATE"
network_configuration {
subnets = var.private_subnet_ids
security_groups = [aws_security_group.app.id]
assign_public_ip = false
}
load_balancer {
target_group_arn = aws_lb_target_group.app.arn
container_name = "app"
container_port = 3000
}
# Prevent Terraform from resetting desired_count on every apply
lifecycle {
ignore_changes = [desired_count]
}
depends_on = [aws_lb_listener.https]
}
# ─── Autoscaling Target ───────────────────────────────────────────────────────
resource "aws_appautoscaling_target" "ecs" {
max_capacity = 20 # Hard ceiling on task count
min_capacity = 2 # Always keep at least 2 tasks (HA)
resource_id = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.app.name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
# ─── Policy 1: Target Tracking on CPU (primary) ──────────────────────────────
resource "aws_appautoscaling_policy" "cpu_tracking" {
name = "${var.app_name}-cpu-target-tracking"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.ecs.resource_id
scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
service_namespace = aws_appautoscaling_target.ecs.service_namespace
target_tracking_scaling_policy_configuration {
target_value = 50.0 # Keep average CPU at 50%
scale_in_cooldown = 300 # 5 min before scaling in (avoids flapping)
scale_out_cooldown = 60 # 1 min before adding tasks (fast response)
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageCPUUtilization"
}
}
}
# ─── Policy 2: Target Tracking on Memory ─────────────────────────────────────
resource "aws_appautoscaling_policy" "memory_tracking" {
name = "${var.app_name}-memory-target-tracking"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.ecs.resource_id
scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
service_namespace = aws_appautoscaling_target.ecs.service_namespace
target_tracking_scaling_policy_configuration {
target_value = 70.0 # Scale out when memory hits 70%
scale_in_cooldown = 300
scale_out_cooldown = 60
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageMemoryUtilization"
}
}
}
# ─── Policy 3: Step Scaling on ALB Request Count (for burst protection) ───────
resource "aws_appautoscaling_policy" "request_step" {
name = "${var.app_name}-request-step-scaling"
policy_type = "StepScaling"
resource_id = aws_appautoscaling_target.ecs.resource_id
scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
service_namespace = aws_appautoscaling_target.ecs.service_namespace
step_scaling_policy_configuration {
adjustment_type = "PercentChangeInCapacity" # Scale by % of current
cooldown = 60
metric_aggregation_type = "Average"
step_adjustment {
metric_interval_lower_bound = 0 # CPU 70–80%: add 25%
metric_interval_upper_bound = 10
scaling_adjustment = 25
}
step_adjustment {
metric_interval_lower_bound = 10 # CPU 80–90%: add 50%
metric_interval_upper_bound = 20
scaling_adjustment = 50
}
step_adjustment {
metric_interval_lower_bound = 20 # CPU >90%: double capacity
scaling_adjustment = 100
}
}
}
# CloudWatch alarm that triggers step scaling
resource "aws_cloudwatch_metric_alarm" "cpu_high" {
alarm_name = "${var.app_name}-cpu-high"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 2
metric_name = "CPUUtilization"
namespace = "AWS/ECS"
period = 60
statistic = "Average"
threshold = 70.0
dimensions = {
ClusterName = aws_ecs_cluster.main.name
ServiceName = aws_ecs_service.app.name
}
alarm_actions = [aws_appautoscaling_policy.request_step.arn]
}
# ─── Policy 4: Scheduled Scaling (pre-scale before business hours) ────────────
resource "aws_appautoscaling_scheduled_action" "scale_up_morning" {
name = "${var.app_name}-scale-up-morning"
service_namespace = aws_appautoscaling_target.ecs.service_namespace
resource_id = aws_appautoscaling_target.ecs.resource_id
scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
schedule = "cron(0 8 ? * MON-FRI *)" # 8 AM UTC Monday–Friday
scalable_target_action {
min_capacity = 4 # Pre-warm to 4 tasks
max_capacity = 20
}
}
resource "aws_appautoscaling_scheduled_action" "scale_down_evening" {
name = "${var.app_name}-scale-down-evening"
service_namespace = aws_appautoscaling_target.ecs.service_namespace
resource_id = aws_appautoscaling_target.ecs.resource_id
scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
schedule = "cron(0 20 ? * MON-FRI *)" # 8 PM UTC
scalable_target_action {
min_capacity = 2 # Back to minimum at night
max_capacity = 20
}
}
☁️ Is Your Cloud Costing Too Much?
Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.
- AWS, GCP, Azure certified engineers
- Infrastructure as Code (Terraform, CDK)
- Docker, Kubernetes, GitHub Actions CI/CD
- Typical audit recovers $500–$3,000/month in savings
Scale-In Protection for Long-Running Tasks
// lib/ecs/scale-protection.ts
// Prevent ECS from killing a task mid-job
import {
ECSClient,
UpdateTaskProtectionCommand,
} from "@aws-sdk/client-ecs";
const ecs = new ECSClient({ region: process.env.AWS_REGION! });
const CLUSTER_ARN = process.env.ECS_CLUSTER_ARN!;
function getOwnTaskArn(): string | null {
// ECS injects this via task metadata endpoint
return process.env.ECS_TASK_ARN ?? null;
}
export async function enableScaleInProtection(
expiresAfterMinutes = 60
): Promise<void> {
const taskArn = getOwnTaskArn();
if (!taskArn) return; // Not running in ECS (local dev)
await ecs.send(new UpdateTaskProtectionCommand({
cluster: CLUSTER_ARN,
tasks: [taskArn],
protectionEnabled: true,
expiresInMinutes: expiresAfterMinutes,
}));
}
export async function disableScaleInProtection(): Promise<void> {
const taskArn = getOwnTaskArn();
if (!taskArn) return;
await ecs.send(new UpdateTaskProtectionCommand({
cluster: CLUSTER_ARN,
tasks: [taskArn],
protectionEnabled: false,
}));
}
// Usage in a long-running job:
// async function processLargeBatch() {
// await enableScaleInProtection(120); // Protect for up to 2 hours
// try {
// await doExpensiveWork();
// } finally {
// await disableScaleInProtection(); // Always release
// }
// }
CloudWatch Dashboard for ECS Scaling
resource "aws_cloudwatch_dashboard" "ecs" {
dashboard_name = "${var.app_name}-ecs-scaling"
dashboard_body = jsonencode({
widgets = [
{
type = "metric"
properties = {
title = "ECS Task Count"
metrics = [
["ECS/ContainerInsights", "RunningTaskCount",
"ClusterName", "${var.app_name}-cluster",
"ServiceName", "${var.app_name}-app"]
]
period = 60
}
},
{
type = "metric"
properties = {
title = "CPU and Memory Utilization"
metrics = [
["AWS/ECS", "CPUUtilization", "ClusterName", "${var.app_name}-cluster", "ServiceName", "${var.app_name}-app"],
["AWS/ECS", "MemoryUtilization", "ClusterName", "${var.app_name}-cluster", "ServiceName", "${var.app_name}-app"],
]
period = 60
}
}
]
})
}
⚙️ DevOps Done Right — Zero Downtime, Full Automation
Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.
- Staging + production environments with feature flags
- Automated security scanning in the pipeline
- Uptime monitoring + alerting + runbook automation
- On-call support handover docs included
Scaling Configuration Reference
| Setting | Value | Why |
|---|---|---|
min_capacity | 2 | High availability — 1 AZ can fail |
max_capacity | 20 | Cost ceiling — alert if this is hit |
| CPU target | 50% | Leaves headroom for traffic spikes before new tasks are ready |
| Memory target | 70% | Higher than CPU (memory leaks are gradual) |
| Scale-out cooldown | 60s | React quickly to load |
| Scale-in cooldown | 300s | Don't thrash — wait for stability |
| Task CPU/memory | 512 CPU / 1024 MiB | Right-size first; then scale out |
Cost and Timeline Estimates
| Scope | Team | Timeline | Cost Range |
|---|---|---|---|
| Basic target tracking (CPU) | 1 dev | Half a day | $150–300 |
| Full Terraform autoscaling module | 1 dev | 1–2 days | $400–800 |
| Step scaling + CloudWatch alarms | 1 dev | 1 day | $300–600 |
| Scale-in protection for job workers | 1 dev | Half a day | $200–400 |
See Also
- AWS ECS Fargate Deployment
- Terraform AWS Infrastructure
- AWS ECR Image Scanning
- AWS CloudFront Cache Policies
- AWS RDS Read Replicas
Working With Viprasol
ECS autoscaling misconfiguration is expensive in both directions: a 5-minute scale-in cooldown that's too short causes thrashing (tasks added and removed repeatedly), while a CPU target of 80% leaves too little headroom (new tasks aren't ready by the time CPU spikes). Our Terraform module sets sensible defaults: CPU target 50%, scale-out cooldown 60s, scale-in cooldown 300s, lifecycle ignore_changes on desired_count.
What we deliver:
aws_ecs_service: Fargate, private subnets, ALB target group,lifecycle { ignore_changes = [desired_count] }aws_appautoscaling_target: min 2, max 20, ECSServiceDesiredCount dimension- Target tracking:
ECSServiceAverageCPUUtilizationtarget 50%,ECSServiceAverageMemoryUtilizationtarget 70% - Step scaling:
PercentChangeInCapacityat 70/80/90% CPU thresholds (25%/50%/100% increase) - Scheduled:
cron(0 8 ? * MON-FRI *)scale-up morning,cron(0 20 ? * MON-FRI *)scale-down evening UpdateTaskProtectionCommand:enableScaleInProtection(minutes)/disableScaleInProtection()for job workers- CloudWatch dashboard: RunningTaskCount + CPU/Memory utilization
Talk to our team about your ECS infrastructure →
Or explore our cloud infrastructure services.
About the Author
Viprasol Tech Team
Custom Software Development Specialists
The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.
Need DevOps & Cloud Expertise?
Scale your infrastructure with confidence. AWS, GCP, Azure certified team.
Free consultation • No commitment • Response within 24 hours
Making sense of your data at scale?
Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.