ECS Fargate autoscaling solves a billing problem and a reliability problem simultaneously. Without it, you either overprovision (paying for idle capacity) or underprovision (task crashes under load). The key is choosing the right scaling policy: target tracking for steady-state workloads, step scaling for predictable burst patterns.

Scaling Policy Comparison

Policy	Use Case	Behavior	Latency
Target tracking (CPU 50%)	Web APIs, SaaS apps	Continuously adjusts to maintain target	60–120 seconds
Target tracking (request count)	ALB-fronted services	Scales per requests-per-task	60–120 seconds
Step scaling	Known burst patterns (e.g., batch jobs)	Add N tasks when threshold exceeded	30–60 seconds
Scheduled scaling	Predictable traffic (e.g., business hours)	Pre-scale before load arrives	Instant

Terraform: ECS Service with Autoscaling

# terraform/ecs-service.tf

# ECS Cluster
resource "aws_ecs_cluster" "main" {
  name = "${var.app_name}-cluster"

  setting {
    name  = "containerInsights"
    value = "enabled"  # CloudWatch Container Insights
  }
}

# ECS Service (Fargate)
resource "aws_ecs_service" "app" {
  name            = "${var.app_name}-app"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.app.arn
  desired_count   = 2  # Starting count (autoscaling overrides this)
  launch_type     = "FARGATE"

  network_configuration {
    subnets          = var.private_subnet_ids
    security_groups  = [aws_security_group.app.id]
    assign_public_ip = false
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.app.arn
    container_name   = "app"
    container_port   = 3000
  }

  # Prevent Terraform from resetting desired_count on every apply
  lifecycle {
    ignore_changes = [desired_count]
  }

  depends_on = [aws_lb_listener.https]
}

# ─── Autoscaling Target ───────────────────────────────────────────────────────

resource "aws_appautoscaling_target" "ecs" {
  max_capacity       = 20   # Hard ceiling on task count
  min_capacity       = 2    # Always keep at least 2 tasks (HA)
  resource_id        = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.app.name}"
  scalable_dimension = "ecs:service:DesiredCount"
  service_namespace  = "ecs"
}

# ─── Policy 1: Target Tracking on CPU (primary) ──────────────────────────────

resource "aws_appautoscaling_policy" "cpu_tracking" {
  name               = "${var.app_name}-cpu-target-tracking"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.ecs.resource_id
  scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
  service_namespace  = aws_appautoscaling_target.ecs.service_namespace

  target_tracking_scaling_policy_configuration {
    target_value       = 50.0   # Keep average CPU at 50%
    scale_in_cooldown  = 300    # 5 min before scaling in (avoids flapping)
    scale_out_cooldown = 60     # 1 min before adding tasks (fast response)

    predefined_metric_specification {
      predefined_metric_type = "ECSServiceAverageCPUUtilization"
    }
  }
}

# ─── Policy 2: Target Tracking on Memory ─────────────────────────────────────

resource "aws_appautoscaling_policy" "memory_tracking" {
  name               = "${var.app_name}-memory-target-tracking"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.ecs.resource_id
  scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
  service_namespace  = aws_appautoscaling_target.ecs.service_namespace

  target_tracking_scaling_policy_configuration {
    target_value       = 70.0   # Scale out when memory hits 70%
    scale_in_cooldown  = 300
    scale_out_cooldown = 60

    predefined_metric_specification {
      predefined_metric_type = "ECSServiceAverageMemoryUtilization"
    }
  }
}

# ─── Policy 3: Step Scaling on ALB Request Count (for burst protection) ───────

resource "aws_appautoscaling_policy" "request_step" {
  name               = "${var.app_name}-request-step-scaling"
  policy_type        = "StepScaling"
  resource_id        = aws_appautoscaling_target.ecs.resource_id
  scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
  service_namespace  = aws_appautoscaling_target.ecs.service_namespace

  step_scaling_policy_configuration {
    adjustment_type          = "PercentChangeInCapacity"  # Scale by % of current
    cooldown                 = 60
    metric_aggregation_type  = "Average"

    step_adjustment {
      metric_interval_lower_bound = 0    # CPU 70–80%: add 25%
      metric_interval_upper_bound = 10
      scaling_adjustment          = 25
    }
    step_adjustment {
      metric_interval_lower_bound = 10   # CPU 80–90%: add 50%
      metric_interval_upper_bound = 20
      scaling_adjustment          = 50
    }
    step_adjustment {
      metric_interval_lower_bound = 20   # CPU >90%: double capacity
      scaling_adjustment          = 100
    }
  }
}

# CloudWatch alarm that triggers step scaling
resource "aws_cloudwatch_metric_alarm" "cpu_high" {
  alarm_name          = "${var.app_name}-cpu-high"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 2
  metric_name         = "CPUUtilization"
  namespace           = "AWS/ECS"
  period              = 60
  statistic           = "Average"
  threshold           = 70.0

  dimensions = {
    ClusterName = aws_ecs_cluster.main.name
    ServiceName = aws_ecs_service.app.name
  }

  alarm_actions = [aws_appautoscaling_policy.request_step.arn]
}

# ─── Policy 4: Scheduled Scaling (pre-scale before business hours) ────────────

resource "aws_appautoscaling_scheduled_action" "scale_up_morning" {
  name               = "${var.app_name}-scale-up-morning"
  service_namespace  = aws_appautoscaling_target.ecs.service_namespace
  resource_id        = aws_appautoscaling_target.ecs.resource_id
  scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension

  schedule = "cron(0 8 ? * MON-FRI *)"  # 8 AM UTC Monday–Friday

  scalable_target_action {
    min_capacity = 4   # Pre-warm to 4 tasks
    max_capacity = 20
  }
}

resource "aws_appautoscaling_scheduled_action" "scale_down_evening" {
  name               = "${var.app_name}-scale-down-evening"
  service_namespace  = aws_appautoscaling_target.ecs.service_namespace
  resource_id        = aws_appautoscaling_target.ecs.resource_id
  scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension

  schedule = "cron(0 20 ? * MON-FRI *)"  # 8 PM UTC

  scalable_target_action {
    min_capacity = 2   # Back to minimum at night
    max_capacity = 20
  }
}

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

AWS, GCP, Azure certified engineers
Infrastructure as Code (Terraform, CDK)
Docker, Kubernetes, GitHub Actions CI/CD
Typical audit recovers $500–$3,000/month in savings

Get a Free Cloud Audit WhatsApp

Scale-In Protection for Long-Running Tasks

// lib/ecs/scale-protection.ts
// Prevent ECS from killing a task mid-job

import {
  ECSClient,
  UpdateTaskProtectionCommand,
} from "@aws-sdk/client-ecs";

const ecs = new ECSClient({ region: process.env.AWS_REGION! });

const CLUSTER_ARN = process.env.ECS_CLUSTER_ARN!;

function getOwnTaskArn(): string | null {
  // ECS injects this via task metadata endpoint
  return process.env.ECS_TASK_ARN ?? null;
}

export async function enableScaleInProtection(
  expiresAfterMinutes = 60
): Promise<void> {
  const taskArn = getOwnTaskArn();
  if (!taskArn) return; // Not running in ECS (local dev)

  await ecs.send(new UpdateTaskProtectionCommand({
    cluster:                CLUSTER_ARN,
    tasks:                  [taskArn],
    protectionEnabled:      true,
    expiresInMinutes:       expiresAfterMinutes,
  }));
}

export async function disableScaleInProtection(): Promise<void> {
  const taskArn = getOwnTaskArn();
  if (!taskArn) return;

  await ecs.send(new UpdateTaskProtectionCommand({
    cluster:           CLUSTER_ARN,
    tasks:             [taskArn],
    protectionEnabled: false,
  }));
}

// Usage in a long-running job:
// async function processLargeBatch() {
//   await enableScaleInProtection(120); // Protect for up to 2 hours
//   try {
//     await doExpensiveWork();
//   } finally {
//     await disableScaleInProtection(); // Always release
//   }
// }

CloudWatch Dashboard for ECS Scaling

resource "aws_cloudwatch_dashboard" "ecs" {
  dashboard_name = "${var.app_name}-ecs-scaling"

  dashboard_body = jsonencode({
    widgets = [
      {
        type = "metric"
        properties = {
          title   = "ECS Task Count"
          metrics = [
            ["ECS/ContainerInsights", "RunningTaskCount",
             "ClusterName", "${var.app_name}-cluster",
             "ServiceName", "${var.app_name}-app"]
          ]
          period = 60
        }
      },
      {
        type = "metric"
        properties = {
          title   = "CPU and Memory Utilization"
          metrics = [
            ["AWS/ECS", "CPUUtilization",    "ClusterName", "${var.app_name}-cluster", "ServiceName", "${var.app_name}-app"],
            ["AWS/ECS", "MemoryUtilization", "ClusterName", "${var.app_name}-cluster", "ServiceName", "${var.app_name}-app"],
          ]
          period = 60
        }
      }
    ]
  })
}

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

Staging + production environments with feature flags
Automated security scanning in the pipeline
Uptime monitoring + alerting + runbook automation
On-call support handover docs included

Modernize My DevOps WhatsApp

Scaling Configuration Reference

Setting	Value	Why
`min_capacity`	2	High availability — 1 AZ can fail
`max_capacity`	20	Cost ceiling — alert if this is hit
CPU target	50%	Leaves headroom for traffic spikes before new tasks are ready
Memory target	70%	Higher than CPU (memory leaks are gradual)
Scale-out cooldown	60s	React quickly to load
Scale-in cooldown	300s	Don't thrash — wait for stability
Task CPU/memory	512 CPU / 1024 MiB	Right-size first; then scale out

Cost and Timeline Estimates

Scope	Team	Timeline	Cost Range
Basic target tracking (CPU)	1 dev	Half a day	$150–300
Full Terraform autoscaling module	1 dev	1–2 days	$400–800
Step scaling + CloudWatch alarms	1 dev	1 day	$300–600
Scale-in protection for job workers	1 dev	Half a day	$200–400

Working With Viprasol

ECS autoscaling misconfiguration is expensive in both directions: a 5-minute scale-in cooldown that's too short causes thrashing (tasks added and removed repeatedly), while a CPU target of 80% leaves too little headroom (new tasks aren't ready by the time CPU spikes). Our Terraform module sets sensible defaults: CPU target 50%, scale-out cooldown 60s, scale-in cooldown 300s, lifecycle ignore_changes on desired_count.

What we deliver:

aws_ecs_service: Fargate, private subnets, ALB target group, lifecycle { ignore_changes = [desired_count] }
aws_appautoscaling_target: min 2, max 20, ECSServiceDesiredCount dimension
Target tracking: ECSServiceAverageCPUUtilization target 50%, ECSServiceAverageMemoryUtilization target 70%
Step scaling: PercentChangeInCapacity at 70/80/90% CPU thresholds (25%/50%/100% increase)
Scheduled: cron(0 8 ? * MON-FRI *) scale-up morning, cron(0 20 ? * MON-FRI *) scale-down evening
UpdateTaskProtectionCommand: enableScaleInProtection(minutes) / disableScaleInProtection() for job workers
CloudWatch dashboard: RunningTaskCount + CPU/Memory utilization

Talk to our team about your ECS infrastructure →

Or explore our cloud infrastructure services.

AWS ECS Autoscaling: Target Tracking, Step Scaling, and Fargate Capacity Providers with Terraform

Scaling Policy Comparison

Terraform: ECS Service with Autoscaling

☁️ Is Your Cloud Costing Too Much?

Scale-In Protection for Long-Running Tasks

CloudWatch Dashboard for ECS Scaling

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Scaling Configuration Reference

Cost and Timeline Estimates

See Also

Working With Viprasol

Viprasol Tech Team

Need DevOps & Cloud Expertise?

Making sense of your data at scale?

Related Articles

AWS ECS Fargate in Production: Task Definitions, Service Discovery, and Blue/Green Deploys

AWS CloudWatch Logs Insights: Query Patterns, Dashboards, Alarms, and Structured Logging

AWS ECS Blue/Green Deployment: CodeDeploy, Traffic Shifting, and Rollback