AWS ECS Auto Scaling 2026

Q: What is AWS ECS Auto Scaling 2026?

> Quick answer. ECS Fargate autoscaling adjusts task count to match load, avoiding both overprovisioning and crashes under spikes. Use target tracking (for example keep CPU near 60%) for steady-state workloads, step scaling for predictable bursts, and scheduled scaling for known patterns. Set sensible min/max tasks and cooldowns so scaling stays stable rather than flapping.

Q: What are the benefits of AWS?

setting { name = "containerInsights" value = "enabled" # CloudWatch Container Insights } }

Quick answer. ECS Fargate autoscaling adjusts task count to match load, avoiding both overprovisioning and crashes under spikes. Use target tracking (for example keep CPU near 60%) for steady-state workloads, step scaling for predictable bursts, and scheduled scaling for known patterns. Set sensible min/max tasks and cooldowns so scaling stays stable rather than flapping.

ECS Fargate autoscaling solves a billing problem and a reliability problem simultaneously. Without it, you either overprovision (paying for idle capacity) or underprovision (task crashes under load). The key is choosing the right scaling policy: target tracking for steady-state workloads, step scaling for predictable burst patterns.

Scaling Policy Comparison

Policy	Use Case	Behavior	Latency
Target tracking (CPU 50%)	Web APIs, SaaS apps	Continuously adjusts to maintain target	60–120 seconds
Target tracking (request count)	ALB-fronted services	Scales per requests-per-task	60–120 seconds
Step scaling	Known burst patterns (e.g., batch jobs)	Add N tasks when threshold exceeded	30–60 seconds
Scheduled scaling	Predictable traffic (e.g., business hours)	Pre-scale before load arrives	Instant

Terraform: ECS Service with Autoscaling

# terraform/ecs-service.tf

# ECS Cluster
resource "aws_ecs_cluster" "main" {
  name = "${var.app_name}-cluster"

  setting {
    name  = "containerInsights"
    value = "enabled"  # CloudWatch Container Insights
  }
}

# ECS Service (Fargate)
resource "aws_ecs_service" "app" {
  name            = "${var.app_name}-app"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.app.arn
  desired_count   = 2  # Starting count (autoscaling overrides this)
  launch_type     = "FARGATE"

  network_configuration {
    subnets          = var.private_subnet_ids
    security_groups  = [aws_security_group.app.id]
    assign_public_ip = false
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.app.arn
    container_name   = "app"
    container_port   = 3000
  }

  # Prevent Terraform from resetting desired_count on every apply
  lifecycle {
    ignore_changes = [desired_count]
  }

  depends_on = [aws_lb_listener.https]
}

# ─── Autoscaling Target ───────────────────────────────────────────────────────

resource "aws_appautoscaling_target" "ecs" {
  max_capacity       = 20   # Hard ceiling on task count
  min_capacity       = 2    # Always keep at least 2 tasks (HA)
  resource_id        = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.app.name}"
  scalable_dimension = "ecs:service:DesiredCount"
  service_namespace  = "ecs"
}

# ─── Policy 1: Target Tracking on CPU (primary) ──────────────────────────────

resource "aws_appautoscaling_policy" "cpu_tracking" {
  name               = "${var.app_name}-cpu-target-tracking"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.ecs.resource_id
  scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
  service_namespace  = aws_appautoscaling_target.ecs.service_namespace

  target_tracking_scaling_policy_configuration {
    target_value       = 50.0   # Keep average CPU at 50%
    scale_in_cooldown  = 300    # 5 min before scaling in (avoids flapping)
    scale_out_cooldown = 60     # 1 min before adding tasks (fast response)

    predefined_metric_specification {
      predefined_metric_type = "ECSServiceAverageCPUUtilization"
    }
  }
}

# ─── Policy 2: Target Tracking on Memory ─────────────────────────────────────

resource "aws_appautoscaling_policy" "memory_tracking" {
  name               = "${var.app_name}-memory-target-tracking"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.ecs.resource_id
  scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
  service_namespace  = aws_appautoscaling_target.ecs.service_namespace

  target_tracking_scaling_policy_configuration {
    target_value       = 70.0   # Scale out when memory hits 70%
    scale_in_cooldown  = 300
    scale_out_cooldown = 60

    predefined_metric_specification {
      predefined_metric_type = "ECSServiceAverageMemoryUtilization"
    }
  }
}

# ─── Policy 3: Step Scaling on ALB Request Count (for burst protection) ───────

resource "aws_appautoscaling_policy" "request_step" {
  name               = "${var.app_name}-request-step-scaling"
  policy_type        = "StepScaling"
  resource_id        = aws_appautoscaling_target.ecs.resource_id
  scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
  service_namespace  = aws_appautoscaling_target.ecs.service_namespace

  step_scaling_policy_configuration {
    adjustment_type          = "PercentChangeInCapacity"  # Scale by % of current
    cooldown                 = 60
    metric_aggregation_type  = "Average"

    step_adjustment {
      metric_interval_lower_bound = 0    # CPU 70–80%: add 25%
      metric_interval_upper_bound = 10
      scaling_adjustment          = 25
    }
    step_adjustment {
      metric_interval_lower_bound = 10   # CPU 80–90%: add 50%
      metric_interval_upper_bound = 20
      scaling_adjustment          = 50
    }
    step_adjustment {
      metric_interval_lower_bound = 20   # CPU >90%: double capacity
      scaling_adjustment          = 100
    }
  }
}

# CloudWatch alarm that triggers step scaling
resource "aws_cloudwatch_metric_alarm" "cpu_high" {
  alarm_name          = "${var.app_name}-cpu-high"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 2
  metric_name         = "CPUUtilization"
  namespace           = "AWS/ECS"
  period              = 60
  statistic           = "Average"
  threshold           = 70.0

  dimensions = {
    ClusterName = aws_ecs_cluster.main.name
    ServiceName = aws_ecs_service.app.name
  }

  alarm_actions = [aws_appautoscaling_policy.request_step.arn]
}

# ─── Policy 4: Scheduled Scaling (pre-scale before business hours) ────────────

resource "aws_appautoscaling_scheduled_action" "scale_up_morning" {
  name               = "${var.app_name}-scale-up-morning"
  service_namespace  = aws_appautoscaling_target.ecs.service_namespace
  resource_id        = aws_appautoscaling_target.ecs.resource_id
  scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension

  schedule = "cron(0 8 ? * MON-FRI *)"  # 8 AM UTC Monday–Friday

  scalable_target_action {
    min_capacity = 4   # Pre-warm to 4 tasks
    max_capacity = 20
  }
}

resource "aws_appautoscaling_scheduled_action" "scale_down_evening" {
  name               = "${var.app_name}-scale-down-evening"
  service_namespace  = aws_appautoscaling_target.ecs.service_namespace
  resource_id        = aws_appautoscaling_target.ecs.resource_id
  scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension

  schedule = "cron(0 20 ? * MON-FRI *)"  # 8 PM UTC

  scalable_target_action {
    min_capacity = 2   # Back to minimum at night
    max_capacity = 20
  }
}

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

AWS, GCP, Azure certified engineers
Infrastructure as Code (Terraform, CDK)
Docker, Kubernetes, GitHub Actions CI/CD
Typical audit recovers $500–$3,000/month in savings

Get a Free Cloud Audit WhatsApp

Scale-In Protection for Long-Running Tasks

// lib/ecs/scale-protection.ts
// Prevent ECS from killing a task mid-job

import {
  ECSClient,
  UpdateTaskProtectionCommand,
} from "@aws-sdk/client-ecs";

const ecs = new ECSClient({ region: process.env.AWS_REGION! });

const CLUSTER_ARN = process.env.ECS_CLUSTER_ARN!;

function getOwnTaskArn(): string | null {
  // ECS injects this via task metadata endpoint
  return process.env.ECS_TASK_ARN ?? null;
}

export async function enableScaleInProtection(
  expiresAfterMinutes = 60
): Promise<void> {
  const taskArn = getOwnTaskArn();
  if (!taskArn) return; // Not running in ECS (local dev)

  await ecs.send(new UpdateTaskProtectionCommand({
    cluster:                CLUSTER_ARN,
    tasks:                  [taskArn],
    protectionEnabled:      true,
    expiresInMinutes:       expiresAfterMinutes,
  }));
}

export async function disableScaleInProtection(): Promise<void> {
  const taskArn = getOwnTaskArn();
  if (!taskArn) return;

  await ecs.send(new UpdateTaskProtectionCommand({
    cluster:           CLUSTER_ARN,
    tasks:             [taskArn],
    protectionEnabled: false,
  }));
}

// Usage in a long-running job:
// async function processLargeBatch() {
//   await enableScaleInProtection(120); // Protect for up to 2 hours
//   try {
//     await doExpensiveWork();
//   } finally {
//     await disableScaleInProtection(); // Always release
//   }
// }

CloudWatch Dashboard for ECS Scaling

resource "aws_cloudwatch_dashboard" "ecs" {
  dashboard_name = "${var.app_name}-ecs-scaling"

  dashboard_body = jsonencode({
    widgets = [
      {
        type = "metric"
        properties = {
          title   = "ECS Task Count"
          metrics = [
            ["ECS/ContainerInsights", "RunningTaskCount",
             "ClusterName", "${var.app_name}-cluster",
             "ServiceName", "${var.app_name}-app"]
          ]
          period = 60
        }
      },
      {
        type = "metric"
        properties = {
          title   = "CPU and Memory Utilization"
          metrics = [
            ["AWS/ECS", "CPUUtilization",    "ClusterName", "${var.app_name}-cluster", "ServiceName", "${var.app_name}-app"],
            ["AWS/ECS", "MemoryUtilization", "ClusterName", "${var.app_name}-cluster", "ServiceName", "${var.app_name}-app"],
          ]
          period = 60
        }
      }
    ]
  })
}

AWS - AWS ECS Auto Scaling 2026: Target Tracking, Step Scaling & Fargate

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

Staging + production environments with feature flags
Automated security scanning in the pipeline
Uptime monitoring + alerting + runbook automation
On-call support handover docs included

Modernize My DevOps WhatsApp

Scaling Configuration Reference

Setting	Value	Why
`min_capacity`	2	High availability — 1 AZ can fail
`max_capacity`	20	Cost ceiling — alert if this is hit
CPU target	50%	Leaves headroom for traffic spikes before new tasks are ready
Memory target	70%	Higher than CPU (memory leaks are gradual)
Scale-out cooldown	60s	React quickly to load
Scale-in cooldown	300s	Don't thrash — wait for stability
Task CPU/memory	512 CPU / 1024 MiB	Right-size first; then scale out

Pricing and Delivery Estimates

Scope	Team	Timeline	Cost Range
Basic target tracking (CPU)	1 dev	Half a day	$150–300
Full Terraform autoscaling module	1 dev	1–2 days	$400–800
Step scaling + CloudWatch alarms	1 dev	1 day	$300–600
Scale-in protection for job workers	1 dev	Half a day	$200–400

How Viprasol Helps

ECS autoscaling misconfiguration is expensive in both directions: a 5-minute scale-in cooldown that's too short causes thrashing (tasks added and removed repeatedly), while a CPU target of 80% leaves too little headroom (new tasks aren't ready by the time CPU spikes). Our Terraform module sets sensible defaults: CPU target 50%, scale-out cooldown 60s, scale-in cooldown 300s, lifecycle ignore_changes on desired_count.

What we deliver:

aws_ecs_service: Fargate, private subnets, ALB target group, lifecycle { ignore_changes = [desired_count] }
aws_appautoscaling_target: min 2, max 20, ECSServiceDesiredCount dimension
Target tracking: ECSServiceAverageCPUUtilization target 50%, ECSServiceAverageMemoryUtilization target 70%
Step scaling: PercentChangeInCapacity at 70/80/90% CPU thresholds (25%/50%/100% increase)
Scheduled: cron(0 8 ? * MON-FRI *) scale-up morning, cron(0 20 ? * MON-FRI *) scale-down evening
UpdateTaskProtectionCommand: enableScaleInProtection(minutes) / disableScaleInProtection() for job workers
CloudWatch dashboard: RunningTaskCount + CPU/Memory utilization

Talk to our team about your ECS infrastructure →

Or explore our cloud infrastructure services.

Getting ECS Fargate Auto Scaling Right in Production

ECS Fargate auto scaling removes the node-management burden of EC2 capacity providers, so you scale tasks instead of instances. Application Auto Scaling adjusts your service's desired task count based on CloudWatch metrics, and the policy you choose matters. Target tracking is the cleanest default for serverless ECS auto scaling: pick a metric like average CPU or memory utilization, set a target value, and AWS adds or removes Fargate tasks to hold that line. Reach for step scaling when you need tiered, predictable reactions to sharp traffic spikes, or schedule-based scaling for known daily patterns. Watch your cooldown periods and minimum task count so you do not flap or scale to zero unintentionally, and set sensible max limits to cap spend. Our senior engineers tune these policies end to end, load-test the thresholds, and own the result so your containerized workloads stay responsive and cost-efficient.

AWS ECS Auto Scaling 2026: Target Tracking, Step Scaling & Fargate

Scaling Policy Comparison

Terraform: ECS Service with Autoscaling

☁️ Is Your Cloud Costing Too Much?

Scale-In Protection for Long-Running Tasks

CloudWatch Dashboard for ECS Scaling

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Recommended Reading

Scaling Configuration Reference

Pricing and Delivery Estimates

More on This Topic

How Viprasol Helps

Getting ECS Fargate Auto Scaling Right in Production

External Resources

Viprasol Tech Team

Need DevOps & Cloud Expertise?

Making sense of your data at scale?

Related Articles

AWS ECS Fargate in Production: Task Definitions, Service Discovery

AWS CloudWatch Logs Insights: Query Patterns, Dashboards, Alarms

AWS ECS Blue/Green Deployment: CodeDeploy, Traffic Shifting

AWS Parameter Store vs Secrets Manager in 2026: Hierarchical Config

AWS SQS Worker Pattern in 2026: Consumer Workers, Dead-Letter Queues

AWS CloudWatch Observability in 2026: Custom Metrics, Log Insights