Back to Blog

AWS ECS Blue/Green Deployment: CodeDeploy, Traffic Shifting, and Rollback

Implement blue/green deployments on AWS ECS with CodeDeploy. Covers Terraform setup, ALB listener rules, canary and linear traffic shifting, automated rollback on CloudWatch alarms, and deployment hooks.

Viprasol Tech Team
April 2, 2027
13 min read

Rolling deployments replace instances one at a time — you get zero downtime but both old and new code serve traffic simultaneously during the rollout. Blue/green goes further: spin up a completely new environment (green), validate it, then shift traffic all at once (or gradually). If something's wrong, roll back in seconds by pointing traffic back to blue.

AWS ECS with CodeDeploy makes blue/green deployments surprisingly straightforward, though the Terraform setup has enough moving parts to get tangled in.

How ECS Blue/Green Works

  1. Blue = current production (running task definition N)
  2. Green = new deployment (running task definition N+1)
  3. CodeDeploy creates a new ECS service (green) behind the same ALB
  4. ALB has two target groups: blue-tg and green-tg
  5. Traffic shifts from blue to green (all-at-once, canary, or linear)
  6. After validation period, blue tasks are terminated
  7. On alarm: rollback by shifting traffic back to blue instantly

Terraform Infrastructure

# terraform/ecs-blue-green.tf

# --- ALB setup ---
resource "aws_lb" "app" {
  name               = "${var.app_name}-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb.id]
  subnets            = var.public_subnet_ids
}

# Production listener (port 443)
resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.app.arn
  port              = 443
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  certificate_arn   = aws_acm_certificate.app.arn

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.blue.arn
  }

  lifecycle {
    ignore_changes = [default_action]  # CodeDeploy manages this
  }
}

# Test listener (port 8080) — for green validation before traffic shift
resource "aws_lb_listener" "test" {
  load_balancer_arn = aws_lb.app.arn
  port              = 8080
  protocol          = "HTTP"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.green.arn
  }

  lifecycle {
    ignore_changes = [default_action]
  }
}

# Blue target group
resource "aws_lb_target_group" "blue" {
  name        = "${var.app_name}-blue"
  port        = var.container_port
  protocol    = "HTTP"
  vpc_id      = var.vpc_id
  target_type = "ip"

  health_check {
    enabled             = true
    path                = "/health"
    interval            = 15
    timeout             = 5
    healthy_threshold   = 2
    unhealthy_threshold = 3
    matcher             = "200"
  }

  deregistration_delay = 30  # Drain existing connections before deregistering
}

# Green target group (identical to blue)
resource "aws_lb_target_group" "green" {
  name        = "${var.app_name}-green"
  port        = var.container_port
  protocol    = "HTTP"
  vpc_id      = var.vpc_id
  target_type = "ip"

  health_check {
    enabled             = true
    path                = "/health"
    interval            = 15
    timeout             = 5
    healthy_threshold   = 2
    unhealthy_threshold = 3
    matcher             = "200"
  }

  deregistration_delay = 30
}

# --- ECS setup ---
resource "aws_ecs_cluster" "app" {
  name = var.app_name

  setting {
    name  = "containerInsights"
    value = "enabled"
  }
}

resource "aws_ecs_task_definition" "app" {
  family                   = var.app_name
  network_mode             = "awsvpc"
  requires_compatibilities = ["FARGATE"]
  cpu                      = var.task_cpu
  memory                   = var.task_memory
  execution_role_arn       = aws_iam_role.ecs_execution.arn
  task_role_arn            = aws_iam_role.ecs_task.arn

  container_definitions = jsonencode([{
    name      = var.app_name
    image     = "${aws_ecr_repository.app.repository_url}:latest"
    essential = true

    portMappings = [{ containerPort = var.container_port, protocol = "tcp" }]

    environment = [
      { name = "NODE_ENV",  value = "production" },
      { name = "PORT",      value = tostring(var.container_port) },
    ]

    secrets = [
      { name = "DATABASE_URL", valueFrom = aws_ssm_parameter.database_url.arn },
      { name = "SECRET_KEY",   valueFrom = aws_ssm_parameter.secret_key.arn },
    ]

    logConfiguration = {
      logDriver = "awslogs"
      options = {
        "awslogs-group"         = aws_cloudwatch_log_group.app.name
        "awslogs-region"        = var.aws_region
        "awslogs-stream-prefix" = "ecs"
      }
    }

    healthCheck = {
      command     = ["CMD-SHELL", "curl -f http://localhost:${var.container_port}/health || exit 1"]
      interval    = 10
      timeout     = 5
      retries     = 3
      startPeriod = 30
    }
  }])
}

resource "aws_ecs_service" "app" {
  name            = var.app_name
  cluster         = aws_ecs_cluster.app.id
  task_definition = aws_ecs_task_definition.app.arn
  desired_count   = var.desired_count
  launch_type     = "FARGATE"

  network_configuration {
    subnets          = var.private_subnet_ids
    security_groups  = [aws_security_group.ecs_tasks.id]
    assign_public_ip = false
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.blue.arn
    container_name   = var.app_name
    container_port   = var.container_port
  }

  # Required for CodeDeploy blue/green
  deployment_controller {
    type = "CODE_DEPLOY"
  }

  lifecycle {
    # CodeDeploy manages task definition and load_balancer changes
    ignore_changes = [task_definition, load_balancer, desired_count]
  }
}

# --- CodeDeploy setup ---
resource "aws_codedeploy_app" "app" {
  compute_platform = "ECS"
  name             = var.app_name
}

resource "aws_codedeploy_deployment_group" "app" {
  app_name               = aws_codedeploy_app.app.name
  deployment_group_name  = "${var.app_name}-dg"
  service_role_arn       = aws_iam_role.codedeploy.arn
  deployment_config_name = "CodeDeployDefault.ECSCanary10Percent5Minutes"
  # Other options:
  # "CodeDeployDefault.ECSAllAtOnce"          — instant (risky)
  # "CodeDeployDefault.ECSLinear10PercentEvery1Minutes" — 10% per minute
  # "CodeDeployDefault.ECSCanary10Percent5Minutes"      — 10% for 5 min, then 100%

  auto_rollback_configuration {
    enabled = true
    events  = ["DEPLOYMENT_FAILURE", "DEPLOYMENT_STOP_ON_ALARM"]
  }

  blue_green_deployment_config {
    deployment_ready_option {
      action_on_timeout    = "CONTINUE_DEPLOYMENT"
      wait_time_in_minutes = 0
    }

    terminate_blue_instances_on_deployment_success {
      action                           = "TERMINATE"
      termination_wait_time_in_minutes = 5  # Wait 5 min after traffic shift before killing blue
    }
  }

  deployment_style {
    deployment_option = "WITH_TRAFFIC_CONTROL"
    deployment_type   = "BLUE_GREEN"
  }

  ecs_service {
    cluster_name = aws_ecs_cluster.app.name
    service_name = aws_ecs_service.app.name
  }

  load_balancer_info {
    target_group_pair_info {
      prod_traffic_route {
        listener_arns = [aws_lb_listener.https.arn]
      }
      test_traffic_route {
        listener_arns = [aws_lb_listener.test.arn]
      }
      target_group {
        name = aws_lb_target_group.blue.name
      }
      target_group {
        name = aws_lb_target_group.green.name
      }
    }
  }

  # Alarm-based rollback
  alarm_configuration {
    alarms  = [aws_cloudwatch_metric_alarm.error_rate.alarm_name]
    enabled = true
  }
}

# CloudWatch alarm: trigger rollback if 5xx rate > 1%
resource "aws_cloudwatch_metric_alarm" "error_rate" {
  alarm_name          = "${var.app_name}-error-rate"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 2
  threshold           = 1

  metric_query {
    id          = "error_rate"
    expression  = "errors / total * 100"
    label       = "Error Rate %"
    return_data = true
  }

  metric_query {
    id = "errors"
    metric {
      metric_name = "HTTPCode_Target_5XX_Count"
      namespace   = "AWS/ApplicationELB"
      period      = 60
      stat        = "Sum"
      dimensions  = { LoadBalancer = aws_lb.app.arn_suffix }
    }
  }

  metric_query {
    id = "total"
    metric {
      metric_name = "RequestCount"
      namespace   = "AWS/ApplicationELB"
      period      = 60
      stat        = "Sum"
      dimensions  = { LoadBalancer = aws_lb.app.arn_suffix }
    }
  }

  alarm_description = "Rollback ECS deployment if error rate exceeds 1%"
  treat_missing_data = "notBreaching"
}

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

  • AWS, GCP, Azure certified engineers
  • Infrastructure as Code (Terraform, CDK)
  • Docker, Kubernetes, GitHub Actions CI/CD
  • Typical audit recovers $500–$3,000/month in savings

AppSpec File

CodeDeploy needs an appspec.yaml to know what to deploy:

# appspec.yaml (stored in S3 or CodeCommit, referenced by deployment)
version: 0.0
Resources:
  - TargetService:
      Type: AWS::ECS::Service
      Properties:
        TaskDefinition: "<TASK_DEFINITION>"      # Replaced by CI/CD
        LoadBalancerInfo:
          ContainerName: "your-app-name"
          ContainerPort: 3000
        PlatformVersion: "LATEST"
Hooks:
  - BeforeInstall: "arn:aws:lambda:us-east-1:123456789:function:pre-deploy-hook"
  - AfterInstall: "arn:aws:lambda:us-east-1:123456789:function:post-install-hook"
  - AfterAllowTestTraffic: "arn:aws:lambda:us-east-1:123456789:function:smoke-test-hook"
  - BeforeAllowTraffic: null
  - AfterAllowTraffic: "arn:aws:lambda:us-east-1:123456789:function:post-deploy-hook"

Deployment Hook: Smoke Tests

// lambda/hooks/smoke-test.ts
// Runs against green environment (port 8080) before traffic shift
import {
  CodeDeployClient,
  PutLifecycleEventHookExecutionStatusCommand,
} from "@aws-sdk/client-codedeploy";

const codedeploy = new CodeDeployClient({});

export async function handler(event: {
  DeploymentId: string;
  LifecycleEventHookExecutionId: string;
}) {
  const { DeploymentId, LifecycleEventHookExecutionId } = event;
  let status: "Succeeded" | "Failed" = "Succeeded";

  try {
    // Test endpoint is on port 8080 (test listener routes to green)
    const ALB_DNS = process.env.ALB_DNS!;

    const checks = await Promise.all([
      fetch(`http://${ALB_DNS}:8080/health`).then((r) => r.ok),
      fetch(`http://${ALB_DNS}:8080/api/v2/status`).then((r) => r.ok),
    ]);

    if (!checks.every(Boolean)) {
      console.error("Smoke test failed:", checks);
      status = "Failed";
    } else {
      console.log("Smoke tests passed");
    }
  } catch (err) {
    console.error("Smoke test error:", err);
    status = "Failed";
  }

  await codedeploy.send(
    new PutLifecycleEventHookExecutionStatusCommand({
      deploymentId: DeploymentId,
      lifecycleEventHookExecutionId: LifecycleEventHookExecutionId,
      status,
    })
  );
}

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

  • Staging + production environments with feature flags
  • Automated security scanning in the pipeline
  • Uptime monitoring + alerting + runbook automation
  • On-call support handover docs included

CI/CD Pipeline: GitHub Actions

# .github/workflows/deploy.yml
name: Deploy to ECS

on:
  push:
    branches: [main]

env:
  AWS_REGION: us-east-1
  ECR_REPOSITORY: your-app
  ECS_CLUSTER: your-app
  ECS_SERVICE: your-app
  CODEDEPLOY_APP: your-app
  CODEDEPLOY_GROUP: your-app-dg

jobs:
  deploy:
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read

    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS credentials (OIDC)
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_DEPLOY_ROLE_ARN }}
          aws-region: ${{ env.AWS_REGION }}

      - name: Login to Amazon ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build, tag, and push image
        id: build
        env:
          ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
          IMAGE_TAG: ${{ github.sha }}
        run: |
          docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG \
            --build-arg BUILD_SHA=${{ github.sha }} .
          docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
          echo "image=$ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG" >> $GITHUB_OUTPUT

      - name: Register new task definition
        id: task-def
        run: |
          TASK_DEF=$(aws ecs describe-task-definition \
            --task-definition ${{ env.ECS_SERVICE }} \
            --query 'taskDefinition' --output json)
          NEW_TASK_DEF=$(echo $TASK_DEF | jq \
            --arg IMAGE "${{ steps.build.outputs.image }}" \
            '.containerDefinitions[0].image = $IMAGE | del(.taskDefinitionArn, .revision, .status, .requiresAttributes, .compatibilities, .registeredAt, .registeredBy)')
          NEW_ARN=$(aws ecs register-task-definition \
            --cli-input-json "$NEW_TASK_DEF" \
            --query 'taskDefinition.taskDefinitionArn' --output text)
          echo "task_def_arn=$NEW_ARN" >> $GITHUB_OUTPUT

      - name: Create appspec and deploy
        run: |
          cat > appspec.json << EOF
          {
            "version": 0.0,
            "Resources": [{
              "TargetService": {
                "Type": "AWS::ECS::Service",
                "Properties": {
                  "TaskDefinition": "${{ steps.task-def.outputs.task_def_arn }}",
                  "LoadBalancerInfo": {
                    "ContainerName": "${{ env.ECS_SERVICE }}",
                    "ContainerPort": 3000
                  }
                }
              }
            }]
          }
          EOF

          aws deploy create-deployment \
            --application-name ${{ env.CODEDEPLOY_APP }} \
            --deployment-group-name ${{ env.CODEDEPLOY_GROUP }} \
            --revision '{"revisionType":"AppSpecContent","appSpecContent":{"content":"'"$(cat appspec.json | jq -c . | tr -d '\n')"'"}}'

Cost Estimates

ComponentCost
ECS Fargate (per vCPU-hour)$0.04048
ECS Fargate (per GB-hour)$0.004445
ALB (per hour)$0.008
ALB (per LCU-hour)$0.008
CodeDeployFree for ECS
Blue/green extra cost~1–5 min of double capacity during traffic shift

Dev setup cost: 3–5 days for Terraform + CodeDeploy + CI/CD pipeline, $1,500–3,000.

See Also


Working With Viprasol

Blue/green deployments on ECS have more moving parts than most teams expect — the Terraform lifecycle ignore_changes are non-obvious, the AppSpec format is fiddly, and deployment hook permissions require exact IAM configuration. Our team has set up ECS blue/green pipelines with canary traffic shifting, CloudWatch alarm rollbacks, and GitHub Actions CI/CD for production SaaS applications.

What we deliver:

  • Terraform: ECS cluster, service (CODE_DEPLOY controller), blue/green target groups, ALB listeners
  • CodeDeploy deployment group with canary/linear traffic config
  • Automated rollback on CloudWatch 5xx alarm
  • Smoke test Lambda hook (validates green before traffic shift)
  • GitHub Actions pipeline: build → push → register task def → CodeDeploy deploy

Talk to our team about your ECS deployment strategy →

Or explore our cloud infrastructure services.

Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Need DevOps & Cloud Expertise?

Scale your infrastructure with confidence. AWS, GCP, Azure certified team.

Free consultation • No commitment • Response within 24 hours

Viprasol · Big Data & Analytics

Making sense of your data at scale?

Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.