AWS ECS Blue/Green Deployment: CodeDeploy, Traffic Shifting, and Rollback
Implement blue/green deployments on AWS ECS with CodeDeploy. Covers Terraform setup, ALB listener rules, canary and linear traffic shifting, automated rollback on CloudWatch alarms, and deployment hooks.
Rolling deployments replace instances one at a time — you get zero downtime but both old and new code serve traffic simultaneously during the rollout. Blue/green goes further: spin up a completely new environment (green), validate it, then shift traffic all at once (or gradually). If something's wrong, roll back in seconds by pointing traffic back to blue.
AWS ECS with CodeDeploy makes blue/green deployments surprisingly straightforward, though the Terraform setup has enough moving parts to get tangled in.
How ECS Blue/Green Works
- Blue = current production (running task definition N)
- Green = new deployment (running task definition N+1)
- CodeDeploy creates a new ECS service (green) behind the same ALB
- ALB has two target groups:
blue-tgandgreen-tg - Traffic shifts from blue to green (all-at-once, canary, or linear)
- After validation period, blue tasks are terminated
- On alarm: rollback by shifting traffic back to blue instantly
Terraform Infrastructure
# terraform/ecs-blue-green.tf
# --- ALB setup ---
resource "aws_lb" "app" {
name = "${var.app_name}-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = var.public_subnet_ids
}
# Production listener (port 443)
resource "aws_lb_listener" "https" {
load_balancer_arn = aws_lb.app.arn
port = 443
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-TLS13-1-2-2021-06"
certificate_arn = aws_acm_certificate.app.arn
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.blue.arn
}
lifecycle {
ignore_changes = [default_action] # CodeDeploy manages this
}
}
# Test listener (port 8080) — for green validation before traffic shift
resource "aws_lb_listener" "test" {
load_balancer_arn = aws_lb.app.arn
port = 8080
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.green.arn
}
lifecycle {
ignore_changes = [default_action]
}
}
# Blue target group
resource "aws_lb_target_group" "blue" {
name = "${var.app_name}-blue"
port = var.container_port
protocol = "HTTP"
vpc_id = var.vpc_id
target_type = "ip"
health_check {
enabled = true
path = "/health"
interval = 15
timeout = 5
healthy_threshold = 2
unhealthy_threshold = 3
matcher = "200"
}
deregistration_delay = 30 # Drain existing connections before deregistering
}
# Green target group (identical to blue)
resource "aws_lb_target_group" "green" {
name = "${var.app_name}-green"
port = var.container_port
protocol = "HTTP"
vpc_id = var.vpc_id
target_type = "ip"
health_check {
enabled = true
path = "/health"
interval = 15
timeout = 5
healthy_threshold = 2
unhealthy_threshold = 3
matcher = "200"
}
deregistration_delay = 30
}
# --- ECS setup ---
resource "aws_ecs_cluster" "app" {
name = var.app_name
setting {
name = "containerInsights"
value = "enabled"
}
}
resource "aws_ecs_task_definition" "app" {
family = var.app_name
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = var.task_cpu
memory = var.task_memory
execution_role_arn = aws_iam_role.ecs_execution.arn
task_role_arn = aws_iam_role.ecs_task.arn
container_definitions = jsonencode([{
name = var.app_name
image = "${aws_ecr_repository.app.repository_url}:latest"
essential = true
portMappings = [{ containerPort = var.container_port, protocol = "tcp" }]
environment = [
{ name = "NODE_ENV", value = "production" },
{ name = "PORT", value = tostring(var.container_port) },
]
secrets = [
{ name = "DATABASE_URL", valueFrom = aws_ssm_parameter.database_url.arn },
{ name = "SECRET_KEY", valueFrom = aws_ssm_parameter.secret_key.arn },
]
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = aws_cloudwatch_log_group.app.name
"awslogs-region" = var.aws_region
"awslogs-stream-prefix" = "ecs"
}
}
healthCheck = {
command = ["CMD-SHELL", "curl -f http://localhost:${var.container_port}/health || exit 1"]
interval = 10
timeout = 5
retries = 3
startPeriod = 30
}
}])
}
resource "aws_ecs_service" "app" {
name = var.app_name
cluster = aws_ecs_cluster.app.id
task_definition = aws_ecs_task_definition.app.arn
desired_count = var.desired_count
launch_type = "FARGATE"
network_configuration {
subnets = var.private_subnet_ids
security_groups = [aws_security_group.ecs_tasks.id]
assign_public_ip = false
}
load_balancer {
target_group_arn = aws_lb_target_group.blue.arn
container_name = var.app_name
container_port = var.container_port
}
# Required for CodeDeploy blue/green
deployment_controller {
type = "CODE_DEPLOY"
}
lifecycle {
# CodeDeploy manages task definition and load_balancer changes
ignore_changes = [task_definition, load_balancer, desired_count]
}
}
# --- CodeDeploy setup ---
resource "aws_codedeploy_app" "app" {
compute_platform = "ECS"
name = var.app_name
}
resource "aws_codedeploy_deployment_group" "app" {
app_name = aws_codedeploy_app.app.name
deployment_group_name = "${var.app_name}-dg"
service_role_arn = aws_iam_role.codedeploy.arn
deployment_config_name = "CodeDeployDefault.ECSCanary10Percent5Minutes"
# Other options:
# "CodeDeployDefault.ECSAllAtOnce" — instant (risky)
# "CodeDeployDefault.ECSLinear10PercentEvery1Minutes" — 10% per minute
# "CodeDeployDefault.ECSCanary10Percent5Minutes" — 10% for 5 min, then 100%
auto_rollback_configuration {
enabled = true
events = ["DEPLOYMENT_FAILURE", "DEPLOYMENT_STOP_ON_ALARM"]
}
blue_green_deployment_config {
deployment_ready_option {
action_on_timeout = "CONTINUE_DEPLOYMENT"
wait_time_in_minutes = 0
}
terminate_blue_instances_on_deployment_success {
action = "TERMINATE"
termination_wait_time_in_minutes = 5 # Wait 5 min after traffic shift before killing blue
}
}
deployment_style {
deployment_option = "WITH_TRAFFIC_CONTROL"
deployment_type = "BLUE_GREEN"
}
ecs_service {
cluster_name = aws_ecs_cluster.app.name
service_name = aws_ecs_service.app.name
}
load_balancer_info {
target_group_pair_info {
prod_traffic_route {
listener_arns = [aws_lb_listener.https.arn]
}
test_traffic_route {
listener_arns = [aws_lb_listener.test.arn]
}
target_group {
name = aws_lb_target_group.blue.name
}
target_group {
name = aws_lb_target_group.green.name
}
}
}
# Alarm-based rollback
alarm_configuration {
alarms = [aws_cloudwatch_metric_alarm.error_rate.alarm_name]
enabled = true
}
}
# CloudWatch alarm: trigger rollback if 5xx rate > 1%
resource "aws_cloudwatch_metric_alarm" "error_rate" {
alarm_name = "${var.app_name}-error-rate"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 2
threshold = 1
metric_query {
id = "error_rate"
expression = "errors / total * 100"
label = "Error Rate %"
return_data = true
}
metric_query {
id = "errors"
metric {
metric_name = "HTTPCode_Target_5XX_Count"
namespace = "AWS/ApplicationELB"
period = 60
stat = "Sum"
dimensions = { LoadBalancer = aws_lb.app.arn_suffix }
}
}
metric_query {
id = "total"
metric {
metric_name = "RequestCount"
namespace = "AWS/ApplicationELB"
period = 60
stat = "Sum"
dimensions = { LoadBalancer = aws_lb.app.arn_suffix }
}
}
alarm_description = "Rollback ECS deployment if error rate exceeds 1%"
treat_missing_data = "notBreaching"
}
☁️ Is Your Cloud Costing Too Much?
Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.
- AWS, GCP, Azure certified engineers
- Infrastructure as Code (Terraform, CDK)
- Docker, Kubernetes, GitHub Actions CI/CD
- Typical audit recovers $500–$3,000/month in savings
AppSpec File
CodeDeploy needs an appspec.yaml to know what to deploy:
# appspec.yaml (stored in S3 or CodeCommit, referenced by deployment)
version: 0.0
Resources:
- TargetService:
Type: AWS::ECS::Service
Properties:
TaskDefinition: "<TASK_DEFINITION>" # Replaced by CI/CD
LoadBalancerInfo:
ContainerName: "your-app-name"
ContainerPort: 3000
PlatformVersion: "LATEST"
Hooks:
- BeforeInstall: "arn:aws:lambda:us-east-1:123456789:function:pre-deploy-hook"
- AfterInstall: "arn:aws:lambda:us-east-1:123456789:function:post-install-hook"
- AfterAllowTestTraffic: "arn:aws:lambda:us-east-1:123456789:function:smoke-test-hook"
- BeforeAllowTraffic: null
- AfterAllowTraffic: "arn:aws:lambda:us-east-1:123456789:function:post-deploy-hook"
Deployment Hook: Smoke Tests
// lambda/hooks/smoke-test.ts
// Runs against green environment (port 8080) before traffic shift
import {
CodeDeployClient,
PutLifecycleEventHookExecutionStatusCommand,
} from "@aws-sdk/client-codedeploy";
const codedeploy = new CodeDeployClient({});
export async function handler(event: {
DeploymentId: string;
LifecycleEventHookExecutionId: string;
}) {
const { DeploymentId, LifecycleEventHookExecutionId } = event;
let status: "Succeeded" | "Failed" = "Succeeded";
try {
// Test endpoint is on port 8080 (test listener routes to green)
const ALB_DNS = process.env.ALB_DNS!;
const checks = await Promise.all([
fetch(`http://${ALB_DNS}:8080/health`).then((r) => r.ok),
fetch(`http://${ALB_DNS}:8080/api/v2/status`).then((r) => r.ok),
]);
if (!checks.every(Boolean)) {
console.error("Smoke test failed:", checks);
status = "Failed";
} else {
console.log("Smoke tests passed");
}
} catch (err) {
console.error("Smoke test error:", err);
status = "Failed";
}
await codedeploy.send(
new PutLifecycleEventHookExecutionStatusCommand({
deploymentId: DeploymentId,
lifecycleEventHookExecutionId: LifecycleEventHookExecutionId,
status,
})
);
}
⚙️ DevOps Done Right — Zero Downtime, Full Automation
Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.
- Staging + production environments with feature flags
- Automated security scanning in the pipeline
- Uptime monitoring + alerting + runbook automation
- On-call support handover docs included
CI/CD Pipeline: GitHub Actions
# .github/workflows/deploy.yml
name: Deploy to ECS
on:
push:
branches: [main]
env:
AWS_REGION: us-east-1
ECR_REPOSITORY: your-app
ECS_CLUSTER: your-app
ECS_SERVICE: your-app
CODEDEPLOY_APP: your-app
CODEDEPLOY_GROUP: your-app-dg
jobs:
deploy:
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials (OIDC)
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_DEPLOY_ROLE_ARN }}
aws-region: ${{ env.AWS_REGION }}
- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v2
- name: Build, tag, and push image
id: build
env:
ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
IMAGE_TAG: ${{ github.sha }}
run: |
docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG \
--build-arg BUILD_SHA=${{ github.sha }} .
docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
echo "image=$ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG" >> $GITHUB_OUTPUT
- name: Register new task definition
id: task-def
run: |
TASK_DEF=$(aws ecs describe-task-definition \
--task-definition ${{ env.ECS_SERVICE }} \
--query 'taskDefinition' --output json)
NEW_TASK_DEF=$(echo $TASK_DEF | jq \
--arg IMAGE "${{ steps.build.outputs.image }}" \
'.containerDefinitions[0].image = $IMAGE | del(.taskDefinitionArn, .revision, .status, .requiresAttributes, .compatibilities, .registeredAt, .registeredBy)')
NEW_ARN=$(aws ecs register-task-definition \
--cli-input-json "$NEW_TASK_DEF" \
--query 'taskDefinition.taskDefinitionArn' --output text)
echo "task_def_arn=$NEW_ARN" >> $GITHUB_OUTPUT
- name: Create appspec and deploy
run: |
cat > appspec.json << EOF
{
"version": 0.0,
"Resources": [{
"TargetService": {
"Type": "AWS::ECS::Service",
"Properties": {
"TaskDefinition": "${{ steps.task-def.outputs.task_def_arn }}",
"LoadBalancerInfo": {
"ContainerName": "${{ env.ECS_SERVICE }}",
"ContainerPort": 3000
}
}
}
}]
}
EOF
aws deploy create-deployment \
--application-name ${{ env.CODEDEPLOY_APP }} \
--deployment-group-name ${{ env.CODEDEPLOY_GROUP }} \
--revision '{"revisionType":"AppSpecContent","appSpecContent":{"content":"'"$(cat appspec.json | jq -c . | tr -d '\n')"'"}}'
Cost Estimates
| Component | Cost |
|---|---|
| ECS Fargate (per vCPU-hour) | $0.04048 |
| ECS Fargate (per GB-hour) | $0.004445 |
| ALB (per hour) | $0.008 |
| ALB (per LCU-hour) | $0.008 |
| CodeDeploy | Free for ECS |
| Blue/green extra cost | ~1–5 min of double capacity during traffic shift |
Dev setup cost: 3–5 days for Terraform + CodeDeploy + CI/CD pipeline, $1,500–3,000.
See Also
- AWS ECS Fargate Production Setup
- AWS ECS Service Connect for Microservices
- Terraform State Management
- AWS CloudWatch Observability
- Kubernetes Helm Charts
Working With Viprasol
Blue/green deployments on ECS have more moving parts than most teams expect — the Terraform lifecycle ignore_changes are non-obvious, the AppSpec format is fiddly, and deployment hook permissions require exact IAM configuration. Our team has set up ECS blue/green pipelines with canary traffic shifting, CloudWatch alarm rollbacks, and GitHub Actions CI/CD for production SaaS applications.
What we deliver:
- Terraform: ECS cluster, service (CODE_DEPLOY controller), blue/green target groups, ALB listeners
- CodeDeploy deployment group with canary/linear traffic config
- Automated rollback on CloudWatch 5xx alarm
- Smoke test Lambda hook (validates green before traffic shift)
- GitHub Actions pipeline: build → push → register task def → CodeDeploy deploy
Talk to our team about your ECS deployment strategy →
Or explore our cloud infrastructure services.
About the Author
Viprasol Tech Team
Custom Software Development Specialists
The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.
Need DevOps & Cloud Expertise?
Scale your infrastructure with confidence. AWS, GCP, Azure certified team.
Free consultation • No commitment • Response within 24 hours
Making sense of your data at scale?
Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.