Back to Blog

AWS Lambda Container Images: ECR, Multi-Stage Dockerfiles, and Cold Start Optimization

Deploy AWS Lambda functions as container images. Covers ECR repository setup, multi-stage Dockerfile for Node.js and Python, Lambda base images, cold start optimization techniques, Terraform deployment, and size optimization.

Viprasol Tech Team
April 15, 2027
12 min read

Lambda container images solve the package size limit problem (50MB for zip, 10GB for containers) and bring the full Docker ecosystem to serverless — custom runtimes, native binaries, ML models, and consistent local-to-production environments. The tradeoff is cold starts: containers initialize slower than zip-based functions, requiring careful optimization.

This guide covers the Dockerfile patterns, ECR setup, cold start mitigation, and Terraform deployment for production Lambda containers.

When to Use Container Images vs Zip

ScenarioRecommendation
Function < 50MB dependenciesZip deployment (simpler, faster cold start)
Native binaries (ffmpeg, pandoc, OpenCV)Container ✅
Large ML modelsContainer ✅
Custom runtimeContainer ✅
Monorepo with shared dependenciesContainer ✅
Function > 50MB totalContainer ✅
Consistent dev/prod environmentContainer ✅

Node.js Multi-Stage Dockerfile

# Dockerfile — multi-stage for minimal final image

# Stage 1: Build dependencies
FROM node:22-alpine AS builder

WORKDIR /app

# Copy package files first for better layer caching
COPY package.json package-lock.json ./
RUN npm ci --only=production

# Copy source
COPY src/ ./src/
COPY tsconfig.json ./

# Build TypeScript
RUN npm install -D typescript @types/node && npx tsc --outDir dist

# Stage 2: Production image using Lambda base image
FROM public.ecr.aws/lambda/nodejs22.x:latest

# Copy compiled code
COPY --from=builder /app/dist ${LAMBDA_TASK_ROOT}

# Copy production node_modules
COPY --from=builder /app/node_modules ${LAMBDA_TASK_ROOT}/node_modules

# Lambda handler: file.exportedFunction
CMD ["index.handler"]

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

  • AWS, GCP, Azure certified engineers
  • Infrastructure as Code (Terraform, CDK)
  • Docker, Kubernetes, GitHub Actions CI/CD
  • Typical audit recovers $500–$3,000/month in savings

Python Dockerfile for ML Workloads

# Python Lambda with large ML dependencies
FROM public.ecr.aws/lambda/python3.12:latest AS base

# Install system dependencies if needed
# RUN dnf install -y libgomp  # OpenMP for some ML libraries

# Stage 1: Install Python dependencies
FROM base AS builder

# Install pip dependencies to a separate directory
COPY requirements.txt ./
RUN pip install --target /python-packages -r requirements.txt --no-cache-dir

# Stage 2: Final image
FROM base

# Copy dependencies
COPY --from=builder /python-packages ${LAMBDA_TASK_ROOT}

# Copy function code
COPY src/ ${LAMBDA_TASK_ROOT}/

# Lambda handler
CMD ["handler.lambda_handler"]
# src/handler.py
import json
import os
from typing import Any

# Import at module level (runs once, cached across warm invocations)
import boto3

s3 = boto3.client("s3")

def lambda_handler(event: dict, context: Any) -> dict:
    """Lambda entry point."""
    try:
        result = process_event(event)
        return {
            "statusCode": 200,
            "body": json.dumps(result),
        }
    except Exception as e:
        print(f"Error: {e}")
        return {
            "statusCode": 500,
            "body": json.dumps({"error": str(e)}),
        }

def process_event(event: dict) -> dict:
    # Your business logic here
    return {"message": "OK"}

Custom Runtime Dockerfile

# Example: Bun runtime for Lambda
FROM public.ecr.aws/lambda/provided.al2023:latest

# Install Bun
RUN curl -fsSL https://bun.sh/install | bash
ENV PATH="/root/.bun/bin:$PATH"

WORKDIR ${LAMBDA_TASK_ROOT}

COPY package.json bun.lockb ./
RUN bun install --production

COPY src/ ./src/

# Custom bootstrap script: Lambda calls this to start your runtime
COPY bootstrap ./
RUN chmod +x bootstrap

CMD ["src/index.handler"]
#!/bin/bash
# bootstrap — custom runtime entrypoint
set -euo pipefail

HANDLER_FILE="${_HANDLER%%.*}"
HANDLER_FUNC="${_HANDLER##*.}"

while true; do
  # Get next event from Lambda runtime API
  RESPONSE=$(curl -sS "http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/next")
  REQUEST_ID=$(curl -sS -D - "http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/next" | grep "Lambda-Runtime-Aws-Request-Id" | tr -d '[:space:]' | cut -d: -f2)

  # Invoke handler using Bun
  RESULT=$(bun run "${LAMBDA_TASK_ROOT}/src/${HANDLER_FILE}.ts" <<< "$RESPONSE" 2>&1)

  # Send response
  curl -sS -X POST "http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/${REQUEST_ID}/response" \
    -d "$RESULT"
done

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

  • Staging + production environments with feature flags
  • Automated security scanning in the pipeline
  • Uptime monitoring + alerting + runbook automation
  • On-call support handover docs included

ECR Repository and CI/CD

# terraform/ecr.tf

resource "aws_ecr_repository" "app" {
  name                 = var.app_name
  image_tag_mutability = "MUTABLE"

  image_scanning_configuration {
    scan_on_push = true  # Automatic vulnerability scanning
  }

  encryption_configuration {
    encryption_type = "AES256"
  }
}

# Lifecycle policy: keep last 10 images, delete untagged older than 1 day
resource "aws_ecr_lifecycle_policy" "app" {
  repository = aws_ecr_repository.app.name

  policy = jsonencode({
    rules = [
      {
        rulePriority = 1
        description  = "Remove untagged images after 1 day"
        selection = {
          tagStatus   = "untagged"
          countType   = "sinceImagePushed"
          countUnit   = "days"
          countNumber = 1
        }
        action = { type = "expire" }
      },
      {
        rulePriority = 2
        description  = "Keep last 10 tagged images"
        selection = {
          tagStatus     = "tagged"
          tagPrefixList = ["v"]
          countType     = "imageCountMoreThan"
          countNumber   = 10
        }
        action = { type = "expire" }
      }
    ]
  })
}

output "ecr_repository_url" { value = aws_ecr_repository.app.repository_url }

Lambda Function Terraform

# terraform/lambda-container.tf

resource "aws_lambda_function" "app" {
  function_name = var.app_name
  role          = aws_iam_role.lambda.arn
  package_type  = "Image"

  # Image URI: repository_url:tag
  image_uri = "${aws_ecr_repository.app.repository_url}:${var.image_tag}"

  timeout      = 30
  memory_size  = 1024  # More memory = more CPU allocation
  architectures = ["arm64"]  # Graviton2: 20% cheaper, often faster

  # SnapStart (Java/Python only — not Node.js):
  # snap_start { apply_on = "PublishedVersions" }

  environment {
    variables = {
      NODE_ENV    = var.environment
      LOG_LEVEL   = "info"
    }
  }

  vpc_config {
    subnet_ids         = var.private_subnet_ids
    security_group_ids = [aws_security_group.lambda.id]
  }

  tracing_config {
    mode = "Active"  # X-Ray tracing
  }

  # Provisioned concurrency: pre-warm instances to eliminate cold starts
  # Use only for latency-critical functions — adds cost
  # provisioned_concurrency_config { ... }

  lifecycle {
    ignore_changes = [image_uri]  # CI/CD manages image updates
  }
}

# Auto-scaling provisioned concurrency (cost-efficient warming)
resource "aws_lambda_provisioned_concurrency_config" "app" {
  count                             = var.environment == "production" ? 1 : 0
  function_name                     = aws_lambda_function.app.function_name
  qualifier                         = aws_lambda_alias.live.name
  provisioned_concurrent_executions = 2  # Keep 2 instances warm
}

resource "aws_lambda_alias" "live" {
  name             = "live"
  function_name    = aws_lambda_function.app.function_name
  function_version = "$LATEST"
}

GitHub Actions: Build and Deploy

# .github/workflows/deploy-lambda.yml
name: Deploy Lambda Container

on:
  push:
    branches: [main]
    paths: ["functions/**", "Dockerfile"]

env:
  AWS_REGION: us-east-1
  ECR_REPOSITORY: your-function-name
  LAMBDA_FUNCTION: your-function-name

jobs:
  deploy:
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read

    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_DEPLOY_ROLE_ARN }}
          aws-region: ${{ env.AWS_REGION }}

      - name: Login to ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build and push image
        id: build
        env:
          ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
          IMAGE_TAG: ${{ github.sha }}
        run: |
          docker build \
            --platform linux/arm64 \
            --build-arg BUILD_SHA=$IMAGE_TAG \
            -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG \
            -t $ECR_REGISTRY/$ECR_REPOSITORY:latest \
            .
          docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
          docker push $ECR_REGISTRY/$ECR_REPOSITORY:latest
          echo "image=$ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG" >> $GITHUB_OUTPUT

      - name: Update Lambda function
        run: |
          aws lambda update-function-code \
            --function-name ${{ env.LAMBDA_FUNCTION }} \
            --image-uri ${{ steps.build.outputs.image }} \
            --architectures arm64

          # Wait for update to complete
          aws lambda wait function-updated \
            --function-name ${{ env.LAMBDA_FUNCTION }}

          echo "Lambda updated successfully"

Cold Start Optimization

// Cold start best practices for Node.js Lambda containers

// 1. Initialize everything at module level (runs once, cached across invocations)
import { DynamoDB } from "@aws-sdk/client-dynamodb";
import { S3Client } from "@aws-sdk/client-s3";

// ✅ Module-level initialization — one cold start cost, reused across invocations
const dynamoDb = new DynamoDB({ region: process.env.AWS_REGION });
const s3 = new S3Client({ region: process.env.AWS_REGION });

// 2. Lazy database connections — don't connect until needed
let dbConnection: any = null;

async function getDb() {
  if (!dbConnection) {
    dbConnection = await connectToDatabase();
  }
  return dbConnection;
}

// 3. Minimize import size — only import what you use
import { GetObjectCommand } from "@aws-sdk/client-s3"; // ✅ specific import
// import { S3Client } from "@aws-sdk/client-s3"; // avoid importing entire client if unused

export const handler = async (event: any) => {
  const db = await getDb(); // Reused if warm
  // ...
};

Size Optimization

# Reduce image size:

# 1. Use alpine base where possible
FROM node:22-alpine AS builder

# 2. Remove dev dependencies
RUN npm ci --only=production

# 3. Prune unnecessary files
RUN find node_modules -name "*.md" -delete && \
    find node_modules -name "*.d.ts" -delete && \
    find node_modules -name "test" -type d -exec rm -rf {} + 2>/dev/null; true

# 4. Use .dockerignore aggressively
# .dockerignore:
# node_modules
# .git
# *.test.ts
# coverage/
# .env*

Cold Start Benchmarks

Package TypeCold StartSize
Zip (simple Node.js)200–400ms<50MB
Container (Node.js, optimized)500–1,200ms100–300MB
Container (Python + pandas)1,500–3,000ms400–800MB
Provisioned concurrency<100msN/A
SnapStart (Java/Python)<1,000msN/A

For latency-critical endpoints using containers: 2 provisioned concurrency instances cost ~$25/month for a 1GB function at 24/7.

See Also


Working With Viprasol

Lambda container images unlock native binaries, large ML models, and custom runtimes that zip deployments can't support — but the cold start penalty needs careful management for production latency requirements. Our team has deployed Lambda containers for PDF generation (Puppeteer), image processing (Sharp), ML inference, and data pipeline functions with optimized Dockerfiles and provisioned concurrency where needed.

What we deliver:

  • Multi-stage Dockerfile for Node.js, Python, and custom runtimes
  • ECR repository with lifecycle policies and vulnerability scanning
  • Terraform: Lambda function, alias, and provisioned concurrency configuration
  • GitHub Actions pipeline: build for arm64, push to ECR, update Lambda
  • Cold start optimization: module-level init, connection reuse, size reduction

Talk to our team about your Lambda container architecture →

Or explore our cloud infrastructure services.

Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Need DevOps & Cloud Expertise?

Scale your infrastructure with confidence. AWS, GCP, Azure certified team.

Free consultation • No commitment • Response within 24 hours

Viprasol · Big Data & Analytics

Making sense of your data at scale?

Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.