Serverless Cost Optimization: Lambda Cold Starts, Provisioned Concurrency, and Right-Sizing

AWS Lambda pricing seems simple: pay per invocation and per GB-second of execution. In practice, serverless bills surprise teams constantly — either through unexpectedly high costs from inefficient functions or through cold start latency that degrades user experience.

This guide covers the techniques that cut Lambda costs 40–70% and eliminate cold start issues without abandoning serverless.

Lambda Pricing Basics (2026)

Resource	Price
Requests	$0.20 per 1M requests
Duration (x86)	$0.0000166667 per GB-second
Duration (ARM/Graviton2)	$0.0000133334 per GB-second (20% cheaper)
Provisioned Concurrency	$0.0000041667 per GB-second (allocated)
Free tier	1M requests + 400,000 GB-seconds per month

Example cost: API handling 10M requests/month, 200ms avg duration, 512MB memory:

Requests: 10M × $0.20/1M = $2.00
Duration: 10M × 0.2s × 0.5GB × $0.0000166667 = $16.67
Total: ~$18.67/month

Same workload on ARM Graviton:

Duration: 10M × 0.2s × 0.5GB × $0.0000133334 = $13.33
Total: ~$15.33/month (18% cheaper, same compute)

Memory Right-Sizing

Lambda charges for memory × duration. The counterintuitive finding: more memory often costs less, because higher memory = more CPU = faster execution.

# benchmark_lambda.py — test your function at different memory settings
# Deploy with AWS Lambda Power Tuning (Step Functions state machine)
# https://github.com/alexcasalboni/aws-lambda-power-tuning

import boto3
import json
import time

lambda_client = boto3.client('lambda', region_name='us-east-1')

def benchmark_memory(function_name: str, test_payload: dict, memory_sizes: list[int]):
    results = []

    for memory_mb in memory_sizes:
        # Update function memory
        lambda_client.update_function_configuration(
            FunctionName=function_name,
            MemorySize=memory_mb,
        )
        time.sleep(2)  # Wait for config propagation

        # Run multiple invocations and average
        durations = []
        for _ in range(10):
            response = lambda_client.invoke(
                FunctionName=function_name,
                Payload=json.dumps(test_payload),
            )
            log = response.get('LogResult', '')
            # Parse duration from REPORT line in Lambda logs
            # REPORT RequestId: ... Duration: 45.23 ms Billed Duration: 46 ms ...

        avg_duration_ms = sum(durations) / len(durations)
        gb_seconds = (memory_mb / 1024) * (avg_duration_ms / 1000)
        cost_per_million = gb_seconds * 0.0000166667 * 1_000_000

        results.append({
            'memory_mb': memory_mb,
            'avg_duration_ms': avg_duration_ms,
            'cost_per_million_invocations': cost_per_million,
        })
        print(f"Memory: {memory_mb}MB | Duration: {avg_duration_ms:.1f}ms | Cost/1M: ${cost_per_million:.2f}")

    return results

Use AWS Lambda Power Tuning — it automates this benchmark across memory settings and produces a cost/performance graph. Most teams find their sweet spot is 512MB–1024MB for Node.js/Python, 1024MB–2048MB for JVM-based functions.

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

AWS, GCP, Azure certified engineers
Infrastructure as Code (Terraform, CDK)
Docker, Kubernetes, GitHub Actions CI/CD
Typical audit recovers $500–$3,000/month in savings

Get a Free Cloud Audit WhatsApp

Cold Starts: Root Causes and Solutions

A cold start happens when Lambda needs to initialize a new execution environment — download your code, start the runtime, run initialization code. This adds 100ms–5s of latency on top of your function's actual execution time.

Cold start latency by runtime (typical):

Runtime	Cold Start	Warm Execution
Node.js 20	150–400ms	5–50ms
Python 3.12	100–300ms	5–30ms
Go 1.21	50–150ms	1–10ms
Java 21 (with SnapStart)	500ms–1s → ~100ms	10–100ms
Java 21 (without SnapStart)	3–10s	10–100ms

Solutions by approach:

1. Reduce Package Size

Smaller deployment packages initialize faster. The cold start is partly I/O — loading your code from S3.

# Audit your bundle
npx source-map-explorer dist/function.js

# Common wins:
# - Use bundler (esbuild/webpack) instead of deploying node_modules/
# - Tree-shake unused imports
# - Move large static assets to S3 (not the Lambda package)
# - Use Lambda Layers for shared dependencies

# Target: < 5MB for Node.js, < 50MB zipped total

// esbuild.config.ts — bundle to single file
import { build } from 'esbuild';

await build({
  entryPoints: ['src/handler.ts'],
  bundle: true,
  platform: 'node',
  target: 'node20',
  outfile: 'dist/handler.js',
  external: [
    // Don't bundle AWS SDK v3 (available in Lambda runtime)
    '@aws-sdk/*',
  ],
  minify: true,
  sourcemap: 'external',
});

2. Move Heavy Init Outside the Handler

// ❌ Bad: DB connection created on every cold start AND on handler calls
export const handler = async (event: APIGatewayEvent) => {
  const db = new Pool({ connectionString: process.env.DATABASE_URL });
  const result = await db.query('SELECT * FROM users WHERE id = $1', [event.pathParameters?.id]);
  await db.end();
  return { statusCode: 200, body: JSON.stringify(result.rows[0]) };
};

// ✅ Good: DB connection created once, reused across invocations
import { Pool } from 'pg';

// Module-level initialization — runs once per execution environment
const db = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 2,  // Lambda: keep pool small (1-2 connections per function)
});

export const handler = async (event: APIGatewayEvent) => {
  const result = await db.query('SELECT * FROM users WHERE id = $1', [event.pathParameters?.id]);
  return { statusCode: 200, body: JSON.stringify(result.rows[0]) };
};

3. Provisioned Concurrency

For latency-sensitive functions (user-facing APIs), pre-warm a fixed number of execution environments:

# terraform/lambda.tf
resource "aws_lambda_function" "api" {
  function_name = "api-handler"
  runtime       = "nodejs20.x"
  architectures = ["arm64"]  # Graviton — 20% cheaper
  memory_size   = 512
  timeout       = 30

  # ... rest of config
}

# Provisioned concurrency — keeps N environments warm
resource "aws_lambda_provisioned_concurrency_config" "api" {
  function_name                  = aws_lambda_function.api.function_name
  qualifier                      = aws_lambda_alias.api_live.name
  provisioned_concurrent_executions = 5  # 5 warm environments
}

# Auto-scale provisioned concurrency with traffic patterns
resource "aws_appautoscaling_target" "lambda_concurrency" {
  max_capacity       = 50
  min_capacity       = 2
  resource_id        = "function:${aws_lambda_function.api.function_name}:live"
  scalable_dimension = "lambda:function:ProvisionedConcurrency"
  service_namespace  = "lambda"
}

resource "aws_appautoscaling_policy" "lambda_concurrency" {
  name               = "lambda-target-tracking"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.lambda_concurrency.resource_id
  scalable_dimension = aws_appautoscaling_target.lambda_concurrency.scalable_dimension
  service_namespace  = aws_appautoscaling_target.lambda_concurrency.service_namespace

  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "LambdaProvisionedConcurrencyUtilization"
    }
    target_value = 0.7  # Scale up when 70% of provisioned capacity is in use
  }
}

Provisioned concurrency cost: ~$0.0000041667/GB-sec allocated (not invoked) — about 25% of execution cost. For 5 × 512MB functions running 24/7: 5 × 0.5GB × 86400s × $0.0000041667 = $0.90/day = $27/month. Worth it if cold starts cause user-facing latency.

4. Java SnapStart

For Java Lambda functions, SnapStart takes a snapshot of the initialized JVM state and restores it on cold start — reducing cold start from 3–10 seconds to ~100ms:

# AWS SAM template
Resources:
  JavaApiFunction:
    Type: AWS::Serverless::Function
    Properties:
      Runtime: java21
      SnapStart:
        ApplyOn: PublishedVersions
      # ... rest of config

Lambda Layers: Shared Dependencies

Lambda Layers let you share code and dependencies across functions without including them in every deployment package:

# Create a layer with shared dependencies
mkdir -p layer/nodejs
cd layer/nodejs
npm install pg ioredis zod  # Shared dependencies
cd ..
zip -r layer.zip nodejs/

aws lambda publish-layer-version \
  --layer-name shared-deps \
  --zip-file fileb://layer.zip \
  --compatible-runtimes nodejs20.x \
  --compatible-architectures arm64

# Attach layer to functions
resource "aws_lambda_function" "api" {
  layers = [
    aws_lambda_layer_version.shared_deps.arn,
    "arn:aws:lambda:us-east-1:580247275435:layer:LambdaInsightsExtension-Arm64:20",
  ]
}

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

Staging + production environments with feature flags
Automated security scanning in the pipeline
Uptime monitoring + alerting + runbook automation
On-call support handover docs included

Modernize My DevOps WhatsApp

When Serverless Costs More Than EC2

Lambda is economical for spiky, unpredictable traffic. At sustained high volume, EC2 or ECS Fargate can be cheaper:

Monthly Invocations	Lambda Cost	ECS Fargate (t3.small)	Winner
1M (spiky)	~$2	$15–20	Lambda
10M	~$20	$15–20	Tie
50M	~$100	$15–20	ECS
500M	~$1,000	$50–100	ECS

Rule of thumb: if your Lambda functions run > 50% of the time (sustained load), containerized compute is cheaper. Lambda's value is elasticity — scaling to zero and scaling to thousands of concurrent executions without pre-provisioning.

Cost Monitoring

# Get Lambda cost breakdown per function from AWS Cost Explorer
import boto3

ce = boto3.client('ce', region_name='us-east-1')

response = ce.get_cost_and_usage(
    TimePeriod={'Start': '2026-04-01', 'End': '2026-05-01'},
    Granularity='MONTHLY',
    Filter={
        'Dimensions': {
            'Key': 'SERVICE',
            'Values': ['AWS Lambda'],
        }
    },
    GroupBy=[{'Type': 'DIMENSION', 'Key': 'OPERATION'}],
    Metrics=['BlendedCost'],
)

for result in response['ResultsByTime']:
    for group in result['Groups']:
        operation = group['Keys'][0]
        cost = group['Metrics']['BlendedCost']['Amount']
        print(f"{operation}: ${float(cost):.2f}")

Set AWS Cost Anomaly Detection alerts on Lambda — unexpected cost spikes often indicate runaway recursion or misconfigured event triggers.

Working With Viprasol

We audit and optimize serverless architectures — identifying memory sizing opportunities, implementing provisioned concurrency for latency-sensitive paths, migrating high-volume workloads to more cost-effective compute, and setting up cost monitoring and alerting.

→ Talk to our cloud team about serverless cost optimization.

Serverless Cost Optimization: Lambda Cold Starts, Provisioned Concurrency, and Right-Sizing

Serverless Cost Optimization: Lambda Cold Starts, Provisioned Concurrency, and Right-Sizing

Lambda Pricing Basics (2026)

Memory Right-Sizing

☁️ Is Your Cloud Costing Too Much?

Cold Starts: Root Causes and Solutions

1. Reduce Package Size

2. Move Heavy Init Outside the Handler

3. Provisioned Concurrency

4. Java SnapStart

Lambda Layers: Shared Dependencies

⚙️ DevOps Done Right — Zero Downtime, Full Automation

When Serverless Costs More Than EC2

Cost Monitoring

Working With Viprasol

See Also

Viprasol Tech Team

Need DevOps & Cloud Expertise?

Making sense of your data at scale?

Related Articles

AWS Lambda Cold Start Optimization in 2026: SnapStart, Graviton, and Provisioned Concurrency

AWS RDS Proxy: Connection Pooling, IAM Auth, Failover, and Terraform Configuration

AWS EventBridge: Event Rules, Cross-Account Routing, Schema Registry, and Terraform