Back to Blog

Serverless Cost Optimization: Lambda Cold Starts

Reduce AWS Lambda costs and eliminate cold starts — memory right-sizing, provisioned concurrency, ARM Graviton, Lambda layers, reserved concurrency, and when se

Viprasol Tech Team
12 min read
Updated 2026

Serverless Cost Optimization: Lambda Cold Starts, Provisioned Concurrency, and Right-Sizing

Quick answer. Cut Lambda costs 40-70% by switching to ARM/Graviton2 (20% cheaper at $0.0000133334 per GB-second), right-sizing memory, and trimming execution time. Eliminate cold starts with provisioned concurrency for latency-sensitive functions. Lambda bills $0.20 per million requests plus duration, with 1M requests and 400,000 GB-seconds free monthly.

AWS Lambda pricing seems simple: pay per invocation and per GB-second of execution. In practice, serverless bills surprise teams constantly — either through unexpectedly high costs from inefficient functions or through cold start latency that degrades user experience.

This guide covers the techniques that cut Lambda costs 40–70% and eliminate cold start issues without abandoning serverless.


Lambda Pricing Basics (2026)

ResourcePrice
Requests$0.20 per 1M requests
Duration (x86)$0.0000166667 per GB-second
Duration (ARM/Graviton2)$0.0000133334 per GB-second (20% cheaper)
Provisioned Concurrency$0.0000041667 per GB-second (allocated)
Free tier1M requests + 400,000 GB-seconds per month

Example cost: API handling 10M requests/month, 200ms avg duration, 512MB memory:

Requests: 10M × $0.20/1M = $2.00
Duration: 10M × 0.2s × 0.5GB × $0.0000166667 = $16.67
Total: ~$18.67/month

Same workload on ARM Graviton:

Duration: 10M × 0.2s × 0.5GB × $0.0000133334 = $13.33
Total: ~$15.33/month (18% cheaper, same compute)

Memory Right-Sizing

Lambda charges for memory × duration. The counterintuitive finding: more memory often costs less, because higher memory = more CPU = faster execution.

# benchmark_lambda.py — test your function at different memory settings
# Deploy with AWS Lambda Power Tuning (Step Functions state machine)
# https://github.com/alexcasalboni/aws-lambda-power-tuning

import boto3
import json
import time

lambda_client = boto3.client('lambda', region_name='us-east-1')

def benchmark_memory(function_name: str, test_payload: dict, memory_sizes: list[int]):
    results = []

    for memory_mb in memory_sizes:
        # Update function memory
        lambda_client.update_function_configuration(
            FunctionName=function_name,
            MemorySize=memory_mb,
        )
        time.sleep(2)  # Wait for config propagation

        # Run multiple invocations and average
        durations = []
        for _ in range(10):
            response = lambda_client.invoke(
                FunctionName=function_name,
                Payload=json.dumps(test_payload),
            )
            log = response.get('LogResult', '')
            # Parse duration from REPORT line in Lambda logs
            # REPORT RequestId: ... Duration: 45.23 ms Billed Duration: 46 ms ...

        avg_duration_ms = sum(durations) / len(durations)
        gb_seconds = (memory_mb / 1024) * (avg_duration_ms / 1000)
        cost_per_million = gb_seconds * 0.0000166667 * 1_000_000

        results.append({
            'memory_mb': memory_mb,
            'avg_duration_ms': avg_duration_ms,
            'cost_per_million_invocations': cost_per_million,
        })
        print(f"Memory: {memory_mb}MB | Duration: {avg_duration_ms:.1f}ms | Cost/1M: ${cost_per_million:.2f}")

    return results

Use AWS Lambda Power Tuning — it automates this benchmark across memory settings and produces a cost/performance graph. Most teams find their sweet spot is 512MB–1024MB for Node.js/Python, 1024MB–2048MB for JVM-based functions.


☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

  • AWS, GCP, Azure certified engineers
  • Infrastructure as Code (Terraform, CDK)
  • Docker, Kubernetes, GitHub Actions CI/CD
  • Typical audit recovers $500–$3,000/month in savings

Cold Starts: Root Causes and Solutions

A cold start happens when Lambda needs to initialize a new execution environment — download your code, start the runtime, run initialization code. This adds 100ms–5s of latency on top of your function's actual execution time.

Cold start latency by runtime (typical):

RuntimeCold StartWarm Execution
Node.js 20150–400ms5–50ms
Python 3.12100–300ms5–30ms
Go 1.2150–150ms1–10ms
Java 21 (with SnapStart)500ms–1s → ~100ms10–100ms
Java 21 (without SnapStart)3–10s10–100ms

Solutions by approach:

1. Reduce Package Size

Smaller deployment packages initialize faster. The cold start is partly I/O — loading your code from S3.

# Audit your bundle
npx source-map-explorer dist/function.js

# Common wins:
# - Use bundler (esbuild/webpack) instead of deploying node_modules/
# - Tree-shake unused imports
# - Move large static assets to S3 (not the Lambda package)
# - Use Lambda Layers for shared dependencies

# Target: < 5MB for Node.js, < 50MB zipped total
// esbuild.config.ts — bundle to single file
import { build } from 'esbuild';

await build({
  entryPoints: ['src/handler.ts'],
  bundle: true,
  platform: 'node',
  target: 'node20',
  outfile: 'dist/handler.js',
  external: [
    // Don't bundle AWS SDK v3 (available in Lambda runtime)
    '@aws-sdk/*',
  ],
  minify: true,
  sourcemap: 'external',
});

2. Move Heavy Init Outside the Handler

// ❌ Bad: DB connection created on every cold start AND on handler calls
export const handler = async (event: APIGatewayEvent) => {
  const db = new Pool({ connectionString: process.env.DATABASE_URL });
  const result = await db.query('SELECT * FROM users WHERE id = $1', [event.pathParameters?.id]);
  await db.end();
  return { statusCode: 200, body: JSON.stringify(result.rows[0]) };
};

// ✅ Good: DB connection created once, reused across invocations
import { Pool } from 'pg';

// Module-level initialization — runs once per execution environment
const db = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 2,  // Lambda: keep pool small (1-2 connections per function)
});

export const handler = async (event: APIGatewayEvent) => {
  const result = await db.query('SELECT * FROM users WHERE id = $1', [event.pathParameters?.id]);
  return { statusCode: 200, body: JSON.stringify(result.rows[0]) };
};

3. Provisioned Concurrency

For latency-sensitive functions (user-facing APIs), pre-warm a fixed number of execution environments:

# terraform/lambda.tf
resource "aws_lambda_function" "api" {
  function_name = "api-handler"
  runtime       = "nodejs20.x"
  architectures = ["arm64"]  # Graviton — 20% cheaper
  memory_size   = 512
  timeout       = 30

  # ... rest of config
}

# Provisioned concurrency — keeps N environments warm
resource "aws_lambda_provisioned_concurrency_config" "api" {
  function_name                  = aws_lambda_function.api.function_name
  qualifier                      = aws_lambda_alias.api_live.name
  provisioned_concurrent_executions = 5  # 5 warm environments
}

# Auto-scale provisioned concurrency with traffic patterns
resource "aws_appautoscaling_target" "lambda_concurrency" {
  max_capacity       = 50
  min_capacity       = 2
  resource_id        = "function:${aws_lambda_function.api.function_name}:live"
  scalable_dimension = "lambda:function:ProvisionedConcurrency"
  service_namespace  = "lambda"
}

resource "aws_appautoscaling_policy" "lambda_concurrency" {
  name               = "lambda-target-tracking"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.lambda_concurrency.resource_id
  scalable_dimension = aws_appautoscaling_target.lambda_concurrency.scalable_dimension
  service_namespace  = aws_appautoscaling_target.lambda_concurrency.service_namespace

  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "LambdaProvisionedConcurrencyUtilization"
    }
    target_value = 0.7  # Scale up when 70% of provisioned capacity is in use
  }
}

Provisioned concurrency cost: ~$0.0000041667/GB-sec allocated (not invoked) — about 25% of execution cost. For 5 × 512MB functions running 24/7: 5 × 0.5GB × 86400s × $0.0000041667 = $0.90/day = $27/month. Worth it if cold starts cause user-facing latency.

4. Java SnapStart

For Java Lambda functions, SnapStart takes a snapshot of the initialized JVM state and restores it on cold start — reducing cold start from 3–10 seconds to ~100ms:

# AWS SAM template
Resources:
  JavaApiFunction:
    Type: AWS::Serverless::Function
    Properties:
      Runtime: java21
      SnapStart:
        ApplyOn: PublishedVersions
      # ... rest of config

Lambda Layers: Shared Dependencies

Lambda Layers let you share code and dependencies across functions without including them in every deployment package:

# Create a layer with shared dependencies
mkdir -p layer/nodejs
cd layer/nodejs
npm install pg ioredis zod  # Shared dependencies
cd ..
zip -r layer.zip nodejs/

aws lambda publish-layer-version \
  --layer-name shared-deps \
  --zip-file fileb://layer.zip \
  --compatible-runtimes nodejs20.x \
  --compatible-architectures arm64
# Attach layer to functions
resource "aws_lambda_function" "api" {
  layers = [
    aws_lambda_layer_version.shared_deps.arn,
    "arn:aws:lambda:us-east-1:580247275435:layer:LambdaInsightsExtension-Arm64:20",
  ]
}

serverless - Serverless Cost Optimization: Lambda Cold Starts

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

  • Staging + production environments with feature flags
  • Automated security scanning in the pipeline
  • Uptime monitoring + alerting + runbook automation
  • On-call support handover docs included

When Serverless Costs More Than EC2

Lambda is economical for spiky, unpredictable traffic. At sustained high volume, EC2 or ECS Fargate can be cheaper:

Monthly InvocationsLambda CostECS Fargate (t3.small)Winner
1M (spiky)~$2$15–20Lambda
10M~$20$15–20Tie
50M~$100$15–20ECS
500M~$1,000$50–100ECS

Rule of thumb: if your Lambda functions run > 50% of the time (sustained load), containerized compute is cheaper. Lambda's value is elasticity — scaling to zero and scaling to thousands of concurrent executions without pre-provisioning.


Cost Monitoring

# Get Lambda cost breakdown per function from AWS Cost Explorer
import boto3

ce = boto3.client('ce', region_name='us-east-1')

response = ce.get_cost_and_usage(
    TimePeriod={'Start': '2026-04-01', 'End': '2026-05-01'},
    Granularity='MONTHLY',
    Filter={
        'Dimensions': {
            'Key': 'SERVICE',
            'Values': ['AWS Lambda'],
        }
    },
    GroupBy=[{'Type': 'DIMENSION', 'Key': 'OPERATION'}],
    Metrics=['BlendedCost'],
)

for result in response['ResultsByTime']:
    for group in result['Groups']:
        operation = group['Keys'][0]
        cost = group['Metrics']['BlendedCost']['Amount']
        print(f"{operation}: ${float(cost):.2f}")

Set AWS Cost Anomaly Detection alerts on Lambda — unexpected cost spikes often indicate runaway recursion or misconfigured event triggers.


What Viprasol Offers

We audit and optimize serverless architectures — identifying memory sizing opportunities, implementing provisioned concurrency for latency-sensitive paths, migrating high-volume workloads to more cost-effective compute, and setting up cost monitoring and alerting.

Talk to our cloud team about serverless cost optimization.


Related Topics

Understanding AWS Lambda Provisioned Concurrency Pricing at 0.0000041667

If you are budgeting around the aws lambda provisioned concurrency pricing 0.0000041667 figure, that number is the per-second rate charged for each GB of memory you keep warm. Stated fully, the aws lambda provisioned concurrency pricing 0.0000041667 gb-second rate applies the moment provisioned concurrency is enabled and continues whether or not requests arrive, which is what eliminates cold starts but adds a steady baseline cost. To estimate spend, multiply your function's allocated memory in GB by the seconds it stays provisioned by the number of concurrent instances. Pair this with the standard per-invocation request and duration charges to see the full picture. Our senior engineers model these trade-offs for clients, comparing provisioned concurrency against on-demand and right-sized memory so you only pay for warmth where latency genuinely matters. We take full ownership of the analysis and the implementation.

serverlessaws-lambdacost-optimizationperformancecloud
Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 1000+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Need DevOps & Cloud Expertise?

Scale your infrastructure with confidence. AWS, GCP, Azure certified team.

Free consultation • No commitment • Response within 24 hours

Viprasol · Big Data & Analytics

Making sense of your data at scale?

Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.