Back to Blog

Cloud Cost Engineering: Rightsizing, Reserved Instances, Spot Fleets, and Savings Plans

Cut AWS cloud costs 40–70% with systematic rightsizing, Compute Savings Plans, Spot Fleet strategies, container cost allocation, and FinOps practices that scale with your organization.

Viprasol Tech Team
September 8, 2026
14 min read

The average company wastes 32% of their cloud spend, according to Flexera's 2025 State of the Cloud Report. On a $500K/year AWS bill, that's $160K in waste — more than a mid-level engineer's salary.

Cloud cost engineering is not about cutting corners. It's about paying the right price for the capacity you actually use, and using the financial instruments AWS provides to reduce that price by 40–70%.


The Cost Reduction Stack

Work through these in order — each layer builds on the previous:

Layer 5: FinOps culture (showback, chargeback, unit economics)
Layer 4: Architecture optimization (caching, CDN, right services)
Layer 3: Commitment discounts (Savings Plans, Reserved Instances)
Layer 2: Rightsizing (match instance size to actual usage)
Layer 1: Waste elimination (idle resources, orphaned volumes, unused IPs)

Most teams skip to Layer 3 (commitments) without doing Layer 1–2, which means they're committing to the wrong amount of the wrong resource types.


Layer 1: Waste Elimination

Start with resources that are running and doing nothing:

#!/bin/bash
# scripts/find-waste.sh
# Find common categories of wasted AWS spend

echo "=== Unattached EBS Volumes ==="
aws ec2 describe-volumes \
  --filters Name=status,Values=available \
  --query 'Volumes[*].[VolumeId,Size,CreateTime,AvailabilityZone]' \
  --output table

echo ""
echo "=== Idle Load Balancers (0 healthy targets) ==="
aws elbv2 describe-load-balancers --query 'LoadBalancers[*].LoadBalancerArn' --output text | \
  tr '\t' '\n' | \
  xargs -I {} sh -c '
    COUNT=$(aws elbv2 describe-target-health --target-group-arn {} --query "length(TargetHealthDescriptions[?TargetHealth.State==\`healthy\`])" 2>/dev/null || echo 0)
    if [ "$COUNT" -eq 0 ]; then echo "Idle: {}"; fi
  '

echo ""
echo "=== Elastic IPs Not Associated with Instances ==="
aws ec2 describe-addresses \
  --query 'Addresses[?AssociationId==null].[AllocationId,PublicIp]' \
  --output table

echo ""
echo "=== Snapshots Older Than 90 Days ==="
CUTOFF=$(date -d '90 days ago' --iso-8601=seconds)
aws ec2 describe-snapshots \
  --owner-ids self \
  --query "Snapshots[?StartTime<'${CUTOFF}'].[SnapshotId,VolumeSize,StartTime]" \
  --output table

echo ""
echo "=== Stopped EC2 Instances (still charging for EBS) ==="
aws ec2 describe-instances \
  --filters Name=instance-state-name,Values=stopped \
  --query 'Reservations[*].Instances[*].[InstanceId,InstanceType,Tags[?Key==`Name`].Value|[0],LaunchTime]' \
  --output table

Automated Waste Cleanup with AWS Lambda

// src/lambda/cost-cleanup/handler.ts
import { EC2Client, DescribeVolumesCommand, DeleteVolumeCommand } from "@aws-sdk/client-ec2";

const ec2 = new EC2Client({ region: process.env.AWS_REGION });

export async function handler(): Promise<void> {
  // Find volumes unattached for more than 7 days
  const { Volumes } = await ec2.send(
    new DescribeVolumesCommand({
      Filters: [{ Name: "status", Values: ["available"] }],
    })
  );

  const staleVolumes = (Volumes ?? []).filter((v) => {
    const createTime = new Date(v.CreateTime!);
    const daysSinceCreation = (Date.now() - createTime.getTime()) / (1000 * 60 * 60 * 24);
    return daysSinceCreation > 7;
  });

  console.log(`Found ${staleVolumes.length} stale volumes`);

  for (const volume of staleVolumes) {
    // Tag for deletion review instead of deleting immediately
    console.log(`Tagging volume ${volume.VolumeId} for deletion review`);
    // In production: tag with deletion-scheduled date, alert to Slack,
    // delete after 7-day review window
  }
}

Typical savings from waste elimination: 10–20% of total bill.


☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

  • AWS, GCP, Azure certified engineers
  • Infrastructure as Code (Terraform, CDK)
  • Docker, Kubernetes, GitHub Actions CI/CD
  • Typical audit recovers $500–$3,000/month in savings

Layer 2: Rightsizing

Finding Over-Provisioned Instances

# scripts/rightsize_analysis.py
import boto3
from datetime import datetime, timedelta
from typing import NamedTuple

cloudwatch = boto3.client('cloudwatch', region_name='us-east-1')
ec2 = boto3.client('ec2', region_name='us-east-1')

class InstanceMetrics(NamedTuple):
    instance_id: str
    instance_type: str
    avg_cpu_percent: float
    max_cpu_percent: float
    avg_memory_percent: float
    recommendation: str

def analyze_instance(instance_id: str, instance_type: str) -> InstanceMetrics:
    end_time = datetime.utcnow()
    start_time = end_time - timedelta(days=14)

    def get_metric(metric_name: str, namespace: str = 'AWS/EC2') -> tuple[float, float]:
        response = cloudwatch.get_metric_statistics(
            Namespace=namespace,
            MetricName=metric_name,
            Dimensions=[{'Name': 'InstanceId', 'Value': instance_id}],
            StartTime=start_time,
            EndTime=end_time,
            Period=3600,  # 1-hour buckets
            Statistics=['Average', 'Maximum'],
        )
        datapoints = response['Datapoints']
        if not datapoints:
            return 0.0, 0.0
        avg = sum(d['Average'] for d in datapoints) / len(datapoints)
        maximum = max(d['Maximum'] for d in datapoints)
        return avg, maximum

    avg_cpu, max_cpu = get_metric('CPUUtilization')
    
    # Memory requires CloudWatch agent
    avg_mem, _ = get_metric('mem_used_percent', 'CWAgent')

    # Rightsizing recommendation
    recommendation = "OK"
    if avg_cpu < 10 and max_cpu < 40:
        recommendation = "DOWNSIZE: CPU consistently low"
    elif avg_cpu < 20 and avg_mem < 20:
        recommendation = "DOWNSIZE: Both CPU and memory underutilized"
    elif avg_cpu > 80:
        recommendation = "UPSIZE: CPU consistently high"

    return InstanceMetrics(
        instance_id=instance_id,
        instance_type=instance_type,
        avg_cpu_percent=round(avg_cpu, 1),
        max_cpu_percent=round(max_cpu, 1),
        avg_memory_percent=round(avg_mem, 1),
        recommendation=recommendation,
    )

# Run analysis
paginator = ec2.get_paginator('describe_instances')
for page in paginator.paginate(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}]):
    for reservation in page['Reservations']:
        for instance in reservation['Instances']:
            metrics = analyze_instance(instance['InstanceId'], instance['InstanceType'])
            if 'DOWNSIZE' in metrics.recommendation:
                monthly_savings = estimate_downsize_savings(metrics.instance_type)
                print(f"{metrics.instance_id} ({metrics.instance_type}): {metrics.recommendation}")
                print(f"  CPU avg/max: {metrics.avg_cpu_percent}% / {metrics.max_cpu_percent}%")
                print(f"  Estimated monthly savings: ${monthly_savings}")

Typical savings from rightsizing: 15–30% of compute costs.


Layer 3: Commitment Discounts

Savings Plans vs Reserved Instances

SAVINGS PLANS (recommended for most):
├── Compute Savings Plans (most flexible)
│   ├── Applies to EC2, Fargate, Lambda
│   ├── Any region, any instance family, any OS
│   └── Up to 66% savings vs On-Demand
├── EC2 Instance Savings Plans
│   ├── Applies to specific instance family in one region
│   └── Up to 72% savings vs On-Demand
└── SageMaker Savings Plans
    └── SageMaker only, up to 64%

RESERVED INSTANCES (use for specific cases):
├── RDS instances (no Savings Plans option)
├── ElastiCache, Redshift, OpenSearch
└── EC2 if you need specific hardware guarantees

Calculating Optimal Commitment Level

// src/scripts/savings-plan-calculator.ts
import {
  CostExplorerClient,
  GetRightsizingRecommendationCommand,
  GetSavingsPlansPurchaseRecommendationCommand,
} from "@aws-sdk/client-cost-explorer";

const ce = new CostExplorerClient({ region: "us-east-1" });

interface SavingsPlanRecommendation {
  termInYears: 1 | 3;
  paymentOption: "NoUpfront" | "PartialUpfront" | "AllUpfront";
  hourlyCommitment: number;
  estimatedMonthlySavings: number;
  estimatedSavingsPercentage: number;
  estimatedROI: number;
}

export async function getSavingsPlanRecommendations(): Promise<
  SavingsPlanRecommendation[]
> {
  const response = await ce.send(
    new GetSavingsPlansPurchaseRecommendationCommand({
      SavingsPlansType: "COMPUTE_SP",
      TermInYears: "ONE_YEAR",
      PaymentOption: "NO_UPFRONT",
      LookbackPeriodInDays: "SIXTY_DAYS",
    })
  );

  return (
    response.SavingsPlansPurchaseRecommendation?.SavingsPlansPurchaseRecommendationDetails?.map(
      (detail) => ({
        termInYears: 1,
        paymentOption: "NoUpfront",
        hourlyCommitment: Number(
          detail.HourlyCommitmentToPurchase ?? 0
        ),
        estimatedMonthlySavings: Number(
          detail.EstimatedMonthlySavingsAmount ?? 0
        ),
        estimatedSavingsPercentage: Number(
          detail.EstimatedSavingsPercentage ?? 0
        ),
        estimatedROI: Number(detail.CurrentAverageHourlyOnDemandSpend ?? 0) > 0
          ? Number(detail.EstimatedSavingsPercentage ?? 0)
          : 0,
      })
    ) ?? []
  );
}

// Rule of thumb: commit to the 30-day p50 of your Compute spend
// This covers baseline load; burst goes on On-Demand

Commitment Strategy

Baseline (always running): Cover with Savings Plans (1yr No-Upfront)
┌────────────────────────────┐
│ Savings Plan commits:      │
│ $X/hour (30-day p20 spend) │
└────────────────────────────┘

Variable (predictable peaks): Cover with Savings Plans (3yr if >$50K)
┌────────────────────────────┐
│ Savings Plan covers:       │  
│ $Y/hour (30-day p80 spend) │
└────────────────────────────┘

Burst (spiky, unpredictable): On-Demand or Spot
┌────────────────────────────┐
│ On-Demand / Spot Fleet     │
│ for everything above p80   │
└────────────────────────────┘

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

  • Staging + production environments with feature flags
  • Automated security scanning in the pipeline
  • Uptime monitoring + alerting + runbook automation
  • On-call support handover docs included

Layer 4: Spot Fleet for Stateless Workloads

Spot instances are spare AWS capacity at 60–90% discount. They can be interrupted with 2-minute notice, making them ideal for stateless workloads:

# terraform/spot-fleet.tf
resource "aws_spot_fleet_request" "batch_workers" {
  iam_fleet_role  = aws_iam_role.spot_fleet.arn
  target_capacity = 10
  
  # Use multiple instance types for availability
  # If one pool runs out, Spot Fleet uses another
  launch_template_config {
    launch_template_specification {
      id      = aws_launch_template.worker.id
      version = "$Latest"
    }

    overrides {
      instance_type     = "m6i.xlarge"
      weighted_capacity = 1
      availability_zone = "us-east-1a"
    }
    overrides {
      instance_type     = "m6a.xlarge"
      weighted_capacity = 1
      availability_zone = "us-east-1b"
    }
    overrides {
      instance_type     = "m5.xlarge"
      weighted_capacity = 1
      availability_zone = "us-east-1c"
    }
    overrides {
      instance_type     = "r6i.large"
      weighted_capacity = 1
      availability_zone = "us-east-1a"
    }
  }

  # Replace interrupted instances automatically
  allocation_strategy = "capacityOptimized"
  
  # Mix spot + on-demand for reliability
  # 80% spot, 20% on-demand baseline
  on_demand_base_capacity                  = 2    # Always have 2 on-demand
  on_demand_target_capacity_percentage     = 20   # Rest is split 80/20
  spot_instance_interruption_behavior      = "terminate"
  
  tags = {
    Name = "batch-workers-spot-fleet"
  }
}

ECS Capacity Provider with Spot

# terraform/ecs-capacity-provider.tf
resource "aws_ecs_capacity_provider" "spot" {
  name = "spot-workers"

  auto_scaling_group_provider {
    auto_scaling_group_arn = aws_autoscaling_group.spot.arn

    managed_scaling {
      maximum_scaling_step_size = 10
      minimum_scaling_step_size = 1
      status                    = "ENABLED"
      target_capacity           = 90
    }
    managed_termination_protection = "DISABLED"
  }
}

resource "aws_ecs_cluster_capacity_providers" "main" {
  cluster_name = aws_ecs_cluster.main.name

  capacity_providers = [
    "FARGATE",           # Always-on baseline
    "FARGATE_SPOT",      # Cheap for batch/dev
    aws_ecs_capacity_provider.spot.name
  ]

  default_capacity_provider_strategy {
    capacity_provider = "FARGATE"
    weight            = 1
    base              = 2  # Minimum 2 Fargate tasks always
  }

  default_capacity_provider_strategy {
    capacity_provider = "FARGATE_SPOT"
    weight            = 4  # 4x more likely to use Spot
    base              = 0
  }
}

Layer 5: FinOps — Cost Allocation and Accountability

Without cost allocation, engineers have no incentive to optimize:

// src/lambda/cost-reporter/handler.ts
import {
  CostExplorerClient,
  GetCostAndUsageCommand,
} from "@aws-sdk/client-cost-explorer";

const ce = new CostExplorerClient({ region: "us-east-1" });

export async function weeklyTeamCostReport(): Promise<void> {
  const endDate = new Date().toISOString().split("T")[0];
  const startDate = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000)
    .toISOString()
    .split("T")[0];

  const response = await ce.send(
    new GetCostAndUsageCommand({
      TimePeriod: { Start: startDate, End: endDate },
      Granularity: "DAILY",
      GroupBy: [{ Type: "TAG", Key: "team" }], // Requires tagging all resources
      Metrics: ["UnblendedCost"],
    })
  );

  // Build team cost report
  const teamCosts: Record<string, number> = {};
  for (const result of response.ResultsByTime ?? []) {
    for (const group of result.Groups ?? []) {
      const team = group.Keys?.[0]?.replace("team$", "") ?? "untagged";
      const cost = Number(group.Metrics?.UnblendedCost?.Amount ?? 0);
      teamCosts[team] = (teamCosts[team] ?? 0) + cost;
    }
  }

  // Post to Slack with week-over-week comparison
  const message = Object.entries(teamCosts)
    .sort(([, a], [, b]) => b - a)
    .map(([team, cost]) => `${team}: $${cost.toFixed(2)}`)
    .join("\n");

  await postToSlack({
    channel: "#engineering-costs",
    text: `Weekly AWS Cost by Team (${startDate}${endDate})\n\`\`\`\n${message}\n\`\`\``,
  });
}

Resource Tagging Policy

// scripts/enforce-tags.ts
// Run as a pre-commit hook or CI check

const REQUIRED_TAGS = ["team", "service", "environment", "cost-center"] as const;

interface TerraformPlan {
  resource_changes: Array<{
    type: string;
    change: {
      actions: string[];
      after: {
        tags?: Record<string, string>;
      };
    };
  }>;
}

export function validateTagCompliance(plan: TerraformPlan): string[] {
  const violations: string[] = [];
  const TAGGABLE_TYPES = new Set([
    "aws_instance",
    "aws_db_instance",
    "aws_elasticache_cluster",
    "aws_ecs_service",
    "aws_lambda_function",
    "aws_s3_bucket",
  ]);

  for (const resource of plan.resource_changes) {
    if (!TAGGABLE_TYPES.has(resource.type)) continue;
    if (!resource.change.actions.includes("create") && !resource.change.actions.includes("update")) continue;

    const tags = resource.change.after.tags ?? {};
    const missingTags = REQUIRED_TAGS.filter((tag) => !tags[tag]);

    if (missingTags.length > 0) {
      violations.push(
        `${resource.type}: missing required tags: ${missingTags.join(", ")}`
      );
    }
  }

  return violations;
}

Cost Savings Reference

OptimizationEffortTypical Savings
Eliminate idle resourcesLow (hours)5–15%
Rightsize EC2/RDSMedium (days)15–30%
1-yr Compute Savings Plan (No Upfront)Low (minutes)33–40% on committed spend
3-yr Compute Savings Plan (All Upfront)Low (minutes)60–66% on committed spend
Spot Fleet for batch workloadsMedium (days)60–80% vs On-Demand
S3 Intelligent TieringLow (hours)10–40% on storage
Reserved RDS (1-yr)Low (minutes)30–35% on database
Fargate Spot for dev/stagingLow (hours)60–70% on dev compute

Combined realistic savings on a $500K/year bill: $150K–$300K/year with 2–4 weeks of engineering effort.


See Also


Working With Viprasol

Our cloud engineering team has performed cost optimization engagements for SaaS companies spending $50K–$2M/year on AWS. We combine automated waste detection, rightsizing analysis, and commitment strategy to deliver 30–60% cost reductions — typically within 30 days.

Cloud engineering services → | Get a cost audit →

Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Need DevOps & Cloud Expertise?

Scale your infrastructure with confidence. AWS, GCP, Azure certified team.

Free consultation • No commitment • Response within 24 hours

Viprasol · Big Data & Analytics

Making sense of your data at scale?

Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.