AWS VPC Design in 2026: Subnets, NAT Gateway, Security Groups, VPC Endpoints, and Terraform
Design a production AWS VPC: public/private subnets across AZs, NAT Gateway, security group rules, VPC endpoints for S3 and DynamoDB, and complete Terraform configuration.
AWS VPC Design in 2026: Subnets, NAT Gateway, Security Groups, VPC Endpoints, and Terraform
A VPC is the network boundary for everything you run on AWS. Get it wrong and you pay for unnecessary NAT Gateway data transfer, expose resources accidentally, or can't add new AZs later. Get it right and it's invisible infrastructure that just works.
This post covers the standard production VPC design: CIDR planning, public and private subnets across three AZs, NAT Gateway with cost optimization, security groups with least-privilege rules, VPC endpoints that eliminate NAT Gateway charges for S3 and DynamoDB, and Terraform for the whole stack.
CIDR Planning
VPC CIDR: 10.0.0.0/16 (65,534 usable IPs — gives room to grow)
Public subnets (one per AZ — for ALB, NAT Gateway, bastion):
10.0.0.0/24 us-east-1a (251 IPs)
10.0.1.0/24 us-east-1b (251 IPs)
10.0.2.0/24 us-east-1c (251 IPs)
Private subnets (one per AZ — for ECS, Lambda, RDS):
10.0.10.0/24 us-east-1a (251 IPs)
10.0.11.0/24 us-east-1b (251 IPs)
10.0.12.0/24 us-east-1c (251 IPs)
Database subnets (one per AZ — isolated, no internet access):
10.0.20.0/24 us-east-1a (251 IPs)
10.0.21.0/24 us-east-1b (251 IPs)
10.0.22.0/24 us-east-1c (251 IPs)
Terraform: Complete VPC
# terraform/vpc.tf
locals {
azs = ["${var.region}a", "${var.region}b", "${var.region}c"]
public_subnet_cidrs = ["10.0.0.0/24", "10.0.1.0/24", "10.0.2.0/24"]
private_subnet_cidrs = ["10.0.10.0/24", "10.0.11.0/24", "10.0.12.0/24"]
database_subnet_cidrs = ["10.0.20.0/24", "10.0.21.0/24", "10.0.22.0/24"]
}
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_support = true # Required for VPC endpoints and Route 53
enable_dns_hostnames = true # Required for RDS endpoints
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}" })
}
# Public subnets (have route to Internet Gateway)
resource "aws_subnet" "public" {
count = length(local.azs)
vpc_id = aws_vpc.main.id
cidr_block = local.public_subnet_cidrs[count.index]
availability_zone = local.azs[count.index]
# Instances in public subnet get public IPs automatically
map_public_ip_on_launch = true
tags = merge(var.common_tags, {
Name = "${var.name}-${var.environment}-public-${local.azs[count.index]}"
Tier = "public"
# Tag for ALB and EKS auto-discovery
"kubernetes.io/role/elb" = "1"
})
}
# Private subnets (access internet via NAT Gateway)
resource "aws_subnet" "private" {
count = length(local.azs)
vpc_id = aws_vpc.main.id
cidr_block = local.private_subnet_cidrs[count.index]
availability_zone = local.azs[count.index]
tags = merge(var.common_tags, {
Name = "${var.name}-${var.environment}-private-${local.azs[count.index]}"
Tier = "private"
"kubernetes.io/role/internal-elb" = "1"
})
}
# Database subnets (no internet access — not even via NAT)
resource "aws_subnet" "database" {
count = length(local.azs)
vpc_id = aws_vpc.main.id
cidr_block = local.database_subnet_cidrs[count.index]
availability_zone = local.azs[count.index]
tags = merge(var.common_tags, {
Name = "${var.name}-${var.environment}-database-${local.azs[count.index]}"
Tier = "database"
})
}
# Internet Gateway (public subnets use this to reach internet)
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}" })
}
# Elastic IP for NAT Gateway
resource "aws_eip" "nat" {
# Cost optimization: one NAT GW per environment, not per AZ
# For production HA: count = length(local.azs) and one NAT GW per AZ
count = var.environment == "production" ? length(local.azs) : 1
domain = "vpc"
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}-nat-${count.index}" })
depends_on = [aws_internet_gateway.main]
}
# NAT Gateway (private subnets use this to reach internet)
resource "aws_nat_gateway" "main" {
count = var.environment == "production" ? length(local.azs) : 1
allocation_id = aws_eip.nat[count.index].id
subnet_id = aws_subnet.public[count.index].id # NAT GW lives in public subnet
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}-nat-${count.index}" })
depends_on = [aws_internet_gateway.main]
}
# Route tables
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}-public" })
}
resource "aws_route_table" "private" {
count = var.environment == "production" ? length(local.azs) : 1
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.main[count.index].id
}
tags = merge(var.common_tags, {
Name = "${var.name}-${var.environment}-private-${count.index}"
})
}
resource "aws_route_table" "database" {
vpc_id = aws_vpc.main.id
# No default route — database subnets cannot reach internet
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}-database" })
}
# Associate subnets with route tables
resource "aws_route_table_association" "public" {
count = length(local.azs)
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
resource "aws_route_table_association" "private" {
count = length(local.azs)
subnet_id = aws_subnet.private[count.index].id
route_table_id = aws_route_table.private[
var.environment == "production" ? count.index : 0
].id
}
resource "aws_route_table_association" "database" {
count = length(local.azs)
subnet_id = aws_subnet.database[count.index].id
route_table_id = aws_route_table.database.id
}
☁️ Is Your Cloud Costing Too Much?
Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.
- AWS, GCP, Azure certified engineers
- Infrastructure as Code (Terraform, CDK)
- Docker, Kubernetes, GitHub Actions CI/CD
- Typical audit recovers $500–$3,000/month in savings
VPC Endpoints (Eliminate NAT Gateway Charges)
S3 and DynamoDB traffic from private subnets normally flows through NAT Gateway at $0.045/GB. Gateway endpoints are free:
# terraform/vpc-endpoints.tf
# S3 Gateway Endpoint (free — no data processing charge)
resource "aws_vpc_endpoint" "s3" {
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.${var.region}.s3"
vpc_endpoint_type = "Gateway"
route_table_ids = concat(
[aws_route_table.public.id],
aws_route_table.private[*].id,
[aws_route_table.database.id]
)
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}-s3" })
}
# DynamoDB Gateway Endpoint (free)
resource "aws_vpc_endpoint" "dynamodb" {
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.${var.region}.dynamodb"
vpc_endpoint_type = "Gateway"
route_table_ids = concat(
[aws_route_table.public.id],
aws_route_table.private[*].id
)
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}-dynamodb" })
}
# ECR Interface Endpoints (for ECS pulling images without NAT)
# Interface endpoints cost $0.01/hour + $0.01/GB
resource "aws_vpc_endpoint" "ecr_api" {
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.${var.region}.ecr.api"
vpc_endpoint_type = "Interface"
subnet_ids = aws_subnet.private[*].id
security_group_ids = [aws_security_group.vpc_endpoints.id]
private_dns_enabled = true
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}-ecr-api" })
}
resource "aws_vpc_endpoint" "ecr_dkr" {
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.${var.region}.ecr.dkr"
vpc_endpoint_type = "Interface"
subnet_ids = aws_subnet.private[*].id
security_group_ids = [aws_security_group.vpc_endpoints.id]
private_dns_enabled = true
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}-ecr-dkr" })
}
Security Groups
# terraform/security-groups.tf
# ALB: accepts HTTPS from internet
resource "aws_security_group" "alb" {
name = "${var.name}-${var.environment}-alb"
description = "Application Load Balancer"
vpc_id = aws_vpc.main.id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
description = "HTTPS from internet"
}
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
description = "HTTP redirect only"
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
description = "All outbound (to ECS tasks)"
}
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}-alb" })
}
# ECS tasks: accepts traffic from ALB only
resource "aws_security_group" "ecs_tasks" {
name = "${var.name}-${var.environment}-ecs-tasks"
description = "ECS Fargate tasks"
vpc_id = aws_vpc.main.id
ingress {
from_port = 3000
to_port = 3000
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
description = "App port from ALB only"
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
description = "All outbound (internet via NAT, S3 via endpoint)"
}
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}-ecs-tasks" })
}
# RDS: accepts connections from ECS and Lambda only
resource "aws_security_group" "rds" {
name = "${var.name}-${var.environment}-rds"
description = "RDS PostgreSQL"
vpc_id = aws_vpc.main.id
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.ecs_tasks.id]
description = "PostgreSQL from ECS tasks"
}
# No egress rules — RDS doesn't initiate connections
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}-rds" })
}
# VPC Endpoints: accepts HTTPS from private subnets
resource "aws_security_group" "vpc_endpoints" {
name = "${var.name}-${var.environment}-vpc-endpoints"
description = "Interface VPC endpoints"
vpc_id = aws_vpc.main.id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = [for s in aws_subnet.private : s.cidr_block]
description = "HTTPS from private subnets"
}
tags = merge(var.common_tags, { Name = "${var.name}-${var.environment}-vpc-endpoints" })
}
⚙️ DevOps Done Right — Zero Downtime, Full Automation
Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.
- Staging + production environments with feature flags
- Automated security scanning in the pipeline
- Uptime monitoring + alerting + runbook automation
- On-call support handover docs included
Flow Logs (Security Audit)
# terraform/flow-logs.tf
resource "aws_cloudwatch_log_group" "vpc_flow_logs" {
name = "/aws/vpc/${var.name}-${var.environment}/flow-logs"
retention_in_days = 30
tags = var.common_tags
}
resource "aws_flow_log" "main" {
vpc_id = aws_vpc.main.id
traffic_type = "REJECT" # Only log rejected traffic (reduces cost vs ALL)
iam_role_arn = aws_iam_role.vpc_flow_logs.arn
log_destination = aws_cloudwatch_log_group.vpc_flow_logs.arn
tags = var.common_tags
}
Cost Breakdown
| Component | Monthly Cost |
|---|---|
| NAT Gateway (1 AZ, dev) | $32 + $0.045/GB data |
| NAT Gateway (3 AZ, prod) | $96 + $0.045/GB data |
| S3 Gateway Endpoint | Free |
| DynamoDB Gateway Endpoint | Free |
| ECR Interface Endpoints (2) | ~$14 + $0.01/GB |
| VPC Flow Logs (REJECT only) | ~$0.50–$2 |
| Elastic IPs (1 per NAT) | Free when attached |
Replacing NAT Gateway S3 traffic with a gateway endpoint typically saves $10–$50/month per environment.
See Also
- AWS ECS Fargate Production — Deploying ECS into this VPC
- AWS RDS Aurora — RDS in database subnets
- Terraform State Management — Managing this VPC config
- AWS CloudWatch Observability — VPC flow log analysis
Working With Viprasol
We design and implement production AWS VPC architectures for SaaS products — from single-AZ development environments through multi-AZ production setups with Transit Gateway and VPC peering. Our cloud team has designed VPCs for products serving millions of requests per day.
What we deliver:
- CIDR block planning for current needs and future growth
- Public/private/database subnet layout across 3 AZs
- NAT Gateway with cost optimization (single vs multi-AZ)
- S3 and DynamoDB gateway endpoints (free NAT bypass)
- Security group rules with least-privilege ingress
- VPC flow logs for security audit and troubleshooting
See our cloud infrastructure services or contact us to design your VPC.
About the Author
Viprasol Tech Team
Custom Software Development Specialists
The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.
Need DevOps & Cloud Expertise?
Scale your infrastructure with confidence. AWS, GCP, Azure certified team.
Free consultation • No commitment • Response within 24 hours
Making sense of your data at scale?
Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.