AWS Kubernetes: EKS Production Architecture (2026)
Deploy AWS Kubernetes with EKS, Terraform, Docker, and CI/CD pipelines. Expert guide to production-grade container orchestration, autoscaling, and DevOps on AWS
AWS Kubernetes (EKS): Setup, Scaling, and Cost Optimization (2026)
Kubernetes has become the de facto standard for container orchestration, and Amazon EKS (Elastic Kubernetes Service) is the managed Kubernetes offering that many organizations choose for its tight AWS integration and operational simplicity. At Viprasol, we've deployed and managed dozens of EKS clusters across production workloads, and we've learned what works—and what doesn't.
This guide covers EKS architecture, setup, scaling strategies, and cost optimization techniques that will help you run production-grade Kubernetes on AWS efficiently.
EKS Architecture and Components
Amazon EKS manages the Kubernetes control plane (API server, etcd, scheduler) while you manage worker nodes. This reduces operational burden compared to self-managed Kubernetes.
The Control Plane (AWS Managed)
AWS runs highly available control plane nodes across multiple availability zones. You don't provision, patch, or maintain them. AWS automatically handles:
- API server updates
- etcd backups and upgrades
- Control plane scaling
- Security patches
Worker Nodes
You provision worker nodes (EC2 instances or Fargate) where your containers run. You're responsible for:
- Node OS patches and updates
- Container runtime upgrades
- Node capacity planning
- Network configuration
Networking: VPC and CNI
EKS runs in your VPC. The AWS CNI (Container Network Interface) plugin assigns IP addresses from your VPC subnet to pods. Each pod gets a real VPC IP address, enabling direct VPC communication and security group integration.
Alternative CNI plugins (Calico, Cilium) offer advanced networking but add operational complexity. Most organizations start with the AWS CNI, which is fine for 90% of use cases.
Setting Up Your First EKS Cluster
Prerequisites
Before creating a cluster, prepare your AWS environment:
-
IAM Role for Control Plane: The role allowing EKS to manage resources
-
VPC and Subnets: Three subnets minimum (one per AZ for HA)
-
Security Groups: Control what traffic enters/exits your cluster
-
AWS CLI and kubectl: Command-line tools for interaction
-
eksctl: The official EKS CLI for streamlined cluster creation
Quick Start with eksctl
The fastest path to a working EKS cluster:
eksctl create cluster \
--name my-cluster \
--region us-east-1 \
--nodegroup-name standard-nodes \
--node-type t3.medium \
--nodes 3 \
--nodes-min 1 \
--nodes-max 5
This command creates:
- A Kubernetes cluster with HA control plane
- A managed node group with 3 t3.medium EC2 instances
- Auto-scaling group configured for 1-5 nodes
- VPC, subnets, and security groups
- IAM roles and policies
The cluster is ready in 10-15 minutes. Verify connectivity:
kubectl cluster-info
kubectl get nodes
At Viprasol, we automate cluster creation through Infrastructure as Code (Terraform or CloudFormation) to maintain consistency across environments and enable easy cluster recreation.
☁️ Is Your Cloud Costing Too Much?
Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.
- AWS, GCP, Azure certified engineers
- Infrastructure as Code (Terraform, CDK)
- Docker, Kubernetes, GitHub Actions CI/CD
- Typical audit recovers $500–$3,000/month in savings
Scaling Your EKS Cluster
Scaling is two-dimensional in Kubernetes: horizontal pod autoscaling (adding pods) and node autoscaling (adding nodes).
Horizontal Pod Autoscaling (HPA)
HPA automatically scales the number of pod replicas based on CPU, memory, or custom metrics.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
This configuration scales pods between 2 and 10 replicas, maintaining 70% CPU and 80% memory utilization. HPA checks metrics every 15 seconds by default.
Cluster Autoscaling
The Cluster Autoscaler adds nodes when pods can't be scheduled and removes nodes when they're underutilized.
Deploy via Helm:
helm repo add autoscaling https://kubernetes.github.io/autoscaler
helm install cluster-autoscaler autoscaling/cluster-autoscaler \
--namespace kube-system \
--set awsRegion=us-east-1 \
--set autoDiscovery.clusterName=my-cluster
Cluster Autoscaler scales based on:
- Pod requests (CPU/memory)
- Node capacity
- Scheduling constraints (node selectors, affinity rules)
A pod requesting more resources than available triggers node scaling. Removing unused nodes saves costs during low-traffic periods.
Karpenter: Advanced Node Provisioning
Karpenter is a newer alternative to Cluster Autoscaler offering faster scaling and better bin-packing (fitting pods efficiently onto nodes).
helm repo add karpenter https://charts.karpenter.sh
helm install karpenter karpenter/karpenter \
--namespace karpenter --create-namespace \
--set sa.annotations."eks\.amazonaws\.com/role-arn"=arn:aws:iam::ACCOUNT:role/KarpenterNodeRole
Define provisioner policies:
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
requirements:
- key: node.k8s.aws/instance-family
operator: In
values: [t3, t4g]
- key: karpenter.sh/capacity-type
operator: In
values: [on-demand, spot]
limits:
resources:
cpu: 100
memory: 100Gi
Karpenter can mix on-demand and spot instances, consolidating pods onto fewer nodes. At Viprasol, we use Karpenter for cost-sensitive workloads, achieving 30-40% cost reductions through spot instance utilization.
Cost Optimization Strategies
Spot Instances
Spot instances are spare AWS capacity available at 70-90% discounts. The catch: AWS can terminate them with 2 minutes notice. Perfect for fault-tolerant workloads.
Configure node groups to use spot:
eksctl create nodegroup \
--cluster my-cluster \
--name spot-nodes \
--instance-types t3.medium,t3.large \
--spot
Or via Karpenter provisioner:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: [spot]
With spot instances, always run multiple replicas. Node termination doesn't equal pod loss if you have redundancy.
Right-Sizing Instances
Oversizing nodes wastes money. Use Kubernetes metrics to identify the right instance type.
kubectl top nodes
kubectl top pods -A
If nodes consistently use only 20% capacity, downsize. A t3.large at 20% utilization should be a t3.medium or t3.small.
Reserved Instances
For baseline workloads, Reserved Instances offer 20-40% discounts over on-demand pricing, locked in for 1 or 3 years.
Mix reserved instances (baseline capacity) with on-demand or spot (burst capacity). At Viprasol, we run RI's for predictable services and spots for analytics jobs.
Pod Resource Requests and Limits
Accurate resource requests enable efficient scheduling and cost optimization.
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
Requests tell Kubernetes the minimum resources a pod needs. Limits prevent runaway pods. Set these based on actual usage (measure in staging first).
Fargate for Cost Optimization
AWS Fargate eliminates node management—you pay only for pod compute time. Great for variable workloads where you don't need constant baseline capacity.
eksctl create fargateprofile \
--cluster my-cluster \
--name default \
--namespace default
Fargate pricing is higher per CPU/memory hour than EC2, but you save on unused node capacity. Use for non-latency-sensitive batch jobs and dev/test workloads.
Compute Savings Plan
AWS Compute Savings Plans offer flexible discounts across EC2, Fargate, and Lambda. One plan can cover your EKS cluster and other AWS services.

⚙️ DevOps Done Right — Zero Downtime, Full Automation
Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.
- Staging + production environments with feature flags
- Automated security scanning in the pipeline
- Uptime monitoring + alerting + runbook automation
- On-call support handover docs included
Monitoring, Logging, and Troubleshooting
CloudWatch Integration
EKS integrates with CloudWatch. Enable control plane logging:
aws eks update-cluster-config \
--name my-cluster \
--logging '{"clusterLogging":[{"types":["api","audit","authenticator","controllerManager","scheduler"],"enabled":true}]}'
View logs in CloudWatch console. API logs show all requests to the Kubernetes API; audit logs show who changed what.
Container Insights
Container Insights provides dashboards for cluster, node, and pod metrics.
Deploy the CloudWatch agent:
kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/quickstart/cwagent-fluentd-quickstart.yaml
View dashboards in CloudWatch console. Monitor CPU, memory, disk, and network per node and pod.
Debugging Common Issues
Pod stuck in Pending: Usually insufficient node resources. Check:
kubectl describe pod <pod-name>
kubectl get nodes
Add nodes or reduce resource requests.
Node NotReady: SSH to the node and check kubelet status:
systemctl status kubelet
journalctl -u kubelet -n 50
Restart if necessary: systemctl restart kubelet.
Networking issues: Verify security groups allow traffic. Check VPC CNI plugin:
kubectl get pods -n kube-system | grep aws-node
Key EKS Features and Services
| Feature | Purpose | Use Case |
|---|---|---|
| IAM Roles for Service Accounts | Fine-grained pod IAM permissions | Grant S3 access to specific pods |
| Network Policies | Control pod-to-pod traffic | Enforce network segmentation |
| Ingress Controllers | Route external traffic to services | Host multiple services on one load balancer |
| IRSA (IAM Roles for Service Accounts) | Pod-level AWS permissions | Avoid storing AWS keys in pods |
| Managed Node Groups | AWS manages node lifecycle | Simplified node updates and patching |
| EKS Anywhere | Kubernetes on-premises | Extend EKS to data centers |
Quick Answers
Q1: Should I use EKS or self-managed Kubernetes?
EKS offloads control plane management, freeing your team for application work. Use EKS unless you have specific requirements for self-managed Kubernetes. At Viprasol, we prefer EKS for production because AWS handles security patches, HA, and scaling for the control plane.
Q2: How do I migrate from self-managed Kubernetes to EKS?
Create a new EKS cluster in the same VPC, configure node networking identically, then migrate workloads using kubectl or Helm. Test thoroughly in staging first. Most migrations take 1-2 weeks.
Q3: What's the right instance type for my workload?
Start with t3 family (burstable, cost-effective). Measure actual CPU/memory usage in production. Migrate to c5 or c6 for CPU-bound workloads, r5 for memory-bound. At Viprasol, we often use t3.large as baseline with spot instances for burst.
Q4: How do I manage secrets in EKS?
Use AWS Secrets Manager or Parameter Store with IRSA to grant pod access:
kind: ServiceAccount
metadata:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT:role/MyAppRole
Never store secrets in ConfigMaps or environment variables.
Q5: Can I run stateful applications on EKS?
Yes, using EBS volumes for persistent storage. For databases, consider RDS instead—managed services are simpler. StatefulSets manage pod identities and persistent volume claims:
kind: StatefulSet
spec:
serviceName: "mysql"
replicas: 1
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
Q6: What's the EKS cost for a basic cluster?
EKS charges $0.10/hour per cluster ($73/month). Add EC2 instance costs, data transfer, and load balancers. A 3-node t3.medium cluster costs roughly $200-300/month including EKS fees. At Viprasol, we use our cloud solutions to reduce this through spot instances and consolidation.
Moving Forward with EKS and Cloud Solutions
EKS provides the best of both worlds: managed simplicity and Kubernetes flexibility. The key is respecting best practices around scaling, cost optimization, and monitoring. At Viprasol, we leverage EKS through our AI agent systems and trading software to achieve 99.9% uptime and cost efficiency for compute-intensive workloads.
Start small, measure everything, and scale incrementally. Your production EKS cluster will handle demands you haven't anticipated yet—built on solid foundations.
External Resources
About the Author
Viprasol Tech Team
Custom Software Development Specialists
The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 1000+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement.
Need DevOps & Cloud Expertise?
Scale your infrastructure with confidence. AWS, GCP, Azure certified team.
Free consultation • No commitment • Response within 24 hours
Making sense of your data at scale?
Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.