Kubernetes Networking: CNI Plugins, NetworkPolicy, Service Mesh vs Native, and Ingress
Master Kubernetes networking: choose the right CNI plugin, write NetworkPolicy rules for zero-trust, decide between service mesh and native K8s networking, and configure production ingress with TLS.
Kubernetes networking is one of the most misunderstood parts of the platform. The abstractions are clean — Pods, Services, Ingress — but the underlying implementation varies dramatically between CNI plugins, and the security implications of that variance are significant.
By default, all pods in a cluster can communicate with all other pods. That's the opposite of zero-trust. NetworkPolicy is the tool that fixes it, but only if your CNI actually enforces it.
The Kubernetes Network Model
Three requirements every Kubernetes network must satisfy:
- Every pod gets its own IP address
- Pods can communicate with any other pod without NAT
- Agents on a node can communicate with all pods on that node
This flat network is intentional — it makes service discovery simple. The tradeoff: without NetworkPolicy, any compromised pod can reach any other pod in the cluster.
CNI Plugin Comparison
The Container Network Interface (CNI) is the plugin layer that implements pod networking. Your choice determines feature availability, performance, and operational complexity.
| CNI | Implementation | NetworkPolicy | eBPF | Multi-cluster | Best For |
|---|---|---|---|---|---|
| Flannel | VXLAN overlay | ❌ No enforcement | ❌ | ❌ | Simple clusters, learning |
| Calico | eBPF or iptables | ✅ Full + extended | ✅ Optional | ✅ | Production, policy-heavy |
| Cilium | eBPF native | ✅ L3/L4/L7 | ✅ Native | ✅ | High-performance, observability |
| Weave | VXLAN mesh | ✅ Basic | ❌ | ❌ | Simplicity over performance |
| AWS VPC CNI | Native VPC IPs | ✅ (via Calico) | ❌ | Limited | EKS, VPC-native routing |
Recommendation for production: Cilium for new clusters. It's the only CNI with native L7 policy (HTTP method/path-level), built-in Hubble observability, and zero-overhead eBPF data plane.
☁️ Is Your Cloud Costing Too Much?
Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.
- AWS, GCP, Azure certified engineers
- Infrastructure as Code (Terraform, CDK)
- Docker, Kubernetes, GitHub Actions CI/CD
- Typical audit recovers $500–$3,000/month in savings
NetworkPolicy: Zero-Trust Pod Communication
Default K8s networking is deny-nothing. NetworkPolicy lets you implement deny-all and then explicitly allow what's needed.
Deny All Traffic by Default
# policy/deny-all.yaml
# Apply to every namespace — this is your baseline
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all-ingress-egress
namespace: production
spec:
podSelector: {} # Applies to ALL pods in namespace
policyTypes:
- Ingress
- Egress
# No ingress/egress rules = deny all
Allow Specific Service Communication
# policy/api-to-database.yaml
# Only the API pods can talk to the database pods
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-api-to-postgres
namespace: production
spec:
# This policy applies to postgres pods
podSelector:
matchLabels:
app: postgres
tier: database
policyTypes:
- Ingress
ingress:
- from:
# Only pods with this label can connect
- podSelector:
matchLabels:
app: api
tier: backend
ports:
- protocol: TCP
port: 5432
# policy/api-egress.yaml
# API pods can only talk to: database, redis, external APIs
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-egress-rules
namespace: production
spec:
podSelector:
matchLabels:
app: api
policyTypes:
- Egress
egress:
# Allow DNS resolution (required for all pods)
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- port: 53
protocol: UDP
- port: 53
protocol: TCP
# Allow connection to database (same namespace)
- to:
- podSelector:
matchLabels:
app: postgres
ports:
- port: 5432
protocol: TCP
# Allow connection to Redis
- to:
- podSelector:
matchLabels:
app: redis
ports:
- port: 6379
protocol: TCP
# Allow HTTPS to external services (Stripe, Sendgrid, etc.)
- to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 10.0.0.0/8 # Block internal VPC
- 172.16.0.0/12 # Block pod CIDR
- 192.168.0.0/16 # Block private ranges
ports:
- port: 443
protocol: TCP
Cilium L7 Policy (HTTP-level)
Standard K8s NetworkPolicy only works at L3/L4 (IP/port). Cilium extends this to L7:
# cilium-policy/api-http-policy.yaml
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: api-http-rules
namespace: production
spec:
endpointSelector:
matchLabels:
app: internal-service
ingress:
- fromEndpoints:
- matchLabels:
app: api
toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
# Only allow these specific HTTP paths
- method: GET
path: /api/v1/users/.*
- method: POST
path: /api/v1/orders
# Block everything else at L7
Service Mesh: When You Need It
A service mesh adds a sidecar proxy (Envoy) to every pod, providing mTLS, traffic management, and observability automatically.
Use a service mesh when you need:
- Automatic mTLS between services (zero-code encryption)
- Traffic splitting for canary deployments (10% → 50% → 100%)
- Circuit breaking at L7
- Distributed tracing without code changes
- Fine-grained retry and timeout policies per route
Don't use a service mesh when:
- Your cluster has fewer than 20 services
- You can't absorb 20–50ms added latency from sidecar
- Your team doesn't have capacity to operate it
- Cilium + NetworkPolicy meets your security requirements
Istio vs Linkerd Comparison
| Factor | Istio | Linkerd |
|---|---|---|
| Proxy | Envoy | Linkerd-proxy (Rust) |
| CPU overhead | 15–30% | 5–10% |
| Memory overhead | 150–300MB/pod | 30–50MB/pod |
| Latency added | 5–15ms | 1–3ms |
| Feature set | Comprehensive | Focused/simpler |
| Learning curve | High | Medium |
| mTLS | ✅ | ✅ |
| Traffic splitting | ✅ | ✅ |
| L7 observability | ✅ | ✅ |
| WASM plugins | ✅ | ❌ |
Recommendation: Linkerd for most teams — lower overhead, simpler ops, covers 90% of service mesh use cases.
Linkerd Installation
# Install Linkerd CLI
curl -sL run.linkerd.io/install | sh
# Validate cluster compatibility
linkerd check --pre
# Install Linkerd control plane
linkerd install --crds | kubectl apply -f -
linkerd install | kubectl apply -f -
# Verify
linkerd check
# Inject sidecar into a namespace (all pods get sidecar on restart)
kubectl annotate namespace production \
linkerd.io/inject=enabled
# Or inject into a specific deployment
kubectl get deploy api -o yaml | linkerd inject - | kubectl apply -f -
# Linkerd traffic split — canary deployment
apiVersion: split.smi-spec.io/v1alpha2
kind: TrafficSplit
metadata:
name: api-canary
namespace: production
spec:
service: api
backends:
- service: api-stable # Current version
weight: 90
- service: api-canary # New version
weight: 10
⚙️ DevOps Done Right — Zero Downtime, Full Automation
Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.
- Staging + production environments with feature flags
- Automated security scanning in the pipeline
- Uptime monitoring + alerting + runbook automation
- On-call support handover docs included
Production Ingress Configuration
# ingress/nginx-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-ingress
namespace: production
annotations:
# Rate limiting
nginx.ingress.kubernetes.io/limit-rps: "100"
nginx.ingress.kubernetes.io/limit-connections: "20"
# TLS configuration
cert-manager.io/cluster-issuer: "letsencrypt-prod"
# Security headers
nginx.ingress.kubernetes.io/configuration-snippet: |
more_set_headers "Strict-Transport-Security: max-age=63072000; includeSubDomains; preload";
more_set_headers "X-Frame-Options: DENY";
more_set_headers "X-Content-Type-Options: nosniff";
# Proxy settings
nginx.ingress.kubernetes.io/proxy-body-size: "10m"
nginx.ingress.kubernetes.io/proxy-read-timeout: "60"
nginx.ingress.kubernetes.io/proxy-send-timeout: "60"
# WebSocket support
nginx.ingress.kubernetes.io/proxy-http-version: "1.1"
nginx.ingress.kubernetes.io/proxy-set-headers: "production/websocket-headers"
spec:
ingressClassName: nginx
tls:
- hosts:
- api.viprasol.com
secretName: api-tls-cert # cert-manager creates this
rules:
- host: api.viprasol.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api
port:
number: 3000
- path: /ws
pathType: Prefix
backend:
service:
name: websocket-server
port:
number: 3001
cert-manager for Automatic TLS
# cert-manager/cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: devops@viprasol.com
privateKeySecretRef:
name: letsencrypt-prod-key
solvers:
# DNS-01 challenge via Route53 (more reliable for wildcard certs)
- dns01:
route53:
region: us-east-1
hostedZoneID: Z1234567890
accessKeyIDSecretRef:
name: route53-credentials
key: access-key-id
secretAccessKeySecretRef:
name: route53-credentials
key: secret-access-key
Network Observability with Hubble (Cilium)
# Install Hubble (Cilium's observability layer)
cilium hubble enable --ui
# Watch live network flows
hubble observe --namespace production --follow
# Filter for dropped packets (policy violations)
hubble observe --namespace production \
--verdict DROPPED \
--follow
# Watch flows between specific pods
hubble observe \
--from-pod production/api-7d9f8b-xyz \
--to-pod production/postgres-5c8d9a-abc \
--follow
# Export metrics to Prometheus
cilium hubble enable --metrics "dns,drop,tcp,flow,port-distribution,icmp,http"
Terraform: Production EKS Networking
# terraform/eks-networking.tf
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.0"
cluster_name = "production"
cluster_version = "1.30"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
# Cilium as CNI (replace aws-vpc-cni)
cluster_addons = {
# Skip vpc-cni — Cilium replaces it
coredns = {
most_recent = true
}
kube-proxy = {
most_recent = true
}
}
eks_managed_node_groups = {
main = {
instance_types = ["m6i.xlarge"]
min_size = 3
max_size = 10
desired_size = 3
# Taint for Cilium bootstrap
taints = [
{
key = "node.cilium.io/agent-not-ready"
value = "true"
effect = "NO_EXECUTE"
}
]
}
}
}
# Install Cilium via Helm after cluster creation
resource "helm_release" "cilium" {
name = "cilium"
repository = "https://helm.cilium.io"
chart = "cilium"
version = "1.15.5"
namespace = "kube-system"
set {
name = "eni.enabled"
value = "true"
}
set {
name = "ipam.mode"
value = "eni"
}
set {
name = "hubble.relay.enabled"
value = "true"
}
set {
name = "hubble.ui.enabled"
value = "true"
}
set {
name = "operator.replicas"
value = "2"
}
depends_on = [module.eks]
}
Networking Cost Reference (2026)
| Component | Option | Monthly Cost (10-node cluster) |
|---|---|---|
| CNI | Cilium (open source) | $0 |
| Service mesh | Linkerd (open source) | $0 |
| Service mesh | Istio (open source) | $0 |
| Ingress controller | nginx-ingress (open source) | $0 |
| Load balancer (AWS NLB) | Per LB | $16 + $0.006/LCU |
| TLS certificates | cert-manager + Let's Encrypt | $0 |
| TLS certificates | AWS ACM | $0 (free with ALB/NLB) |
| Network observability | Hubble (Cilium) | $0 |
| Network observability | Datadog NPM | $5–$10/host/month |
See Also
- Cloud-Native Security Best Practices — pod security standards
- Kubernetes Operators: Automating Complex Workloads — custom controllers
- Distributed Tracing with OpenTelemetry — service-level observability
- Zero-Trust Security Architecture — network zero-trust model
Working With Viprasol
Kubernetes networking misconfiguration is one of the most common causes of security incidents in cloud-native environments. Our platform engineers design cluster networking with zero-trust NetworkPolicy from day one, implement service mesh when the complexity is justified, and tune ingress for performance and security.
About the Author
Viprasol Tech Team
Custom Software Development Specialists
The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.
Need DevOps & Cloud Expertise?
Scale your infrastructure with confidence. AWS, GCP, Azure certified team.
Free consultation • No commitment • Response within 24 hours
Making sense of your data at scale?
Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.