- Published on
Amazon EKS Cheat Sheet - Complete Kubernetes Service Guide 2025
- Authors
- Name
- QuizCld
Amazon EKS stands for Elastic Kubernetes Service, and it is used to launch and manage Kubernetes clusters on AWS.
Key Benefits:
- Faster deployment and lower operational overhead - EKS handles scheduling, cluster services, updates, and maintenance so teams can focus on applications
- Seamless scalability for control plane (by AWS) and the data plane - Supporting up to 100,000 worker nodes per cluster, enabling ultra‑scale AI/ML workloads with up to 1.6M Trainium chips or 800,000 GPUs in a single cluster
- Improved security via deep AWS integration, including IAM for RBAC, Pod Identity, container image signing, automatic patching, and immutable operating systems
Two operation modes tailored to user needs:
- Standard EKS: AWS manages only the control plane and leaves node management to users
- EKS Auto Mode: AWS extends management to compute, networking, and storage - automatically provisioning, scaling, and patching the data plane with Bottlerocket nodes and managed EC2 instances (powered by Karpenter)
Features of Amazon EKS
Multiple Management Interfaces
EKS supports various tools to provision and manage clusters including: AWS Management Console, Amazon EKS API/SDKs, AWS CDK, AWS CLI, AWS CloudFormation and Terraform
Access Control and IAM Integration
- Supports Kubernetes-native RBAC integrated with AWS IAM, allowing fine-grained access control using IAM users and roles
- Enables Kubernetes workloads to assume AWS IAM roles using Kubernetes Service Accounts (IRSA / Pod Identity)
Compute Options & EC2 Support
- Full access to all EC2 instance types: x86, Graviton (ARM), and Nitro architecture
- Supports EC2 On-Demand, Spot, Savings Plans, and AWS Fargate (serverless compute)
Storage Flexibility via CSI Integrations
- Standard mode allows you to manage storage classes manually
- In EKS Auto Mode, default storage classes using EBS are auto-created
- CSI drivers also enable use of Amazon S3, EFS, FSx, and File Cache for persistent storage needs
Embedded Security & Shared Responsibility Model
- AWS manages security of the control plane; users are responsible for securing worker nodes and workloads
- Supports compliance certifications: SOC, PCI, HIPAA‑eligible, ISO, FedRAMP, and more
- Integrations for container image signing, GuardDuty Runtime Monitoring, and secure networking via VPC policies
Observability and Monitoring Tools
- Native integration with CloudWatch Container Insights, Prometheus (via Amazon Managed Service for Prometheus), CloudTrail logs, and AWS Distro for OpenTelemetry (ADOT) Operator
- EKS Dashboard with curated control plane metrics, control plane log analysis, and alerts available in the AWS console at zero extra cost
Certified Kubernetes Compatibility & Version Support
- Conformant with upstream Kubernetes - existing tools and plugins are compatible
- Supports standard support for latest 3 Kubernetes versions and optional extended support for up to 26 months total
- Automatic notification via AWS Health and optional auto-upgrade policies help manage version lifecycle
Amazon EKS Components
Cluster
- Amazon EKS uses a dedicated Kubernetes control plane, deployed across at least two API server nodes and three etcd instances in multiple Availability Zones (AZs) for high availability and resilience
- Your cluster runs in a VPC with public and private subnets spanning multiple AZs. You must provide at least two subnets in different AZs, each with ≥16 IP addresses recommended
- Cross-account ENIs created by EKS in your VPC allow secure communication between worker nodes and your control plane endpoint
- Pods receive VPC-native IP addresses via the AWS VPC CNI plugin, enabling them to participate in the VPC networking layer alongside EC2 instances
Nodes
EKS worker nodes are typically EC2 instances grouped in Auto Scaling Groups (ASGs) and managed via: Managed Node Groups, Self‑managed Nodes, Fargate
Node Types Comparison:
Node Type | Description | Pros | Cons |
---|---|---|---|
Managed Node Groups | AWS-managed EC2 instances in Auto Scaling Groups. Lifecycle, patching, and scaling are automated | Easy to use • Auto-scaling support • Integration with Spot & Savings Plans | Limited customization • Only supports EKS-optimized AMIs |
Self-Managed Nodes | You provision and configure EC2 instances manually, then join them to the cluster yourself | Full control (AMI, kernel, networking) • Supports custom configurations | Requires manual setup • No auto patching or lifecycle automation |
AWS Fargate | Fully serverless, no EC2 nodes. Each Pod runs in its own isolated environment managed by AWS | No infrastructure to manage • Per-Pod isolation • Ideal for short-lived workloads | Limited support for DaemonSets • No host-level access • Fewer customization options |
Networking & Load Balancing
- Use the VPC CNI plugin to provide each Pod with its own IP address from the VPC subnet; supports ENI warm pools and optional custom networking using secondary CIDR blocks via ENIConfig CRDs
- Load Balancing:
- Provision public or private ALBs/NLBs to expose services externally or internally
- In Fargate mode, ensure ALBs/NLBs target pods using IP mode rather than instance mode
Storage Integration
Pods are connected to volumes using StorageClass resources via Container Storage Interface (CSI) drivers.
Supported persistent storage options:
- Amazon EBS (block storage)
- Amazon EFS (only supported in Fargate mode)
- Amazon FSx for Lustre
- FSx for NetApp ONTAP
Amazon EKS Storage
Storage Option | Use Case | AZ Support | Read Modes | Best For |
---|---|---|---|---|
EBS (gp3) | High-performance block storage | Single AZ | ReadWriteOnce | Databases, transactional apps |
EFS (Standard/OneZone) | Shared file storage across Pods/AZs | Multi-AZ | ReadWriteMany | Shared storage, multi-Pod access |
FSx for Lustre | HPC/AI training, high-throughput compute | Multi-AZ | ReadWriteMany | Data-intensive workloads |
FSx for ONTAP | Enterprise-grade shared file systems | Multi-AZ | ReadWriteMany | Enterprise workloads, rich features |
S3 CSI / File Cache | Object storage with caching | Global (S3) | ReadWriteOnce | Data lakes, large file access |
Amazon EKS Networking
Kubernetes Networking Requirements
- Pods on the same node must communicate without NAT
- All system daemons on a node can communicate with local Pods
- Pods using the host network must reach Pods on other nodes without NAT
- Based on the Kubernetes networking model, compatible with standard CNI plugins
Amazon VPC CNI Components
- CNI binary: Manages Pod network setup on the host
- Ipamd (IP Address Management Daemon):
- Allocates Elastic Network Interfaces (ENIs)
- Manages a warm pool of IP addresses
- Improves Pod startup performance by pre-allocating IPs
Operating Modes
Mode | Description | Use Case |
---|---|---|
Secondary IP | Default mode. Pods receive IPs from ENIs attached to EC2 nodes | Suitable for small to medium clusters |
Prefix Delegation | Assigns a full /28 IPv4 block to an ENI, enabling more Pods per node | Increases Pod density |
Custom Networking | Pods use IPs from separate subnets or CG-NAT ranges (100.64.0.0/10, 198.19.0.0/16) | Prevents RFC1918 address exhaustion |
IPv6 Mode | Pods get IPv6 addresses from a managed VPC IPv6 range | Preferred for solving IPv4 exhaustion at scale |
Network Security & Isolation
- Security Groups:
- By default, Pods inherit the EC2 node's primary ENI security group
- Optionally, enable per-Pod security groups for granular isolation
- Supports VPC Flow Logs, Routing Policies, and AWS-native Network ACLs
Subnet Planning & IP Management
- Ensure subnets in at least two Availability Zones
- Monitor subnet IP availability to avoid Pod scheduling failures
- Use Subnet Calculator tools to estimate subnet usage, evaluate ENI/IP configuration, and plan node density and scaling
Add-On Updates & Maintenance
- VPC CNI is a self-managed add-on
- Not updated automatically with the EKS control plane
- Update manually using AWS CLI
Amazon EKS Security
Shared Responsibility Model
Security of the cloud (AWS responsibility):
- AWS secures infrastructure, Kubernetes control plane, database
- Includes physical data center security and global network
- Third-party audits through programs like SOC, PCI, ISO, FedRAMP, etc.
Security in the cloud (Your responsibility):
- Secure your own data plane, including node configuration and network access (security groups)
- Maintain container images, operating systems, and patch updates
- Define network controls (e.g., firewalls)
- Configure IAM roles, RBAC policies, Kubernetes namespaces, and pod security policies
- Monitor activity using services like AWS CloudTrail, Amazon GuardDuty, and AWS Config
Compliance Certifications
Amazon EKS is certified under:
- SOC 1/2/3
- PCI-DSS
- ISO 27001/17/18
- FedRAMP Moderate
- IRAP
- ENS High
- HITRUST
- HIPAA eligible service
Amazon EKS Monitoring
Observability Dashboard (Built-in)
EKS Observability Dashboard in the AWS Console shows cluster-level health and performance:
- Metrics: API server latency, storage size, scheduler stats, pending pods, etc.
- Cluster health insights: configuration issues, Kubernetes version upgrade readiness
- Node health: detected via the EKS node monitoring agent and supports auto-repair for Auto Mode clusters
Metrics Collection & Analysis
- CloudWatch Container Insights provides aggregated metrics for EKS clusters, nodes, pods, and containers
- Use the CloudWatch agent with Prometheus integration to scrape and ingest cluster/application metrics into CloudWatch from Prometheus exporters
- For high-scale setups, use Amazon Managed Service for Prometheus (AMP) along with PromQL and Alertmanager for custom alerting workflows
- Complement metrics with kube-state-metrics (KSM) to expose Kubernetes object-level metrics (e.g. deployments, jobs, pod counts)
Logging & Tracing
- Enable control plane logs (API server, scheduler, audit logs) from the EKS control plane to send to CloudWatch Logs
- Use GuardDuty's EKS Audit Log Monitoring to continuously analyze audit logs for anomalous or unauthorized behavior (no manual agent needed)
- Forward application and system logs with Fluent Bit, Amazon OpenSearch, Elasticsearch/Kibana, or AWS X-Ray for distributed tracing and visualization
Alerting Best Practices
Follow AWS prescriptive guidance:
- Define thresholds based on SLOs and historical trends
- Use severity tiers (Critical, High, Medium)
- Group related alerts, include context, and link runbooks
- Utilize Prometheus + Grafana + Alertmanager or integrate with AWS SNS or Slack for team notifications
- Implement anomaly detection and custom metrics auto-scaling
Security Monitoring
- Enable GuardDuty Runtime Monitoring to profile container behavior (file access, process execution, network patterns) within EKS workloads (works with both Managed and Auto Mode clusters)
- Configure custom Trusted IP lists or Threat IP lists to alert on suspicious network interactions
Amazon EKS On AWS Outposts
Amazon EKS can run on AWS Outposts in two ways:
- Local Clusters: Both the control plane and worker nodes run on the Outpost, offering low latency and local data processing even during internet outages
- Extended Clusters: The control plane runs in the AWS Region, while worker nodes operate on the Outpost, ideal for hybrid workloads
Choose based on connectivity needs, latency, and compliance requirements.
Amazon EKS Anywhere
Amazon EKS Anywhere lets you run Kubernetes clusters on your on-premises infrastructure using VMware vSphere or bare metal servers, with consistent tooling from Amazon EKS. It supports offline cluster management, built-in observability, and integrates with GitOps-based lifecycle tools. Ideal for hybrid and disconnected edge environments, it provides operational consistency and security across environments.
Amazon EKS Distro
Amazon EKS Distro is the open-source distribution of the Kubernetes components used by Amazon EKS. It provides the same versions of Kubernetes, etcd, and other dependencies that are tested and deployed on EKS, ensuring compatibility and long-term support. You can use EKS-D to build and operate your own Kubernetes clusters independent of AWS, while still benefiting from secure defaults, EKS compatibility, and regular updates.
Amazon EKS Pricing
Item | Pricing |
---|---|
EKS Cluster | $0.10 per hour per cluster |
EC2-based Worker Nodes | Pay for EC2 instances, EBS volumes, and load balancers |
Fargate-based Pods | Pay per vCPU and memory used per second |
EKS Anywhere / EKS Distro | Free to use (open source), but you pay for infrastructure you run it on |
Savings Plans & Spot Pricing | Supported for EC2-based workloads to reduce costs |