Getting Started with AWS EKS (Preview)

I got access to the EKS preview last week and immediately dropped everything to kick the tires on it. Managed Kubernetes on AWS has been on my wishlist for a while — I've been hand-rolling kops clusters for clients and the operational overhead gets old fast. Here's what I found.

What EKS actually is

EKS gives you a managed Kubernetes control plane. AWS runs etcd and the API server across multiple availability zones; you don't see those nodes, you don't pay for them directly (you pay a flat $0.20/hour per cluster), and you don't have to patch them. You still manage your worker nodes — those are regular EC2 instances in your account — but at least the thing that's hardest to operate correctly (the control plane) is someone else's problem now.

This is different from GKE, where Google has been doing this for years. EKS is late to the party but it's AWS so everything integrates with IAM, VPC, ELB, etc. That matters if you're already all-in on AWS.

Creating a cluster

During preview you create clusters via the AWS CLI. eksctl from Weaveworks doesn't exist yet, so we're doing this the long way.

First, you need an IAM role for the control plane:

$ aws iam create-role \
  --role-name eks-cluster-role \
  --assume-role-policy-document file://eks-trust-policy.json

The trust policy lets EKS assume the role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": { "Service": "eks.amazonaws.com" },
      "Action": "sts:AssumeRole"
    }
  ]
}

Attach the managed policies:

$ aws iam attach-role-policy \
  --role-name eks-cluster-role \
  --policy-arn arn:aws:iam::aws:policy/AmazonEKSClusterPolicy

$ aws iam attach-role-policy \
  --role-name eks-cluster-role \
  --policy-arn arn:aws:iam::aws:policy/AmazonEKSServicePolicy

Then create the cluster. You need an existing VPC and subnets — EKS doesn't create these for you:

$ aws eks create-cluster \
  --name my-cluster \
  --role-arn arn:aws:iam::123456789012:role/eks-cluster-role \
  --resources-vpc-config subnetIds=subnet-abc123,subnet-def456,securityGroupIds=sg-xyz789 \
  --region us-east-1

Creating the control plane takes about 10 minutes. You can poll status with:

$ aws eks describe-cluster --name my-cluster --query cluster.status
"CREATING"
...
"ACTIVE"

Configuring kubectl

Once the cluster is active, you need to wire up kubectl. EKS uses a custom authentication mechanism that calls back into the AWS CLI:

$ aws eks update-kubeconfig --name my-cluster --region us-east-1

Added new context arn:aws:eks:us-east-1:123456789012:cluster/my-cluster to /Users/mharris/.kube/config

Under the hood this sets up a user in your kubeconfig that runs aws eks get-token to get a short-lived bearer token. You need the aws-iam-authenticator binary in your PATH for this to work. If you get error: You must be logged in to the server (Unauthorized), that's probably why.

Worker nodes

Here's the part AWS doesn't do for you yet: the actual compute. You launch an EC2 Auto Scaling Group using the EKS-optimized AMI (find the right AMI ID for your region in the EKS docs — they publish a table). The worker nodes need a bootstrap script to register with the cluster:

#!/bin/bash
/etc/eks/bootstrap.sh my-cluster

The worker nodes also need an IAM instance profile with the EKS worker node policies:

AmazonEKSWorkerNodePolicy
AmazonEKS_CNI_Policy
AmazonEC2ContainerRegistryReadOnly

The aws-auth ConfigMap

This part bit me. Worker nodes authenticate to the control plane using their IAM role, but that role has to be explicitly allowed in the aws-auth ConfigMap. Without this your nodes stay in NotReady forever:

apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    - rolearn: arn:aws:iam::123456789012:role/eks-worker-node-role
      username: system:node:{{EC2PrivateDNSName}}
      groups:
        - system:bootstrappers
        - system:nodes

Apply it with kubectl apply -f aws-auth.yaml and your nodes should come up within a minute:

$ kubectl get nodes
NAME                          STATUS    ROLES     AGE   VERSION
ip-10-0-1-45.ec2.internal     Ready     <none>    2m    v1.9.6
ip-10-0-2-22.ec2.internal     Ready     <none>    2m    v1.9.6

Deploying a sample app

Nothing fancy. Let's deploy nginx to confirm everything hangs together:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.13
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx
spec:
  type: LoadBalancer
  selector:
    app: nginx
  ports:
  - port: 80
    targetPort: 80

$ kubectl apply -f nginx.yaml
deployment.apps/nginx created
service/nginx created

$ kubectl get svc nginx
NAME    TYPE           CLUSTER-IP     EXTERNAL-IP                          PORT(S)        AGE
nginx   LoadBalancer   172.20.54.12   abc123.us-east-1.elb.amazonaws.com   80:31456/TCP   45s

The LoadBalancer type automatically provisions an AWS Classic ELB. That part works exactly like you'd expect.

Verdict so far

Honestly? The setup is more manual than I'd like, especially the worker node bootstrapping and the aws-auth ConfigMap dance. The control plane itself is solid — I've restarted API server pods in kops clusters before at 2am and it's not fun, so offloading that is genuinely valuable. I expect the tooling around worker nodes will get better. For now it's preview-quality UX with production-quality control plane. I'll take it.