Kubernetes

How Kubernetes automates the deployment, scaling, and self-healing of containerized applications across a cluster of machines.

IntermediateInfrastructureChapter: Infrastructure14 min read

The Problem Docker Alone Cannot Solve

Docker is excellent for running a single container on a single machine. But real production systems run across tens or hundreds of machines, need to recover automatically from crashes, scale up under load, and deploy new versions without downtime. Docker by itself has no answer for any of this.

This is what Kubernetes (often abbreviated K8s) does. It is a platform for running containers at scale, deciding which machine each container goes on, restarting containers that crash, and routing traffic between them.


The Core Idea: Desired State

Kubernetes operates on a simple principle: you tell it what you want (three copies of my API running), and it continuously works to make reality match that description. If a server crashes and takes two of those copies with it, Kubernetes notices and schedules two replacement containers on healthy machines. You do not intervene manually.

This is called declarative configuration, and it is the key mental shift when moving to Kubernetes.


Key Building Blocks

Kubernetes Cluster Structure Control Plane API Server Scheduler Controller Manager etcd (state store) Node 1 Pod: my-api (replica 1) container + shared network/storage Pod: worker-job different workload Node 2 Pod: my-api (replica 2) container + shared network/storage Node 3 Pod: my-api (replica 3) container + shared network/storage A Service sits in front of all pods, load-balancing traffic across replicas

Pod: The smallest unit in Kubernetes. A pod wraps one (or sometimes a few tightly coupled) containers and gives them a shared IP address and storage volumes. You almost never interact with pods directly.

Deployment: A blueprint that says "I want N replicas of this pod". Kubernetes creates the pods, spreads them across available nodes, and continuously reconciles if any are missing or unhealthy.

Service: A stable network endpoint that sits in front of a set of pods and load-balances traffic across them. Because pods are ephemeral and get new IP addresses when rescheduled, you always talk to a Service rather than a pod directly.

Node: A physical or virtual machine in the cluster. The control plane schedules pods onto nodes based on available resources.

Control Plane: The cluster's brain. It watches the current state, compares it to the desired state, and issues instructions to make them match. You interact with it via kubectl commands or by applying YAML files.


Self-Healing and Rolling Deployments

Two things Kubernetes handles automatically that teams used to do manually:

Self-healing: If a container crashes, the kubelet (an agent on each node) restarts it. If an entire node goes down, the control plane reschedules its pods onto surviving nodes. You do not get paged at 3am because a single pod crashed.

Rolling deployments: When you update an image version, Kubernetes replaces pods one at a time, waiting for each new pod to pass health checks before killing the old one. Traffic never drops to zero. If the new version fails health checks, the rollout pauses automatically.


When You Do Not Need Kubernetes

Kubernetes is powerful but genuinely complex. For a single-service app or a small team, it often adds more operational overhead than it solves. Managed platforms like Railway, Render, or AWS App Runner give you container deployments with autoscaling without managing a cluster yourself.

Kubernetes becomes the right choice when you have many independent services, need fine-grained resource control, or are running at a scale where managed platforms become prohibitively expensive.


Further Reading

Prerequisites

Code Examples