Containerization

How containers package software with everything it needs to run, making deployments consistent across any machine.

BeginnerInfrastructureChapter: Infrastructure8 min read

The Deployment Problem

Imagine you write a service on your laptop: Node 20, a specific version of libssl, environment variables set just right. It works perfectly. You ship it to a production server running Ubuntu 22, and it immediately crashes because the server has Node 18 and a different libssl version.

This is the classic "it works on my machine" problem. Before containers, teams solved it by writing lengthy setup scripts, maintaining identical server configurations by hand, or shipping entire virtual machines. All of these are slow, fragile, or expensive.

Containers solve it by bundling the application together with its entire environment into a single portable unit.

What a Container Actually Is

A container is a running process that is isolated from the rest of the system. It has its own filesystem, its own network interface, and its own view of the process table. From inside the container, it looks like it is the only thing running on the machine.

Critically, containers are not virtual machines. A VM emulates an entire hardware layer and runs a complete operating system on top. A container shares the host machine's operating system kernel but is kept isolated using two Linux kernel features:

Namespaces — control what a process can see. A container gets its own namespace for the filesystem, network interfaces, process IDs, and users. So a process inside a container sees / as its root, has its own eth0, and its PIDs start at 1, even though from the host's perspective it is just process 4827.
cgroups (control groups) — control what a process can use. The kernel enforces limits on how much CPU time, memory, and disk I/O the container's processes can consume.

Because containers skip the guest OS entirely, they are dramatically smaller (megabytes vs gigabytes) and start in milliseconds instead of minutes.

Container Images

A container image is the blueprint — the read-only snapshot of the filesystem that a container starts from. You define the image with a Dockerfile (in Docker's case), listing what base system to start from, what files to add, and what command to run.

Images are built in layers. Each instruction in a Dockerfile adds a new layer on top of the previous ones. Layers are content-addressed and cached, so if only your app code changes, only that layer needs to be rebuilt and transferred — the operating system and dependency layers are reused.

When you run an image, the container runtime adds a writable layer on top. Any changes the container makes (log files, temp files) go into that layer and are discarded when the container stops.

Why It Matters

Consistency: The same image runs identically on your laptop, a CI server, and a production cluster. The environment is part of the artifact.
Fast deployments: Starting a container is nearly instant. You can start dozens of them in the time a VM would finish booting.
Isolation: Containers cannot accidentally affect each other's processes or files. Dependencies for one service do not conflict with another's.
Scalability: Because containers are lightweight and fast to start, orchestrators like Kubernetes can spin up or kill them automatically in response to traffic.

Prerequisites

Processes & Threads Virtual Memory

Continue learning

ACID & Isolation Levels

Deep dive into database transaction guarantees, isolation levels, concurrency anomalies like write skew, and control mechanisms such as MVCC, 2PL, and SSI.

API Gateways

Understand the API Gateway pattern as the central ingress point for microservices, handling routing, auth, rate limiting, and protocol translation.

API Security & OAuth 2.0

Understand API authentication and authorization mechanisms, JWT security, and the OAuth 2.0 framework including Authorization Code Flow with PKCE.