Have you ever wondered how Kubernetes works inside? How much magic and unicorns are hidden there? Lets try to build small, not production-ready “cluster” from scratch, so that we can learn about Kubernetes internals.
Before we start we need some nodes to work on, we will use DigitalOcean here. Let’s start with one node:
This is the first and most important component in Kubernetes. Kubelet’s responsibility is to spawn/kill pods and containers on its node, it communicates directly with Docker daemon so we need to install it first. For Ubuntu 16.04 the default version of Docker is 1.12.6.
So let’s download Kubernetes binaries and run kubelet.
We run kubelet with –pod-manifest-path option. This is the directory that kubelet will watch for pod manifest yaml files.
Let’s put simple nginx pod manifest file to that directory and see what happens.
Now we can check docker ps to see that our container has been added and try to curl it:
b603d65d8bfd is the id of a pause container. This is an infrastructure container that Kubernetes creates first when creating a pod. Using a pause container Kubernetes acquires IP and setup network namespace. All other containers in a pod shares the same IP address and network interface. When all your containers die, this is the last container that holds whole network namespace.
This is how our node looks like now:
KUBE API SERVER
Kubernetes use etcd, a distributed database with strong consistency data model to store the state of whole cluster. API Server is the only component that can talk to etcd directly, all other components (including kubelet) have to communicate through API Server. Let’s try to run API Server with kubelet.
First we need etcd:
And the API Server:
Now we can connect kubelet to API Server and check if it was discovered by the cluster.
We don’t have any pods yet, so let’s create one with kubectl create -f nginx.yaml using previous manifest file.
Notice here that the pod hangs in Pending status – but why ? This is because we don’t yet have another Kubernetes component responsible for choosing a node for the pod – Scheduler. We will talk about it later but for now we can just create nginx2 with updated manifest that determinates what node should be used.
Great, so now we can see that API Server and kubelet works. This is how our node looks like now:
Scheduler is responsible for assigning pod to a node. It watches pods and assigns available nodes to those without one.
We still have nginx pod that is in Pending state from previous example. Let’s run scheduler and see what happens.
as you can see the scheduler kicks in, finds a pod and assigns it to the node. You can see it’s placement on our node schema:
KUBE CONTROLLER MANAGER
Controller Manager is responsible for managing (among others) Replication Controllers and Replica Sets so without it we can’t use Kubernetes Deployments.
Here we are going to run it and create a deployment.
Updated version of our node scheme:
Kubernetes (network) proxy is responsible for managing Kubernetes Services and thus internal load balancing and exposing pods internally for other pods and for external clients.
Nginx deployment is now exposed via 30073 port externally, we can check that with curl.
Our updated node schema:
We have something that we can’t really call a cluster yet but we have learned how Kubernetes components works together. What is really astonishing is how well designed and decoupled Kubernetes parts are. After understanding each part’s role in the system Kubernetes should no longer be a mystery for us.
In the next blog post I will describe how to add more nodes to our cluster and load balance ingress between them, so stay tuned!