What is Kubernetes (k8s) ?
- Kubernetes is just like the control panel (flight instrucment) of an aircraft.
- For aviators, they have lots of component to control and manage, so there is a control panel for them.
- For developers, when we have lots of computer to manage, so there is Kubernetes for us!
Think of you have lots of service to manage…
You have Kafka, Cassandra, MySQL, Redis, Elasticsearch and many services. For each service, you need:
- Deploy newer version to certain machines, rollback if deployment failed
- Restart the service once it failed
- Scale up when heavy loading
Why should we use Kubernetes ?
- self-healing: auto restart services once if failed
- automated rollouts and rollbacks
- built-in load balancing
- easier service discovery
- secret and configuration management
- it abstracts away the hardware infrastructure and exposes your whole cluster as a single enormous computational resource
- better resource allocation: you can specify the CPU & memory for each service, k8s will automatically allocate them for you.
- storage orchestration: you don’t need to think where to store your data*
Who invents Kubernetes ?
- 15 years of experience of building production workloads at Google: Borg, Omega
Before we dive into details
Let’s have a very simple practice today.
Core concepts of Kubernetes
- Container: a technique to package our application, and run it in an isolated environment. You can think of that we packed our our application, and run it in a virtualized computer.
- Node: a real computer or a virtualized machine (e.g. ec2, gce) who has computing resource (CPU, memory)
- Cluster: a bunch of real computers or VMs managed by Kubernetes
- Pod: basic working unit of kubernetes
- Container: a container instance by containerized technology, e.g. docker
A cluster can contain many nodes, a node can contain many pods, and a pod can contain many containers
Important resource types
- Service (NodePort, LoadBalancer) & Ingress
- Persistent Volume
- Purpose: the basic working unit for scaling/deployment
- each container in a pod share same host and port space
- each pod inside cluster has its own unique IP address
- we can specify multiple containers you would like to use in a single pod
When we use multiple containers in a pod ?
Sidecar pattern: log collecting
Ambassador/Adapter pattern: proxy / adaptor
When we should NOT use multiple containers in a pod ?
when we want to scale up different applications inside a pod in different scale, we should separate it as different pods
- Purpose: make cluster’s internal and external networking more convenient and feasible
- Pod-to-Pod communication
- Pod to connect external service
- External client to connect internal pod
- Pod can fail / be deleted / be created at any time, we are not possible to setup our application by IP address mannually
- By using service, we can refer to one or more pods by a selector
- When a service can to muliple pods, the service can load balancing the requests
Pod to connect external service
External client to connect internal pod
- Purpose: to maintain a stable number of replica Pods running
- High Availability (HA): to ensure service always being available
- Load Balance
- For example: Login service, API service, …
- Purpose: to deploy our application more gracefully
- Use case: almost every application
- Before introducting Deployment, let’s see what kind of strategies we can use for deploying new version of application:
And actually, Deployment works by ReplicaSet:
Pods can be deleted/removed at any time, and all the files inside pod will disappear by default. However, for some of service, persistent storage is crucial. (e.g. DB service)
Important resource types recap
- Pod: basic working unit
- Service (NodePort, LoadBalancer) & Ingress: for networking
- ReplicaSet: to keep stable number of replica pods
- Deployment: to deploy application more gracefully
- Persistent Volume: to save data persistently
Tips for inspecting issues in Kubernetes cluster while on duty
Common tasks for onduty person:
- ensure applications running correctly
- scale up/down service if needed
How to monitor whether application is running properly ?
- Pod: status, age
Monitoring service like Grafana & Prometheus
worker: understand the source of jobs, usually a queue.
- how many tasks are there in a queue ?
- what is the producing / consuming velocity ?
- what is the oldest task in the queue ?
- server: request per second / latency (response time)
- cronjob: log