One place for hosting & domains

      A Beginner's Guide to Kubernetes


      Updated by Linode Contributed by Linode

      Kubernetes, often referred to as k8s, is an open source container orchestration system that helps deploy and manage containerized applications. Developed by Google starting in 2014 and written in the Go language, Kubernetes is quickly becoming the standard way to architect horizontally-scalable applications. This guide will explain the major parts and concepts of Kubernetes.

      Containers

      Kubernetes is a container orchestration tool and, therefore, needs a container runtime installed to work. In practice, the default container runtime for Kubernetes is Docker, though other runtimes like rkt, and LXD will also work. With the advent of the Container Runtime Interface (CRI), which hopes to standardize the way Kubernetes interacts with containers, other options like containerd, cri-o, and Frakti have also become available. This guide assumes you have a working knowledge of containers and the examples will all use Docker as the container runtime.

      Kubernetes API

      Kubernetes is built around a robust RESTful API. Every action taken in Kubernetes, be it inter-component communication or user command, interacts in some fashion with the Kubernetes API. The goal of the API is to help facilitate the desired state of the Kubernetes cluster. If you want X instances of your application running and have Y currently active, the API will take the required steps to get to X, whether this means creating, or destroying resources. To create this desired state, you create objects, which are normally represented by YAML files called manifests, and apply them through the command line with the kubectl tool.

      kubectl

      kubectl is a command line tool used to interact with the Kubernetes cluster. It offers a host of features, including the ability to create, stop, and delete resources, describe active resources, and auto scale resources. For more information on the types of commands and resources you can use with kubectl, consult the Kubernetes kubectl documentation.

      Kubernetes Master, Nodes, and Control Plane

      At the highest level of Kubernetes, there exist two kinds of servers, a Master and a Node. These servers can be Linodes, VMs, or physical servers. Together, these servers form a cluster.

      Nodes

      Kubernetes Nodes are worker servers that run your application. The number of Nodes is determined by the user, and they are created by the user. In addition to running your application, each Node runs two processes:

      • kubelet receives descriptions of the desired state of a Pod from the API server, and ensures the Pod is healthy, and running on the Node.
      • kube-proxy is a networking proxy that proxies the UDP, TCP, and SCTP networking of each Node, and provides load balancing. This is only used to connect to Services.

      Kubernetes Master

      The Kubernetes Master is normally a separate server responsible for maintaining the desired state of the cluster. It does this by telling the Nodes how many instances of your application it should run and where. The Kubernetes Master runs three processes:

      • kube-apiserver is the front end for the Kubernetes API server.
      • kube-controller-manager is a daemon that manages the Kubernetes control loop. For more on Controllers, see the Controllers section.
      • kube-scheduler is a function that looks for newly created Pods that have no Nodes, and assigns them a Node based on a host of requirements. For more information on kube-scheduler, consult the Kubernetes kube-scheduler documentation.

      Additionally, the Kubernetes Master runs the database etcd. Etcd is a highly available key-value store that provides the backend database for Kubernetes.

      Together, kube-apiserver, kube-controller-manager, kube-scheduler, and etcd form what is known as the control plane. The control plane is responsible for making decisions about the cluster, and pushing it toward the desired state.

      Kubernetes Objects

      In Kubernetes, there are a number of objects that are abstractions of your Kubernetes system’s desired state. These objects represent your application, its networking, and disk resources – all of which together form your application.

      Pods

      In Kubernetes, all containers exist within Pods. Pods are the smallest unit of the Kubernetes architecture, and can be viewed as a kind of wrapper for your container. Each Pod is given its own IP address with which it can interact with other Pods within the cluster.

      Usually, a Pod contains only one container, but a Pod can contain multiple containers if those containers need to share resources. If there is more than one container in a Pod, these containers can communicate with one another via localhost.

      Pods in Kubernetes are “mortal,” which means that they are created, and destroyed depending on the needs of the application. For instance, you might have a web app backend that sees a spike in CPU usage. This might cause the cluster to scale up the amount of backend Pods from two to ten, in which case eight new Pods would be created. Once the traffic subsides, the Pods might scale back to two, in which case eight pods would be destroyed.

      It is important to note that Pods are destroyed without respect to which Pod was created first. And, while each Pod has its own IP address, this IP address will only be available for the life-cycle of the Pod.

      Below is an example of a Pod manifest:

      my-apache-pod.yaml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      
      apiVersion: v1
      kind: Pod
      metadata:
       name: apache-pod
       labels:
         app: web
      spec:
        containers:
        - name: apache-container
          image: httpd

      Each manifest has four necessary parts:

      • The version of the API in use
      • The kind of resource you’d like to define
      • Metadata about the resource
      • Though not required by all objects, a spec which describes the desired behavior of the resource is necessary for most objects and controllers.

      In the case of this example, the API in use is v1, and the kind is a Pod. The metadata field is used for applying a name, labels, and annotations. Names are used to differentiate resources, while labels are used to group like resources. Labels will come into play more when defining Services and Deployments. Annotations are for attaching arbitrary data to the resource.

      The spec is where the desired state of the resource is defined. In this case, a Pod with a single Apache container is desired, so the containers field is supplied with a name, ‘apache-container’, and an image, the latest version of Apache. The image is pulled from Docker Hub, as that is the default container registry for Kubernetes.

      For more information on the type of fields you can supply in a Pod manifest, refer to the Kubernetes Pod API documentation.

      Now that you have the manifest, you can create the Pod using the create command:

      kubectl create -f my-apache-pod.yaml
      

      To view a list of your pods, use the get pods command:

      kubectl get pods
      

      You should see output like the following:

      NAME         READY   STATUS    RESTARTS   AGE
      apache-pod   1/1     Running   0          16s
      

      To quickly view which Node the Pod exists on, issue the get pods command with the -o=wide flag:

      kubectl get pods -o=wide
      

      To retrieve information about the Pod, issue the describe command:

      kubcetl describe pod apache-pod
      

      You should see output like the following:

      ...
      Events:
      Type    Reason     Age    From                       Message
      ----    ------     ----   ----                       -------
      Normal  Scheduled  2m38s  default-scheduler          Successfully assigned default/apache-pod to mycluster-node-1
      Normal  Pulling    2m36s  kubelet, mycluster-node-1  pulling image "httpd"
      Normal  Pulled     2m23s  kubelet, mycluster-node-1  Successfully pulled image "httpd"
      Normal  Created    2m22s  kubelet, mycluster-node-1  Created container
      Normal  Started    2m22s  kubelet, mycluster-node-1  Started container
      

      To delete the Pod, issue the delete command:

      kubectl delete pod apache-pod
      

      Services

      Services group identical Pods together to provide a consistent means of accessing them. For instance, you might have three Pods that are all serving a website, and all of those Pods need to be accessible on port 80. A Service can ensure that all of the Pods are accessible at that port, and can load balance traffic between those Pods. Additionally, a Service can allow your application to be accessible from the internet. Each Service is given an IP address and a corresponding local DNS entry. Additionally, Services exist across Nodes. If you have two replica Pods on one Node and an additional replica Pod on another Node, the service can include all three Pods. There are four types of Service:

      • ClusterIP: Exposes the Service internally to the cluster. This is the default setting for a Service.
      • NodePort: Exposes the Service to the internet from the IP address of the Node at the specified port number. You can only use ports in the 30000-32767 range.
      • LoadBalancer: This will create a load balancer assigned to a fixed IP address in the cloud, so long as the cloud provider supports it. In the case of Linode, this is the responsibility of the Linode Cloud Controller Manager, which will create a NodeBalancer for the cluster. This is the best way to expose your cluster to the internet.
      • ExternalName: Maps the service to a DNS name by returning a CNAME record redirect. ExternalName is good for directing traffic to outside resources, such as a database that is hosted on another cloud.

      Below is an example of a Service manifest:

      my-apache-service.yaml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      
      apiVersion: v1
      kind: Service
      metadata:
        name: apache-service
        labels:
          app: web
      spec:
        type: NodePort
        ports:
        - port: 80
          targetPort: 80
          nodePort: 30020
        selector:
          app: web

      The above example Service uses the v1 API, and its kind is Service. Like the Pod example in the previous section, this manifest has a name and a label. Unlike the Pod example, this spec uses the ports field to define the exposed port on the container (port), and the target port on the Pod (targetPort). The type NodePort unlocks the use of nodePort field, which allows traffic on the host Node at that port. Lastly, the selector field is used to target only the Pods that have been assigned the app: web label.

      For more information on Services, visit the Kubernetes Service API documentation.

      To create the Service from the YAML file, issue the create command:

      kubectl create -f my-apache-service.yaml
      

      To view a list of running services, issue the get services command:

      kubectl get services
      

      You should see output like the following:

      NAME             TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)        AGE
      apache-service   NodePort    10.99.57.13   <none>        80:30020/TCP   54s
      kubernetes       ClusterIP   10.96.0.1     <none>        443/TCP        46h
      

      To retrieve more information about your Service, issue the describe command:

      kubectl describe service apache-service
      

      To delete the Service, issue the delete command:

      kubcetl delete service apache-service
      

      Volumes

      A Volume in Kubernetes is a way to share file storage between containers in a Pod. Kubernetes Volumes differ from Docker volumes because they exist inside the Pod rather than inside the container. When a container is restarted the Volume persists. Note, however, that these Volumes are still tied to the lifecycle of the Pod, so if the Pod is destroyed the Volume will be destroyed with it.

      Linode also offers a Container Storage Interface (CSI) driver that allows the cluster to persist data on a Block Storage volume.

      Below is an example of how to create and use a Volume by creating a Pod manifest:

      my-apache-pod-with-volume.yaml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      
      apiVersion: v1
      kind: Pod
      metadata:
        name: apache-with-volume
      spec:
        volumes:
        - name: apache-storage-volume
          emptyDir: {}
      
        containers:
        - name: apache-container
          image: httpd
          volumeMounts:
          - name: apache-storage-volume
            mountPath: /data/apache-data

      A Volume has two unique aspects to its definition. In this example, the first aspect is the volumes block that defines the type of Volume you want to create, which in this case is a simple empty directory (emptyDir). The second aspect is the volumeMounts field within the container’s spec. This field is given the name of the Volume you are creating and a mount path within the container.

      There are a number of different Volume types you could create in addition to emptyDir depending on your cloud host. For more information on Volume types, visit the Kubernetes Volumes API documentation.

      Namespaces

      Namespaces are virtual clusters that exist within the Kubernetes cluster that help to group and organize objects. Every cluster has at least three namespaces: default, kube-system, and kube-public. When interacting with the cluster it is important to know which Namespace the object you are looking for is in, as many commands will default to only showing you what exists in the default namespace. Resources created without an explicit namespace will be added to the default namespace.

      Namespaces consist of alphanumeric characters, dashes (-), and periods (.).

      Here is an example of how to define a Namespace with a manifest:

      my-namespace.yaml
      1
      2
      3
      4
      
      apiVersion: v1
      kind: Namespace
      metadata:
        name: my-app

      To create the Namespace, issue the create command:

      kubcetl create -f my-namespace.yaml
      

      Below is an example of a Pod with a Namespace:

      my-apache-pod-with-namespace.yaml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      
      apiVersion: v1
      kind: Pod
      metadata:
        name: apache-pod
        labels:
          app: web
        namespace: my-app
      spec:
        containers:
        - name: apache-container
          image: httpd

      To retrieve resources in a certain Namespace, use the -n flag.

      kubectl get pods -n my-app
      

      You should see a list of Pods within your namespace:

      NAME         READY   STATUS    RESTARTS   AGE
      apache-pod   1/1     Running   0          7s
      

      To view Pods in all Namespaces, use the --all-namespaces flag.

      kubectl get pods --all-namespaces
      

      To delete a Namespace, issue the delete namespace command. Note that this will delete all resources within that Namespace:

      kubectl delete namespace my-app
      

      For more information on Namespaces, visit the Kubernetes Namespaces API documentation

      Controllers

      A Controller is a control loop that continuously watches the Kubernetes API and tries to manage the desired state of certain aspects of the cluster. There are a number of controllers. Below is a short reference of the most popular controllers you might interact with.

      ReplicaSets

      As has been mentioned, Kubernetes allows an application to scale horizontally. A ReplicaSet is one of the controllers responsible for keeping a given number of replica Pods running. If one Pod goes down in a ReplicaSet, another will be created to replace it. In this way, Kubernetes is self-healing. However, for most use cases it is recommended to use a Deployment instead of a ReplicaSet.

      Below is an example of a ReplicaSet:

      my-apache-replicaset.yaml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      
      apiVersion: apps/v1
      kind: ReplicaSet
      metadata:
        name: apache-replicaset
        labels:
          app: web
      spec:
        replicas: 5
        selector:
          matchLabels:
            app: web
        template:
          metadata:
            labels:
              app: web
          spec:
            containers:
            - name: apache-container
              image: httpd

      There are three main things to note in this ReplicaSet. The first is the apiVersion, which is apps/v1. This differs from the previous examples, which were all apiVersion: v1, because ReplicaSets do not exist in the v1 core. They instead reside in the apps group of v1. The second and third things to note are the replicas field and the selector field. The replicas field defines how many replica Pods you want to be running at any given time. The selector field defines which Pods, matched by their label, will be controlled by the ReplicaSet.

      To view your ReplicaSets, issue the get replicasets command:

      kubectl get replicasets
      

      You should see output like the following:

      NAME                DESIRED   CURRENT   READY   AGE
      apache-replicaset   5         5         0       5s
      

      This output shows that of the five desired replicas, there are 5 currently active, but zero of those replicas are available. This is because the Pods are still booting up. If you issue the command again, you will see that all five have become ready:

      NAME                DESIRED   CURRENT   READY   AGE
      apache-replicaset   5         5         5       86s
      

      You can view the Pods the ReplicaSet created by issuing the get pods command:

      NAME                      READY   STATUS    RESTARTS   AGE
      apache-replicaset-5rsx2   1/1     Running   0          31s
      apache-replicaset-8n52c   1/1     Running   0          31s
      apache-replicaset-jcgn8   1/1     Running   0          31s
      apache-replicaset-sj422   1/1     Running   0          31s
      apache-replicaset-z8g76   1/1     Running   0          31s
      

      To delete a ReplicaSet, issue the delete replicaset command:

      kubectl delete replicaset apache-replicaset
      

      If you issue the get pods command, you will see that the Pods the ReplicaSet created are in the process of terminating:

      NAME                      READY   STATUS        RESTARTS   AGE
      

      apache-replicaset-bm2pn 0/1 Terminating 0 3m54s

      In the above example, four of the Pods have already terminated, and one is in the process of terminating.

      For more information on ReplicaSets, view the Kubernetes ReplicaSets API documentation.

      Deployments

      A Deployment can manage a ReplicaSet, so it shares the ability to keep a defined number of replica pods up and running. A Deployment can also update those Pods to resemble the desired state by means of rolling updates. For example, if you wanted to update a container image to a newer version, you would create a Deployment, and the controller would update the container images one by one until the desired state is achieved. This ensures that there is no downtime when updating or altering your Pods.

      Below is an example of a Deployment:

      my-apache-deployment.yaml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: apache-deployment
        labels:
          app: web
      spec:
        replicas: 5
        selector:
          matchLabels:
            app: web
        template:
          metadata:
            labels:
              app: web
          spec:
            containers:
            - name: apache-container
              image: httpd:2.4.35

      The only noticeable difference between this Deployment and the example given in the ReplicaSet section is the kind. In this example we have chosen to initially install Apache 2.4.35. If you wanted to update that image to Apache 2.4.38, you would issue the following command:

      kubectl --record deployment.apps/apache-deployment set image deployment.v1.apps/apache-deployment apache-container=httpd:2.4.38
      

      You’ll see a confirmation that the images have been updated:

      deployment.apps/apache-deployment image updated
      

      To see for yourself that the images have updated, you can grab the Pod name from the get pods list:

      kubectl get pods
      
      NAME                                 READY   STATUS    RESTARTS   AGE
      apache-deployment-574c8c4874-8zwgl   1/1     Running   0          8m36s
      apache-deployment-574c8c4874-9pr5j   1/1     Running   0          8m36s
      apache-deployment-574c8c4874-fbs46   1/1     Running   0          8m34s
      apache-deployment-574c8c4874-nn7dl   1/1     Running   0          8m36s
      apache-deployment-574c8c4874-pndgp   1/1     Running   0          8m33s
      

      Issue the describe command to view all of the available details of the Pod:

      kubectl describe pod apache-deployment-574c8c4874-pndgp
      

      You’ll see a long list of details, of which the container image is included:

      ....
      
      Containers:
        apache-container:
          Container ID:   docker://d7a65e7993ab5bae284f07f59c3ed422222100833b2769ff8ee14f9f384b7b94
          Image:          httpd:2.4.38
      
      ....
      

      For more information on Deployments, visit the Kubernetes Deployments API documentation

      Jobs

      A Job is a controller that manages a Pod that is created for a single, or set, of tasks. This is handy if you need to create a Pod that performs a single function, or calculates a value. The deletion of the Job will delete the Pod.

      Below is an example of a Job that simply prints “Hello World!” and ends:

      my-job.yaml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      16
      17
      
      apiVersion: batch/v1
      kind: Job
      metadata:
        name: hello-world
      spec:
        template:
          metadata:
            name: hello-world
          spec:
            containers:
            - name: output
              image: debian
              command:
               - "bin/bash"
               - "-c"
               - "echo 'Hello World!'"
            restartPolicy: Never

      To create the Job, issue the create command:

      kubectl create -f my-job.yaml
      

      To see if the job has run, or is running, issue the get jobs command:

      kubectl get jobs
      

      You should see output like the following:

      NAME          COMPLETIONS   DURATION   AGE
      hello-world   1/1           9s         8m23s
      

      To get the Pod of the Job, issue the get pods command:

      kubectl get pods
      

      You should see an output like the following:

      NAME                               READY   STATUS             RESTARTS   AGE
      hello-world-4jzdm                  0/1     Completed          0          9m44s
      

      You can use the name of the Pod to inspect its output by consulting the log file for the Pod:

      kubectl get logs hello-world-4jzdm
      

      To delete the Job, and its Pod, issue the delete command:

      kubectl delete job hello-world
      

      Networking

      Networking in Kubernetes was designed to make it simple to port existing apps from VMs to containers, and subsequently, Pods. The basic requirements of the Kubernetes networking model are:

      1. Pods can communicate with each other across Nodes without the use of NAT
      2. Agents on a Node, like kubelet, can communicate with all of a Node’s Pods
      3. In the case of Linux, Pods in a Node’s host network can communicate to all other Pods without NAT.

      Though the rules of the Kubernetes networking model are simple, the implementation of those rules is an advanced topic. Because Kubernetes does not come with its own implementation, it is up to the user to provide a networking model.

      Two of the most popular options are Flannel and Calico. Flannel is a networking overlay that meets the functionality of the Kubernetes networking model by supplying a layer 3 network fabric, and is relatively easy to set up. Calico enables networking, and networking policy through the NetworkPolicy API to provide simple virtual networking.

      For more information on the Kubernetes networking model, and ways to implement it, consult the cluster networking documentation.

      Advanced Topics

      There are a number of advanced topics in Kubernetes. Below are a few you might find useful as you progress in Kubernetes:

      • StatefulSets can be used when creating stateful applications.
      • DaemonSets can be used to ensure each Node is running a certain Pod. This is useful for log collection, monitoring, and cluster storage.
      • Horizontal Pod Autoscaling can automatically scale your deployments based on CPU usage.
      • CronJobs can schedule Jobs to run at certain times.
      • ResourceQuotas are helpful when working with larger groups where there is a concern that some teams might take up too many resources.

      Next Steps

      Now that you are familiar with Kubernetes concepts and components, you can follow the Getting Started with Kubernetes: Use kubeadm to Deploy a Cluster on Linode guide. This guide provides a hands-on activity to continue learning about Kubernetes. If you would like to deploy a Kubernetes cluster on Linode for production use, we recommend using one of the following methods, instead:

      More Information

      You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.

      Find answers, ask questions, and help others.

      This guide is published under a CC BY-ND 4.0 license.



      Source link


      Leave a Comment