One place for hosting & domains

      Cluster

      How To Set Up a Ceph Cluster within Kubernetes Using Rook


      The author selected the Mozilla Foundation to receive a donation as part of the Write for DOnations program.

      Introduction

      Kubernetes containers are stateless as a core principle, but data must still be managed, preserved, and made accessible to other services. Stateless means that the container is running in isolation without any knowledge of past transactions, which makes it easy to replace, delete, or distribute the container. However, it also means that data will be lost for certain lifecycle events like restart or deletion.

      Rook is a storage orchestration tool that provides a cloud-native, open source solution for a diverse set of storage providers. Rook uses the power of Kubernetes to turn a storage system into self-managing services that provide a seamless experience for saving Kubernetes application or deployment data.

      Ceph is a highly scalable distributed-storage solution offering object, block, and file storage. Ceph clusters are designed to run on any hardware using the so-called CRUSH algorithm (Controlled Replication Under Scalable Hashing).

      One main benefit of this deployment is that you get the highly scalable storage solution of Ceph without having to configure it manually using the Ceph command line, because Rook automatically handles it. Kubernetes applications can then mount block devices and filesystems from Rook to preserve and monitor their application data.

      In this tutorial, you will set up a Ceph cluster using Rook and use it to persist data for a MongoDB database as an example.

      Prerequisites

      Before you begin this guide, you’ll need the following:

      • A DigitalOcean Kubernetes cluster with at least three nodes that each have 2 vCPUs and 4 GB of Memory. To create a cluster on DigitalOcean and connect to it, see the Kubernetes Quickstart.
      • The kubectl command-line tool installed on a development server and configured to connect to your cluster. You can read more about installing kubectl in its official documentation.
      • A DigitalOcean block storage Volume with at least 100 GB for each node of the cluster you just created—for example, if you have three nodes you will need three Volumes. Select Manually Format rather than automatic and then attach your Volume to the Droplets in your node pool. You can follow the Volumes Quickstart to achieve this.

      Step 1 — Setting up Rook

      After completing the prerequisite, you have a fully functional Kubernetes cluster with three nodes and three Volumes—you’re now ready to set up Rook.

      In this section, you will clone the Rook repository, deploy your first Rook operator on your Kubernetes cluster, and validate the given deployment status. A Rook operator is a container that automatically bootstraps the storage clusters and monitors the storage daemons to ensure the storage clusters are healthy.

      First, you will clone the Rook repository, so you have all the resources needed to start setting up your Rook cluster:

      • git clone --single-branch --branch release-1.3 https://github.com/rook/rook.git

      This command will clone the Rook repository from Github and create a folder with the name of rook in your directory. Now enter the directory using the following command:

      • cd rook/cluster/examples/kubernetes/ceph

      Next you will continue by creating the common resources you needed for your Rook deployment, which you can do by deploying the Kubernetes config file that is available by default in the directory:

      • kubectl create -f common.yaml

      The resources you’ve created are mainly CustomResourceDefinitions (CRDs) and define new resources that the operator will later use. They contain resources like the ServiceAccount, Role, RoleBinding, ClusterRole, and ClusterRoleBinding.

      Note: This standard file assumes that you will deploy the Rook operator and all Ceph daemons in the same namespace. If you want to deploy the operator in a separate namespace, see the comments throughout the common.yaml file.

      After the common resources are created, the next step is to create the Rook operator.

      Before deploying the operator.yaml file, you will need to change the CSI_RBD_GRPC_METRICS_PORT variable because your DigitalOcean Kubernetes cluster already uses the standard port by default. Open the file with the following command:

      Then search for the CSI_RBD_GRPC_METRICS_PORT variable, uncomment it by removing the #, and change the value from port 9001 to 9093:

      operator.yaml

      kind: ConfigMap
      apiVersion: v1
      metadata:
        name: rook-ceph-operator-config
        namespace: rook-ceph
      data:
        ROOK_CSI_ENABLE_CEPHFS: "true"
        ROOK_CSI_ENABLE_RBD: "true"
        ROOK_CSI_ENABLE_GRPC_METRICS: "true"
        CSI_ENABLE_SNAPSHOTTER: "true"
        CSI_FORCE_CEPHFS_KERNEL_CLIENT: "true"
        ROOK_CSI_ALLOW_UNSUPPORTED_VERSION: "false"
        # Configure CSI CSI Ceph FS grpc and liveness metrics port
        # CSI_CEPHFS_GRPC_METRICS_PORT: "9091"
        # CSI_CEPHFS_LIVENESS_METRICS_PORT: "9081"
        # Configure CSI RBD grpc and liveness metrics port
        CSI_RBD_GRPC_METRICS_PORT: "9093"
        # CSI_RBD_LIVENESS_METRICS_PORT: "9080"
      

      Once you’re done, save and exit the file.

      Next, you can deploy the operator using the following command:

      • kubectl create -f operator.yaml

      The command will output the following:

      Output

      configmap/rook-ceph-operator-config created deployment.apps/rook-ceph-operator created

      Again, you’re using the kubectl create command with the -f flag to assign the file that you want to apply. It will take around a couple of seconds for the operator to be running. You can verify the status using the following command:

      • kubectl get pod -n rook-ceph

      You use the -n flag to get the pods of a specific Kubernetes namespace (rook-ceph in this example).

      Once the operator deployment is ready, it will trigger the creation of the DeamonSets that are in charge of creating the rook-discovery agents on each worker node of your cluster. You’ll receive output similar to:

      Output

      NAME READY STATUS RESTARTS AGE rook-ceph-operator-599765ff49-fhbz9 1/1 Running 0 92s rook-discover-6fhlb 1/1 Running 0 55s rook-discover-97kmz 1/1 Running 0 55s rook-discover-z5k2z 1/1 Running 0 55s

      You have successfully installed Rook and deployed your first operator. Next, you will create a Ceph cluster and verify that it is working.

      Step 2 — Creating a Ceph Cluster

      Now that you have successfully set up Rook on your Kubernetes cluster, you’ll continue by creating a Ceph cluster within the Kubernetes cluster and verifying its functionality.

      First let’s review the most important Ceph components and their functionality:

      • Ceph Monitors, also known as MONs, are responsible for maintaining the maps of the cluster required for the Ceph daemons to coordinate with each other. There should always be more than one MON running to increase the reliability and availability of your storage service.

      • Ceph Managers, also known as MGRs, are runtime daemons responsible for keeping track of runtime metrics and the current state of your Ceph cluster. They run alongside your monitoring daemons (MONs) to provide additional monitoring and an interface to external monitoring and management systems.

      • Ceph Object Store Devices, also known as OSDs, are responsible for storing objects on a local file system and providing access to them over the network. These are usually tied to one physical disk of your cluster. Ceph clients interact with OSDs directly.

      To interact with the data of your Ceph storage, a client will first make contact with the Ceph Monitors (MONs) to obtain the current version of the cluster map. The cluster map contains the data storage location as well as the cluster topology. The Ceph clients then use the cluster map to decide which OSD they need to interact with.

      Rook enables Ceph storage to run on your Kubernetes cluster. All of these components are running in your Rook cluster and will directly interact with the Rook agents. This provides a more streamlined experience for administering your Ceph cluster by hiding Ceph components like placement groups and storage maps while still providing the options of advanced configurations.

      Now that you have a better understanding of what Ceph is and how it is used in Rook, you will continue by setting up your Ceph cluster.

      You can complete the setup by either running the example configuration, found in the examples directory of the Rook project, or by writing your own configuration. The example configuration is fine for most use cases and provides excellent documentation of optional parameters.

      Now you’ll start the creation process of a Ceph cluster Kubernetes Object.

      First, you need to create a YAML file:

      The configuration defines how the Ceph cluster will be deployed. In this example, you will deploy three Ceph Monitors (MON) and enable the Ceph dashboard. The Ceph dashboard is out of scope for this tutorial, but you can use it later in your own individual project for visualizing the current status of your Ceph cluster.

      Add the following content to define the apiVersion and the Kubernetes Object kind as well as the name and the namespace the Object should be deployed in:

      cephcluster.yaml

      apiVersion: ceph.rook.io/v1
      kind: CephCluster
      metadata:
        name: rook-ceph
        namespace: rook-ceph
      

      After that, add the spec key, which defines the model that Kubernetes will use to create your Ceph cluster. You’ll first define the image version you want to use and whether you allow unsupported Ceph versions or not:

      cephcluster.yaml

      spec:
        cephVersion:
          image: ceph/ceph:v14.2.8
          allowUnsupported: false
      

      Then set the data directory where configuration files will be persisted using the dataDirHostPath key:

      cephcluster.yaml

        dataDirHostPath: /var/lib/rook
      

      Next, you define if you want to skip upgrade checks and when you want to upgrade your cluster using the following parameters:

      cephcluster.yaml

        skipUpgradeChecks: false
        continueUpgradeAfterChecksEvenIfNotHealthy: false
      

      You configure the number of Ceph Monitors (MONs) using the mon key. You also allow the deployment of multiple MONs per node:

      cephcluster.yaml

        mon:
          count: 3
          allowMultiplePerNode: false
      

      Options for the Ceph dashboard are defined under the dashboard key. This gives you options to enable the dashboard, customize the port, and prefix it when using a reverse proxy:

      cephcluster.yaml

        dashboard:
          enabled: true
          # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
          # urlPrefix: /ceph-dashboard
          # serve the dashboard at the given port.
          # port: 8443
          # serve the dashboard using SSL
          ssl: false
      

      You can also enable monitoring of your cluster with the monitoring key (monitoring requires Prometheus to be pre-installed):

      cephcluster.yaml

        monitoring:
          enabled: false
          rulesNamespace: rook-ceph
      

      RDB stands for RADOS (Reliable Autonomic Distributed Object Store) block device, which are thin-provisioned and resizable Ceph block devices that store data on multiple nodes.

      RBD images can be asynchronously shared between two Ceph clusters by enabling rbdMirroring. Since we’re working with one cluster in this tutorial, this isn’t necessary. The number of workers is therefore set to 0:

      cephcluster.yaml

        rbdMirroring:
          workers: 0
      

      You can enable the crash collector for the Ceph daemons:

      cephcluster.yaml

        crashCollector:
          disable: false
      

      The cleanup policy is only important if you want to delete your cluster. That is why this option has to be left empty:

      cephcluster.yaml

        cleanupPolicy:
          deleteDataDirOnHosts: ""
        removeOSDsIfOutAndSafeToRemove: false
      

      The storage key lets you define the cluster level storage options; for example, which node and devices to use, the database size, and how many OSDs to create per device:

      cephcluster.yaml

        storage:
          useAllNodes: true
          useAllDevices: true
          config:
            # metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore.
            # databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB
            # journalSizeMB: "1024"  # uncomment if the disks are 20 GB or smaller
      

      You use the disruptionManagement key to manage daemon disruptions during upgrade or fencing:

      cephcluster.yaml

        disruptionManagement:
          managePodBudgets: false
          osdMaintenanceTimeout: 30
          manageMachineDisruptionBudgets: false
          machineDisruptionBudgetNamespace: openshift-machine-api
      

      These configuration blocks will result in the final following file:

      cephcluster.yaml

      apiVersion: ceph.rook.io/v1
      kind: CephCluster
      metadata:
        name: rook-ceph
        namespace: rook-ceph
      spec:
        cephVersion:
          image: ceph/ceph:v14.2.8
          allowUnsupported: false
        dataDirHostPath: /var/lib/rook
        skipUpgradeChecks: false
        continueUpgradeAfterChecksEvenIfNotHealthy: false
        mon:
          count: 3
          allowMultiplePerNode: false
        dashboard:
          enabled: true
          # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
          # urlPrefix: /ceph-dashboard
          # serve the dashboard at the given port.
          # port: 8443
          # serve the dashboard using SSL
          ssl: false
        monitoring:
          enabled: false
          rulesNamespace: rook-ceph
        rbdMirroring:
          workers: 0
        crashCollector:
          disable: false
        cleanupPolicy:
          deleteDataDirOnHosts: ""
        removeOSDsIfOutAndSafeToRemove: false
        storage:
          useAllNodes: true
          useAllDevices: true
          config:
            # metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore.
            # databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB
            # journalSizeMB: "1024"  # uncomment if the disks are 20 GB or smaller
        disruptionManagement:
          managePodBudgets: false
          osdMaintenanceTimeout: 30
          manageMachineDisruptionBudgets: false
          machineDisruptionBudgetNamespace: openshift-machine-api
      

      Once you’re done, save and exit your file.

      You can also customize your deployment by, for example changing your database size or defining a custom port for the dashboard. You can find more options for your cluster deployment in the cluster example of the Rook repository.

      Next, apply this manifest in your Kubernetes cluster:

      • kubectl apply -f cephcluster.yaml

      Now check that the pods are running:

      • kubectl get pod -n rook-ceph

      This usually takes a couple of minutes, so just refresh until your output reflects something like the following:

      Output

      NAME READY STATUS RESTARTS AGE csi-cephfsplugin-lz6dn 3/3 Running 0 3m54s csi-cephfsplugin-provisioner-674847b584-4j9jw 5/5 Running 0 3m54s csi-cephfsplugin-provisioner-674847b584-h2cgl 5/5 Running 0 3m54s csi-cephfsplugin-qbpnq 3/3 Running 0 3m54s csi-cephfsplugin-qzsvr 3/3 Running 0 3m54s csi-rbdplugin-kk9sw 3/3 Running 0 3m55s csi-rbdplugin-l95f8 3/3 Running 0 3m55s csi-rbdplugin-provisioner-64ccb796cf-8gjwv 6/6 Running 0 3m55s csi-rbdplugin-provisioner-64ccb796cf-dhpwt 6/6 Running 0 3m55s csi-rbdplugin-v4hk6 3/3 Running 0 3m55s rook-ceph-crashcollector-pool-33zy7-68cdfb6bcf-9cfkn 1/1 Running 0 109s rook-ceph-crashcollector-pool-33zyc-565559f7-7r6rt 1/1 Running 0 53s rook-ceph-crashcollector-pool-33zym-749dcdc9df-w4xzl 1/1 Running 0 78s rook-ceph-mgr-a-7fdf77cf8d-ppkwl 1/1 Running 0 53s rook-ceph-mon-a-97d9767c6-5ftfm 1/1 Running 0 109s rook-ceph-mon-b-9cb7bdb54-lhfkj 1/1 Running 0 96s rook-ceph-mon-c-786b9f7f4b-jdls4 1/1 Running 0 78s rook-ceph-operator-599765ff49-fhbz9 1/1 Running 0 6m58s rook-ceph-osd-prepare-pool-33zy7-c2hww 1/1 Running 0 21s rook-ceph-osd-prepare-pool-33zyc-szwsc 1/1 Running 0 21s rook-ceph-osd-prepare-pool-33zym-2p68b 1/1 Running 0 21s rook-discover-6fhlb 1/1 Running 0 6m21s rook-discover-97kmz 1/1 Running 0 6m21s rook-discover-z5k2z 1/1 Running 0 6m21s

      You have now successfully set up your Ceph cluster and can continue by creating your first storage block.

      Step 3 — Adding Block Storage

      Block storage allows a single pod to mount storage. In this section, you will create a storage block that you can use later in your applications.

      Before Ceph can provide storage to your cluster, you first need to create a storageclass and a cephblockpool. This will allow Kubernetes to interoperate with Rook when creating persistent volumes:

      • kubectl apply -f ./csi/rbd/storageclass.yaml

      The command will output the following:

      Output

      cephblockpool.ceph.rook.io/replicapool created storageclass.storage.k8s.io/rook-ceph-block created

      Note: If you’ve deployed the Rook operator in a namespace other than rook-ceph you need to change the prefix in the provisioner to match the namespace you use.

      After successfully deploying the storageclass and cephblockpool, you will continue by defining the PersistentVolumeClaim (PVC) for your application. A PersistentVolumeClaim is a resource used to request storage from your cluster.

      For that, you first need to create a YAML file:

      • nano pvc-rook-ceph-block.yaml

      Add the following for your PersistentVolumeClaim:

      pvc-rook-ceph-block.yaml

      apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
        name: mongo-pvc
      spec:
        storageClassName: rook-ceph-block
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 5Gi
      

      First, you need to set an apiVersion (v1 is the current stable version). Then you need to tell Kubernetes which type of resource you want to define using the kind key (PersistentVolumeClaim in this case).

      The spec key defines the model that Kubernetes will use to create your PersistentVolumeClaim. Here you need to select the storage class you created earlier: rook-ceph-block. You can then define the access mode and limit the resources of the claim. ReadWriteOnce means the volume can only be mounted by a single node.

      Now that you have defined the PersistentVolumeClaim, it is time to deploy it using the following command:

      • kubectl apply -f pvc-rook-ceph-block.yaml

      You will receive the following output:

      Output

      persistentvolumeclaim/mongo-pvc created

      You can now check the status of your PVC:

      When the PVC is bound, you are ready:

      Output

      NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE mongo-pvc Bound pvc-ec1ca7d1-d069-4d2a-9281-3d22c10b6570 5Gi RWO rook-ceph-block 16s

      You have now successfully created a storage class and used it to create a PersistenVolumeClaim that you will mount to a application to persist data in the next section.

      Step 4 — Creating a MongoDB Deployment with a rook-ceph-block

      Now that you have successfully created a storage block and a persistent volume, you will put it to use by implementing it in a MongoDB application.

      The configuration will contain a few things:

      • A single container deployment based on the latest version of the mongo image.
      • A persistent volume to preserve the data of the MongoDB database.
      • A service to expose the MongoDB port on port 31017 of every node so you can interact with it later.

      First open the configuration file:

      Start the manifest with the Deployment resource:

      mongo.yaml

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: mongo
      spec:
        selector:
          matchLabels:
            app: mongo
        template:
          metadata:
            labels:
              app: mongo
          spec:
            containers:
            - image: mongo:latest
              name: mongo
              ports:
              - containerPort: 27017
                name: mongo
              volumeMounts:
              - name: mongo-persistent-storage
                mountPath: /data/db
            volumes:
            - name: mongo-persistent-storage
              persistentVolumeClaim:
                claimName: mongo-pvc
      
      ...
      

      For each resource in the manifest, you need to set an apiVersion. For deployments and services, use apiVersion: apps/v1, which is a stable version. Then, tell Kubernetes which resource you want to define using the kind key. Each definition should also have a name defined in metadata.name.

      The spec section tells Kubernetes what the desired state of your final state of the deployment is. This definition requests that Kubernetes should create one pod with one replica.

      Labels are key-value pairs that help you organize and cross-reference your Kubernetes resources. You can define them using metadata.labels and you can later search for them using selector.matchLabels.

      The spec.template key defines the model that Kubernetes will use to create each of your pods. Here you will define the specifics of your pod’s deployment like the image name, container ports, and the volumes that should be mounted. The image will then automatically be pulled from an image registry by Kubernetes.

      Here you will use the PersistentVolumeClaim you created earlier to persist the data of the /data/db directory of the pods. You can also specify extra information like environment variables that will help you with further customizing your deployment.

      Next, add the following code to the file to define a Kubernetes Service that exposes the MongoDB port on port 31017 of every node in your cluster:

      mongo.yaml

      ...
      
      ---
      apiVersion: v1
      kind: Service
      metadata:
        name: mongo
        labels:
          app: mongo
      spec:
        selector:
          app: mongo
        type: NodePort
        ports:
          - port: 27017
            nodePort: 31017
      

      Here you also define an apiVersion, but instead of using the Deployment type, you define a Service. The service will receive connections on port 31017 and forward them to the pods’ port 27017, where you can then access the application.

      The service uses NodePort as the service type, which will expose the Service on each Node’s IP at a static port between 30000 and 32767 (31017 in this case).

      Now that you have defined the deployment, it is time to deploy it:

      • kubectl apply -f mongo.yaml

      You will see the following output:

      Output

      deployment.apps/mongo created service/mongo created

      You can check the status of the deployment and service:

      • kubectl get svc,deployments

      The output will be something like this:

      Output

      NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.245.0.1 <none> 443/TCP 33m service/mongo NodePort 10.245.124.118 <none> 27017:31017/TCP 4m50s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/mongo 1/1 1 1 4m50s

      After the deployment is ready, you can start saving data into your database. The easiest way to do so is by using the MongoDB shell, which is included in the MongoDB pod you just started. You can open it using kubectl.

      For that you are going to need the name of the pod, which you can get using the following command:

      The output will be similar to this:

      Output

      NAME READY STATUS RESTARTS AGE mongo-7654889675-mjcks 1/1 Running 0 13m

      Now copy the name and use it in the exec command:

      • kubectl exec -it your_pod_name mongo

      Now that you are in the MongoDB shell let’s continue by creating a database:

      The use command switches between databases or creates them if they don’t exist.

      Output

      switched to db test

      Then insert some data into your new test database. You use the insertOne() method to insert a new document in the created database:

      • db.test.insertOne( {name: "test", number: 10 })

      Output

      { "acknowledged" : true, "insertedId" : ObjectId("5f22dd521ba9331d1a145a58") }

      The next step is retrieving the data to make sure it is saved, which can be done using the find command on your collection:

      • db.getCollection("test").find()

      The output will be similar to this:

      Output

      NAME READY STATUS RESTARTS AGE { "_id" : ObjectId("5f1b18e34e69b9726c984c51"), "name" : "test", "number" : 10 }

      Now that you have saved some data into the database, it will be persisted in the underlying Ceph volume structure. One big advantage of this kind of deployment is the dynamic provisioning of the volume. Dynamic provisioning means that applications only need to request the storage and it will be automatically provided by Ceph instead of developers creating the storage manually by sending requests to their storage providers.

      Let’s validate this functionality by restarting the pod and checking if the data is still there. You can do this by deleting the pod, because it will be restarted to fulfill the state defined in the deployment:

      • kubectl delete pod -l app=mongo

      Now let’s validate that the data is still there by connecting to the MongoDB shell and printing out the data. For that you first need to get your pod’s name and then use the exec command to open the MongoDB shell:

      The output will be similar to this:

      Output

      NAME READY STATUS RESTARTS AGE mongo-7654889675-mjcks 1/1 Running 0 13m

      Now copy the name and use it in the exec command:

      • kubectl exec -it your_pod_name mongo

      After that, you can retrieve the data by connecting to the database and printing the whole collection:

      • use test
      • db.getCollection("test").find()

      The output will look similar to this:

      Output

      NAME READY STATUS RESTARTS AGE { "_id" : ObjectId("5f1b18e34e69b9726c984c51"), "name" : "test", "number" : 10 }

      As you can see the data you saved earlier is still in the database even though you restarted the pod. Now that you have successfully set up Rook and Ceph and used them to persist the data of your deployment, let’s review the Rook toolbox and what you can do with it.

      The Rook Toolbox is a tool that helps you get the current state of your Ceph deployment and troubleshoot problems when they arise. It also allows you to change your Ceph configurations like enabling certain modules, creating users, or pools.

      In this section, you will install the Rook Toolbox and use it to execute basic commands like getting the current Ceph status.

      The toolbox can be started by deploying the toolbox.yaml file, which is in the examples/kubernetes/ceph directory:

      • kubectl apply -f toolbox.yaml

      You will receive the following output:

      Output

      deployment.apps/rook-ceph-tools created

      Now check that the pod is running:

      • kubectl -n rook-ceph get pod -l "app=rook-ceph-tools"

      Your output will be similar to this:

      Output

      NAME READY STATUS RESTARTS AGE rook-ceph-tools-7c5bf67444-bmpxc 1/1 Running 0 9s

      Once the pod is running you can connect to it using the kubectl exec command:

      • kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath="{.items[0].metadata.name}") bash

      Let’s break this command down for better understanding:

      1. The kubectl exec command lets you execute commands in a pod; like setting an environment variable or starting a service. Here you use it to open the BASH terminal in the pod. The command that you want to execute is defined at the end of the command.
      2. You use the -n flag to specify the Kubernetes namespace the pod is running in.
      3. The -i (interactive) and -t (tty) flags tell Kubernetes that you want to run the command in interactive mode with tty enabled. This lets you interact with the terminal you open.
      4. $() lets you define an expression in your command. That means that the expression will be evaluated (executed) before the main command and the resulting value will then be passed to the main command as an argument. Here we define another Kubernetes command to get a pod where the label app=rook-ceph-tool and read the name of the pod using jsonpath. We then use the name as an argument for our first command.

      Note: As already mentioned this command will open a terminal in the pod, so your prompt will change to reflect this.

      Now that you are connected to the pod you can execute Ceph commands for checking the current status or troubleshooting error messages. For example the ceph status command will give you the current health status of your Ceph configuration and more information like the running MONs, the current running data pools, the available and used storage, and the current I/O operations:

      Here is the output of the command:

      Output

      cluster: id: 71522dde-064d-4cf8-baec-2f19b6ae89bf health: HEALTH_OK services: mon: 3 daemons, quorum a,b,c (age 23h) mgr: a(active, since 23h) osd: 3 osds: 3 up (since 23h), 3 in (since 23h) data: pools: 1 pools, 32 pgs objects: 61 objects, 157 MiB usage: 3.4 GiB used, 297 GiB / 300 GiB avail pgs: 32 active+clean io: client: 5.3 KiB/s wr, 0 op/s rd, 0 op/s wr

      You can also query the status of specific items like your OSDs using the following command:

      This will print information about your OSD like the used and available storage and the current state of the OSD:

      Output

      +----+------------+-------+-------+--------+---------+--------+---------+-----------+ | id | host | used | avail | wr ops | wr data | rd ops | rd data | state | +----+------------+-------+-------+--------+---------+--------+---------+-----------+ | 0 | node-3jis6 | 1165M | 98.8G | 0 | 0 | 0 | 0 | exists,up | | 1 | node-3jisa | 1165M | 98.8G | 0 | 5734 | 0 | 0 | exists,up | | 2 | node-3jise | 1165M | 98.8G | 0 | 0 | 0 | 0 | exists,up | +----+------------+-------+-------+--------+---------+--------+---------+-----------+

      More information about the available commands and how you can use them to debug your Ceph deployment can be found in the official documentation.

      You have now successfully set up a complete Rook Ceph cluster on Kubernetes that helps you persist the data of your deployments and share their state between the different pods without having to use some kind of external storage or provision storage manually. You also learned how to start the Rook Toolbox and use it to debug and troubleshoot your Ceph deployment.

      Conclusion

      In this article, you configured your own Rook Ceph cluster on Kubernetes and used it to provide storage for a MongoDB application. You extracted useful terminology and became familiar with the essential concepts of Rook so you can customize your deployment.

      If you are interested in learning more, consider checking out the official Rook documentation and the example configurations provided in the repository for more configuration options and parameters.

      You can also try out the other kinds of storage Ceph provides like shared file systems if you want to mount the same volume to multiple pods at the same time.



      Source link

      How To Set Up and Secure an etcd Cluster with Ansible on Ubuntu 18.04


      The author selected the Wikimedia Foundation to receive a donation as part of the Write for DOnations program.

      Introduction

      etcd is a distributed key-value store relied on by many platforms and tools, including Kubernetes, Vulcand, and Doorman. Within Kubernetes, etcd is used as a global configuration store that stores the state of the cluster. Knowing how to administer etcd is essential to administering a Kubernetes cluster. Whilst there are many managed Kubernetes offerings, also known as Kubernetes-as-a-Service, that remove this administrative burden away from you, many companies still choose to run self-managed Kubernetes clusters on-premises because of the flexibility it brings.

      The first half of this article will guide you through setting up a 3-node etcd cluster on Ubuntu 18.04 servers. The second half will focus on securing the cluster using Transport Layer Security, or TLS. To run each setup in an automated manner, we will use Ansible throughout. Ansible is a configuration management tool similar to Puppet, Chef, and SaltStack; it allows us to define each setup step in a declarative manner, inside files called playbooks.

      At the end of this tutorial, you will have a secure 3-node etcd cluster running on your servers. You will also have an Ansible playbook that allows you to repeatedly and consistently recreate the same setup on a fresh set of servers.

      Prerequisites

      Before you begin this guide you’ll need the following:

      • Python, pip, and the pyOpenSSL package installed on your local machine. To learn how to install Python3, pip, and Python packages, refer to How To Install Python 3 and Set Up a Local Programming Environment on Ubuntu 18.04.

      • Three Ubuntu 18.04 servers on the same local network, with at least 2GB of RAM and root SSH access. You should also configure the servers to have the hostnames etcd1, etcd2, and etcd3. The steps outlined in this article would work on any generic server, not necessarily DigitalOcean Droplets. However, if you’d like to host your servers on DigitalOcean, you can follow the How to Create a Droplet from the DigitalOcean Control Panel guide to fulfil this requirement. Note that you must enable the Private Networking option when creating your Droplet. To enable private networking on existing Droplets, refer to How to Enable Private Networking on Droplets.

      Warning: Since the purpose of this article is to provide an introduction to setting up an etcd cluster on a private network, the three Ubuntu 18.04 servers in this setup were not tested with a firewall and are accessed as the root user. In a production setup, any node exposed to the public internet would require a firewall and a sudo user to adhere to security best practices. For more information, check out the Initial Server Setup with Ubuntu 18.04 tutorial.

      Step 1 — Configuring Ansible for the Control Node

      Ansible is a tool used to manage servers. The servers Ansible is managing are called the managed nodes, and the machine that is running Ansible is called the control node. Ansible works by using the SSH keys on the control node to gain access to the managed nodes. Once an SSH session is established, Ansible will run a set of scripts to provision and configure the managed nodes. In this step, we will test that we are able to use Ansible to connect to the managed nodes and run the hostname command.

      A typical day for a system administrator may involve managing different sets of nodes. For instance, you may use Ansible to provision some new servers, but later on use it to reconfigure another set of servers. To allow administrators to better organize the set of managed nodes, Ansible provides the concept of host inventory (or inventory for short). You can define every node that you wish to manage with Ansible inside an inventory file, and organize them into groups. Then, when running the ansible and ansible-playbook commands, you can specify which hosts or groups the command applies to.

      By default, Ansible reads the inventory file from /etc/ansible/hosts; however, we can specify a different inventory file by using the --inventory flag (or -i for short).

      To get started, create a new directory on your local machine (the control node) to house all the files for this tutorial:

      • mkdir -p $HOME/playground/etcd-ansible

      Then, enter into the directory you just created:

      • cd $HOME/playground/etcd-ansible

      Inside the directory, create and open a blank inventory file named hosts using your editor:

      • nano $HOME/playground/etcd-ansible/hosts

      Inside the hosts file, list out each of your managed nodes in the following format, replacing the public IP addresses highlighted with the actual public IP addresses of your servers:

      ~/playground/etcd-ansible/hosts

      [etcd]
      etcd1 ansible_host=etcd1_public_ip  ansible_user=root
      etcd2 ansible_host=etcd2_public_ip  ansible_user=root
      etcd3 ansible_host=etcd3_public_ip  ansible_user=root
      

      The [etcd] line defines a group called etcd. Under the group definition, we list all our managed nodes. Each line begins with an alias (e.g., etcd1), which allows us to refer to each host using an easy-to-remember name instead of a long IP address. The ansible_host and ansible_user are Ansible variables. In this case, they are used to provide Ansible with the public IP addresses and SSH usernames to use when connecting via SSH.

      To ensure Ansible is able to connect with our managed nodes, we can test for connectivity by using Ansible to run the hostname command on each of the hosts within the etcd group:

      • ansible etcd -i hosts -m command -a hostname

      Let us break down this command to learn what each part means:

      • etcd: specifies the host pattern to use to determine which hosts from the inventory are being managed with this command. Here, we are using the group name as the host pattern.
      • -i hosts: specifies the inventory file to use.
      • -m command: the functionality behind Ansible is provided by modules. The command module takes the argument passed in and executes it as a command on each of the managed nodes. This tutorial will introduce a few more Ansible modules as we progress.
      • -a hostname: the argument to pass into the module. The number and types of arguments depend on the module.

      After running the command, you will find the following output, which means Ansible is configured correctly:

      Output

      etcd2 | CHANGED | rc=0 >> etcd2 etcd3 | CHANGED | rc=0 >> etcd3 etcd1 | CHANGED | rc=0 >> etcd1

      Each command that Ansible runs is called a task. Using ansible on the command line to run tasks is called running ad-hoc commands. The upside of ad-hoc commands is that they are quick and require little setup; the downside is that they run manually, and thus cannot be committed to a version control system like Git.

      A slight improvement would be to write a shell script and run our commands using Ansible’s script module. This would allow us to record the configuration steps we took into version control. However, shell scripts are imperative, which means we are responsible for figuring out the commands to run (the “how”s) to configure the system to the desired state. Ansible, on the other hand, advocates for a declarative approach, where we define “what” the desired state of our server should be inside configuration files, and Ansible is responsible for getting the server to that desired state.

      The declarative approach is preferred because the intent of the configuration file is immediately conveyed, meaning it’s easier to understand and maintain. It also places the onus of handling edge cases on Ansible instead of the administrator, saving us a lot of work.

      Now that you have configured the Ansible control node to communicate with the managed nodes, in the next step, we will introduce you to Ansible playbooks, which allow you to specify tasks in a declarative way.

      Step 2 — Getting the Hostnames of Managed Nodes with Ansible Playbooks

      In this step, we will replicate what was done in Step 1—printing out the hostnames of the managed nodes—but instead of running ad-hoc tasks, we will define each task declaratively as an Ansible playbook and run it. The purpose of this step is to demonstrate how Ansible playbooks work; we will carry out much more substantial tasks with playbooks in later steps.

      Inside your project directory, create a new file named playbook.yaml using your editor:

      • nano $HOME/playground/etcd-ansible/playbook.yaml

      Inside playbook.yaml, add the following lines:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: etcd
        tasks:
          - name: "Retrieve hostname"
            command: hostname
            register: output
          - name: "Print hostname"
            debug: var=output.stdout_lines
      

      Close and save the playbook.yaml file by pressing CTRL+X followed by Y.

      The playbook contains a list of plays; each play contains a list of tasks that should be run on all hosts matching the host pattern specified by the hosts key. In this playbook, we have one play that contains two tasks. The first task runs the hostname command using the command module and registers the output to a variable named output. In the second task, we use the debug module to print out the stdout_lines property of the output variable.

      We can now run this playbook using the ansible-playbook command:

      • ansible-playbook -i hosts playbook.yaml

      You will find the following output, which means your playbook is working correctly:

      Output

      PLAY [etcd] *********************************************************************************************************************** TASK [Gathering Facts] ************************************************************************************************************ ok: [etcd2] ok: [etcd3] ok: [etcd1] TASK [Retrieve hostname] ********************************************************************************************************** changed: [etcd2] changed: [etcd3] changed: [etcd1] TASK [Print hostname] ************************************************************************************************************* ok: [etcd1] => { "output.stdout_lines": [ "etcd1" ] } ok: [etcd2] => { "output.stdout_lines": [ "etcd2" ] } ok: [etcd3] => { "output.stdout_lines": [ "etcd3" ] } PLAY RECAP ************************************************************************************************************************ etcd1 : ok=3 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 etcd2 : ok=3 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 etcd3 : ok=3 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

      Note: ansible-playbook sometimes uses cowsay as a playful way to print the headings. If you find a lot of ASCII-art cows printed on your terminal, now you know why. To disable this feature, set the ANSIBLE_NOCOWS environment variable to 1 prior to running ansible-playbook by running export ANSIBLE_NOCOWS=1 in your shell.

      In this step, we’ve moved from running imperative ad-hoc tasks to running declarative playbooks. In the next step, we will replace these two demo tasks with tasks that will set up our etcd cluster.

      Step 3 — Installing etcd on the Managed Nodes

      In this step, we will show you the commands to install etcd manually and demonstrate how to translate these same commands into tasks inside our Ansible playbook.

      etcd and its client etcdctl are available as binaries, which we’ll download, extract, and move to a directory that’s part of the PATH environment variable. When configured manually, these are the steps we would take on each of the managed nodes:

      • mkdir -p /opt/etcd/bin
      • cd /opt/etcd/bin
      • wget -qO- https://storage.googleapis.com/etcd/v3.3.13/etcd-v3.3.13-linux-amd64.tar.gz | tar --extract --gzip --strip-components=1
      • echo 'export PATH="$PATH:/opt/etcd/bin"' >> ~/.profile
      • echo 'export ETCDCTL_API=3" >> ~/.profile

      The first four commands download and extract the binaries to the /opt/etcd/bin/ directory. By default, the etcdctl client will use API v2 to communicate with the etcd server. Since we are running etcd v3.x, the last command sets the ETCDCTL_API environment variable to 3.

      Note: Here, we are using etcd v3.3.13 built for a machine with processors that use the AMD64 instruction set. You can find binaries for other systems and other versions on the the official GitHub Releases page.

      To replicate the same steps in a standardized format, we can add tasks to our playbook. Open the playbook.yaml playbook file in your editor:

      • nano $HOME/playground/etcd-ansible/playbook.yaml

      Replace the entirety of the playbook.yaml file with the following contents:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: etcd
        become: True
        tasks:
          - name: "Create directory for etcd binaries"
            file:
              path: /opt/etcd/bin
              state: directory
              owner: root
              group: root
              mode: 0700
          - name: "Download the tarball into the /tmp directory"
            get_url:
              url: https://storage.googleapis.com/etcd/v3.3.13/etcd-v3.3.13-linux-amd64.tar.gz
              dest: /tmp/etcd.tar.gz
              owner: root
              group: root
              mode: 0600
              force: True
          - name: "Extract the contents of the tarball"
            unarchive:
              src: /tmp/etcd.tar.gz
              dest: /opt/etcd/bin/
              owner: root
              group: root
              mode: 0600
              extra_opts:
                - --strip-components=1
              decrypt: True
              remote_src: True
          - name: "Set permissions for etcd"
            file:
              path: /opt/etcd/bin/etcd
              state: file
              owner: root
              group: root
              mode: 0700
          - name: "Set permissions for etcdctl"
            file:
              path: /opt/etcd/bin/etcdctl
              state: file
              owner: root
              group: root
              mode: 0700
          - name: "Add /opt/etcd/bin/ to the $PATH environment variable"
            lineinfile:
              path: /etc/profile
              line: export PATH="$PATH:/opt/etcd/bin"
              state: present
              create: True
              insertafter: EOF
          - name: "Set the ETCDCTL_API environment variable to 3"
            lineinfile:
              path: /etc/profile
              line: export ETCDCTL_API=3
              state: present
              create: True
              insertafter: EOF
      

      Each task uses a module; for this set of tasks, we are making use of the following modules:

      • file: to create the /opt/etcd/bin directory, and to later set the file permissions for the etcd and etcdctl binaries.
      • get_url: to download the gzipped tarball onto the managed nodes.
      • unarchive: to extract and unpack the etcd and etcdctl binaries from the gzipped tarball.
      • lineinfile: to add an entry into the .profile file.

      To apply these changes, close and save the playbook.yaml file by pressing CTRL+X followed by Y. Then, on the terminal, run the same ansible-playbook command again:

      • ansible-playbook -i hosts playbook.yaml

      The PLAY RECAP section of the output will show only ok and changed:

      Output

      ... PLAY RECAP ************************************************************************************************************************ etcd1 : ok=8 changed=7 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 etcd2 : ok=8 changed=7 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 etcd3 : ok=8 changed=7 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

      To confirm a correct installation of etcd, manually SSH into one of the managed nodes and run etcd and etcdctl:

      etcd1_public_ip is the public IP addresses of the server named etcd1. Once you have gained SSH access, run etcd --version to print out the version of etcd installed:

      You will find output similar to what’s shown in the following, which means the etcd binary is successfully installed:

      Output

      etcd Version: 3.3.13 Git SHA: 98d3084 Go Version: go1.10.8 Go OS/Arch: linux/amd64

      To confirm etcdctl is successfully installed, run etcdctl version:

      You will find output similar to the following:

      Output

      etcdctl version: 3.3.13 API version: 3.3

      Note that the output says API version: 3.3, which also confirms that our ETCDCTL_API environment variable was set correctly.

      Exit out of the etcd1 server to return to your local environment.

      We have now successfully installed etcd and etcdctl on all of our managed nodes. In the next step, we will add more tasks to our play to run etcd as a background service.

      Step 4 — Creating a Unit File for etcd

      The quickest way to run etcd with Ansible may appear to be to use the command module to run /opt/etcd/bin/etcd. However, this will not work because it will make etcd run as a foreground process. Using the command module will cause Ansible to hang as it waits for the etcd command to return, which it never will. So in this step, we are going to update our playbook to run our etcd binary as a background service instead.

      Ubuntu 18.04 uses systemd as its init system, which means we can create new services by writing unit files and placing them inside the /etc/systemd/system/ directory.

      First, inside our project directory, create a new directory named files/:

      Then, using your editor, create a new file named etcd.service within that directory:

      Next, copy the following code block into the files/etcd.service file:

      ~/playground/etcd-ansible/files/etcd.service

      [Unit]
      Description=etcd distributed reliable key-value store
      
      [Service]
      Type=notify
      ExecStart=/opt/etcd/bin/etcd
      Restart=always
      

      This unit file defines a service that runs the executable at /opt/etcd/bin/etcd, notifies systemd when it has finished initializing, and always restarts if it ever exits.

      Note: If you’d like to understand more about systemd and unit files, or want to tailor the unit file to your needs, read the Understanding Systemd Units and Unit Files guide.

      Close and save the files/etcd.service file by pressing CTRL+X followed by Y.

      Next, we need to add a task inside our playbook that will copy the files/etcd.service local file into the /etc/systemd/system/etcd.service directory for every managed node. We can do this using the copy module.

      Open up your playbook:

      • nano $HOME/playground/etcd-ansible/playbook.yaml

      Append the following highlighted task to the end of our existing tasks:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: etcd
        become: True
        tasks:
          ...
          - name: "Set the ETCDCTL_API environment variable to 3"
            lineinfile:
              path: /etc/profile
              line: export ETCDCTL_API=3
              state: present
              create: True
              insertafter: EOF
          - name: "Create a etcd service"
            copy:
              src: files/etcd.service
              remote_src: False
              dest: /etc/systemd/system/etcd.service
              owner: root
              group: root
              mode: 0644
      

      By copying the unit file into the /etc/systemd/system/etcd.service, a service is now defined.

      Save and exit the playbook.

      Run the same ansible-playbook command again to apply the new changes:

      • ansible-playbook -i hosts playbook.yaml

      To confirm the changes have been applied, first SSH into one of the managed nodes:

      Then, run systemctl status etcd to query systemd about the status of the etcd service:

      You will find the following output, which states that the service is loaded:

      Output

      ● etcd.service - etcd distributed reliable key-value store Loaded: loaded (/etc/systemd/system/etcd.service; static; vendor preset: enabled) Active: inactive (dead) ...

      Note: The last line (Active: inactive (dead)) of the output states that the service is inactive, which means it would not be automatically run when the system starts. This is expected and not an error.

      Press q to return to the shell, and then run exit to exit out of the managed node and back to your local shell:

      In this step, we updated our playbook to run the etcd binary as a systemd service. In the next step, we will continue to set up etcd by providing it space to store its data.

      Step 5 — Configuring the Data Directory

      etcd is a key-value data store, which means we must provide it with space to store its data. In this step, we are going to update our playbook to define a dedicated data directory for etcd to use.

      Open up your playbook:

      • nano $HOME/playground/etcd-ansible/playbook.yaml

      Append the following task to the end of the list of tasks:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: etcd
        become: True
        tasks:
          ...
          - name: "Create a etcd service"
            copy:
              src: files/etcd.service
              remote_src: False
              dest: /etc/systemd/system/etcd.service
              owner: root
              group: root
              mode: 0644
          - name: "Create a data directory"
            file:
              path: /var/lib/etcd/{{ inventory_hostname }}.etcd
              state: directory
              owner: root
              group: root
              mode: 0755
      

      Here, we are using /var/lib/etcd/hostname.etcd as the data directory, where hostname is the hostname of the current managed node. inventory_hostname is a variable that represents the hostname of the current managed node; its value is populated by Ansible automatically. The curly-braces syntax (i.e., {{ inventory_hostname }}) is used for variable substitution, supported by the Jinja2 template engine, which is the default templating engine for Ansible.

      Close the text editor and save the file.

      Next, we need to instruct etcd to use this data directory. We do this by passing in the data-dir parameter to etcd. To set etcd parameters, we can use a combination of environment variables, command-line flags, and configuration files. For this tutorial, we will use a configuration file, as it is much neater to isolate all configurations into a file, rather than have configuration littered across our playbook.

      In your project directory, create a new directory named templates/:

      Then, using your editor, create a new file named etcd.conf.yaml.j2 within the directory:

      • nano templates/etcd.conf.yaml.j2

      Next, copy the following line and paste it into the file:

      ~/playground/etcd-ansible/templates/etcd.conf.yaml.j2

      data-dir: /var/lib/etcd/{{ inventory_hostname }}.etcd
      

      This file uses the same Jinja2 variable substitution syntax as our playbook. To substitute the variables and upload the result to each managed host, we can use the template module. It works in a similar way to copy, except it will perform variable substitution prior to upload.

      Exit from etcd.conf.yaml.j2, then open up your playbook:

      • nano $HOME/playground/etcd-ansible/playbook.yaml

      Append the following tasks to the list of tasks to create a directory and upload the templated configuration file into it:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: etcd
        become: True
        tasks:
          ...
          - name: "Create a data directory"
            file:
              ...
              mode: 0755
          - name: "Create directory for etcd configuration"
            file:
              path: /etc/etcd
              state: directory
              owner: root
              group: root
              mode: 0755
          - name: "Create configuration file for etcd"
            template:
              src: templates/etcd.conf.yaml.j2
              dest: /etc/etcd/etcd.conf.yaml
              owner: root
              group: root
              mode: 0600
      

      Save and close this file.

      Because we’ve made this change, we need to update our service’s unit file to pass it the location of our configuration file (i.e., /etc/etcd/etcd.conf.yaml).

      Open the etcd service file on your local machine:

      Update the files/etcd.service file by adding the --config-file flag highlighted in the following:

      ~/playground/etcd-ansible/files/etcd.service

      [Unit]
      Description=etcd distributed reliable key-value store
      
      [Service]
      Type=notify
      ExecStart=/opt/etcd/bin/etcd --config-file /etc/etcd/etcd.conf.yaml
      Restart=always
      

      Save and close this file.

      In this step, we used our playbook to provide a data directory for etcd to store its data. In the next step, we will add a couple more tasks to restart the etcd service and have it run on startup.

      Step 6 — Enabling and Starting the etcd Service

      Whenever we make changes to the unit file of a service, we need to restart the service to have it take effect. We can do this by running the systemctl restart etcd command. Furthermore, to make the etcd service start automatically on system startup, we need to run systemctl enable etcd. In this step, we will run those two commands using the playbook.

      To run commands, we can use the command module:

      • nano $HOME/playground/etcd-ansible/playbook.yaml

      Append the following tasks to the end of the task list:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: etcd
        become: True
        tasks:
          ...
          - name: "Create configuration file for etcd"
            template:
              ...
              mode: 0600
          - name: "Enable the etcd service"
            command: systemctl enable etcd
          - name: "Start the etcd service"
            command: systemctl restart etcd
      

      Save and close the file.

      Run ansible-playbook -i hosts playbook.yaml once more:

      • ansible-playbook -i hosts playbook.yaml

      To check that the etcd service is now restarted and enabled, SSH into one of the managed nodes:

      Then, run systemctl status etcd to check the status of the etcd service:

      You will find enabled and active (running) as highlighted in the following; this means the changes we made in our playbook have taken effect:

      Output

      ● etcd.service - etcd distributed reliable key-value store Loaded: loaded (/etc/systemd/system/etcd.service; static; vendor preset: enabled) Active: active (running) Main PID: 19085 (etcd) Tasks: 11 (limit: 2362)

      In this step, we used the command module to run systemctl commands that restart and enable the etcd service on our managed nodes. Now that we have set up an etcd installation, we will, in the next step, test out its functionality by carry out some basic create, read, update, and delete (CRUD) operations.

      Step 7 — Testing etcd

      Although we have a working etcd installation, it is insecure and not yet ready for production use. But before we secure our etcd setup in later steps, let’s first understand what etcd can do in terms of functionality. In this step, we are going to manually send requests to etcd to add, retrieve, update, and delete data from it.

      By default, etcd exposes an API that listens on port 2379 for client communication. This means we can send raw API requests to etcd using an HTTP client. However, it’s quicker to use the official etcd client etcdctl, which allows you to create/update, retrieve, and delete key-value pairs using the put, get, and del subcommands, respectively.

      Make sure you’re still inside the etcd1 managed node, and run the following etcdctl commands to confirm your etcd installation is working.

      First, create a new entry using the put subcommand.

      The put subcommand has the following syntax:

      etcdctl put key value
      

      On etcd1, run the following command:

      The command we just ran instructs etcd to write the value "bar" to the key foo in the store.

      You will then find OK printed in the output, which indicates the data persisted:

      Output

      OK

      We can then retrieve this entry using the get subcommand, which has the syntax etcdctl get key:

      You will find this output, which shows the key on the first line and the value you inserted earlier on the second line:

      Output

      foo bar

      We can delete the entry using the del subcommand, which has the syntax etcdctl del key:

      You will find the following output, which indicates the number of entries deleted:

      Output

      1

      Now, let’s run the get subcommand once more in an attempt to retrieve a deleted key-value pair:

      You will not receive an output, which means etcdctl is unable to retrieve the key-value pair. This confirms that after the entry is deleted, it can no longer be retrieved.

      Now that you’ve tested the basic operations of etcd and etcdctl, let’s exit out of our managed node and back to your local environment:

      In this step, we used the etcdctl client to send requests to etcd. At this point, we are running three separate instances of etcd, each acting independently from each other. However, etcd is designed as a distributed key-value store, which means multiple etcd instances can group up to form a single cluster; each instance then becomes a member of the cluster. After forming a cluster, you would be able to retrieve a key-value pair that was inserted from a different member of the cluster. In the next step, we will use our playbook to transform our 3 single-node clusters into a single 3-node cluster.

      Step 8 — Forming a Cluster Using Static Discovery

      To create one 3-node cluster instead of three 1-node clusters, we must configure these etcd installations to communicate with each other. This means each one must know the IP addresses of the others. This process is called discovery. Discovery can be done using either static configuration or dynamic service discovery. In this step, we will discuss the difference between the two, as well as update our playbook to set up an etcd cluster using static discovery.

      Discovery by static configuration is the method that requires the least setup; this is where the endpoints of each member are passed into the etcd command before it is executed. To use static configuration, the following conditions must be met prior to the initialization of the cluster:

      • the number of members are known
      • the endpoints of each member are known
      • the IP addresses for all endpoints are static

      If these conditions cannot be met, then you can use a dynamic discovery service. With dynamic service discovery, all instances would register with the discovery service, which allows each member to retrieve information about the location of other members.

      Since we know we want a 3-node etcd cluster, and all our servers have static IP addresses, we will use static discovery. To initiate our cluster using static discovery, we must add several parameters to our configuration file. Use an editor to open up the templates/etcd.conf.yaml.j2 template file:

      • nano templates/etcd.conf.yaml.j2

      Then, add the following highlighted lines:

      ~/playground/etcd-ansible/templates/etcd.conf.yaml.j2

      data-dir: /var/lib/etcd/{{ inventory_hostname }}.etcd
      name: {{ inventory_hostname }}
      initial-advertise-peer-urls: http://{{ hostvars[inventory_hostname]['ansible_facts']['eth1']['ipv4']['address'] }}:2380
      listen-peer-urls: http://{{ hostvars[inventory_hostname]['ansible_facts']['eth1']['ipv4']['address'] }}:2380,http://127.0.0.1:2380
      advertise-client-urls: http://{{ hostvars[inventory_hostname]['ansible_facts']['eth1']['ipv4']['address'] }}:2379
      listen-client-urls: http://{{ hostvars[inventory_hostname]['ansible_facts']['eth1']['ipv4']['address'] }}:2379,http://127.0.0.1:2379
      initial-cluster-state: new
      initial-cluster: {% for host in groups['etcd'] %}{{ hostvars[host]['ansible_facts']['hostname'] }}=http://{{ hostvars[host]['ansible_facts']['eth1']['ipv4']['address'] }}:2380{% if not loop.last %},{% endif %}{% endfor %}
      

      Close and save the templates/etcd.conf.yaml.j2 file by pressing CTRL+X followed by Y.

      Here’s a brief explanation of each parameter:

      • name – a human-readable name for the member. By default, etcd uses a unique, randomly-generated ID to identify each member; however, a human-readable name allows us to reference it more easily inside configuration files and on the command line. Here, we will use the hostnames as the member names (i.e., etcd1, etcd2, and etcd3).
      • initial-advertise-peer-urls – a list of IP address/port combinations that other members can use to communicate with this member. In addition to the API port (2379), etcd also exposes port 2380 for peer communication between etcd members, which allows them to send messages to each other and exchange data. Note that these URLs must be reachable by its peers (and not be a local IP address).
      • listen-peer-urls – a list of IP address/port combinations where the current member will listen for communication from other members. This must include all the URLs from the --initial-advertise-peer-urls flag, but also local URLs like 127.0.0.1:2380. The destination IP address/port of incoming peer messages must match one of the URLs listed here.
      • advertise-client-urls – a list of IP address/port combinations that clients should use to communicate with this member. These URLs must be reachable by the client (and not be a local address). If the client is accessing the cluster over public internet, this must be a public IP address.
      • listen-client-urls – a list of IP address/port combinations where the current member will listen for communication from clients. This must include all the URLs from the --advertise-client-urls flag, but also local URLs like 127.0.0.1:2379. The destination IP address/port of incoming client messages must match one of the URLs listed here.
      • initial-cluster – a list of endpoints for each member of the cluster. Each endpoint must match one of the corresponding member’s initial-advertise-peer-urls URLs.
      • initial-cluster-state – either new or existing.

      To ensure consistency, etcd can only make decisions when a majority of the nodes are healthy. This is known as establishing quorum. In other words, in a three-member cluster, quorum is reached if two or more of the members are healthy.

      If the initial-cluster-state parameter is set to new, etcd will know that this is a new cluster being bootstrapped, and will allow members to start in parallel, without waiting for quorum to be reached. More concretely, after the first member is started, it will not have quorum because one third (33.33%) is less than or equal to 50%. Normally, etcd will halt and refuse to commit any more actions and the cluster will never be formed. However, with initial-cluster-state set to new, it will ignore the initial lack of quorum.

      If set to existing, the member will try to join an existing cluster, and expects quorum to already be established.

      Note: You can find more details about all supported configuration flags in the Configuration section of etcd’s documentation.

      In the updated templates/etcd.conf.yaml.j2 template file, there are a few instances of hostvars. When Ansible runs, it will collect variables from a variety of sources. We have already made use of the inventory_hostname variable before, but there are a lot more available. These variables are available under hostvars[inventory_hostname]['ansible_facts']. Here, we are extracting the private IP addresses of each node and using it to construct our parameter value.

      Note: Because we enabled the Private Networking option when we created our servers, each server would have three IP addresses associated with them:

      • A loopback IP address – an address that is only valid inside the same machine. It is used for the machine to refer to itself, e.g., 127.0.0.1
      • A public IP address – an address that is routable over the public internet, e.g., 178.128.169.51
      • A private IP address – an address that is routable only within the private network; in the case of DigitalOcean Droplets, there’s a private network within each datacenter, e.g., 10.131.82.225

      Each of these IP addresses are associated with a different network interface—the loopback address is associated with the lo interface, the public IP address is associated with the eth0 interface, and the private IP address with the eth1 interface. We are using the eth1 interface so that all traffic stays within the private network, without ever reaching the internet.

      Understanding of network interfaces is not required for this article, but if you’d like to learn more, An Introduction to Networking Terminology, Interfaces, and Protocols is a great place to start.

      The {% %} Jinja2 syntax defines the for loop structure that iterates through every node in the etcd group to build up the initial-cluster string into a format required by etcd.

      To form the new three-member cluster, you must first stop the etcd service and clear the data directory before launching the cluster. To do this, use an editor to open up the playbook.yaml file on your local machine:

      • nano $HOME/playground/etcd-ansible/playbook.yaml

      Then, before the "Create a data directory" task, add a task to stop the etcd service:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: etcd
        become: True
        tasks:
          ...
              group: root
              mode: 0644
          - name: "Stop the etcd service"
            command: systemctl stop etcd
          - name: "Create a data directory"
            file:
          ...
      

      Next, update the "Create a data directory" task to first delete the data directory and recreate it:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: etcd
        become: True
        tasks:
          ...
          - name: "Stop the etcd service"
            command: systemctl stop etcd
          - name: "Create a data directory"
            file:
              path: /var/lib/etcd/{{ inventory_hostname }}.etcd
              state: "{{ item }}"
              owner: root
              group: root
              mode: 0755
            with_items:
              - absent
              - directory
          - name: "Create directory for etcd configuration"
            file:
          ...
      

      The with_items property defines a list of strings that this task will iterate over. It is equivalent to repeating the same task twice but with different values for the state property. Here, we are iterating over the list with items absent and directory, which ensures that the data directory is deleted first and then re-created after.

      Close and save the playbook.yaml file by pressing CTRL+X followed by Y. Then, run ansible-playbook again. Ansible will now create a single, 3-member etcd cluster:

      • ansible-playbook -i hosts playbook.yaml

      You can check this by SSH-ing into any etcd member node:

      Then run etcdctl endpoint health --cluster:

      • etcdctl endpoint health --cluster

      This will list out the health of each member of the cluster:

      Output

      http://etcd2_private_ip:2379 is healthy: successfully committed proposal: took = 2.517267ms http://etcd1_private_ip:2379 is healthy: successfully committed proposal: took = 2.153612ms http://etcd3_private_ip:2379 is healthy: successfully committed proposal: took = 2.639277ms

      We have now successfully created a 3-node etcd cluster. We can confirm this by adding an entry to etcd on one member node, and retrieving it on another member node. On one of the member nodes, run etcdctl put:

      Then, use a new terminal to SSH into a different member node:

      Next, attempt to retrieve the same entry using the key:

      You will be able to retrieve the entry, which proves that the cluster is working:

      Output

      foo bar

      Lastly, exit out of each of the managed nodes and back to your local machine:

      In this step, we provisioned a new 3-node cluster. At the moment, communication between etcd members and their peers and clients are conducted through HTTP. This means the communication is unencrypted and any party who can intercept the traffic can read the messages. This is not a big issue if the etcd cluster and clients are all deployed within a private network or virtual private network (VPN) which you fully control. However, if any of the traffic needs to travel through a shared network (private or public), then you should ensure this traffic is encrypted. Furthermore, a mechanism needs to be put in place for a client or peer to verify the authenticity of the server.

      In the next step, we will look at how to secure client-to-server as well as peer communication using TLS.

      Step 9 — Obtaining the Private IP Addresses of Managed Nodes

      To encrypt messages between member nodes, etcd uses Hypertext Transfer Protocol Secure, or HTTPS, which is a layer on top of the Transport Layer Security, or TLS, protocol. TLS uses a system of private keys, certificates, and trusted entities called Certificate Authorities (CAs) to authenticate with, and send encrypted messages to, each other.

      In this tutorial, each member node needs to generate a certificate to identify itself, and have this certificate signed by a CA. We will configure all member nodes to trust this CA, and thus also trust any certificates it signs. This allows member nodes to mutually authenticate with each other.

      The certificate that a member node generates must allow other member nodes to identify itself. All certificates include the Common Name (CN) of the entity it is associated with. This is often used as the identity of the entity. However, when verifying a certificate, client implementations may compare whether the information it collected about the entity match what was given in the certificate. For example, when a client downloads the TLS certificate with the subject of CN=foo.bar.com, but the client is actually connecting to the server using an IP address (e.g., 167.71.129.110), then there’s a mismatch and the client may not trust the certificate. By specifying a subject alternative name (SAN) in the certificate, it informs the verifier that both names belong to the same entity.

      Because our etcd members are peering with each other using their private IP addresses, when we define our certificates, we’ll need to provide these private IP addresses as the subject alternative names.

      To find out the private IP address of a managed node, SSH into it:

      Then run the following command:

      • ip -f inet addr show eth1

      You’ll find output similar to the following lines:

      Output

      3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 inet 10.131.255.176/16 brd 10.131.255.255 scope global eth1 valid_lft forever preferred_lft forever

      In our example output, 10.131.255.176 is the private IP address of the managed node, and the only information we are interested in. To filter out everything else apart from the private IP, we can pipe the output of the ip command to the sed utility, which is used to filter and transform text.

      • ip -f inet addr show eth1 | sed -En -e 's/.*inet ([0-9.]+).*/1/p'

      Now, the only output is the private IP address itself:

      Output

      10.131.255.176

      Once you’re satisfied that the preceding command works, exit out of the managed node:

      To incorporate the preceding commands into our playbook, first open up the playbook.yaml file:

      • nano $HOME/playground/etcd-ansible/playbook.yaml

      Then, add a new play with a single task before our existing play:

      ~/playground/etcd-ansible/playbook.yaml

      ...
      - hosts: etcd
        tasks:
          - shell: ip -f inet addr show eth1 | sed -En -e 's/.*inet ([0-9.]+).*/1/p'
            register: privateIP
      - hosts: etcd
        tasks:
      ...
      

      The task uses the shell module to run the ip and sed commands, which fetches the private IP address of the managed node. It then registers the return value of the shell command inside a variable named privateIP, which we will use later.

      In this step, we added a task to the playbook to obtain the private IP address of the managed nodes. In the next step, we are going to use this information to generate certificates for each member node, and have these certificates signed by a Certificate Authority (CA).

      Step 10 — Generating etcd Members’ Private Keys and CSRs

      In order for a member node to receive encrypted traffic, the sender must use the member node’s public key to encrypt the data, and the member node must use its private key to decrypt the ciphertext and retrieve the original data. The public key is packaged into a certificate and signed by a CA to ensure that it is genuine.

      Therefore, we will need to generate a private key and certificate signing request (CSR) for each etcd member node. To make it easier for us, we will generate all key pairs and sign all certificates locally, on the control node, and then copy the relevant files to the managed hosts.

      First, create a directory called artifacts/, where we’ll place the files (keys and certificates) generated during the process. Open the playbook.yaml file with an editor:

      • nano $HOME/playground/etcd-ansible/playbook.yaml

      In it, use the file module to create the artifacts/ directory:

      ~/playground/etcd-ansible/playbook.yaml

      ...
          - shell: ip -f inet addr show eth1 | sed -En -e 's/.*inet ([0-9.]+).*/1/p'
            register: privateIP
      - hosts: localhost
        gather_facts: False
        become: False
        tasks:
          - name: "Create ./artifacts directory to house keys and certificates"
            file:
              path: ./artifacts
              state: directory
      - hosts: etcd
        tasks:
      ...
      

      Next, add another task to the end of the play to generate the private key:

      ~/playground/etcd-ansible/playbook.yaml

      ...
      - hosts: localhost
        gather_facts: False
        become: False
        tasks:
              ...
          - name: "Generate private key for each member"
            openssl_privatekey:
              path: ./artifacts/{{item}}.key
              type: RSA
              size: 4096
              state: present
              force: True
            with_items: "{{ groups['etcd'] }}"
      - hosts: etcd
        tasks:
      ...
      

      Creating private keys and CSRs can be done using the openssl_privatekey and openssl_csr modules, respectively.

      The force: True attribute ensures that the private key is regenerated each time, even if it exists already.

      Similarly, append the following new task to the same play to generate the CSRs for each member, using the openssl_csr module:

      ~/playground/etcd-ansible/playbook.yaml

      ...
      - hosts: localhost
        gather_facts: False
        become: False
        tasks:
          ...
          - name: "Generate private key for each member"
            openssl_privatekey:
              ...
            with_items: "{{ groups['etcd'] }}"
          - name: "Generate CSR for each member"
            openssl_csr:
              path: ./artifacts/{{item}}.csr
              privatekey_path: ./artifacts/{{item}}.key
              common_name: "{{item}}"
              key_usage:
                - digitalSignature
              extended_key_usage:
                - serverAuth
              subject_alt_name:
                - IP:{{ hostvars[item]['privateIP']['stdout']}}
                - IP:127.0.0.1
              force: True
            with_items: "{{ groups['etcd'] }}"
      

      We are specifying that this certificate can be involved in a digital signature mechanism for the purpose of server authentication. This certificate is associated with the hostname (e.g., etcd1), but the verifier should also treat each node’s private and local loopback IP addresses as alternative names. Note that we are using the privateIP variable that we registered in the previous play.

      Close and save the playbook.yaml file by pressing CTRL+X followed by Y. Then, run our playbook again:

      • ansible-playbook -i hosts playbook.yaml

      We will now find a new directory called artifacts within our project directory; use ls to list out its contents:

      You will find the private keys and CSRs for each of the etcd members:

      Output

      etcd1.csr etcd1.key etcd2.csr etcd2.key etcd3.csr etcd3.key

      In this step, we used several Ansible modules to generate private keys and public key certificates for each of the member nodes. In the next step, we will look at how to sign a certificate signing request (CSR).

      Step 11 — Generating CA Certificates

      Within an etcd cluster, member nodes encrypt messages using the receiver’s public key. To ensure the public key is genuine, the receiver packages the public key into a certificate signing request (CSR) and has a trusted entity (i.e., the CA) sign the CSR. Since we control all the member nodes and the CAs they trust, we don’t need to use an external CA and can act as our own CA. In this step, we are going to act as our own CA, which means we’ll need to generate a private key and a self-signed certificate to function as the CA.

      First, open the playbook.yaml file with your editor:

      • nano $HOME/playground/etcd-ansible/playbook.yaml

      Then, similar to the previous step, append a task to the localhost play to generate a private key for the CA:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: localhost
        ...
        tasks:
          ...
        - name: "Generate CSR for each member"
          ...
          with_items: "{{ groups['etcd'] }}"
          - name: "Generate private key for CA"
            openssl_privatekey:
              path: ./artifacts/ca.key
              type: RSA
              size: 4096
              state: present
              force: True
      - hosts: etcd
        become: True
        tasks:
          - name: "Create directory for etcd binaries"
      ...
      

      Next, use the openssl_csr module to generate a new CSR. This is similar to the previous step, but in this CSR, we are adding the basic constraint and key usage extension to indicate that this certificate can be used as a CA certificate:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: localhost
        ...
        tasks:
          ...
          - name: "Generate private key for CA"
            openssl_privatekey:
              path: ./artifacts/ca.key
              type: RSA
              size: 4096
              state: present
              force: True
          - name: "Generate CSR for CA"
            openssl_csr:
              path: ./artifacts/ca.csr
              privatekey_path: ./artifacts/ca.key
              common_name: ca
              organization_name: "Etcd CA"
              basic_constraints:
                - CA:TRUE
                - pathlen:1
              basic_constraints_critical: True
              key_usage:
                - keyCertSign
                - digitalSignature
              force: True
      - hosts: etcd
        become: True
        tasks:
          - name: "Create directory for etcd binaries"
      ...
      

      Lastly, use the openssl_certificate module to self-sign the CSR:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: localhost
        ...
        tasks:
          ...
          - name: "Generate CSR for CA"
            openssl_csr:
              path: ./artifacts/ca.csr
              privatekey_path: ./artifacts/ca.key
              common_name: ca
              organization_name: "Etcd CA"
              basic_constraints:
                - CA:TRUE
                - pathlen:1
              basic_constraints_critical: True
              key_usage:
                - keyCertSign
                - digitalSignature
              force: True
          - name: "Generate self-signed CA certificate"
            openssl_certificate:
              path: ./artifacts/ca.crt
              privatekey_path: ./artifacts/ca.key
              csr_path: ./artifacts/ca.csr
              provider: selfsigned
              force: True
      - hosts: etcd
        become: True
        tasks:
          - name: "Create directory for etcd binaries"
      ...
      

      Close and save the playbook.yaml file by pressing CTRL+X followed by Y. Then, run our playbook again to apply the changes:

      • ansible-playbook -i hosts playbook.yaml

      You can also run ls to check the contents of the artifacts/ directory:

      You will now find the freshly generated CA certificate (ca.crt):

      Output

      ca.crt ca.csr ca.key etcd1.csr etcd1.key etcd2.csr etcd2.key etcd3.csr etcd3.key

      In this step, we generated a private key and a self-signed certificate for the CA. In the next step, we will use the CA certificate to sign each member’s CSR.

      Step 12 — Signing the etcd members’ CSRs

      In this step, we are going to sign each member node’s CSR. This will be similar to how we used the openssl_certificate module to self-sign the CA certificate, but instead of using the selfsigned provider, we will use the ownca provider, which allows us to sign using our own CA certificate.

      Open up your playbook:

      • nano $HOME/playground/etcd-ansible/playbook.yaml

      Append the following highlighted task to the "Generate self-signed CA certificate" task:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: localhost
        ...
        tasks:
          ...
          - name: "Generate self-signed CA certificate"
            openssl_certificate:
              path: ./artifacts/ca.crt
              privatekey_path: ./artifacts/ca.key
              csr_path: ./artifacts/ca.csr
              provider: selfsigned
              force: True
          - name: "Generate an `etcd` member certificate signed with our own CA certificate"
            openssl_certificate:
              path: ./artifacts/{{item}}.crt
              csr_path: ./artifacts/{{item}}.csr
              ownca_path: ./artifacts/ca.crt
              ownca_privatekey_path: ./artifacts/ca.key
              provider: ownca
              force: True
            with_items: "{{ groups['etcd'] }}"
      - hosts: etcd
        become: True
        tasks:
          - name: "Create directory for etcd binaries"
      ...
      

      Close and save the playbook.yaml file by pressing CTRL+X followed by Y. Then, run the playbook again to apply the changes:

      • ansible-playbook -i hosts playbook.yaml

      Now, list out the contents of the artifacts/ directory:

      You will find the private key, CSR, and certificate for every etcd member and the CA:

      Output

      ca.crt ca.csr ca.key etcd1.crt etcd1.csr etcd1.key etcd2.crt etcd2.csr etcd2.key etcd3.crt etcd3.csr etcd3.key

      In this step, we have signed each member node’s CSRs using the CA’s key. In the next step, we are going to copy the relevant files into each managed node, so that etcd has access to the relevant keys and certificates to set up TLS connections.

      Step 13 — Copying Private Keys and Certificates

      Every node needs to have a copy of the CA’s self-signed certificate (ca.crt). Each etcd member node also needs to have its own private key and certificate. In this step, we are going to upload these files and place them in a new /etc/etcd/ssl/ directory.

      To start, open the playbook.yaml file with your editor:

      • nano $HOME/playground/etcd-ansible/playbook.yaml

      To make these changes on our Ansible playbook, first update the path property of the Create directory for etcd configuration task to create the /etc/etcd/ssl/ directory:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: etcd
        ...
        tasks:
          ...
            with_items:
              - absent
              - directory
          - name: "Create directory for etcd configuration"
            file:
              path: "{{ item }}"
              state: directory
              owner: root
              group: root
              mode: 0755
            with_items:
              - /etc/etcd
              - /etc/etcd/ssl
          - name: "Create configuration file for etcd"
            template:
      ...
      

      Then, following the modified task, add three more tasks to copy the files over:

      ~/playground/etcd-ansible/playbook.yaml

      - hosts: etcd
        ...
        tasks:
          ...
          - name: "Copy over the CA certificate"
            copy:
              src: ./artifacts/ca.crt
              remote_src: False
              dest: /etc/etcd/ssl/ca.crt
              owner: root
              group: root
              mode: 0644
          - name: "Copy over the `etcd` member certificate"
            copy:
              src: ./artifacts/{{inventory_hostname}}.crt
              remote_src: False
              dest: /etc/etcd/ssl/server.crt
              owner: root
              group: root
              mode: 0644
          - name: "Copy over the `etcd` member key"
            copy:
              src: ./artifacts/{{inventory_hostname}}.key
              remote_src: False
              dest: /etc/etcd/ssl/server.key
              owner: root
              group: root
              mode: 0600
          - name: "Create configuration file for etcd"
            template:
      ...
      

      Close and save the playbook.yaml file by pressing CTRL+X followed by Y.

      Run ansible-playbook again to make these changes:

      • ansible-playbook -i hosts playbook.yaml

      In this step, we have successfully uploaded the private keys and certificates to the managed nodes. Having copied the files over, we now need to update our etcd configuration file to make use of them.

      Step 14 — Enabling TLS on etcd

      In the last step of this tutorial, we are going to update some Ansible configurations to enable TLS in an etcd cluster.

      First, open up the templates/etcd.conf.yaml.j2 template file using your editor:

      • nano $HOME/playground/etcd-ansible/templates/etcd.conf.yaml.j2

      Once inside, change all URLs to use https as the protocol instead of http. Additionally, add a section at the end of the template to specify the location of the CA certificate, server certificate, and server key:

      ~/playground/etcd-ansible/templates/etcd.conf.yaml.j2

      data-dir: /var/lib/etcd/{{ inventory_hostname }}.etcd
      name: {{ inventory_hostname }}
      initial-advertise-peer-urls: https://{{ hostvars[inventory_hostname]['ansible_facts']['eth1']['ipv4']['address'] }}:2380
      listen-peer-urls: https://{{ hostvars[inventory_hostname]['ansible_facts']['eth1']['ipv4']['address'] }}:2380,https://127.0.0.1:2380
      advertise-client-urls: https://{{ hostvars[inventory_hostname]['ansible_facts']['eth1']['ipv4']['address'] }}:2379
      listen-client-urls: https://{{ hostvars[inventory_hostname]['ansible_facts']['eth1']['ipv4']['address'] }}:2379,https://127.0.0.1:2379
      initial-cluster-state: new
      initial-cluster: {% for host in groups['etcd'] %}{{ hostvars[host]['ansible_facts']['hostname'] }}=https://{{ hostvars[host]['ansible_facts']['eth1']['ipv4']['address'] }}:2380{% if not loop.last %},{% endif %}{% endfor %}
      
      client-transport-security:
        cert-file: /etc/etcd/ssl/server.crt
        key-file: /etc/etcd/ssl/server.key
        trusted-ca-file: /etc/etcd/ssl/ca.crt
      peer-transport-security:
        cert-file: /etc/etcd/ssl/server.crt
        key-file: /etc/etcd/ssl/server.key
        trusted-ca-file: /etc/etcd/ssl/ca.crt
      

      Close and save the templates/etcd.conf.yaml.j2 file.

      Next, run your Ansible playbook:

      • ansible-playbook -i hosts playbook.yaml

      Then, SSH into one of the managed nodes:

      Once inside, run the etcdctl endpoint health command to check whether the endpoints are using HTTPS, and if all members are healthy:

      • etcdctl --cacert /etc/etcd/ssl/ca.crt endpoint health --cluster

      Because our CA certificate is not, by default, a trusted root CA certificate installed in the /etc/ssl/certs/ directory, we need to pass it to etcdctl using the --cacert flag.

      This will give the following output:

      Output

      https://etcd3_private_ip:2379 is healthy: successfully committed proposal: took = 19.237262ms https://etcd1_private_ip:2379 is healthy: successfully committed proposal: took = 4.769088ms https://etcd2_private_ip:2379 is healthy: successfully committed proposal: took = 5.953599ms

      To confirm that the etcd cluster is actually working, we can, once again, create an entry on one member node, and retrieve it from another member node:

      • etcdctl --cacert /etc/etcd/ssl/ca.crt put foo "bar"

      Use a new terminal to SSH into a different node:

      Now retrieve the same entry using the key foo:

      • etcdctl --cacert /etc/etcd/ssl/ca.crt get foo

      This will return the entry, showing the output below:

      Output

      foo bar

      You can do the same on the third node to ensure all three members are operational.

      Conclusion

      You have now successfully provisioned a 3-node etcd cluster, secured it with TLS, and confirmed that it is working.

      etcd is a tool originally created by CoreOS. To understand etcd’s usage in relation to CoreOS, you can read How To Use Etcdctl and Etcd, CoreOS’s Distributed Key-Value Store. The article also guides you through setting up a dynamic discovery model, something which was discussed but not demonstrated in this tutorial.

      As mentioned at the beginning of this tutorial, etcd is an important part of the Kubernetes ecosystem. To learn more about Kubernetes and etcd’s role within it, you can read An Introduction to Kubernetes. If you are deploying etcd as part of a Kubernetes cluster, know that there are other tools available, such as kubespray and kubeadm. For more details on the latter, you can read How To Create a Kubernetes Cluster Using Kubeadm on Ubuntu 18.04.

      Finally, this tutorial made use of many tools, but could not dive into each in too much detail. In the following you’ll find links that will provide a more detailed examination of each tool:



      Source link

      Getting Started with Load Balancing on a Linode Kubernetes Engine (LKE) Cluster


      Updated by Linode Contributed by Linode

      The Linode Kubernetes Engine (LKE) is Linode’s managed Kubernetes service. When you deploy an LKE cluster, you receive a Kubernetes Master which runs your cluster’s control plane components, at no additional cost. The control plane includes Linode’s Cloud Controller Manager (CCM), which provides a way for your cluster to access additional Linode services. Linode’s CCM provides access to Linode’s load balancing service, Linode NodeBalancers.

      NodeBalancers provide your Kubernetes cluster with a reliable way of exposing resources to the public internet. The LKE control plane handles the creation and deletion of the NodeBalancer, and correctly identifies the resources, and their networking, that the NodeBalancer will route traffic to. Whenever a Kubernetes Service of the LoadBalancer type is created, your Kubernetes cluster will create a Linode NodeBalancer service with the help of the Linode CCM.

      Note

      Adding external Linode NodeBalancers to your LKE cluster will incur additional costs. See Linode’s Pricing page for details.

      Note

      All existing LKE clusters receive CCM updates automatically every two weeks when a new LKE release is deployed. See the LKE Changelog for information on the latest LKE release.

      Note

      The Linode Terraform K8s module also deploys a Kubernetes cluster with the Linode CCM installed by default. Any Kubernetes cluster with a Linode CCM installation can make use of Linode NodeBalancers in the ways described in this guide.

      In this Guide

      This guide will show you:

      Before You Begin

      This guide assumes you have a working Kubernetes cluster that was deployed using the Linode Kubernetes Engine (LKE). You can deploy a Kubernetes cluster using LKE in the following ways:

      Adding Linode NodeBalancers to your Kubernetes Cluster

      To add an external load balancer to your Kubernetes cluster you can add the example lines to a new configuration file, or more commonly, to a Service file. When the configuration is applied to your cluster, Linode NodeBalancers will be created, and added to your Kubernetes cluster. Your cluster will be accessible via a public IP address and the NodeBalancers will route external traffic to a Service running on healthy nodes in your cluster.

      Note

      Billing for Linode NodeBalancers begin as soon as the example configuration is successfully applied to your Kubernetes cluster.

      1
      2
      3
      4
      5
      6
      7
      
      spec:
        type: LoadBalancer
        ports:
        - name: http
          port: 80
          protocol: TCP
          targetPort: 80
      • The spec.type of LoadBalancer is responsible for telling Kubernetes to create a Linode NodeBalancer.
      • The remaining lines provide port definitions for your Service’s Pods and maps an incoming port to a container’s targetPort.

      Viewing Linode NodeBalancer Details

      To view details about running NodeBalancers on your cluster:

      1. Get the services running on your cluster:

        kubectl get services
        

        You will see a similar output:

          
        NAME            TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)        AGE
        kubernetes      ClusterIP      10.128.0.1      none           443/TCP        3h5m
        example-service LoadBalancer   10.128.171.88   45.79.246.55   80:30028/TCP   36m
              
        
        • Viewing the entry for the example-service, you can find your NodeBalancer’s public IP under the EXTERNAL-IP column.
        • The PORT(S) column displays the example-service incoming port and NodePort.
      2. View details about the example-service to retrieve information about the deployed NodeBalancers:

        kubectl describe service example-service
        
          
        Name:                     nginx-service
        Namespace:                default
        Labels:                   app=nginx
        Annotations:              service.beta.kubernetes.io/linode-loadbalancer-throttle: 4
        Selector:                 app=nginx
        Type:                     LoadBalancer
        IP:                       10.128.171.88
        LoadBalancer Ingress:     192.0.2.0
        Port:                     http  80/TCP
        TargetPort:               80/TCP
        NodePort:                 http  30028/TCP
        Endpoints:                10.2.1.2:80,10.2.1.3:80,10.2.2.2:80
        Session Affinity:         None
        External Traffic Policy:  Cluster
        Events:                   
        

      Configuring your Linode NodeBalancers with Annotations

      The Linode CCM accepts annotations that configure the behavior and settings of your cluster’s underlying NodeBalancers.

      • The table below provides a list of all available annotation suffixes.
      • Each annotation must be prefixed with service.beta.kubernetes.io/linode-loadbalancer-. For example, the complete value for the throttle annotation is service.beta.kubernetes.io/linode-loadbalancer-throttle.
      • Annotation values such as http are case-sensitive.

      Annotations Reference

      Annotation (suffix) Values Default Value Description
      throttle • integer
      020
      0 disables the throttle
      20 The client connection throttle limits the number of new connections-per-second from the same client IP.
      default-protocol • string
      tcp, http, https
      tcp Specifies the protocol for the NodeBalancer to use.
      port-* A JSON object of port configurations
      For example:
      { "tls-secret-name": "prod-app-tls", "protocol": "https"})
      None • Specifies a NodeBalancer port to configure, i.e. port-443.

      • Ports 1-65534 are available for balancing.

      • The available port configurations are:

      "tls-secret-name" use this key to provide a Kubernetes secret name when setting up TLS termination for a service to be accessed over HTTPS. The secret type should be kubernetes.io/tls.

      "protocol" specifies the protocol to use for this port, i.e. tcp, http, https. The default protocol is tcp, unless you provided a different configuration for the default-protocol annotation.

      check-type • string
      none, connection, http, http_body
      None • The type of health check to perform on Nodes to ensure that they are serving requests. The behavior for each check is the following:

      none no check is performed

      connection checks for a valid TCP handshake

      http checks for a 2xx or 3xx response code

      http_body checks for a specific string within the response body of the healthcheck URL. Use the check-body annotation to provide the string to use for the check.

      check-path string None The URL path that the NodeBalancer will use to check on the health of the back-end Nodes.
      check-body string None The string that must be present in the response body of the URL path used for health checks. You must have a check-type annotation configured for a http_body check.
      check-interval integer None The duration, in seconds, between health checks.
      check-timeout • integer
      • value between 130
      None Duration, in seconds, to wait for a health check to succeed before it is considered a failure.
      check-attempts • integer
      • value between 130
      None Number of health checks to perform before removing a back-end Node from service.
      check-passive boolean false When true, 5xx status codes will cause the health check to fail.
      preserve boolean false When true, deleting a LoadBalancer service does not delete the underlying NodeBalancer

      Note

      Configuring Linode NodeBalancers for TLS Encryption

      This section describes how to set up TLS termination on your Linode NodeBalancers so a Kubernetes Service can be accessed over HTTPS.

      Generating a TLS type Secret

      Kubernetes allows you to store sensitive information in a Secret object for use within your cluster. This is useful for storing things like passwords and API tokens. In this section, you will create a Kubernetes secret to store Transport Layer Security (TLS) certificates and keys that you will then use to configure TLS termination on your Linode NodeBalancers.

      In the context of the Linode CCM, Secrets are useful for storing Transport Layer Security (TLS) certificates and keys. The linode-loadbalancer-tls annotation requires TLS certificates and keys to be stored as Kubernetes Secrets with the type tls. Follow the steps in this section to create a Kubernetes TLS Secret.

      Note

      1. Generate a TLS key and certificate using a TLS toolkit like OpenSSL. Be sure to change the CN and O values to those of your own website domain.

        openssl req -newkey rsa:4096 
            -x509 
            -sha256 
            -days 3650 
            -nodes 
            -out tls.crt 
            -keyout tls.key 
            -subj "/CN=mywebsite.com/O=mywebsite.com"
        
      2. Create the secret using the create secret tls command. Ensure you substitute $SECRET_NAME for the name you’d like to give to your secret. This will be how you reference the secret in your Service manifest.

        kubectl create secret tls $SECRET_NAME --cert tls.crt --key tls.key
        
      3. You can check to make sure your Secret has been successfully stored by using describe:

        kubectl describe secret $SECRET_NAME
        

        You should see output like the following:

          
        kubectl describe secret docteamdemosite
        Name:         my-secret
        Namespace:    default
        Labels:       
        Annotations:  
        
        Type:  kubernetes.io/tls
        
        Data
        ====
        tls.crt:  1164 bytes
        tls.key:  1704 bytes
        
        

        If your key is not formatted correctly you’ll receive an error stating that there is no PEM formatted data within the key file.

      Configuring TLS within a Service

      In order to use https you’ll need to instruct the Service to use the correct port using the required annotations. You can add the following code snippet to a Service file to enable TLS termination on your NodeBalancers:

      example-serivce.yaml
      1
      2
      3
      4
      5
      6
      
      ...
      metadata:
        annotations:
          service.beta.kubernetes.io/linode-loadbalancer-default-protocol: http
          service.beta.kubernetes.io/linode-loadbalancer-port-443: '{ "tls-secret-name": "example-secret", "protocol": "https" }'
      ...
      • The service.beta.kubernetes.io/linode-loadbalancer-default-protocol annotation configures the NodeBalancer’s default protocol.

      • service.beta.kubernetes.io/linode-loadbalancer-port-443 specifies port 443 as the port to be configured. The value of this annotation is a JSON object designating the TLS secret name to use (example-secret) and the protocol to use for the port being configured (https).

      If you have multiple Secrets and ports for different environments (testing, staging, etc.), you can define more than one secret and port pair:

      example-serivce.yaml
      1
      2
      3
      4
      5
      6
      7
      
      ...
      metadata:
        annotations:
          service.beta.kubernetes.io/linode-loadbalancer-default-protocol: http
          service.beta.kubernetes.io/linode-loadbalancer-port-443: '{ "tls-secret-name": "example-secret", "protocol": "https" }'
          service.beta.kubernetes.io/linode-loadbalancer-port-8443: '{ "tls-secret-name": "example-secret-staging", "protocol": "https" }'
      ...

      Configuring Session Affinity for Cluster Pods

      kube-proxy will always attempt to proxy traffic to a random backend Pod. To direct traffic to the same Pod, you can use the sessionAffinity mechanism. When set to clientIP, sessionAffinity will ensure that all traffic from the same IP will be directed to the same Pod. You can add the example lines to a Service configuration file to

      1
      2
      3
      4
      5
      6
      7
      8
      
      spec:
        type: LoadBalancer
        selector:
          app: example-app
        sessionAffinity: ClientIP
        sessionAffinityConfig:
          clientIP:
            timeoutSeconds: 100

      Removing Linode NodeBalancers from your Kubernetes Cluster

      To delete a NodeBalancer and the Service that it represents, you can use the Service manifest file you used to create the NodeBalancer. Simply use the delete command and supply your file name with the f flag:

      kubectl delete -f example-service.yaml
      

      Similarly, you can delete the Service by name:

      kubectl delete service example-service
      

      After deleting your service, its corresponding NodeBalancer will be removed from your Linode account.

      Note

      If your Service file used the preserve annotation, the underlying NodeBalancer will not be removed from your Linode account. See the annotations reference for details.

      This guide is published under a CC BY-ND 4.0 license.



      Source link