One place for hosting & domains

      Cluster

      How To Install and Configure an Apache ZooKeeper Cluster on Ubuntu 18.04


      The author selected Wikimedia Foundation Inc. to receive a donation as part of the Write for DOnations program.

      Introduction

      Apache ZooKeeper is open-source software that enables resilient and highly reliable distributed coordination. It is commonly used in distributed systems to manage configuration information, naming services, distributed synchronization, quorum, and state. In addition, distributed systems rely on ZooKeeper to implement consensus, leader election, and group management.

      In this guide, you will install and configure Apache ZooKeeper 3.4.13 on Ubuntu 18.04. To achieve resilience and high availability, ZooKeeper is intended to be replicated over a set of hosts, called an ensemble. First, you will create a standalone installation of a single-node ZooKeeper server and then add in details for setting up a multi-node cluster. The standalone installation is useful in development and testing environments, but a cluster is the most practical solution for production environments.

      Prerequisites

      Before you begin this installation and configuration guide, you’ll need the following:

      • The standalone installation needs one Ubuntu 18.04 server with a minimum of 4GB of RAM set up by following the Ubuntu 18.04 initial server setup guide, including a non-root user with sudo privileges and a firewall. You need two additional servers, set up by following the same steps, for the multi-node cluster.
      • OpenJDK 8 installed on your server, as ZooKeeper requires Java to run. To do this, follow the “Install Specific Versions of OpenJDK” step from the How To Install Java with `apt` on Ubuntu 18.04 guide.

      Because ZooKeeper keeps data in memory to achieve high throughput and low latency, production systems work best with 8GB of RAM. Lower amounts of RAM may lead to JVM swapping, which could cause ZooKeeper server latency. High ZooKeeper server latency could result in issues like client session timeouts that would have an adverse impact on system functionality.

      Step 1 — Creating a User for ZooKeeper

      A dedicated user should run services that handle requests over a network and consume resources. This practice creates segregation and control that will improve your environment’s security and manageability. In this step, you’ll create a non-root sudo user, named zk in this tutorial, to run the ZooKeeper service.

      First, log in as the non-root sudo user that you created in the prerequisites.

      ssh sammy@your_server_ip
      

      Create the user that will run the ZooKeeper service:

      Passing the -m flag to the useradd command will create a home directory for this user. The home directory for zk will be /home/zk by default.

      Set bash as the default shell for the zk user:

      • sudo usermod --shell /bin/bash zk

      Set a password for this user:

      Next, you will add the zk user to the sudo group so it can run commands in a privileged mode:

      In terms of security, it is recommended that you allow SSH access to as few users as possible. Logging in remotely as sammy and then using su to switch to the desired user creates a level of separation between credentials for accessing the system and running processes. You will disable SSH access for both your zk and root user in this step.

      Open your sshd_config file:

      • sudo nano /etc/ssh/sshd_config

      Locate the PermitRootLogin line and set the value to no to disable SSH access for the root user:

      /etc/ssh/sshd_config

      PermitRootLogin no
      

      Under the PermitRootLogin value, add a DenyUsers line and set the value as any user who should have SSH access disabled:

      /etc/ssh/sshd_config

      DenyUsers zk
      

      Save and exit the file and then restart the SSH daemon to activate the changes.

      • sudo systemctl restart sshd

      Switch to the zk user:

      The -l flag invokes a login shell after switching users. A login shell resets environment variables and provides a clean start for the user.

      Enter the password at the prompt to authenticate the user.

      Now that you have created, configured, and logged in as the zk user, you will create a directory to store your ZooKeeper data.

      Step 2 — Creating a Data Directory for ZooKeeper

      ZooKeeper persists all configuration and state data to disk so it can survive a reboot. In this step, you will create a data directory that ZooKeeper will use to read and write data. You can create the data directory on the local filesystem or on a remote storage drive. This tutorial will focus on creating the data directory on your local filesystem.

      Create a directory for ZooKeeper to use:

      • sudo mkdir -p /data/zookeeper

      Grant your zk user ownership to the directory:

      • sudo chown zk:zk /data/zookeeper

      chown changes the ownership and group of the /data/zookeeper directory so that the user zk, who belongs to the group zk, owns the data directory.

      You have successfully created and configured the data directory. When you move on to configure ZooKeeper, you will specify this path as the data directory that ZooKeeper will use to store its files.

      Step 3 — Downloading and Extracting the ZooKeeper Binaries

      In this step, you will manually download and extract the ZooKeeper binaries to the /opt directory. You can use the Advanced Packaging Tool, apt, to download ZooKeeper, but it may install an older version with different features. Installing ZooKeeper manually will give you full control to choose which version you would like to use.

      Since you are downloading these files manually, start by changing to the /opt directory:

      From your local machine, navigate to the Apache download page. This page will automatically provide you with the mirror closest to you for the fastest download. Click the link to the suggested mirror site, then scroll down and click zookeeper/ to view the available releases. Select the version of ZooKeeper that you would like to install. This tutorial will focus on using 3.4.13. Once you select the version, right click the binary file ending with .tar.gz and copy the link address.

      From your server, use the wget command along with the copied link to download the ZooKeeper binaries:

      • sudo wget http://apache.osuosl.org/zookeeper/zookeeper-3.4.13/zookeeper-3.4.13.tar.gz

      Extract the binaries from the compressed archive:

      • sudo tar -xvf zookeeper-3.4.13.tar.gz

      The .tar.gz extension represents a combination of TAR packaging followed by a GNU zip (gzip) compression. You will notice that you passed the flag -xvf to the command to extract the archive. The flag x stands for extract, v enables verbose mode to show the extraction progress, and f allows specifying the input, in our case zookeeper-3.4.13.tar.gz, as opposed to STDIN.

      Next, give the zk user ownership of the extracted binaries so that it can run the executables. You can change ownership like so:

      • sudo chown zk:zk -R zookeeper-3.4.13

      Next, you will configure a symbolic link to ensure that your ZooKeeper directory will remain relevant across updates. You can also use symbolic links to shorten directory names, which can lessen the time it takes to set up your configuration files.

      Create a symbolic link using the ln command.

      • sudo ln -s zookeeper-3.4.13 zookeeper

      Change the ownership of that link to zk:zk. Notice that you have passed a -h flag to change the ownership of the link itself. Not specifying -h changes the ownership of the target of the link, which you explicitly did in the previous step.

      • sudo chown -h zk:zk zookeeper

      With the symbolic links created, your directory paths in the configurations will remain relevant and unchanged through future upgrades. You can now configure ZooKeeper.

      Step 4 — Configuring ZooKeeper

      Now that you've set up your environment, you are ready to configure ZooKeeper.

      The configuration file will live in the /opt/zookeeper/conf directory. This directory contains a sample configuration file that comes with the ZooKeeper distribution. This sample file, named zoo_sample.cfg, contains the most common configuration parameter definitions and sample values for these parameters. Some of the common parameters are as follows:

      • tickTime: Sets the length of a tick in milliseconds. A tick is a time unit used by ZooKeeper to measure the length between heartbeats. Minimum session timeouts are twice the tickTime.
      • dataDir: Specifies the directory used to store snapshots of the in-memory database and the transaction log for updates. You could choose to specify a separate directory for transaction logs.
      • clientPort: The port used to listen for client connections.
      • maxClientCnxns: Limits the maximum number of client connections.

      Create a configuration file named zoo.cfg at /opt/zookeeper/conf. You can create and open a file using nano or your favorite editor:

      • nano /opt/zookeeper/conf/zoo.cfg

      Add the following set of properties and values to that file:

      /opt/zookeeper/conf/zoo.cfg

      tickTime=2000
      dataDir=/data/zookeeper
      clientPort=2181
      maxClientCnxns=60
      

      A tickTime of 2000 milliseconds is the suggested interval between heartbeats. A shorter interval could lead to system overhead with limited benefits. The dataDir parameter points to the path defined by the symbolic link you created in the previous section. Conventionally, ZooKeeper uses port 2181 to listen for client connections. In most situations, 60 allowed client connections are plenty for development and testing.

      Save the file and exit the editor.

      You have configured ZooKeeper and are ready to start the server.

      Step 5 — Starting ZooKeeper and Testing the Standalone Installation

      You've configured all the components needed to run ZooKeeper. In this step, you will start the ZooKeeper service and test your configuration by connecting to the service locally.

      Navigate back to the /opt/zookeeper directory.

      Start ZooKeeper with the zkServer.sh command.

      You will see the following on your standard output:

      Output

      ZooKeeper JMX enabled by default Using config: /opt/zookeeper/bin/../conf/zoo.cfg Starting zookeeper ... STARTED

      Connect to the local ZooKeeper server with the following command:

      • bin/zkCli.sh -server 127.0.0.1:2181

      You will get a prompt with the label CONNECTED. This confirms that you have a successful local, standalone ZooKeeper installation. If you encounter errors, you will want to verify that the configuration is correct.

      Output

      Connecting to 127.0.0.1:2181 ... ... [zk: 127.0.0.1:2181(CONNECTED) 0]

      Type help on this prompt to get a list of commands that you can execute from the client. The output will be as follows:

      Output

      [zk: 127.0.0.1:2181(CONNECTED) 0] help ZooKeeper -server host:port cmd args stat path [watch] set path data [version] ls path [watch] delquota [-n|-b] path ls2 path [watch] setAcl path acl setquota -n|-b val path history redo cmdno printwatches on|off delete path [version] sync path listquota path rmr path get path [watch] create [-s] [-e] path data acl addauth scheme auth quit getAcl path close connect host:port

      After you've done some testing, you will close the client session by typing quit on the prompt. The ZooKeeper service will continue running after you closed the client session. Shut down the ZooKeeper service, as you'll configure it as a systemd service in the next step:

      You have now installed, configured, and tested a standalone ZooKeeper service. This setup is useful to familiarize yourself with ZooKeeper, but is also helpful for developmental and testing environments. Now that you know the configuration works, you will configure systemd to simplify the management of your ZooKeeper service.

      Step 6 — Creating and Using a Systemd Unit File

      The systemd, system and service manager, is an init system used to bootstrap the user space and to manage system processes after boot. You can create a daemon for starting and checking the status of ZooKeeper using systemd.

      Systemd Essentials is a great introductory resource for learning more about systemd and its constituent components.

      Use your editor to create a .service file named zk.service at /etc/systemd/system/.

      • sudo nano /etc/systemd/system/zk.service

      Add the following lines to the file to define the ZooKeeper Service:

      /etc/systemd/system/zk.service

      [Unit]
      Description=Zookeeper Daemon
      Documentation=http://zookeeper.apache.org
      Requires=network.target
      After=network.target
      
      [Service]    
      Type=forking
      WorkingDirectory=/opt/zookeeper
      User=zk
      Group=zk
      ExecStart=/opt/zookeeper/bin/zkServer.sh start /opt/zookeeper/conf/zoo.cfg
      ExecStop=/opt/zookeeper/bin/zkServer.sh stop /opt/zookeeper/conf/zoo.cfg
      ExecReload=/opt/zookeeper/bin/zkServer.sh restart /opt/zookeeper/conf/zoo.cfg
      TimeoutSec=30
      Restart=on-failure
      
      [Install]
      WantedBy=default.target
      

      The Service section in the unit file configuration specifies the working directory, the user under which the service would run, and the executable commands to start, stop, and restart the ZooKeeper service. For additional information on all the unit file configuration options, you can read the Understanding Systemd Units and Unit Files article.

      Save the file and exit the editor.

      Now that your systemd configuration is in place, you can start the service:

      Once you've confirmed that your systemd file can successfully start the service, you will enable the service to start on boot.

      This output confirms the creation of the symbolic link:

      Output

      Created symlink /etc/systemd/system/multi-user.target.wants/zk.service → /etc/systemd/system/zk.service.

      Check the status of the ZooKeeper service using:

      Stop the ZooKeeper service using systemctl.

      Finally, to restart the daemon, use the following command:

      • sudo systemctl restart zk

      The systemd mechanism is becoming the init system of choice on many Linux distributions. Now that you've configured systemd to manage ZooKeeper, you can leverage this fast and flexible init model to start, stop, and restart the ZooKeeper service.

      Step 7 — Configuring a Multi-Node ZooKeeper Cluster

      While the standalone ZooKeeper server is useful for development and testing, every production environment should have a replicated multi-node cluster.

      Nodes in the ZooKeeper cluster that work together as an application form a quorum. Quorum refers to the minimum number of nodes that need to agree on a transaction before it's committed. A quorum needs an odd number of nodes so that it can establish a majority. An even number of nodes may result in a tie, which would mean the nodes would not reach a majority or consensus.

      In a production environment, you should run each ZooKeeper node on a separate host. This prevents service disruption due to host hardware failure or reboots. This is an important and necessary architectural consideration for building a resilient and highly available distributed system.

      In this tutorial, you will install and configure three nodes in the quorum to demonstrate a multi-node setup. Before you configure a three-node cluster, you will spin up two additional servers with the same configuration as your standalone ZooKeeper installation. Ensure that the two additional nodes meet the prerequisites, and then follow steps one through six to set up a running ZooKeeper instance.

      Once you've followed steps one through six for the new nodes, open zoo.cfg in the editor on each node.

      • sudo nano /opt/zookeeper/conf/zoo.cfg

      All nodes in a quorum will need the same configuration file. In your zoo.cfg file on each of the three nodes, add the additional configuration parameters and values for initLimit, syncLimit, and the servers in the quorum, at the end of the file.

      /opt/zookeeper/conf/zoo.cfg

      tickTime=2000
      dataDir=/data/zookeeper
      clientPort=2181
      maxClientCnxns=60
      initLimit=10
      syncLimit=5
      server.1=your_zookeeper_node_1:2888:3888
      server.2=your_zookeeper_node_2:2888:3888
      server.3=your_zookeeper_node_3:2888:3888
      

      initLimit specifies the time that the initial synchronization phase can take. This is the time within which each of the nodes in the quorum needs to connect to the leader. syncLimit specifies the time that can pass between sending a request and receiving an acknowledgment. This is the maximum time nodes can be out of sync from the leader. ZooKeeper nodes use a pair of ports, :2888 and :3888, for follower nodes to connect to the leader node and for leader election, respectively.

      Once you've updated the file on each node, you will save and exit the editor.

      To complete your multi-node configuration, you will specify a node ID on each of the servers. To do this, you will create a myid file on each node. Each file will contain a number that correlates to the server number assigned in the configuration file.

      On your_zookeeper_node_1, create the myid file that will specify the node ID:

      • sudo nano /data/zookeeper/myid

      Since your_zookeeper_node_1 is identified as server.1, you will enter 1 to define the node ID. After adding the value, your file will look like this:

      your_zookeeper_node_1 /data/zookeeper/myid

      1

      Follow the same steps for the remaining nodes. The myid file on each node should be as follows:

      your_zookeeper_node_1 /data/zookeeper/myid

      1

      your_zookeeper_node_2 /data/zookeeper/myid

      2

      your_zookeeper_node_3 /data/zookeeper/myid

      3

      You have now configured a three-node ZooKeeper cluster. Next, you will run the cluster and test your installation.

      Step 8 — Running and Testing the Multi-Node Installation

      With each node configured to work as a cluster, you are ready to start a quorum. In this step, you will start the quorum on each node and then test your cluster by creating sample data in ZooKeeper.

      To start a quorum node, first change to the /opt/zookeeper directory on each node:

      Start each node with the following command:

      • java -cp zookeeper-3.4.13.jar:lib/log4j-1.2.17.jar:lib/slf4j-log4j12-1.7.25.jar:lib/slf4j-api-1.7.25.jar:conf org.apache.zookeeper.server.quorum.QuorumPeerMain conf/zoo.cfg

      As nodes start up, you will intermittently see some connection errors followed by a stage where they join the quorum and elect a leader among themselves. After a few seconds of initialization, you can start testing your installation.

      Log in via SSH to your_zookeeper_node_3 as the non-root user you configured in the prerequisites:

      • ssh sammy@your_zookeeper_node_3

      Once logged in, switch to your zk user:

      your_zookeeper_node_3 /data/zookeeper/myid

      Enter the password for the zk user. Once logged in, change the directory to /opt/zookeeper:

      your_zookeeper_node_3 /data/zookeeper/myid

      You will now start a ZooKeeper command line client and connect to ZooKeeper on your_zookeeper_node_1:

      your_zookeeper_node_3 /data/zookeeper/myid

      • bin/zkCli.sh -server your_zookeeper_node_1:2181

      In the standalone installation, both the client and server were running on the same host. This allowed you to establish a client connection with the ZooKeeper server using localhost. Since the client and server are running on different nodes in your multi-node cluster, in the previous step you needed to specify the IP address of your_zookeeper_node_1 to connect to it.

      You will see the familiar prompt with the CONNECTED label, similar to what you saw in Step 5.

      Next, you will create, list, and then delete a znode. The znodes are the fundamental abstractions in ZooKeeper that are analogous to files and directories on a file system. ZooKeeper maintains its data in a hierarchical namespace, and znodes are the data registers of this namespace.

      Testing that you can successfully create, list, and then delete a znode is essential to establishing that your ZooKeeper cluster is installed and configured correctly.

      Create a znode named zk_znode_1 and associate the string sample_data with it.

      • create /zk_znode_1 sample_data

      You will see the following output once created:

      Output

      Created /zk_znode_1

      List the newly created znode:

      Get the data associated with it:

      ZooKeeper will respond like so:

      Output

      [zk: your_zookeeper_node_1:2181(CONNECTED)] ls / [zk_znode_1, zookeeper] [zk: your_zookeeper_node_1:2181(CONNECTED)] get /zk_znode_1 sample_data cZxid = 0x100000002 ctime = Tue Nov 06 19:47:41 UTC 2018 mZxid = 0x100000002 mtime = Tue Nov 06 19:47:41 UTC 2018 pZxid = 0x100000002 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 11 numChildren = 0

      The output confirms the value, sample_data, that you associated with zk_node_1. ZooKeeper also provides additional information about creation time, ctime, and modification time, mtime. ZooKeeper is a versioned data store, so it also presents you with metadata about the data version.

      Delete the zk_znode_1 znode:

      In this step, you successfully tested connectivity between two of your ZooKeeper nodes. You also learned basic znode management by creating, listing, and deleting znodes. Your multi-node configuration is complete, and you are ready to start using ZooKeeper.

      Conclusion

      In this tutorial, you configured and tested both a standalone and multi-node ZooKeeper environment. Now that your multi-node ZooKeeper deployment is ready to use, you can review the official ZooKeeper documentation for additional information and projects.



      Source link

      How to Back Up and Restore a Kubernetes Cluster on DigitalOcean Using Heptio Ark


      Introduction

      Heptio Ark is a convenient backup tool for Kubernetes clusters that compresses and backs up Kubernetes objects to object storage. It also takes snapshots of your cluster’s Persistent Volumes using your cloud provider’s block storage snapshot features, and can then restore your cluster’s objects and Persistent Volumes to a previous state.

      StackPointCloud’s DigitalOcean Ark Plugin allows you to use DigitalOcean block storage to snapshot your Persistent Volumes, and Spaces to back up your Kubernetes objects. When running a Kubernetes cluster on DigitalOcean, this allows you to quickly back up your cluster’s state and restore it should disaster strike.

      In this tutorial we’ll set up and configure the Ark client on a local machine, and deploy the Ark server into our Kubernetes cluster. We’ll then deploy a sample Nginx app that uses a Persistent Volume for logging, and simulate a disaster recovery scenario.

      Prerequisites

      Before you begin this tutorial, you should have the following available to you:

      On your local computer:

      In your DigitalOcean account:

      • A DigitalOcean Kubernetes cluster, or a Kubernetes cluster (version 1.7.5 or later) on DigitalOcean Droplets
      • A DNS server running inside of your cluster. If you are using DigitalOcean Kubernetes, this is running by default. To learn more about configuring a Kubernetes DNS service, consult Customizing DNS Service from the official Kuberentes documentation.
      • A DigitalOcean Space that will store your backed-up Kubernetes objects. To learn how to create a Space, consult the Spaces product documentation.
      • An access key pair for your DigitalOcean Space. To learn how to create a set of access keys, consult How to Manage Administrative Access to Spaces.
      • A personal access token for use with the DigitalOcean API. To learn how to create a personal access token, consult How to Create a Personal Access Token.

      Once you have all of this set up, you’re ready to begin with this guide.

      Step 1 — Installing the Ark Client

      The Heptio Ark backup tool consists of a client installed on your local computer and a server that runs in your Kubernetes cluster. To begin, we’ll install the local Ark client.

      In your web browser, navigate to the Ark GitHub repo releases page, find the latest release corresponding to your OS and system architecture, and copy the link address. For the purposes of this guide, we’ll use an Ubuntu 18.04 server on an x86-64 (or AMD64) processor as our local machine.

      Then, from the command line on your local computer, navigate to the temporary /tmp directory and cd into it:

      Use wget and the link you copied earlier to download the release tarball:

      • wget https://link_copied_from_release_page

      Once the download completes, extract the tarball using tar (note the filename may differ depending on the current release version and your OS):

      • tar -xvzf ark-v0.9.6-linux-amd64.tar.gz

      The /tmp directory should now contain the extracted ark binary as well as the tarball you just downloaded.

      Verify that you can run the ark client by executing the binary:

      You should see the following help output:

      Output

      Heptio Ark is a tool for managing disaster recovery, specifically for Kubernetes cluster resources. It provides a simple, configurable, and operationally robust way to back up your application state and associated data. If you're familiar with kubectl, Ark supports a similar model, allowing you to execute commands such as 'ark get backup' and 'ark create schedule'. The same operations can also be performed as 'ark backup get' and 'ark schedule create'. Usage: ark [command] Available Commands: backup Work with backups client Ark client related commands completion Output shell completion code for the specified shell (bash or zsh) create Create ark resources delete Delete ark resources describe Describe ark resources get Get ark resources help Help about any command plugin Work with plugins restic Work with restic restore Work with restores schedule Work with schedules server Run the ark server version Print the ark version and associated image . . .

      At this point you should move the ark executable out of the temporary /tmp directory and add it to your PATH. To add it to your PATH on an Ubuntu system, simply copy it to /usr/local/bin:

      • sudo mv ark /usr/local/bin/ark

      You're now ready to configure the Ark server and deploy it to your Kubernetes cluster.

      Step 2 — Installing and Configuring the Ark Server

      Before we deploy Ark into our Kubernetes cluster, we'll first create Ark's prerequisite objects. Ark's prerequisites consist of:

      • A heptio-ark Namespace

      • The ark Service Account

      • Role-based access control (RBAC) rules to grant permissions to the ark Service Account

      • Custom Resources (CRDs) for the Ark-specific resources: Backup, Schedule, Restore, Config

      A YAML file containing the specs for the above Kubernetes objects can be found in the official Ark Git repository. While still in the /tmp directory, download the Ark repo using git:

      • git clone https://github.com/heptio/ark.git

      Once downloaded, navigate into the ark directory:

      The prerequisite resources listed above can be found in the examples/common/00-prereqs.yaml YAML file. We'll create these resources in our Kubernetes cluster by using kubectl apply and passing in the file:

      • kubectl apply -f examples/common/00-prereqs.yaml

      You should see the following output:

      Output

      customresourcedefinition.apiextensions.k8s.io/backups.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/schedules.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/restores.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/configs.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/downloadrequests.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/deletebackuprequests.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/podvolumebackups.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/podvolumerestores.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/resticrepositories.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/backupstoragelocations.ark.heptio.com created namespace/heptio-ark created serviceaccount/ark created clusterrolebinding.rbac.authorization.k8s.io/ark created

      Now that we've created the necessary Ark Kubernetes objects in our cluster, we can download and install the Ark DigitalOcean Plugin, which will allow us to use DigitalOcean Spaces as a backupStorageProvider (for Kubernetes objects), and DigitalOcean Block Storage as a persistentVolumeProvider (for Persistent Volume backups).

      Move back out of the ark directory and fetch the plugin from StackPointCloud's repo using git:

      • cd ..
      • git clone https://github.com/StackPointCloud/ark-plugin-digitalocean.git

      Move into the plugin directory:

      • cd ark-plugin-digitalocean

      We'll now save the access keys for our DigitalOcean Space as a Kubernetes Secret. First, open up the examples/credentials-ark file using your favorite editor:

      • nano examples/credentials-ark

      Replace <AWS_ACCESS_KEY_ID> and <AWS_SECRET_ACCESS_KEY> with your Spaces access key and secret key:

      examples/credentials-ark

      [default]
      aws_access_key_id=your_spaces_access_key_here
      aws_secret_access_key=your_spaces_secret_key_here
      

      Save and close the file.

      Now, create the cloud-credentials Secret using kubectl, inserting your Personal Access Token using the digitalocean_token data item:

      • kubectl create secret generic cloud-credentials
      • --namespace heptio-ark
      • --from-file cloud=examples/credentials-ark
      • --from-literal digitalocean_token=your_personal_access_token

      You should see the following output:

      Output

      secret/cloud-credentials created

      To confirm that the cloud-credentials Secret was created successfully, you can describe it using kubectl:

      • kubectl describe secrets/cloud-credentials --namespace heptio-ark

      You should see the following output describing the cloud-credentials secret:

      Output

      Name: cloud-credentials Namespace: heptio-ark Labels: <none> Annotations: <none> Type: Opaque Data ==== cloud: 115 bytes digitalocean_token: 64 bytes

      We can now move on to creating an Ark Config object named default. To do this, we'll edit a YAML configuration file and then create the object in our Kubernetes cluster.

      Open examples/10-ark-config.yaml in your favorite editor:

      • nano examples/10-ark-config.yaml

      Insert your Space's name and region in the highlighted fields:

      examples/10-ark-config.yaml

      ---
      apiVersion: ark.heptio.com/v1
      kind: Config
      metadata:
        namespace: heptio-ark
        name: default
      persistentVolumeProvider:
        name: digitalocean
      backupStorageProvider:
        name: aws
        bucket: space_name_here
        config:
          region: space_region_here
          s3ForcePathStyle: "true"
          s3Url: https://space_region_here.digitaloceanspaces.com
      backupSyncPeriod: 30m
      gcSyncPeriod: 30m
      scheduleSyncPeriod: 1m
      restoreOnlyMode: false
      

      persistentVolumeProvider sets DigitalOcean Block Storage as the the provider for Persistent Volume backups. These will be Block Storage Volume Snapshots.

      backupStorageProvider sets DigitalOcean Spaces as the provider for Kubernetes object backups. Ark will create a tarball of all your Kubernetes objects (or some, depending on how you execute it), and upload this tarball to Spaces.

      When you're done, save and close the file.

      Create the object in your cluster using kubectl apply:

      • kubectl apply -f examples/10-ark-config.yaml

      You should see the following output:

      Output

      config.ark.heptio.com/default created

      At this point, we've finished configuring the Ark server and can create its Kubernetes deployment, found in the examples/20-deployment.yaml configuration file. Let's take a quick look at this file:

      • cat examples/20-deployment.yaml

      You should see the following text:

      examples/20-deployment.yaml

      ---
      apiVersion: apps/v1beta1
      kind: Deployment
      metadata:
        namespace: heptio-ark
        name: ark
      spec:
        replicas: 1
        template:
          metadata:
            labels:
              component: ark
            annotations:
              prometheus.io/scrape: "true"
              prometheus.io/port: "8085"
              prometheus.io/path: "/metrics"
          spec:
            restartPolicy: Always
            serviceAccountName: ark
            containers:
              - name: ark
                image: gcr.io/heptio-images/ark:latest
                command:
                  - /ark
                args:
                  - server
                volumeMounts:
                  - name: cloud-credentials
                    mountPath: /credentials
                  - name: plugins
                    mountPath: /plugins
                  - name: scratch
                    mountPath: /scratch
                env:
                  - name: AWS_SHARED_CREDENTIALS_FILE
                    value: /credentials/cloud
                  - name: ARK_SCRATCH_DIR
                    value: /scratch
                  - name: DIGITALOCEAN_TOKEN
                    valueFrom:
                      secretKeyRef:
                        key: digitalocean_token
                        name: cloud-credentials
            volumes:
              - name: cloud-credentials
                secret:
                  secretName: cloud-credentials
              - name: plugins
                emptyDir: {}
              - name: scratch
                emptyDir: {}
      

      We observe here that we're creating a Deployment called ark that consists of a single replica of the gcr.io/heptio-images/ark:latest container. The Pod is configured using the cloud-credentials secret we created earlier.

      Create the Deployment using kubectl apply:

      • kubectl apply -f examples/20-deployment.yaml

      You should see the following output:

      Output

      deployment.apps/ark created

      We can double check that the Deployment has been successfully created using kubectl get on the heptio-ark Namespace :

      • kubectl get deployments --namespace=heptio-ark

      You should see the following output:

      Output

      NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE ark 1 1 1 0 8m

      The Ark server Pod may not start correctly until you install the Ark DigitalOcean plugin. To install the ark-blockstore-digitalocean plugin, use the ark client we installed earlier:

      • ark plugin add quay.io/stackpoint/ark-blockstore-digitalocean:latest

      You can specify the kubeconfig to use with the --kubeconfig flag. If you don't use this flag, ark will check the KUBECONFIG environment variable and then fall back to the kubectl default (~/.kube/config).

      At this point Ark is running and fully configured, and ready to back up and restore your Kubernetes cluster objects and Persistent Volumes to DigitalOcean Spaces and Block Storage.

      In the next section, we'll run a quick test to make sure that the backup and restore functionality works as expected.

      Step 3 — Testing Backup and Restore Procedure

      Now that we've successfully installed and configured Ark, we can create a test Nginx Deployment and Persistent Volume, and run through a backup and restore drill to ensure that everything is working properly.

      The ark-plugin-digitalocean repository contains a sample Nginx deployment called nginx-pv.yaml.

      Let's take a quick look:

      • cat examples/nginx-pv.yaml

      You should see the following text:

      Output

      --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: nginx-logs namespace: nginx-example labels: app: nginx spec: storageClassName: do-block-storage accessModes: - ReadWriteOnce resources: requests: storage: 5Gi --- apiVersion: apps/v1beta1 kind: Deployment metadata: name: nginx-deployment namespace: nginx-example spec: replicas: 1 template: metadata: labels: app: nginx spec: volumes: - name: nginx-logs persistentVolumeClaim: claimName: nginx-logs containers: - image: nginx:1.7.9 name: nginx ports: - containerPort: 80 volumeMounts: - mountPath: "/var/log/nginx" name: nginx-logs readOnly: false --- apiVersion: v1 kind: Service metadata: labels: app: nginx name: my-nginx namespace: nginx-example spec: ports: - port: 80 targetPort: 80 selector: app: nginx type: LoadBalancer

      In this file, we observe specs for:

      • An Nginx Deployment consisting of a single replica of the nginx:1.7.9 container image
      • A 5Gi Persistent Volume Claim (called nginx-logs), using the do-block-storage StorageClass
      • A LoadBalancer Service that exposes port 80

      Create the deployment using kubectl apply:

      • kubectl apply -f examples/nginx-pv.yml

      You should see the following output:

      Output

      namespace/nginx-example created persistentvolumeclaim/nginx-logs created deployment.apps/nginx-deployment created service/my-nginx created

      Check that the Deployment succeeded:

      • kubectl get deployments --namespace=nginx-example

      You should see the following output:

      Output

      NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE nginx-deployment 1 1 1 1 1h

      Once Available reaches 1, fetch the Nginx load balancer’s external IP using kubectl get:

      • kubectl get services --namespace=nginx-example

      You should see both the internal CLUSTER-IP and EXTERNAL-IP for the my-nginx Service:

      Output

      NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE my-nginx LoadBalancer 10.32.27.0 203.0.113.0 80:30754/TCP 3m

      Note the EXTERNAL-IP and navigate to it using your web browser.

      You should see the following NGINX welcome page:

      Nginx Welcome Page

      This indicates that your Nginx Deployment and Service are up and running.

      Before we simulate our disaster scenario, let’s first check the Nginx access logs (stored on a Persistent Volume attached to the Nginx Pod):

      Fetch the Pod’s name using kubectl get:

      • kubectl get pods --namespace nginx-example

      Output

      NAME READY STATUS RESTARTS AGE nginx-deployment-77d8f78fcb-zt4wr 1/1 Running 0 29m

      Now, exec into the running Nginx container to get a shell inside of it:

      • kubectl exec -it nginx-deployment-77d8f78fcb-zt4wr --namespace nginx-example -- /bin/bash

      Once inside the Nginx container, cat the Nginx access logs:

      • cat /var/log/nginx/access.log

      You should see some Nginx access entries:

      Output

      10.244.17.1 - - [01/Oct/2018:21:47:01 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/203.0.113.11 Safari/537.36" "-" 10.244.17.1 - - [01/Oct/2018:21:47:01 +0000] "GET /favicon.ico HTTP/1.1" 404 570 "http://203.0.113.0/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/203.0.113.11 Safari/537.36" "-"

      Note these down (especially the timestamps), as we will use them to confirm the success of the restore procedure.

      We can now perform the backup procedure to copy all nginx Kubernetes objects to Spaces and take a Snapshot of the Persistent Volume we created when deploying Nginx.

      We'll create a backup called nginx-backup using the ark client:

      • ark backup create nginx-backup --selector app=nginx

      The --selector app=nginx instructs the Ark server to only back up Kubernetes objects with the app=nginx Label Selector.

      You should see the following output:

      Output

      Backup request "nginx-backup" submitted successfully. Run `ark backup describe nginx-backup` for more details.

      Running ark backup describe nginx-backup should provide the following output after a short delay:

      Output

      Name: nginx-backup Namespace: heptio-ark Labels: <none> Annotations: <none> Phase: Completed Namespaces: Included: * Excluded: <none> Resources: Included: * Excluded: <none> Cluster-scoped: auto Label selector: app=nginx Snapshot PVs: auto TTL: 720h0m0s Hooks: <none> Backup Format Version: 1 Started: 2018-09-26 00:14:30 -0400 EDT Completed: 2018-09-26 00:14:34 -0400 EDT Expiration: 2018-10-26 00:14:30 -0400 EDT Validation errors: <none> Persistent Volumes: pvc-e4862eac-c2d2-11e8-920b-92c754237aeb: Snapshot ID: 2eb66366-c2d3-11e8-963b-0a58ac14428b Type: ext4 Availability Zone: IOPS: <N/A>

      This output indicates that nginx-backup completed successfully.

      From the DigitalOcean Cloud Control Panel, navigate to the Space containing your Kubernetes backup files.

      You should see a new directory called nginx-backup containing the Ark backup files.

      Using the left-hand navigation bar, go to Images and then Snapshots. Within Snapshots, navigate to Volumes. You should see a Snapshot corresponding to the PVC listed in the above output.

      We can now test the restore procedure.

      Let's first delete the nginx-example Namespace. This will delete everything in the Namespace, including the Load Balancer and Persistent Volume:

      • kubectl delete namespace nginx-example

      Verify that you can no longer access Nginx at the Load Balancer endpoint, and that the nginx-example Deployment is no longer running:

      • kubectl get deployments --namespace=nginx-example

      Output

      No resources found.

      We can now perform the restore procedure, once again using the ark client:

      • ark restore create --from-backup nginx-backup

      Here we use create to create an Ark Restore object from the nginx-backup object.

      You should see the following output:

      Output

      • Restore request "nginx-backup-20180926143828" submitted successfully.
      • Run `ark restore describe nginx-backup-20180926143828` for more details.

      Check the status of the restored Deployment:

      • kubectl get deployments --namespace=nginx-example

      Output

      NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE nginx-deployment 1 1 1 1 1m

      Check for the creation of a Persistent Volume:

      • kubectl get pvc --namespace=nginx-example

      Output

      NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE nginx-logs Bound pvc-e4862eac-c2d2-11e8-920b-92c754237aeb 5Gi RWO do-block-storage 3m

      Navigate to the Nginx Service’s external IP once again to confirm that Nginx is up and running.

      Finally, check the logs on the restored Persistent Volume to confirm that the log history has been preserved post-restore.

      To do this, once again fetch the Pod’s name using kubectl get:

      • kubectl get pods --namespace nginx-example

      Output

      NAME READY STATUS RESTARTS AGE nginx-deployment-77d8f78fcb-zt4wr 1/1 Running 0 29m

      Then exec into it:

      • kubectl exec -it nginx-deployment-77d8f78fcb-zt4wr --namespace nginx-example -- /bin/bash

      Once inside the Nginx container, cat the Nginx access logs:

      • cat /var/log/nginx/access.log

      Output

      10.244.17.1 - - [01/Oct/2018:21:47:01 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/203.0.113.11 Safari/537.36" "-" 10.244.17.1 - - [01/Oct/2018:21:47:01 +0000] "GET /favicon.ico HTTP/1.1" 404 570 "http://203.0.113.0/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/203.0.113.11 Safari/537.36" "-"

      You should see the same pre-backup access attempts (note the timestamps), confirming that the Persistent Volume restore was successful. Note that there may be additional attempts in the logs if you visited the Nginx landing page after you performed the restore.

      At this point, we've successfully backed up our Kubernetes objects to DigitalOcean Spaces, and our Persistent Volumes using Block Storage Volume Snapshots. We simulated a disaster scenario, and restored service to the test Nginx application.

      Conclusion

      In this guide we installed and configured the Ark Kubernetes backup tool on a DigitalOcean-based Kubernetes cluster. We configured the tool to back up Kubernetes objects to DigitalOcean Spaces, and back up Persistent Volumes using Block Storage Volume Snapshots.

      Ark can also be used to schedule regular backups of your Kubernetes cluster. To do this, you can use the ark schedule command. It can also be used to migrate resources from one cluster to another. To learn more about these two use cases, consult the official Ark documentation.

      To learn more about DigitalOcean Spaces, consult the official Spaces documentation. To learn more about Block Storage Volumes, consult the Block Storage Volume documentation.

      This tutorial builds on the README found in StackPointCloud's ark-plugin-digitalocean GitHub repo.



      Source link

      How To Create a Multi-Node MySQL Cluster on Ubuntu 18.04


      Introduction

      The MySQL Cluster distributed database provides high availability and throughput for your MySQL database management system. A MySQL Cluster consists of one or more management nodes (ndb_mgmd) that store the cluster’s configuration and control the data nodes (ndbd), where cluster data is stored. After communicating with the management node, clients (MySQL clients, servers, or native APIs) connect directly to these data nodes.

      With MySQL Cluster there is typically no replication of data, but instead data node synchronization. For this purpose a special data engine must be used — NDBCluster (NDB). It’s helpful to think of the cluster as a single logical MySQL environment with redundant components. Thus, a MySQL Cluster can participate in replication with other MySQL Clusters.

      MySQL Cluster works best in a shared-nothing environment. Ideally, no two components should share the same hardware. For simplicity and demonstration purposes, we’ll limit ourselves to using only three servers. We will set up two servers as data nodes which sync data between themselves. The third server will be used for the Cluster Manager and also for the MySQL server/client. If you spin up additional servers, you can add more data nodes to the cluster, decouple the cluster manager from the MySQL server/client, and configure more servers as Cluster Managers and MySQL servers/clients.

      Prerequisites

      To complete this tutorial, you will need a total of three servers: two servers for the redundant MySQL data nodes (ndbd), and one server for the Cluster Manager (ndb_mgmd) and MySQL server/client (mysqld and mysql).

      In the same DigitalOcean data center, create the following Droplets with private networking enabled:

      Be sure to note down the private IP addresses of your three Droplets. In this tutorial our cluster nodes have the following private IP addresses:

      • 198.51.100.0 will be the first MySQL Cluster data node
      • 198.51.100.1 will be the second data node
      • 198.51.100.2 will be the Cluster Manager & MySQL server node

      Once you’ve spun up your Droplets, configured a non-root user, and noted down the IP addresses for the 3 nodes, you’re ready to begin with this tutorial.

      Step 1 — Installing and Configuring the Cluster Manager

      We’ll first begin by downloading and installing the MySQL Cluster Manager, ndb_mgmd.

      To install the Cluster Manager, we first need to fetch the appropriate .deb installer file from the the official MySQL Cluster download page.

      From this page, under Select Operating System, choose Ubuntu Linux. Then, under Select OS Version, choose Ubuntu Linux 18.04 (x86, 64-bit).

      Scroll down until you see DEB Package, NDB Management Server, and click on the Download link for the one that does not contain dbgsym (unless you require debug symbols). You will be brought to a Begin Your Download page. Here, right click on No thanks, just start my download. and copy the link to the .deb file.

      Now, log in to your Cluster Manager Droplet (in this tutorial, 198.51.100.2), and download this .deb file:

      • cd ~
      • wget https://dev.mysql.com/get/Downloads/MySQL-Cluster-7.6/mysql-cluster-community-management-server_7.6.6-1ubuntu18.04_amd64.deb

      Install ndb_mgmd using dpkg:

      • sudo dpkg -i mysql-cluster-community-management-server_7.6.6-1ubuntu18.04_amd64.deb

      We now need to configure ndb_mgmd before first running it; proper configuration will ensure correct synchronization and load distribution among the data nodes.

      The Cluster Manager should be the first component launched in any MySQL cluster. It requires a configuration file, passed in as an argument to its executable. We’ll create and use the following configuration file: /var/lib/mysql-cluster/config.ini.

      On the Cluster Manager Droplet, create the /var/lib/mysql-cluster directory where this file will reside:

      • sudo mkdir /var/lib/mysql-cluster

      Then create and edit the configuration file using your preferred text editor:

      • sudo nano /var/lib/mysql-cluster/config.ini

      Paste the following text into your editor:

      /var/lib/mysql-cluster/config.ini

      [ndbd default]
      # Options affecting ndbd processes on all data nodes:
      NoOfReplicas=2  # Number of replicas
      
      [ndb_mgmd]
      # Management process options:
      hostname=198.51.100.2 # Hostname of the manager
      datadir=/var/lib/mysql-cluster  # Directory for the log files
      
      [ndbd]
      hostname=198.51.100.0 # Hostname/IP of the first data node
      NodeId=2            # Node ID for this data node
      datadir=/usr/local/mysql/data   # Remote directory for the data files
      
      [ndbd]
      hostname=198.51.100.1 # Hostname/IP of the second data node
      NodeId=3            # Node ID for this data node
      datadir=/usr/local/mysql/data   # Remote directory for the data files
      
      [mysqld]
      # SQL node options:
      hostname=198.51.100.2 # In our case the MySQL server/client is on the same Droplet as the cluster manager
      

      After pasting in this text, being sure to replace the hostname values above with the correct IP addresses of the Droplets you’ve configured. Setting this hostname parameter is an important security measure that prevents other servers from connecting to the Cluster Manager.

      Save the file and close your text editor.

      This is a pared-down, minimal configuration file for a MySQL Cluster. You should customize the parameters in this file depending on your production needs. For a sample, fully configured ndb_mgmd configuration file, consult the MySQL Cluster documentation.

      In the above file you can add additional components like data nodes (ndbd) or MySQL server nodes (mysqld) by appending instances to the appropriate section.

      We can now start the manager by executing the ndb_mgmd binary and specifying its config file using the -f flag:

      • sudo ndb_mgmd -f /var/lib/mysql-cluster/config.ini

      You should see the following output:

      Output

      MySQL Cluster Management Server mysql-5.7.22 ndb-7.6.6 2018-07-25 21:48:39 [MgmtSrvr] INFO -- The default config directory '/usr/mysql-cluster' does not exist. Trying to create it... 2018-07-25 21:48:39 [MgmtSrvr] INFO -- Successfully created config directory

      This indicates that the MySQL Cluster Management server has successfully been installed and is now running on your Droplet.

      Ideally, we’d like to start the Cluster Management server automatically on boot. To do this, we’re going to create and enable a systemd service.

      Before we create the service, we need to kill the running server:

      Now, open and edit the following systemd Unit file using your favorite editor:

      • sudo nano /etc/systemd/system/ndb_mgmd.service

      Paste in the following code:

      /etc/systemd/system/ndb_mgmd.service

      [Unit]
      Description=MySQL NDB Cluster Management Server
      After=network.target auditd.service
      
      [Service]
      Type=forking
      ExecStart=/usr/sbin/ndb_mgmd -f /var/lib/mysql-cluster/config.ini
      ExecReload=/bin/kill -HUP $MAINPID
      KillMode=process
      Restart=on-failure
      
      [Install]
      WantedBy=multi-user.target
      

      Here, we’ve added a minimal set of options instructing systemd on how to start, stop and restart the ndb_mgmd process. To learn more about the options used in this unit configuration, consult the systemd manual.

      Save and close the file.

      Now, reload systemd’s manager configuration using daemon-reload:

      • sudo systemctl daemon-reload

      We’ll enable the service we just created so that the MySQL Cluster Manager starts on reboot:

      • sudo systemctl enable ndb_mgmd

      Finally, we’ll start the service:

      • sudo systemctl start ndb_mgmd

      You can verify that the NDB Cluster Management service is running:

      • sudo systemctl status ndb_mgmd

      You should see the following output:

      ● ndb_mgmd.service - MySQL NDB Cluster Management Server
         Loaded: loaded (/etc/systemd/system/ndb_mgmd.service; enabled; vendor preset: enabled)
         Active: active (running) since Thu 2018-07-26 21:23:37 UTC; 3s ago
        Process: 11184 ExecStart=/usr/sbin/ndb_mgmd -f /var/lib/mysql-cluster/config.ini (code=exited, status=0/SUCCESS)
       Main PID: 11193 (ndb_mgmd)
          Tasks: 11 (limit: 4915)
         CGroup: /system.slice/ndb_mgmd.service
                 └─11193 /usr/sbin/ndb_mgmd -f /var/lib/mysql-cluster/config.ini
      

      Which indicates that the ndb_mgmd MySQL Cluster Management server is now running as a systemd service.

      The final step for setting up the Cluster Manager is to allow incoming connections from other MySQL Cluster nodes on our private network.

      If you did not configure the ufw firewall when setting up this Droplet, you can skip ahead to the next section.

      We’ll add rules to allow local incoming connections from both data nodes:

      • sudo ufw allow from 198.51.100.0
      • sudo ufw allow from 198.51.100.1

      After entering these commands, you should see the following output:

      Output

      Rule added

      The Cluster Manager should now be up and running, and able to communicate with other Cluster nodes over the private network.

      Step 2 — Installing and Configuring the Data Nodes

      Note: All the commands in this section should be executed on both data nodes.

      In this step, we'll install the ndbd MySQL Cluster data node daemon, and configure the nodes so they can communicate with the Cluster Manager.

      To install the data node binaries we first need to fetch the appropriate .deb installer file from the official MySQL download page.

      From this page, under Select Operating System, choose Ubuntu Linux. Then, under Select OS Version, choose Ubuntu Linux 18.04 (x86, 64-bit).

      Scroll down until you see DEB Package, NDB Data Node Binaries, and click on the Download link for the one that does not contain dbgsym (unless you require debug symbols). You will be brought to a Begin Your Download page. Here, right click on No thanks, just start my download. and copy the link to the .deb file.

      Now, log in to your first data node Droplet (in this tutorial, 198.51.100.0), and download this .deb file:

      • cd ~
      • wget https://dev.mysql.com/get/Downloads/MySQL-Cluster-7.6/mysql-cluster-community-data-node_7.6.6-1ubuntu18.04_amd64.deb

      Before we install the data node binary, we need to install a dependency, libclass-methodmaker-perl:

      • sudo apt update
      • sudo apt install libclass-methodmaker-perl

      We can now install the data note binary using dpkg:

      • sudo dpkg -i mysql-cluster-community-data-node_7.6.6-1ubuntu18.04_amd64.deb

      The data nodes pull their configuration from MySQL’s standard location, /etc/my.cnf. Create this file using your favorite text editor and begin editing it:

      Add the following configuration parameter to the file:

      /etc/my.cnf

      [mysql_cluster]
      # Options for NDB Cluster processes:
      ndb-connectstring=198.51.100.2  # location of cluster manager
      

      Specifying the location of the Cluster Manager node is the only configuration needed for ndbd to start. The rest of the configuration will be pulled from the manager directly.

      Save and exit the file.

      In our example, the data node will find out that its data directory is /usr/local/mysql/data, per the manager's configuration. Before starting the daemon, we’ll create this directory on the node:

      • sudo mkdir -p /usr/local/mysql/data

      Now we can start the data node using the following command:

      You should see the following output:

      Output

      2018-07-18 19:48:21 [ndbd] INFO -- Angel connected to '198.51.100.2:1186' 2018-07-18 19:48:21 [ndbd] INFO -- Angel allocated nodeid: 2

      The NDB data node daemon has been successfully installed and is now running on your server.

      We also need to allow incoming connections from other MySQL Cluster nodes over the private network.

      If you did not configure the ufw firewall when setting up this Droplet, you can skip ahead to setting up the systemd service for ndbd.

      We’ll add rules to allow incoming connections from the Cluster Manager and other data nodes:

      • sudo ufw allow from 198.51.100.0
      • sudo ufw allow from 198.51.100.2

      After entering these commands, you should see the following output:

      Output

      Rule added

      Your MySQL data node Droplet can now communicate with both the Cluster Manager and other data node over the private network.

      Finally, we’d also like the data node daemon to start up automatically when the server boots. We’ll follow the same procedure used for the Cluster Manager, and create a systemd service.

      Before we create the service, we’ll kill the running ndbd process:

      Now, open and edit the following systemd Unit file using your favorite editor:

      • sudo nano /etc/systemd/system/ndbd.service

      Paste in the following code:

      /etc/systemd/system/ndbd.service

      [Unit]
      Description=MySQL NDB Data Node Daemon
      After=network.target auditd.service
      
      [Service]
      Type=forking
      ExecStart=/usr/sbin/ndbd
      ExecReload=/bin/kill -HUP $MAINPID
      KillMode=process
      Restart=on-failure
      
      [Install]
      WantedBy=multi-user.target
      

      Here, we’ve added a minimal set of options instructing systemd on how to start, stop and restart the ndbd process. To learn more about the options used in this unit configuration, consult the systemd manual.

      Save and close the file.

      Now, reload systemd’s manager configuration using daemon-reload:

      • sudo systemctl daemon-reload

      We’ll now enable the service we just created so that the data node daemon starts on reboot:

      • sudo systemctl enable ndbd

      Finally, we’ll start the service:

      • sudo systemctl start ndbd

      You can verify that the NDB Cluster Management service is running:

      • sudo systemctl status ndbd

      You should see the following output:

      Output

      ● ndbd.service - MySQL NDB Data Node Daemon Loaded: loaded (/etc/systemd/system/ndbd.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2018-07-26 20:56:29 UTC; 8s ago Process: 11972 ExecStart=/usr/sbin/ndbd (code=exited, status=0/SUCCESS) Main PID: 11984 (ndbd) Tasks: 46 (limit: 4915) CGroup: /system.slice/ndbd.service ├─11984 /usr/sbin/ndbd └─11987 /usr/sbin/ndbd

      Which indicates that the ndbd MySQL Cluster data node daemon is now running as a systemd service. Your data node should now be fully functional and able to connect to the MySQL Cluster Manager.

      Once you’ve finished setting up the first data node, repeat the steps in this section on the other data node (198.51.100.1 in this tutorial).

      Step 3 — Configuring and Starting the MySQL Server and Client

      A standard MySQL server, such as the one available in Ubuntu's APT repository, does not support the MySQL Cluster engine NDB. This means we need to install the custom SQL server packaged with the other MySQL Cluster software we’ve installed in this tutorial.

      We’ll once again grab the MySQL Cluster Server binary from the official MySQL Cluster download page.

      From this page, under Select Operating System, choose Ubuntu Linux. Then, under Select OS Version, choose Ubuntu Linux 18.04 (x86, 64-bit).

      Scroll down until you see DEB Bundle, and click on the Download link (it should be the first one in the list). You will be brought to a Begin Your Download page. Here, right click on No thanks, just start my download. and copy the link to the .tar archive.

      Now, log in to the Cluster Manager Droplet (in this tutorial, 198.51.100.2), and download this .tar archive (recall that we are installing MySQL Server on the same node as our Cluster Manager – in a production setting you should run these daemons on different nodes):

      • cd ~
      • wget https://dev.mysql.com/get/Downloads/MySQL-Cluster-7.6/mysql-cluster_7.6.6-1ubuntu18.04_amd64.deb-bundle.tar

      We’ll now extract this archive into a directory called install. First, create the directory:

      Now extract the archive into this directory:

      • tar -xvf mysql-cluster_7.6.6-1ubuntu18.04_amd64.deb-bundle.tar -C install/

      Move into this directory, containing the extracted MySQL Cluster component binaries:

      Before we install the MySQL server binary, we need to install a couple of dependencies:

      • sudo apt update
      • sudo apt install libaio1 libmecab2

      Now, we need to install the MySQL Cluster dependencies, bundled in the tar archive we just extracted :

      • sudo dpkg -i mysql-common_7.6.6-1ubuntu18.04_amd64.deb
      • sudo dpkg -i mysql-cluster-community-client_7.6.6-1ubuntu18.04_amd64.deb
      • sudo dpkg -i mysql-client_7.6.6-1ubuntu18.04_amd64.deb
      • sudo dpkg -i mysql-cluster-community-server_7.6.6-1ubuntu18.04_amd64.deb

      When installing mysql-cluster-community-server, a configuration prompt should appear, asking you to set a password for the root account of your MySQL database. Choose a strong, secure password, and hit <Ok>. Re-enter this root password when prompted, and hit <Ok> once again to complete installation.

      We can now install the MySQL server binary using dpkg:

      • mysql-server_7.6.6-1ubuntu18.04_amd64.deb

      We now need to configure this MySQL server installation.

      The configuration for MySQL Server is stored in the default /etc/mysql/my.cnf file.

      Open this configuration file using your favorite editor:

      • sudo nano /etc/mysql/my.cnf

      You should see the following text:

      /etc/mysql/my.cnf

      # Copyright (c) 2015, 2016, Oracle and/or its affiliates. All rights reserved.
      #
      # This program is free software; you can redistribute it and/or modify
      # it under the terms of the GNU General Public License as published by
      # the Free Software Foundation; version 2 of the License.
      #
      # This program is distributed in the hope that it will be useful,
      # but WITHOUT ANY WARRANTY; without even the implied warranty of
      # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
      # GNU General Public License for more details.
      #
      # You should have received a copy of the GNU General Public License
      # along with this program; if not, write to the Free Software
      # Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301 USA
      
      #
      # The MySQL Cluster Community Server configuration file.
      #
      # For explanations see
      # http://dev.mysql.com/doc/mysql/en/server-system-variables.html
      
      # * IMPORTANT: Additional settings that can override those from this file!
      #   The files must end with '.cnf', otherwise they'll be ignored.
      #
      !includedir /etc/mysql/conf.d/
      !includedir /etc/mysql/mysql.conf.d/
      

      Append the following configuration to it:

      /etc/mysql/my.cnf

      . . .
      [mysqld]
      # Options for mysqld process:
      ndbcluster                      # run NDB storage engine
      
      [mysql_cluster]
      # Options for NDB Cluster processes:
      ndb-connectstring=198.51.100.2  # location of management server
      

      Save and exit the file.

      Restart the MySQL server for these changes to take effect:

      • sudo systemctl restart mysql

      MySQL by default should start automatically when your server reboots. If it doesn’t, the following command should fix this:

      • sudo systemctl enable mysql

      A SQL server should now be running on your Cluster Manager / MySQL Server Droplet.

      In the next step, we’ll run a few commands to verify that our MySQL Cluster installation is functioning as expected.

      Step 4 — Verifying MySQL Cluster Installation

      To verify your MySQL Cluster installation, log in to your Cluster Manager / SQL Server node.

      We’ll open the MySQL client from the command line and connect to the root account we just configured by entering the following command:

      Enter your password when prompted, and hit ENTER.

      You should see an output similar to:

      Output

      Welcome to the MySQL monitor. Commands end with ; or g. Your MySQL connection id is 3 Server version: 5.7.22-ndb-7.6.6 MySQL Cluster Community Server (GPL) Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or 'h' for help. Type 'c' to clear the current input statement. mysql>

      Once inside the MySQL client, run the following command:

      • SHOW ENGINE NDB STATUS G

      You should now see information about the NDB cluster engine, beginning with connection parameters:

      Output

      *************************** 1. row *************************** Type: ndbcluster Name: connection Status: cluster_node_id=4, connected_host=198.51.100.2, connected_port=1186, number_of_data_nodes=2, number_of_ready_data_nodes=2, connect_count=0 . . .

      This indicates that you’ve successfully connected to your MySQL Cluster.

      Notice here the number of ready_data_nodes: 2. This redundancy allows your MySQL cluster to continue operating even if one of the data nodes fails. It also means that your SQL queries will be load balanced across the two data nodes.

      You can try shutting down one of the data nodes to test cluster stability. The simplest test would be to restart the data node Droplet in order to fully test the recovery process. You should see the value of number_of_ready_data_nodes change to 1 and back up to 2 again as the node reboots and reconnects to the Cluster Manager.

      To exit the MySQL prompt, simply type quit or press CTRL-D.

      This is the first test that indicates that the MySQL cluster, server, and client are working. We'll now go through an additional test to confirm that the cluster is functioning properly.

      Open the Cluster management console, ndb_mgm using the command:

      You should see the following output:

      Output

      -- NDB Cluster -- Management Client -- ndb_mgm>

      Once inside the console enter the command SHOW and hit ENTER:

      You should see the following output:

      Output

      Connected to Management Server at: 198.51.100.2:1186 Cluster Configuration --------------------- [ndbd(NDB)] 2 node(s) id=2 @198.51.100.0 (mysql-5.7.22 ndb-7.6.6, Nodegroup: 0, *) id=3 @198.51.100.1 (mysql-5.7.22 ndb-7.6.6, Nodegroup: 0) [ndb_mgmd(MGM)] 1 node(s) id=1 @198.51.100.2 (mysql-5.7.22 ndb-7.6.6) [mysqld(API)] 1 node(s) id=4 @198.51.100.2 (mysql-5.7.22 ndb-7.6.6)

      The above shows that there are two data nodes connected with node-ids 2 and 3. There is also one management node with node-id 1 and one MySQL server with node-id 4. You can display more information about each id by typing its number with the command STATUS as follows:

      The above command shows you the status, MySQL version, and NDB version of node 2:

      Output

      Node 2: started (mysql-5.7.22 ndb-7.6.6)

      To exit the management console type quit, and then hit ENTER.

      The management console is very powerful and gives you many other options for administering the cluster and its data, including creating an online backup. For more information consult the official MySQL documentation.

      At this point, you’ve fully tested your MySQL Cluster installation. The concluding step of this guide shows you how to create and insert test data into this MySQL Cluster.

      Step 5 — Inserting Data into MySQL Cluster

      To demonstrate the cluster’s functionality, let's create a new table using the NDB engine and insert some sample data into it. Note that in order to use cluster functionality, the engine must be specified explicitly as NDB. If you use InnoDB (default) or any other engine, you will not make use of the cluster.

      First, let's create a database called clustertest with the command:

      • CREATE DATABASE clustertest;

      Next, switch to the new database:

      Now, create a simple table called test_table like this:

      • CREATE TABLE test_table (name VARCHAR(20), value VARCHAR(20)) ENGINE=ndbcluster;

      We have explicitly specified the engine ndbcluster in order to make use of the cluster.

      Now, we can start inserting data using this SQL query:

      • INSERT INTO test_table (name,value) VALUES('some_name','some_value');

      To verify that the data has been inserted, run the following select query:

      • SELECT * FROM test_table;

      When you insert data into and select data from an ndbcluster table, the cluster load balances queries between all the available data nodes. This improves the stability and performance of your MySQL database installation.

      You can also set the default storage engine to ndbcluster in the my.cnf file that we edited previously. If you do this, you won’t need to specify the ENGINE option when creating tables. To learn more, consult the MySQL Reference Manual.

      Conclusion

      In this tutorial, we’ve demonstrated how to set up and configure a MySQL Cluster on Ubuntu 18.04 servers. It’s important to note that this is a minimal, pared-down architecture used to demonstrate the installation procedure, and there are many advanced options and features worth learning about before deploying MySQL Cluster in production (for example, performing backups). To learn more, consult the official MySQL Cluster documentation.



      Source link