One place for hosting & domains

      December 2019

      How to Use Ansible to Install and Set Up WordPress with LAMP on Ubuntu 18.04


      Introduction

      Server automation now plays an essential role in systems administration, due to the disposable nature of modern application environments. Configuration management tools such as Ansible are typically used to streamline the process of automating server setup by establishing standard procedures for new servers while also reducing human error associated with manual setups.

      Ansible offers a simple architecture that doesn’t require special software to be installed on nodes. It also provides a robust set of features and built-in modules which facilitate writing automation scripts.

      This guide explains how to use Ansible to automate the steps contained in our guide on How To Install WordPress with LAMP on Ubuntu 18.04. WordPress is the most popular CMS (content management system) on the internet, allowing users to set up flexible blogs and websites on top of a MySQL backend with PHP processing. After setup, almost all administration can be done through the web frontend.

      Prerequisites

      In order to execute the automated setup provided by the playbook we’re discussing in this guide, you’ll need:

      Before proceeding, you first need to make sure your Ansible control node is able to connect and execute commands on your Ansible host(s). For a connection test, please check step 3 of How to Install and Configure Ansible on Ubuntu 18.04.

      What Does this Playbook Do?

      This Ansible playbook provides an alternative to manually running through the procedure outlined in our guide on How To Install WordPress with LAMP on Ubuntu 18.04.

      Running this playbook will perform the following actions on your Ansible hosts:

      1. Install aptitude, which is preferred by Ansible as an alternative to the apt package manager.
      2. Install the required LAMP packages and PHP extensions.
      3. Create and enable a new Apache VirtualHost for the WordPress website.
      4. Enable the Apache rewrite (mod_rewrite) module.
      5. Disable the default Apache website.
      6. Set the password for the MySQL root user.
      7. Remove anonymous MySQL accounts and the test database.
      8. Create a new MySQL database and user for the WordPress website.
      9. Set up UFW to allow HTTP traffic on the configured port (80 by default).
      10. Download and unpack WordPress.
      11. Set up correct directory ownership and permissions.
      12. Set up the wp-config.php file using the provided template.

      Once the playbook has finished running, you will have a WordPress installation running on top of a LAMP environment, based on the options you defined within your configuration variables.

      How to Use this Playbook

      The first thing we need to do is obtain the WordPress on LAMP playbook and its dependencies from the do-community/ansible-playbooks repository. We need to clone this repository to a local folder inside the Ansible Control Node.

      In case you have cloned this repository before while following a different guide, access your existing ansible-playbooks copy and run a git pull command to make sure you have updated contents:

      • cd ~/ansible-playbooks
      • git pull

      If this is your first time using the do-community/ansible-playbooks repository, you should start by cloning the repository to your home folder with:

      • cd ~
      • git clone https://github.com/do-community/ansible-playbooks.git
      • cd ansible-playbooks

      The files we’re interested in are located inside the wordpress-lamp_ubuntu1804 folder, which has the following structure:

      wordpress-lamp_ubuntu1804
      ├── files
      │   ├── apache.conf.j2
      │   └── wp-config.php.j2
      ├── vars
      │   └── default.yml
      ├── playbook.yml
      └── readme.md
      

      Here is what each of these files are:

      • files/apache.conf.j2: Template file for setting up the Apache VirtualHost.
      • files/wp-config.php.j2: Template file for setting up WordPress’s configuration file.
      • vars/default.yml: Variable file for customizing playbook settings.
      • playbook.yml: The playbook file, containing the tasks to be executed on the remote server(s).
      • readme.md: A text file containing information about this playbook.

      We’ll edit the playbook’s variable file to customize its options. Access the wordpress-lamp_ubuntu1804 directory and open the vars/default.yml file using your command line editor of choice:

      • cd wordpress-lamp_ubuntu1804
      • nano vars/default.yml

      This file contains a few variables that require your attention:

      vars/default.yml

      ---
      #System Settings
      php_modules: [ 'php-curl', 'php-gd', 'php-mbstring', 'php-xml', 'php-xmlrpc', 'php-soap', 'php-intl', 'php-zip' ]
      
      #MySQL Settings
      mysql_root_password: "mysql_root_password"
      mysql_db: "wordpress"
      mysql_user: "sammy"
      mysql_password: "password"
      
      #HTTP Settings
      http_host: "your_domain"
      http_conf: "your_domain.conf"
      http_port: "80"
      

      The following list contains a brief explanation of each of these variables and how you might want to change them:

      • php_modules: An array containing PHP extensions that should be installed to support your WordPress setup. You don’t need to change this variable, but you might want to include new extensions to the list if your specific setup requires it.
      • mysql_root_password: The desired password for the root MySQL account.
      • mysql_db: The name of the MySQL database that should be created for WordPress.
      • mysql_user: The name of the MySQL user that should be created for WordPress.
      • mysql_password: The password for the new MySQL user.
      • http_host: Your domain name.
      • http_conf: The name of the configuration file that will be created within Apache.
      • http_port: HTTP port for this virtual host, where 80 is the default.

      Once you’re done updating the variables inside vars/default.yml, save and close this file. If you used nano, do so by pressing CTRL + X, Y, then ENTER.

      You’re now ready to run this playbook on one or more servers. Most playbooks are configured to be executed on every server in your inventory, by default. We can use the -l flag to make sure that only a subset of servers, or a single server, is affected by the playbook. We can also use the -u flag to specify which user on the remote server we’re using to connect and execute the playbook commands on the remote hosts.

      To execute the playbook only on server1, connecting as sammy, you can use the following command:

      • ansible-playbook playbook.yml -l server1 -u sammy

      You will get output similar to this:

      Output

      PLAY [all] ***************************************************************************************************************************** TASK [Gathering Facts] ***************************************************************************************************************** ok: [server1] TASK [Install prerequisites] *********************************************************************************************************** ok: [server1] … TASK [Download and unpack latest WordPress] ******************************************************************************************** changed: [server1] TASK [Set ownership] ******************************************************************************************************************* changed: [server1] TASK [Set permissions for directories] ************************************************************************************************* changed: [server1] TASK [Set permissions for files] ******************************************************************************************************* changed: [server1] TASK [Set up wp-config] **************************************************************************************************************** changed: [server1] RUNNING HANDLER [Reload Apache] ******************************************************************************************************** changed: [server1] RUNNING HANDLER [Restart Apache] ******************************************************************************************************* changed: [server1] PLAY RECAP ***************************************************************************************************************************** server1 : ok=22 changed=18 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

      Note: For more information on how to run Ansible playbooks, check our Ansible Cheat Sheet Guide.

      When the playbook is finished running, you can go to your web browser to finish WordPress’s installation from there.

      Navigate to your server’s domain name or public IP address:

      http://server_host_or_IP
      

      You will see a page like this:

      WordPress language selection page

      After selecting the language you’d like to use for your WordPress installation, you’ll be presented with a final step to set up your WordPress user and password so you can log into your control panel:

      WordPress Setup

      When you click ahead, you will be taken to a page that prompts you to log in:

      WP login prompt

      Once you log in, you will be taken to the WordPress administration dashboard:

      WP Admin Panel

      Some common next steps for customizing your WordPress installation include choosing the permalinks setting for your posts (can be found in Settings > Permalinks) and selecting a new theme (in Appearance > Themes).

      The Playbook Contents

      You can find the WordPress on LAMP server setup featured in this tutorial in the wordpress-lamp_ubuntu1804 folder inside the DigitalOcean Community Playbooks repository. To copy or download the script contents directly, click the Raw button towards the top of each script.

      The full contents of the playbook as well as its associated files are also included here for your convenience.

      vars/default.yml

      The default.yml variable file contains values that will be used within the playbook tasks, such as the database settings and the domain name to configure within Apache.

      vars/default.yml

      #System Settings
      php_modules: [ 'php-curl', 'php-gd', 'php-mbstring', 'php-xml', 'php-xmlrpc', 'php-soap', 'php-intl', 'php-zip' ]
      
      #MySQL Settings
      mysql_root_password: "mysql_root_password"
      mysql_db: "wordpress"
      mysql_user: "sammy"
      mysql_password: "password"
      
      #HTTP Settings
      http_host: "your_domain"
      http_conf: "your_domain.conf"
      http_port: "80"
      

      files/apache.conf.j2

      The apache.conf.j2 file is a Jinja 2 template file that configures a new Apache VirtualHost. The variables used within this template are defined in the vars/default.yml variable file.

      files/apache.conf.j2

      <VirtualHost *:{{ http_port }}>
         ServerAdmin webmaster@localhost
         ServerName {{ http_host }}
         ServerAlias www.{{ http_host }}
         DocumentRoot /var/www/{{ http_host }}
         ErrorLog ${APACHE_LOG_DIR}/error.log
         CustomLog ${APACHE_LOG_DIR}/access.log combined
      
         <Directory /var/www/{{ http_host }}>
               Options -Indexes
         </Directory>
      
         <IfModule mod_dir.c>
             DirectoryIndex index.php index.html index.cgi index.pl  index.xhtml index.htm
         </IfModule>
      
      </VirtualHost>
      

      files/wp-config.php.j2

      The wp-config.php.j2 file is another Jinja template, used to set up the main configuration file used by WordPress. The variables used within this template are defined in the vars/default.yml variable file. Unique authentication keys and salts are generated using a hash function.

      files/info.php.j2

      <?php
      /**
       * The base configuration for WordPress
       *
       * The wp-config.php creation script uses this file during the
       * installation. You don't have to use the web site, you can
       * copy this file to "wp-config.php" and fill in the values.
       *
       * This file contains the following configurations:
       *
       * * MySQL settings
       * * Secret keys
       * * Database table prefix
       * * ABSPATH
       *
       * @link https://codex.wordpress.org/Editing_wp-config.php
       *
       * @package WordPress
       */
      
      // ** MySQL settings - You can get this info from your web host ** //
      /** The name of the database for WordPress */
      define( 'DB_NAME', '{{ mysql_db }}' );
      
      /** MySQL database username */
      define( 'DB_USER', '{{ mysql_user }}' );
      
      /** MySQL database password */
      define( 'DB_PASSWORD', '{{ mysql_password }}' );
      
      /** MySQL hostname */
      define( 'DB_HOST', 'localhost' );
      
      /** Database Charset to use in creating database tables. */
      define( 'DB_CHARSET', 'utf8' );
      
      /** The Database Collate type. Don't change this if in doubt. */
      define( 'DB_COLLATE', '' );
      
      /** Filesystem access **/
      define('FS_METHOD', 'direct');
      
      /**#@+
       * Authentication Unique Keys and Salts.
       *
       * Change these to different unique phrases!
       * You can generate these using the {@link https://api.wordpress.org/secret-key/1.1/salt/ WordPress.org secret-key service}
       * You can change these at any point in time to invalidate all existing cookies. This will force all users to have to log in again.
       *
       * @since 2.6.0
       */
      define( 'AUTH_KEY',         '{{ ansible_date_time.iso8601_micro | hash('sha256') }}' );
      define( 'SECURE_AUTH_KEY',  '{{ ansible_date_time.iso8601_micro | hash('sha256') }}' );
      define( 'LOGGED_IN_KEY',    '{{ ansible_date_time.iso8601_micro | hash('sha256') }}' );
      define( 'NONCE_KEY',        '{{ ansible_date_time.iso8601_micro | hash('sha256') }}' );
      define( 'AUTH_SALT',        '{{ ansible_date_time.iso8601_micro | hash('sha256') }}' );
      define( 'SECURE_AUTH_SALT', '{{ ansible_date_time.iso8601_micro | hash('sha256') }}' );
      define( 'LOGGED_IN_SALT',   '{{ ansible_date_time.iso8601_micro | hash('sha256') }}' );
      define( 'NONCE_SALT',       '{{ ansible_date_time.iso8601_micro | hash('sha256') }}' );
      
      /**#@-*/
      
      /**
       * WordPress Database Table prefix.
       *
       * You can have multiple installations in one database if you give each
       * a unique prefix. Only numbers, letters, and underscores please!
       */
      $table_prefix = 'wp_';
      
      /**
       * For developers: WordPress debugging mode.
       *
       * Change this to true to enable the display of notices during development.
       * It is strongly recommended that plugin and theme developers use WP_DEBUG
       * in their development environments.
       *
       * For information on other constants that can be used for debugging,
       * visit the Codex.
       *
       * @link https://codex.wordpress.org/Debugging_in_WordPress
       */
      define( 'WP_DEBUG', false );
      
      /* That's all, stop editing! Happy publishing. */
      
      /** Absolute path to the WordPress directory. */
      if ( ! defined( 'ABSPATH' ) ) {
          define( 'ABSPATH', dirname( __FILE__ ) . '/' );
      }
      
      /** Sets up WordPress vars and included files. */
      require_once( ABSPATH . 'wp-settings.php' );
      
      

      playbook.yml

      The playbook.yml file is where all tasks from this setup are defined. It starts by defining the group of servers that should be the target of this setup (all), after which it uses become: true to define that tasks should be executed with privilege escalation (sudo) by default. Then, it includes the vars/default.yml variable file to load configuration options.

      playbook.yml

      ---
      - hosts: all
        become: true
        vars_files:
          - vars/default.yml
      
        tasks:
          - name: Install prerequisites
            apt: name=aptitude update_cache=yes state=latest force_apt_get=yes
            tags: [ system ]
      
          - name: Install LAMP Packages
            apt: name={{ item }} update_cache=yes state=latest
            loop: [ 'apache2', 'mysql-server', 'python3-pymysql', 'php', 'php-mysql', 'libapache2-mod-php' ]
            tags: [ system ]
      
          - name: Install PHP Extensions
            apt: name={{ item }} update_cache=yes state=latest
            loop: "{{ php_modules }}"
            tags: [ system ]
      
        # Apache Configuration
          - name: Create document root
            file:
              path: "/var/www/{{ http_host }}"
              state: directory
              owner: "www-data"
              group: "www-data"
              mode: '0755'
            tags: [ apache ]
      
          - name: Set up Apache VirtualHost
            template:
              src: "files/apache.conf.j2"
              dest: "/etc/apache2/sites-available/{{ http_conf }}"
            notify: Reload Apache
            tags: [ apache ]
      
          - name: Enable rewrite module
            shell: /usr/sbin/a2enmod rewrite
            notify: Reload Apache
            tags: [ apache ]
      
          - name: Enable new site
            shell: /usr/sbin/a2ensite {{ http_conf }}
            notify: Reload Apache
            tags: [ apache ]
      
          - name: Disable default Apache site
            shell: /usr/sbin/a2dissite 000-default.conf
            notify: Restart Apache
            tags: [ apache ]
      
        # MySQL Configuration
          - name: Set the root password
            mysql_user:
              name: root
              password: "{{ mysql_root_password }}"
              login_unix_socket: /var/run/mysqld/mysqld.sock
            tags: [ mysql, mysql-root ]
      
          - name: Remove all anonymous user accounts
            mysql_user:
              name: ''
              host_all: yes
              state: absent
              login_user: root
              login_password: "{{ mysql_root_password }}"
            tags: [ mysql ]
      
          - name: Remove the MySQL test database
            mysql_db:
              name: test
              state: absent
              login_user: root
              login_password: "{{ mysql_root_password }}"
            tags: [ mysql ]
      
          - name: Creates database for WordPress
            mysql_db:
              name: "{{ mysql_db }}"
              state: present
              login_user: root
              login_password: "{{ mysql_root_password }}"
            tags: [ mysql ]
      
          - name: Create MySQL user for WordPress
            mysql_user:
              name: "{{ mysql_user }}"
              password: "{{ mysql_password }}"
              priv: "{{ mysql_db }}.*:ALL"
              state: present
              login_user: root
              login_password: "{{ mysql_root_password }}"
            tags: [ mysql ]
      
        # UFW Configuration
          - name: "UFW - Allow HTTP on port {{ http_port }}"
            ufw:
              rule: allow
              port: "{{ http_port }}"
              proto: tcp
            tags: [ system ]
      
        # WordPress Configuration
          - name: Download and unpack latest WordPress
            unarchive:
              src: https://wordpress.org/latest.tar.gz
              dest: "/var/www/{{ http_host }}"
              remote_src: yes
              creates: "/var/www/{{ http_host }}/wordpress"
            tags: [ wordpress ]
      
          - name: Set ownership
            file:
              path: "/var/www/{{ http_host }}"
              state: directory
              recurse: yes
              owner: www-data
              group: www-data
            tags: [ wordpress ]
      
          - name: Set permissions for directories
            shell: "/usr/bin/find /var/www/{{ http_host }}/wordpress/ -type d -exec chmod 750 {} \;"
            tags: [ wordpress ]
      
          - name: Set permissions for files
            shell: "/usr/bin/find /var/www/{{ http_host }}/wordpress/ -type f -exec chmod 640 {} \;"
            tags: [ wordpress ]
      
          - name: Set up wp-config
            template:
              src: "files/wp-config.php.j2"
              dest: "/var/www/{{ http_host }}/wordpress/wp-config.php"
            tags: [ wordpress ]
      
        handlers:
          - name: Reload Apache
            service:
              name: apache2
              state: reloaded
      
          - name: Restart Apache
            service:
              name: apache2
              state: restarted
      

      Feel free to modify these files to best suit your individual needs within your own workflow.

      Conclusion

      In this guide, we used Ansible to automate the process of installing and setting up a WordPress website with LAMP on an Ubuntu 18.04 server.

      If you’d like to include other tasks in this playbook to further customize your server setup, please refer to our introductory Ansible guide Configuration Management 101: Writing Ansible Playbooks.



      Source link

      Containerizing a Ruby on Rails Application for Development with Docker Compose


      Introduction

      If you are actively developing an application, using Docker can simplify your workflow and the process of deploying your application to production. Working with containers in development offers the following benefits:

      • Environments are consistent, meaning that you can choose the languages and dependencies you want for your project without worrying about system conflicts.
      • Environments are isolated, making it easier to troubleshoot issues and onboard new team members.
      • Environments are portable, allowing you to package and share your code with others.

      This tutorial will show you how to set up a development environment for a Ruby on Rails application using Docker. You will create multiple containers – for the application itself, the PostgreSQL database, Redis, and a Sidekiq service – with Docker Compose. The setup will do the following:

      • Synchronize the application code on the host with the code in the container to facilitate changes during development.
      • Persist application data between container restarts.
      • Configure Sidekiq workers to process jobs as expected.

      At the end of this tutorial, you will have a working shark information application running on Docker containers:

      Sidekiq App Home

      Prerequisites

      To follow this tutorial, you will need:

      Step 1 — Cloning the Project and Adding Dependencies

      Our first step will be to clone the rails-sidekiq repository from the DigitalOcean Community GitHub account. This repository includes the code from the setup described in How To Add Sidekiq and Redis to a Ruby on Rails Application, which explains how to add Sidekiq to an existing Rails 5 project.

      Clone the repository into a directory called rails-docker:

      • git clone https://github.com/do-community/rails-sidekiq.git rails-docker

      Navigate to the rails-docker directory:

      In this tutorial we will use PostgreSQL as a database. In order to work with PostgreSQL instead of SQLite 3, you will need to add the pg gem to the project’s dependencies, which are listed in its Gemfile. Open that file for editing using nano or your favorite editor:

      Add the gem anywhere in the main project dependencies (above development dependencies):

      ~/rails-docker/Gemfile

      . . . 
      # Reduces boot times through caching; required in config/boot.rb
      gem 'bootsnap', '>= 1.1.0', require: false
      gem 'sidekiq', '~>6.0.0'
      gem 'pg', '~>1.1.3'
      
      group :development, :test do
      . . .
      

      We can also comment out the sqlite gem, since we won’t be using it anymore:

      ~/rails-docker/Gemfile

      . . . 
      # Use sqlite3 as the database for Active Record
      # gem 'sqlite3'
      . . .
      

      Finally, comment out the spring-watcher-listen gem under development:

      ~/rails-docker/Gemfile

      . . . 
      gem 'spring'
      # gem 'spring-watcher-listen', '~> 2.0.0'
      . . .
      

      If we do not disable this gem, we will see persistent error messages when accessing the Rails console. These error messages derive from the fact that this gem has Rails use listen to watch for changes in development, rather than polling the filesystem for changes. Because this gem watches the root of the project, including the node_modules directory, it will throw error messages about which directories are being watched, cluttering the console. If you are concerned about conserving CPU resources, however, disabling this gem may not work for you. In this case, it may be a good idea to upgrade your Rails application to Rails 6.

      Save and close the file when you are finished editing.

      With your project repository in place, the pg gem added to your Gemfile, and the spring-watcher-listen gem commented out, you are ready to configure your application to work with PostgreSQL.

      Step 2 — Configuring the Application to Work with PostgreSQL and Redis

      To work with PostgreSQL and Redis in development, we will want to do the following:

      • Configure the application to work with PostgreSQL as the default adapter.
      • Add an .env file to the project with our database username and password and Redis host.
      • Create an init.sql script to create a sammy user for the database.
      • Add an initializer for Sidekiq so that it can work with our containerized redis service.
      • Add the .env file and other relevant files to the project’s gitignore and dockerignore files.
      • Create database seeds so that our application has some records for us to work with when we start it up.

      First, open your database configuration file, located at config/database.yml:

      Currently, the file includes the following default settings, which are applied in the absence of other settings:

      ~/rails-docker/config/database.yml

      default: &default
        adapter: sqlite3
        pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
        timeout: 5000
      

      We need to change these to reflect the fact that we will use the postgresql adapter, since we will be creating a PostgreSQL service with Docker Compose to persist our application data.

      Delete the code that sets SQLite as the adapter and replace it with the following settings, which will set the adapter appropriately and the other variables necessary to connect:

      ~/rails-docker/config/database.yml

      default: &default
        adapter: postgresql
        encoding: unicode
        database: <%= ENV['DATABASE_NAME'] %>
        username: <%= ENV['DATABASE_USER'] %>
        password: <%= ENV['DATABASE_PASSWORD'] %>
        port: <%= ENV['DATABASE_PORT'] || '5432' %>
        host: <%= ENV['DATABASE_HOST'] %>
        pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
        timeout: 5000
      . . .
      

      Next, we’ll modify the setting for the development environment, since this is the environment we’re using in this setup.

      Delete the existing SQLite database configuration so that section looks like this:

      ~/rails-docker/config/database.yml

      . . . 
      development:
        <<: *default
      . . .
      

      Finally, delete the database settings for the production and test environments as well:

      ~/rails-docker/config/database.yml

      . . . 
      test:
        <<: *default
      
      production:
        <<: *default
      . . . 
      

      These modifications to our default database settings will allow us to set our database information dynamically using environment variables defined in .env files, which will not be committed to version control.

      Save and close the file when you are finished editing.

      Note that if you are creating a Rails project from scratch, you can set the adapter with the rails new command, as described in Step 3 of How To Use PostgreSQL with Your Ruby on Rails Application on Ubuntu 18.04. This will set your adapter in config/database.yml and automatically add the pg gem to the project.

      Now that we have referenced our environment variables, we can create a file for them with our preferred settings. Extracting configuration settings in this way is part of the 12 Factor approach to application development, which defines best practices for application resiliency in distributed environments. Now, when we are setting up our production and test environments in the future, configuring our database settings will involve creating additional .env files and referencing the appropriate file in our Docker Compose files.

      Open an .env file:

      Add the following values to the file:

      ~/rails-docker/.env

      DATABASE_NAME=rails_development
      DATABASE_USER=sammy
      DATABASE_PASSWORD=shark
      DATABASE_HOST=database
      REDIS_HOST=redis
      

      In addition to setting our database name, user, and password, we’ve also set a value for the DATABASE_HOST. The value, database, refers to the database PostgreSQL service we will create using Docker Compose. We’ve also set a REDIS_HOST to specify our redis service.

      Save and close the file when you are finished editing.

      To create the sammy database user, we can write an init.sql script that we can then mount to the database container when it starts.

      Open the script file:

      Add the following code to create a sammy user with administrative privileges:

      ~/rails-docker/init.sql

      CREATE USER sammy;
      ALTER USER sammy WITH SUPERUSER;
      

      This script will create the appropriate user on the database and grant this user administrative privileges.

      Set appropriate permissions on the script:

      Next, we’ll configure Sidekiq to work with our containerized redis service. We can add an initializer to the config/initializers directory, where Rails looks for configuration settings once frameworks and plugins are loaded, that sets a value for a Redis host.

      Open a sidekiq.rb file to specify these settings:

      • nano config/initializers/sidekiq.rb

      Add the following code to the file to specify values for a REDIS_HOST and REDIS_PORT:

      ~/rails-docker/config/initializers/sidekiq.rb

      Sidekiq.configure_server do |config|
        config.redis = {
          host: ENV['REDIS_HOST'],
          port: ENV['REDIS_PORT'] || '6379'
        }
      end
      
      Sidekiq.configure_client do |config|
        config.redis = {
          host: ENV['REDIS_HOST'],
          port: ENV['REDIS_PORT'] || '6379'
        }
      end
      

      Much like our database configuration settings, these settings give us the ability to set our host and port parameters dynamically, allowing us to substitute the appropriate values at runtime without having to modify the application code itself. In addition to a REDIS_HOST, we have a default value set for REDIS_PORT in case it is not set elsewhere.

      Save and close the file when you are finished editing.

      Next, to ensure that our application’s sensitive data is not copied to version control, we can add .env to our project’s .gitignore file, which tells Git which files to ignore in our project. Open the file for editing:

      At the bottom of the file, add an entry for .env:

      ~/rails-docker/.gitignore

      yarn-debug.log*
      .yarn-integrity
      .env
      

      Save and close the file when you are finished editing.

      Next, we’ll create a .dockerignore file to set what should not be copied to our containers. Open the file for editing:

      Add the following code to the file, which tells Docker to ignore some of the things we don’t need copied to our containers:

      ~/rails-docker/.dockerignore

      .DS_Store
      .bin
      .git
      .gitignore
      .bundleignore
      .bundle
      .byebug_history
      .rspec
      tmp
      log
      test
      config/deploy
      public/packs
      public/packs-test
      node_modules
      yarn-error.log
      coverage/
      

      Add .env to the bottom of this file as well:

      ~/rails-docker/.dockerignore

      . . .
      yarn-error.log
      coverage/
      .env
      

      Save and close the file when you are finished editing.

      As a final step, we will create some seed data so that our application has a few records when we start it up.

      Open a file for the seed data in the db directory:

      Add the following code to the file to create four demo sharks and one sample post:

      ~/rails-docker/db/seeds.rb

      # Adding demo sharks
      sharks = Shark.create([{ name: 'Great White', facts: 'Scary' }, { name: 'Megalodon', facts: 'Ancient' }, { name: 'Hammerhead', facts: 'Hammer-like' }, { name: 'Speartooth', facts: 'Endangered' }])
      Post.create(body: 'These sharks are misunderstood', shark: sharks.first)
      

      This seed data will create four sharks and one post that is associated with the first shark.

      Save and close the file when you are finished editing.

      With your application configured to work with PostgreSQL and your environment variables created, you are ready to write your application Dockerfile.

      Step 3 — Writing the Dockerfile and Entrypoint Scripts

      Your Dockerfile specifies what will be included in your application container when it is created. Using a Dockerfile allows you to define your container environment and avoid discrepancies with dependencies or runtime versions.

      Following these guidelines on building optimized containers, we will make our image as efficient as possible by using an Alpine base and attempting to minimize our image layers generally.

      Open a Dockerfile in your current directory:

      Docker images are created using a succession of layered images that build on one another. Our first step will be to add the base image for our application, which will form the starting point of the application build.

      Add the following code to the file to add the Ruby alpine image as a base:

      ~/rails-docker/Dockerfile

      FROM ruby:2.5.1-alpine
      

      The alpine image is derived from the Alpine Linux project, and will help us keep our image size down. For more information about whether or not the alpine image is the right choice for your project, please see the full discussion under the Image Variants section of the Docker Hub Ruby image page.

      Some factors to consider when using alpine in development:

      • Keeping image size down will decrease page and resource load times, particularly if you also keep volumes to a minimum. This helps keep your user experience in development quick and closer to what it would be if you were working locally in a non-containerized environment.
      • Having parity between development and production images facilitates successful deployments. Since teams often opt to use Alpine images in production for speed benefits, developing with an Alpine base helps offset issues when moving to production.

      Next, set an environment variable to specify the Bundler version:

      ~/rails-docker/Dockerfile

      . . .
      ENV BUNDLER_VERSION=2.0.2
      

      This is one of the steps we will take to avoid version conflicts between the default bundler version available in our environment and our application code, which requires Bundler 2.0.2.

      Next, add the packages that you need to work with the application to the Dockerfile:

      ~/rails-docker/Dockerfile

      . . . 
      RUN apk add --update --no-cache 
            binutils-gold 
            build-base 
            curl 
            file 
            g++ 
            gcc 
            git 
            less 
            libstdc++ 
            libffi-dev 
            libc-dev  
            linux-headers 
            libxml2-dev 
            libxslt-dev 
            libgcrypt-dev 
            make 
            netcat-openbsd 
            nodejs 
            openssl 
            pkgconfig 
            postgresql-dev 
            python 
            tzdata 
            yarn 
      

      These packages include nodejs and yarn, among others. Since our application serves assets with webpack, we need to include Node.js and Yarn for the application to work as expected.

      Keep in mind that the alpine image is extremely minimal: the packages listed here are not exhaustive of what you might want or need in development when you are containerizing your own application.

      Next, install the appropriate bundler version:

      ~/rails-docker/Dockerfile

      . . . 
      RUN gem install bundler -v 2.0.2
      

      This step will guarantee parity between our containerized environment and the specifications in this project’s Gemfile.lock file.

      Now set the working directory for the application on the container:

      ~/rails-docker/Dockerfile

      . . .
      WORKDIR /app
      

      Copy over your Gemfile and Gemfile.lock:

      ~/rails-docker/Dockerfile

      . . .
      COPY Gemfile Gemfile.lock ./
      

      Copying these files as an independent step, followed by bundle install, means that the project gems do not need to be rebuilt every time you make changes to your application code. This will work in conjunction with the gem volume that we will include in our Compose file, which will mount gems to your application container in cases where the service is recreated but project gems remain the same.

      Next, set the configuration options for the nokogiri gem build:

      ~/rails-docker/Dockerfile

      . . . 
      RUN bundle config build.nokogiri --use-system-libraries
      . . .
      

      This step builds nokigiri with the libxml2 and libxslt library versions that we added to the application container in the RUN apk add… step above.

      Next, install the project gems:

      ~/rails-docker/Dockerfile

      . . . 
      RUN bundle check || bundle install
      

      This instruction checks that the gems are not already installed before installing them.

      Next, we’ll repeat the same procedure that we used with gems with our JavaScript packages and dependencies. First we’ll copy package metadata, then we’ll install dependencies, and finally we’ll copy the application code into the container image.

      To get started with the Javascript section of our Dockerfile, copy package.json and yarn.lock from your current project directory on the host to the container:

      ~/rails-docker/Dockerfile

      . . . 
      COPY package.json yarn.lock ./
      

      Then install the required packages with yarn install:

      ~/rails-docker/Dockerfile

      . . . 
      RUN yarn install --check-files
      

      This instruction includes a --check-files flag with the yarn command, a feature that makes sure any previously installed files have not been removed. As in the case of our gems, we will manage the persistence of the packages in the node_modules directory with a volume when we write our Compose file.

      Finally, copy over the rest of the application code and start the application with an entrypoint script:

      ~/rails-docker/Dockerfile

      . . . 
      COPY . ./ 
      
      ENTRYPOINT ["./entrypoints/docker-entrypoint.sh"]
      

      Using an entrypoint script allows us to run the container as an executable.

      The final Dockerfile will look like this:

      ~/rails-docker/Dockerfile

      FROM ruby:2.5.1-alpine
      
      ENV BUNDLER_VERSION=2.0.2
      
      RUN apk add --update --no-cache 
            binutils-gold 
            build-base 
            curl 
            file 
            g++ 
            gcc 
            git 
            less 
            libstdc++ 
            libffi-dev 
            libc-dev  
            linux-headers 
            libxml2-dev 
            libxslt-dev 
            libgcrypt-dev 
            make 
            netcat-openbsd 
            nodejs 
            openssl 
            pkgconfig 
            postgresql-dev 
            python 
            tzdata 
            yarn 
      
      RUN gem install bundler -v 2.0.2
      
      WORKDIR /app
      
      COPY Gemfile Gemfile.lock ./
      
      RUN bundle config build.nokogiri --use-system-libraries
      
      RUN bundle check || bundle install 
      
      COPY package.json yarn.lock ./
      
      RUN yarn install --check-files
      
      COPY . ./ 
      
      ENTRYPOINT ["./entrypoints/docker-entrypoint.sh"]
      

      Save and close the file when you are finished editing.

      Next, create a directory called entrypoints for the entrypoint scripts:

      This directory will include our main entrypoint script and a script for our Sidekiq service.

      Open the file for the application entrypoint script:

      • nano entrypoints/docker-entrypoint.sh

      Add the following code to the file:

      rails-docker/entrypoints/docker-entrypoint.sh

      #!/bin/sh
      
      set -e
      
      if [ -f tmp/pids/server.pid ]; then
        rm tmp/pids/server.pid
      fi
      
      bundle exec rails s -b 0.0.0.0
      

      The first important line is set -e, which tells the /bin/sh shell that runs the script to fail fast if there are any problems later in the script. Next, the script checks that tmp/pids/server.pid is not present to ensure that there won’t be server conflicts when we start the application. Finally, the script starts the Rails server with the bundle exec rails s command. We use the -b option with this command to bind the server to all IP addresses rather than to the default, localhost. This invocation makes the Rails server route incoming requests to the container IP rather than to the default localhost.

      Save and close the file when you are finished editing.

      Make the script executable:

      • chmod +x entrypoints/docker-entrypoint.sh

      Next, we will create a script to start our sidekiq service, which will process our Sidekiq jobs. For more information about how this application uses Sidekiq, please see How To Add Sidekiq and Redis to a Ruby on Rails Application.

      Open a file for the Sidekiq entrypoint script:

      • nano entrypoints/sidekiq-entrypoint.sh

      Add the following code to the file to start Sidekiq:

      ~/rails-docker/entrypoints/sidekiq-entrypoint.sh

      #!/bin/sh
      
      set -e
      
      if [ -f tmp/pids/server.pid ]; then
        rm tmp/pids/server.pid
      fi
      
      bundle exec sidekiq
      

      This script starts Sidekiq in the context of our application bundle.

      Save and close the file when you are finished editing. Make it executable:

      • chmod +x entrypoints/sidekiq-entrypoint.sh

      With your entrypoint scripts and Dockerfile in place, you are ready to define your services in your Compose file.

      Step 4 — Defining Services with Docker Compose

      Using Docker Compose, we will be able to run the multiple containers required for our setup. We will define our Compose services in our main docker-compose.yml file. A service in Compose is a running container, and service definitions — which you will include in your docker-compose.yml file — contain information about how each container image will run. The Compose tool allows you to define multiple services to build multi-container applications.

      Our application setup will include the following services:

      • The application itself
      • The PostgreSQL database
      • Redis
      • Sidekiq

      We will also include a bind mount as part of our setup, so that any code changes we make during development will be immediately synchronized with the containers that need access to this code.

      Note that we are not defining a test service, since testing is outside of the scope of this tutorial and series, but you could do so by following the precedent we are using here for the sidekiq service.

      Open the docker-compose.yml file:

      First, add the application service definition:

      ~/rails-docker/docker-compose.yml

      version: '3.4'
      
      services:
        app: 
          build:
            context: .
            dockerfile: Dockerfile
          depends_on:
            - database
            - redis
          ports: 
            - "3000:3000"
          volumes:
            - .:/app
            - gem_cache:/usr/local/bundle/gems
            - node_modules:/app/node_modules
          env_file: .env
          environment:
            RAILS_ENV: development
      

      The app service definition includes the following options:

      • build: This defines the configuration options, including the context and dockerfile, that will be applied when Compose builds the application image. If you wanted to use an existing image from a registry like Docker Hub, you could use the image instruction instead, with information about your username, repository, and image tag.
      • context: This defines the build context for the image build — in this case, the current project directory.
      • dockerfile: This specifies the Dockerfile in your current project directory as the file Compose will use to build the application image.
      • depends_on: This sets up the database and redis containers first so that they are up and running before app.
      • ports: This maps port 3000 on the host to port 3000 on the container.
      • volumes: We are including two types of mounts here:
        • The first is a bind mount that mounts our application code on the host to the /app directory on the container. This will facilitate rapid development, since any changes you make to your host code will be populated immediately in the container.
        • The second is a named volume, gem_cache. When the bundle install instruction runs in the container, it will install the project gems. Adding this volume means that if you recreate the container, the gems will be mounted to the new container. This mount presumes that there haven’t been any changes to the project, so if you do make changes to your project gems in development, you will need to remember to delete this volume before recreating your application service.
        • The third volume is a named volume for the node_modules directory. Rather than having node_modules mounted to the host, which can lead to package discrepancies and permissions conflicts in development, this volume will ensure that the packages in this directory are persisted and reflect the current state of the project. Again, if you modify the project’s Node dependencies, you will need to remove and recreate this volume.
      • env_file: This tells Compose that we would like to add environment variables from a file called .env located in the build context.
      • environment: Using this option allows us to set a non-sensitive environment variable, passing information about the Rails environment to the container.

      Next, below the app service definition, add the following code to define your database service:

      ~/rails-docker/docker-compose.yml

      . . .
        database:
          image: postgres:12.1
          volumes:
            - db_data:/var/lib/postgresql/data
            - ./init.sql:/docker-entrypoint-initdb.d/init.sql
      

      Unlike the app service, the database service pulls a postgres image directly from Docker Hub. Note that we’re also pinning the version here, rather than setting it to latest or not specifying it (which defaults to latest). This way, we can ensure that this setup works with the versions specified here and avoid unexpected surprises with breaking code changes to the image.

      We are also including a db_data volume here, which will persist our application data in between container starts. Additionally, we’ve mounted our init.sql startup script to the appropriate directory, docker-entrypoint-initdb.d/ on the container, in order to create our sammy database user. After the image entrypoint creates the default postgres user and database, it will run any scripts found in the docker-entrypoint-initdb.d/ directory, which you can use for necessary initialization tasks. For more details, look at the Initialization scripts section of the PostgreSQL image documentation

      Next, add the redis service definition:

      ~/rails-docker/docker-compose.yml

      . . .
        redis:
          image: redis:5.0.7
      

      Like the database service, the redis service uses an image from Docker Hub. In this case, we are not persisting the Sidekiq job cache.

      Finally, add the sidekiq service definition:

      ~/rails-docker/docker-compose.yml

      . . .
        sidekiq:
          build:
            context: .
            dockerfile: Dockerfile
          depends_on:
            - app      
            - database
            - redis
          volumes:
            - .:/app
            - gem_cache:/usr/local/bundle/gems
            - node_modules:/app/node_modules
          env_file: .env
          environment:
            RAILS_ENV: development
          entrypoint: ./entrypoints/sidekiq-entrypoint.sh
      

      Our sidekiq service resembles our app service in a few respects: it uses the same build context and image, environment variables, and volumes. However, it is dependent on the app, redis, and database services, and so will be the last to start. Additionally, it uses an entrypoint that will override the entrypoint set in the Dockerfile. This entrypoint setting points to entrypoints/sidekiq-entrypoint.sh, which includes the appropriate command to start the sidekiq service.

      As a final step, add the volume definitions below the sidekiq service definition:

      ~/rails-docker/docker-compose.yml

      . . .
      volumes:
        gem_cache:
        db_data:
        node_modules:
      

      Our top-level volumes key defines the volumes gem_cache, db_data, and node_modules. When Docker creates volumes, the contents of the volume are stored in a part of the host filesystem, /var/lib/docker/volumes/, that’s managed by Docker. The contents of each volume are stored in a directory under /var/lib/docker/volumes/ and get mounted to any container that uses the volume. In this way, the shark information data that our users will create will persist in the db_data volume even if we remove and recreate the database service.

      The finished file will look like this:

      ~/rails-docker/docker-compose.yml

      version: '3.4'
      
      services:
        app: 
          build:
            context: .
            dockerfile: Dockerfile
          depends_on:     
            - database
            - redis
          ports: 
            - "3000:3000"
          volumes:
            - .:/app
            - gem_cache:/usr/local/bundle/gems
            - node_modules:/app/node_modules
          env_file: .env
          environment:
            RAILS_ENV: development
      
        database:
          image: postgres:12.1
          volumes:
            - db_data:/var/lib/postgresql/data
            - ./init.sql:/docker-entrypoint-initdb.d/init.sql
      
        redis:
          image: redis:5.0.7
      
        sidekiq:
          build:
            context: .
            dockerfile: Dockerfile
          depends_on:
            - app      
            - database
            - redis
          volumes:
            - .:/app
            - gem_cache:/usr/local/bundle/gems
            - node_modules:/app/node_modules
          env_file: .env
          environment:
            RAILS_ENV: development
          entrypoint: ./entrypoints/sidekiq-entrypoint.sh
      
      volumes:
        gem_cache:
        db_data:
        node_modules:     
      

      Save and close the file when you are finished editing.

      With your service definitions written, you are ready to start the application.

      Step 5 — Testing the Application

      With your docker-compose.yml file in place, you can create your services with the docker-compose up command and seed your database. You can also test that your data will persist by stopping and removing your containers with docker-compose down and recreating them.

      First, build the container images and create the services by running docker-compose up with the -d flag, which will run the containers in the background:

      You will see output that your services have been created:

      Output

      Creating rails-docker_database_1 ... done Creating rails-docker_redis_1 ... done Creating rails-docker_app_1 ... done Creating rails-docker_sidekiq_1 ... done

      You can also get more detailed information about the startup processes by displaying the log output from the services:

      You will see something like this if everything has started correctly:

      Output

      sidekiq_1 | 2019-12-19T15:05:26.365Z pid=6 tid=grk7r6xly INFO: Booting Sidekiq 6.0.3 with redis options {:host=>"redis", :port=>"6379", :id=>"Sidekiq-server-PID-6", :url=>nil} sidekiq_1 | 2019-12-19T15:05:31.097Z pid=6 tid=grk7r6xly INFO: Running in ruby 2.5.1p57 (2018-03-29 revision 63029) [x86_64-linux-musl] sidekiq_1 | 2019-12-19T15:05:31.097Z pid=6 tid=grk7r6xly INFO: See LICENSE and the LGPL-3.0 for licensing details. sidekiq_1 | 2019-12-19T15:05:31.097Z pid=6 tid=grk7r6xly INFO: Upgrade to Sidekiq Pro for more features and support: http://sidekiq.org app_1 | => Booting Puma app_1 | => Rails 5.2.3 application starting in development app_1 | => Run `rails server -h` for more startup options app_1 | Puma starting in single mode... app_1 | * Version 3.12.1 (ruby 2.5.1-p57), codename: Llamas in Pajamas app_1 | * Min threads: 5, max threads: 5 app_1 | * Environment: development app_1 | * Listening on tcp://0.0.0.0:3000 app_1 | Use Ctrl-C to stop . . . database_1 | PostgreSQL init process complete; ready for start up. database_1 | database_1 | 2019-12-19 15:05:20.160 UTC [1] LOG: starting PostgreSQL 12.1 (Debian 12.1-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit database_1 | 2019-12-19 15:05:20.160 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432 database_1 | 2019-12-19 15:05:20.160 UTC [1] LOG: listening on IPv6 address "::", port 5432 database_1 | 2019-12-19 15:05:20.163 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" database_1 | 2019-12-19 15:05:20.182 UTC [63] LOG: database system was shut down at 2019-12-19 15:05:20 UTC database_1 | 2019-12-19 15:05:20.187 UTC [1] LOG: database system is ready to accept connections . . . redis_1 | 1:M 19 Dec 2019 15:05:18.822 * Ready to accept connections

      You can also check the status of your containers with docker-compose ps:

      You will see output indicating that your containers are running:

      Output

      Name Command State Ports ----------------------------------------------------------------------------------------- rails-docker_app_1 ./entrypoints/docker-resta ... Up 0.0.0.0:3000->3000/tcp rails-docker_database_1 docker-entrypoint.sh postgres Up 5432/tcp rails-docker_redis_1 docker-entrypoint.sh redis ... Up 6379/tcp rails-docker_sidekiq_1 ./entrypoints/sidekiq-entr ... Up

      Next, create and seed your database and run migrations on it with the following docker-compose exec command:

      • docker-compose exec app bundle exec rake db:setup db:migrate

      The docker-compose exec command allows you to run commands in your services; we are using it here to run rake db:setup and db:migrate in the context of our application bundle to create and seed the database and run migrations. As you work in development, docker-compose exec will prove useful to you when you want to run migrations against your development database.

      You will see the following output after running this command:

      Output

      Created database 'rails_development' Database 'rails_development' already exists -- enable_extension("plpgsql") -> 0.0140s -- create_table("endangereds", {:force=>:cascade}) -> 0.0097s -- create_table("posts", {:force=>:cascade}) -> 0.0108s -- create_table("sharks", {:force=>:cascade}) -> 0.0050s -- enable_extension("plpgsql") -> 0.0173s -- create_table("endangereds", {:force=>:cascade}) -> 0.0088s -- create_table("posts", {:force=>:cascade}) -> 0.0128s -- create_table("sharks", {:force=>:cascade}) -> 0.0072s

      With your services running, you can visit localhost:3000 or http://your_server_ip:3000 in the browser. You will see a landing page that looks like this:

      Sidekiq App Home

      We can now test data persistence. Create a new shark by clicking on Get Shark Info button, which will take you to the sharks/index route:

      Sharks Index Page with Seeded Data

      To verify that the application is working, we can add some demo information to it. Click on New Shark. You will be prompted for a username (sammy) and password (shark), thanks to the project’s authentication settings.

      On the New Shark page, input “Mako” into the Name field and “Fast” into the Facts field.

      Click on the Create Shark button to create the shark. Once you have created the shark, click Home on the site’s navbar to get back to the main application landing page. We can now test that Sidekiq is working.

      Click on the Which Sharks Are in Danger? button. Since you have not uploaded any endangered sharks, this will take you to the endangered index view:

      Endangered Index View

      Click on Import Endangered Sharks to import the sharks. You will see a status message telling you that the sharks have been imported:

      Begin Import

      You will also see the beginning of the import. Refresh your page to see the entire table:

      Refresh Table

      Thanks to Sidekiq, our large batch upload of endangered sharks has succeeded without locking up the browser or interfering with other application functionality.

      Click on the Home button at the bottom of the page, which will bring you back to the application main page:

      Sidekiq App Home

      From here, click on Which Sharks Are in Danger? again. You will see the uploaded sharks once again.

      Now that we know our application is working properly, we can test our data persistence.

      Back at your terminal, type the following command to stop and remove your containers:

      Note that we are not including the --volumes option; hence, our db_data volume is not removed.

      The following output confirms that your containers and network have been removed:

      Output

      Stopping rails-docker_sidekiq_1 ... done Stopping rails-docker_app_1 ... done Stopping rails-docker_database_1 ... done Stopping rails-docker_redis_1 ... done Removing rails-docker_sidekiq_1 ... done Removing rails-docker_app_1 ... done Removing rails-docker_database_1 ... done Removing rails-docker_redis_1 ... done Removing network rails-docker_default

      Recreate the containers:

      Open the Rails console on the app container with docker-compose exec and bundle exec rails console:

      • docker-compose exec app bundle exec rails console

      At the prompt, inspect the last Shark record in the database:

      You will see the record you just created:

      IRB session

      Shark Load (1.0ms) SELECT "sharks".* FROM "sharks" ORDER BY "sharks"."id" DESC LIMIT $1 [["LIMIT", 1]] => "#<Shark id: 5, name: "Mako", facts: "Fast", created_at: "2019-12-20 14:03:28", updated_at: "2019-12-20 14:03:28">"

      You can then check to see that your Endangered sharks have been persisted with the following command:

      IRB session

      (0.8ms) SELECT COUNT(*) FROM "endangereds" => 73

      Your db_data volume was successfully mounted to the recreated database service, making it possible for your app service to access the saved data. If you navigate directly to the index shark page by visiting localhost:3000/sharks or http://your_server_ip:3000/sharks you will also see that record displayed:

      Sharks Index Page with Mako

      Your endangered sharks will also be at the localhost:3000/endangered/data or http://your_server_ip:3000/endangered/data view:

      Refresh Table

      Your application is now running on Docker containers with data persistence and code synchronization enabled. You can go ahead and test out local code changes on your host, which will be synchronized to your container thanks to the bind mount we defined as part of the app service.

      Conclusion

      By following this tutorial, you have created a development setup for your Rails application using Docker containers. You’ve made your project more modular and portable by extracting sensitive information and decoupling your application’s state from your code. You have also configured a boilerplate docker-compose.yml file that you can revise as your development needs and requirements change.

      As you develop, you may be interested in learning more about designing applications for containerized and Cloud Native workflows. Please see Architecting Applications for Kubernetes and Modernizing Applications for Kubernetes for more information on these topics. Or, if you would like to invest in a Kubernetes learning sequence, please have a look at out Kubernetes for Full-Stack Developers curriculum.

      To learn more about the application code itself, please see the other tutorials in this series:



      Source link

      How To Optimize MySQL Queries with ProxySQL Caching on Ubuntu 16.04


      The author selected the Free Software Foundation to receive a donation as part of the Write for DOnations program.

      Introduction

      ProxySQL is a SQL-aware proxy server that can be positioned between your application and your database. It offers many features, such as load-balancing between multiple MySQL servers and serving as a caching layer for queries. This tutorial will focus on ProxySQL’s caching feature, and how it can optimize queries for your MySQL database.

      MySQL caching occurs when the result of a query is stored so that, when that query is repeated, the result can be returned without needing to sort through the database. This can significantly increase the speed of common queries. But in many caching methods, developers must modify the code of their application, which could introduce a bug into the codebase. To avoid this error-prone practice, ProxySQL allows you to set up transparent caching.

      In transparent caching, only database administrators need to change the ProxySQL configuration to enable caching for the most common queries, and these changes can be done through the ProxySQL admin interface. All the developer needs to do is connect to the protocol-aware proxy, and the proxy will decide if the query can be served from the cache without hitting the back-end server.

      In this tutorial, you will use ProxySQL to set up transparent caching for a MySQL server on Ubuntu 16.04. You will then test its performance using mysqlslap with and without caching to demonstrate the effect of caching and how much time it can save when executing many similar queries.

      Prerequisites

      Before you begin this guide you’ll need the following:

      Step 1 — Installing and Setting Up the MySQL Server

      First, you will install MySQL server and configure it to be used by ProxySQL as a back-end server for serving client queries.

      On Ubuntu 16.04, mysql-server can be installed using this command:

      • sudo apt-get install mysql-server

      Press Y to confirm the installation.

      You will then be prompted for your MySQL root user password. Enter a strong password and save it for later use.

      Now that you have your MySQL server ready, you will configure it for ProxySQL to work correctly. You need to add a monitor user for ProxySQL to monitor the MySQL server, since ProxySQL listens to the back-end server via the SQL protocol, rather than using a TCP connection or HTTP GET requests to make sure that the backend is running. monitor will use a dummy SQL connection to determine if the server is alive or not.

      First, log in to the MySQL shell:

      -uroot logs you in using the MySQL root user, and -p prompts for the root user’s password. This root user is different from your server’s root user, and the password is the one you entered when installing the mysql-server package.

      Enter the root password and press ENTER.

      Now you will create two users, one named monitor for ProxySQL and another that you will use to execute client queries and grant them the right privileges. This tutorial will name this user sammy.

      Create the monitor user:

      • CREATE USER 'monitor'@'%' IDENTIFIED BY 'monitor_password';

      The CREATE USER query is used to create a new user that can connect from specific IPs. Using % denotes that the user can connect from any IP address. IDENTIFIED BY sets the password for the new user; enter whatever password you like, but make sure to remember it for later use.

      With the user monitor created, next make the sammy user:

      • CREATE USER 'sammy'@'%' IDENTIFIED BY 'sammy_password';

      Next, grant privileges to your new users. Run the following command to configure monitor:

      • GRANT SELECT ON sys.* TO 'monitor'@'%';

      The GRANT query is used to give privileges to users. Here you granted only SELECT on all tables in the sys database to the monitor user; it only needs this privilege to listen to the back-end server.

      Now grant all privileges to all databases to the user sammy:

      • GRANT ALL PRIVILEGES on *.* TO 'sammy'@'%';

      This will allow sammy to make the necessary queries to test your database later.

      Apply the privilege changes by running the following:

      Finally, exit the mysql shell:

      You’ve now installed mysql-server and created a user to be used by ProxySQL to monitor your MySQL server, and another one to execute client queries. Next you will install and configure ProxySQL.

      Step 2 — Installing and Configuring ProxySQL Server

      Now you can install ProxySQL server, which will be used as a caching layer for your queries. A caching layer exists as a stop between your application servers and database back-end servers; it is used to connect to the database and to save the results of some queries in its memory for fast access later.

      The ProxySQL releases Github page offers installation files for common Linux distributions. For this tutorial, you will use wget to download the ProxySQL version 2.0.4 Debian installation file:

      • wget https://github.com/sysown/proxysql/releases/download/v2.0.4/proxysql_2.0.4-ubuntu16_amd64.deb

      Next, install the package using dpkg:

      • sudo dpkg -i proxysql_2.0.4-ubuntu16_amd64.deb

      Once it is installed, start ProxySQL with this command:

      • sudo systemctl start proxysql

      You can check if ProxySQL started correctly with this command:

      • sudo systemctl status proxysql

      You will get an output similar to this:

      Output

      root@ubuntu-s-1vcpu-2gb-sgp1-01:~# systemctl status proxysql ● proxysql.service - LSB: High Performance Advanced Proxy for MySQL Loaded: loaded (/etc/init.d/proxysql; bad; vendor preset: enabled) Active: active (exited) since Wed 2019-06-12 21:32:50 UTC; 6 months 7 days ago Docs: man:systemd-sysv-generator(8) Tasks: 0 Memory: 0B CPU: 0

      Now it is time to connect your ProxySQL server to the MySQL server. For this purpose, use the ProxySQL admin SQL interface, which by default listens to port 6032 on localhost and has admin as its username and password.

      Connect to the interface by running the following:

      • mysql -uadmin -p -h 127.0.0.1 -P6032

      Enter admin when prompted for the password.

      -uadmin sets the username as admin, and the -h flag specifies the host as localhost. The port is 6032, specified using the -P flag.

      Here you had to specify the host and port explicitly because, by default, the MySQL client connects using a local sockets file and port 3306.

      Now that you are logged into the mysql shell as admin, configure the monitor user so that ProxySQL can use it. First, use standard SQL queries to set the values of two global variables:

      • UPDATE global_variables SET variable_value='monitor' WHERE variable_name='mysql-monitor_username';
      • UPDATE global_variables SET variable_value='monitor_password' WHERE variable_name='mysql-monitor_password';

      The variable mysql-monitor_username specifies the MySQL username that will be used to check if the back-end server is alive or not. The variable mysql-monitor_password points to the password that will be used when connecting to the back-end server. Use the password you created for the monitor username.

      Every time you create a change in the ProxySQL admin interface, you need to use the right LOAD command to apply changes to the running ProxySQL instance. You changed MySQL global variables, so load them to RUNTIME to apply changes:

      • LOAD MYSQL VARIABLES TO RUNTIME;

      Next, SAVE the changes to the on-disk database to persist changes between restarts. ProxySQL uses its own SQLite local database to store its own tables and variables:

      • SAVE MYSQL VARIABLES TO DISK;

      Now, you will tell ProxySQL about the back-end server. The table mysql_servers holds information about each back-end server where ProxySQL can connect and execute queries, so add a new record using a standard SQL INSERT statement with the following values for hostgroup_id, hostname, and port:

      • INSERT INTO mysql_servers(hostgroup_id, hostname, port) VALUES (1, '127.0.0.1', 3306);

      To apply the changes, run LOAD and SAVE again:

      • LOAD MYSQL SERVERS TO RUNTIME;
      • SAVE MYSQL SERVERS TO DISK;

      Finally, you will tell ProxySQL which user will connect to the back-end server; set sammy as the user, and replace sammy_password with the password you created earlier:

      • INSERT INTO mysql_users(username, password, default_hostgroup) VALUES ('sammy', 'sammy_password', 1);

      The table mysql_users holds information about users used to connect to the back-end servers; you specified the username, password, and default_hostgroup.

      LOAD and SAVE the changes:

      • LOAD MYSQL USERS TO RUNTIME;
      • SAVE MYSQL USERS TO DISK;

      Then exit the mysql shell:

      To test that you can connect to your back-end server using ProxySQL, execute the following test query:

      • mysql -usammy -h127.0.0.1 -p -P6033 -e "SELECT @@HOSTNAME as hostname"

      In this command, you used the -e flag to execute a query and close the connection. The query prints the hostname of the back-end server.

      Note: ProxySQL uses port 6033 by default for listening to incoming connections.

      The output will look like this, with your_hostname replaced by your hostname:

      Output

      +----------------------------+ | hostname | +----------------------------+ | your_hostname | +----------------------------+

      To learn more about ProxySQL configuration, see Step 3 of How To Use ProxySQL as a Load Balancer for MySQL on Ubuntu 16.04.

      So far, you configured ProxySQL to use your MySQL server as a backend and connected to the backend using ProxySQL. Now, you are ready to use mysqlslap to benchmark the query performance without caching.

      Step 3 — Testing Using mysqlslap Without Caching

      In this step, you will download a test database so you can execute queries against it with mysqlslap to test the latency without caching, setting a benchmark for the speed of your queries. You will also explore how ProxySQL keeps records of queries in the stats_mysql_query_digest table.

      mysqlslap is a load emulation client that is used as a load testing tool for MySQL. It can test a MySQL server with auto-generated queries or with some custom queries executed on a database. It comes installed with the MySQL client package, so you do not need to install it; instead, you will download a database for testing purposes only, on which you can use mysqlslap.

      In this tutorial, you will use a sample employee database. You will be using this employee database because it features a large data set that can illustrate differences in query optimization. The database has six tables, but the data it contains has more than 300,000 employee records. This will help you emulate a large-scale production workload.

      To download the database, first clone the Github repository using this command:

      • git clone https://github.com/datacharmer/test_db.git

      Then enter the test_db directory and load the database into the MySQL server using these commands:

      • cd test_db
      • mysql -uroot -p < employees.sql

      This command uses shell redirection to read the SQL queries in employees.sql file and execute them on the MySQL server to create the database structure.

      You will see output like this:

      Output

      INFO CREATING DATABASE STRUCTURE INFO storage engine: InnoDB INFO LOADING departments INFO LOADING employees INFO LOADING dept_emp INFO LOADING dept_manager INFO LOADING titles INFO LOADING salaries data_load_time_diff 00:00:32

      Once the database is loaded into your MySQL server, test that mysqlslap is working with the following query:

      • mysqlslap -usammy -p -P6033 -h127.0.0.1 --auto-generate-sql --verbose

      mysqlslap has similar flags to the mysql client; here are the ones used in this command:

      • -u specifies the user used to connect to the server.
      • -p prompts for the user’s password.
      • -P connects using the specified port.
      • -h connects to the specified host.
      • --auto-generate-sql lets MySQL perform load testing using its own generated queries.
      • --verbose makes the output show more information.

      You will get output similar to the following:

      Output

      Benchmark Average number of seconds to run all queries: 0.015 seconds Minimum number of seconds to run all queries: 0.015 seconds Maximum number of seconds to run all queries: 0.015 seconds Number of clients running queries: 1 Average number of queries per client: 0

      In this output, you can see the average, minimum, and maximum number of seconds spent to execute all queries. This gives you an indication about the amount of time needed to execute the queries by a number of clients. In this output, only one client was used to execute queries.

      Next, find out what queries mysqlslap executed in the last command by looking at ProxySQL’s stats_mysql_query_digest. This will give us information like the digest of the queries, which is a normalized form of the SQL statement that can be referenced later to enable caching.

      Enter the ProxySQL admin interface with this command:

      • mysql -uadmin -p -h 127.0.0.1 -P6032

      Then execute this query to find information in the stats_mysql_query_digest table:

      • SELECT count_star,sum_time,hostgroup,digest,digest_text FROM stats_mysql_query_digest ORDER BY sum_time DESC;

      You will see output similar to the following:

      +------------+----------+-----------+--------------------+----------------------------------+
      | count_star | sum_time | hostgroup | digest             | digest_text                      |
      +------------+----------+-----------+--------------------+----------------------------------+
      | 1          | 598      | 1         | 0xF8F780C47A8D1D82 | SELECT @@HOSTNAME as hostname    |
      | 1          | 0        | 1         | 0x226CD90D52A2BA0B | select @@version_comment limit ? |
      +------------+----------+-----------+--------------------+----------------------------------+
      2 rows in set (0.01 sec)
      

      The previous query selects data from the stats_mysql_query_digest table, which contains information about all executed queries in ProxySQL. Here you have five columns selected:

      • count_star: The number of times this query was executed.
      • sum_time: Total time in milliseconds that this query took to execute.
      • hostgroup: The hostgroup used to execute the query.
      • digest: A digest of the executed query.
      • digest_text: The actual query. In this tutorial’s example, the second query is parameterized using ? marks in place of variable parameters. select @@version_comment limit 1 and select @@version_comment limit 2, therefore, are grouped together as the same query with the same digest.

      Now that you know how to check query data in the stats_mysql_query_digest table, exit the mysql shell:

      The database you downloaded contains some tables with demo data. You will now test queries on the dept_emp table by selecting any records whose from_date is greater than 2000-04-20 and recording the average execution time.

      Use this command to run the test:

      • mysqlslap -usammy -P6033 -p -h127.0.0.1 --concurrency=100 --iterations=20 --create-schema=employees --query="SELECT * from dept_emp WHERE from_date>'2000-04-20'" --verbose

      Here you are using some new flags:

      • --concurrency=100: This sets the number of users to simulate, in this case 100.
      • --iterations=20: This causes the test to run 20 times and calculate results from all of them.
      • --create-schema=employees: Here you selected the employees database.
      • --query="SELECT * from dept_emp WHERE from_date>'2000-04-20'": Here you specified the query executed in the test.

      The test will take a few minutes. After it is done, you will get results similar to the following:

      Output

      Benchmark Average number of seconds to run all queries: 18.117 seconds Minimum number of seconds to run all queries: 8.726 seconds Maximum number of seconds to run all queries: 22.697 seconds Number of clients running queries: 100 Average number of queries per client: 1

      Your numbers could be a little different. Keep these numbers somewhere in order to compare them with the results from after you enable caching.

      After testing ProxySQL without caching, it is time to run the same test again, but this time with caching enabled.

      Step 4 — Testing Using mysqlslap With Caching

      In this step, caching will help us to decrease latency when executing similar queries. Here, you will identify the queries executed, take their digests from ProxySQL’s stats_mysql_query_digest table, and use them to enable caching. Then, you will test again to check the difference.

      To enable caching, you need to know the digests of the queries that will be cached. Log in to the ProxySQL admin interface using this command:

      • mysql -uadmin -p -h127.0.0.1 -P6032

      Then execute this query again to get a list of queries executed and their digests:

      • SELECT count_star,sum_time,hostgroup,digest,digest_text FROM stats_mysql_query_digest ORDER BY sum_time DESC;

      You will get a result similar to this:

      Output

      +------------+-------------+-----------+--------------------+------------------------------------------+ | count_star | sum_time | hostgroup | digest | digest_text | +------------+-------------+-----------+--------------------+------------------------------------------+ | 2000 | 33727110501 | 1 | 0xC5DDECD7E966A6C4 | SELECT * from dept_emp WHERE from_date>? | | 1 | 601 | 1 | 0xF8F780C47A8D1D82 | SELECT @@HOSTNAME as hostname | | 1 | 0 | 1 | 0x226CD90D52A2BA0B | select @@version_comment limit ? | +------------+-------------+-----------+--------------------+------------------------------------------+ 3 rows in set (0.00 sec)

      Look at the first row. It is about a query that was executed 2000 times. This is the benchmarked query executed previously. Take its digest and save it to be used in adding a query rule for caching.

      The next few queries will add a new query rule to ProxySQL that will match the digest of the previous query and put a cache_ttl value for it. cache_ttl is the number of milliseconds that the result will be cached in memory:

      • INSERT INTO mysql_query_rules(active, digest, cache_ttl, apply) VALUES(1,'0xC5DDECD7E966A6C4',2000,1);

      In this command you are adding a new record to the mysql_query_rules table; this table holds all the rules applied before executing a query. In this example, you are adding a value for the cache_ttl column that will cause the matched query by the given digest to be cached for a number of milliseconds specified in this column. You put 1 in the apply column to make sure that the rule is applied to queries.

      LOAD and SAVE these changes, then exit the mysql shell:

      • LOAD MYSQL QUERY RULES TO RUNTIME;
      • SAVE MYSQL QUERY RULES TO DISK;
      • exit;

      Now that caching is enabled, re-run the test again to check the result:

      • mysqlslap -usammy -P6033 -p -h127.0.0.1 --concurrency=100 --iterations=20 --create-schema=employees --query="SELECT * from dept_emp WHERE from_date>'2000-04-20'" --verbose

      This will give output similar to the following:

      Output

      Benchmark Average number of seconds to run all queries: 7.020 seconds Minimum number of seconds to run all queries: 0.274 seconds Maximum number of seconds to run all queries: 23.014 seconds Number of clients running queries: 100 Average number of queries per client: 1

      Here you can see the big difference in average execution time: it dropped from 18.117 seconds to 7.020.

      Conclusion

      In this article, you set up transparent caching with ProxySQL to cache database query results. You also tested the query speed with and without caching to see the difference that caching can make.

      You’ve used one level of caching in this tutorial. You could also try, web caching, which sits in front of a web server and caches the responses to similar requests, sending the response back to the client without hitting the back-end servers. This is very similar to ProxySQL caching but at a different level. To learn more about web caching, check out our Web Caching Basics: Terminology, HTTP Headers, and Caching Strategies primer.

      MySQL server also has its own query cache; you can learn more about it in our How To Optimize MySQL with Query Cache on Ubuntu 18.04 tutorial.



      Source link