One place for hosting & domains

      How To Set Up a JupyterLab Environment on Ubuntu 18.04


      The author selected the United Nations Foundation to receive a donation as part of the Write for DOnations program.

      Introduction

      JupyterLab is a highly feature-rich UI that makes it easy for users, particularly in the fields of Data Science and AI, to perform their tasks. The JupyterLab environments provide a productivity-focused redesign of Jupyter Notebook. It introduces tools such as a built-in HTML viewer and CSV viewer along with features that unify several discrete features of Jupyter Notebooks onto the same screen.

      In this tutorial, you’ll install and set up JupyterLab on your Ubuntu 18.04 server. You’ll also be configuring your server to be able to connect to the JupyterLab instance remotely from any web browser, securely, using a domain name.

      Prerequisites

      In order to complete this tutorial, you’ll need:

      • An Ubuntu 18.04 server with a non-root user account with sudo privileges using this Initial Server Setup Guide.
      • An installation of the Python Anaconda Distribution on your server. You can use the How To Install the Anaconda Python Distribution on Ubuntu 18.04 tutorial.
      • A registered domain name or sub-domain where you’ve access to edit DNS records. This tutorial will use your_domain throughout. You can purchase domains on Namecheap, get a free domain at Freenom, or register a new domain with any registrar of your choice.
      • The following DNS records set up for your domain:
        • An A record with your_domain pointing to your server’s public IP address.
        • An A record with www.your_domain pointing to your server’s public IP address.
          This How to Create, Edit, and Delete DNS Records documentation can help you in setting up these records.

      Step 1 — Setting Up Your Password

      In this step you’ll set up a password on your JupyterLab installation. It is important to have a password in place since your instance will be publicly accessible.

      First, make sure your Anaconda environment is activated. As per the prerequisite tutorial the environment is called base.

      To activate the environment, use the following command:

      Your prompt will change in the terminal to reflect the default Anaconda environment base:

      All future commands in this tutorial will be run within the base environment.

      With your Anaconda environment activated, you’re ready to set up a password for JupyterLab on your server.

      First, let’s generate a configuration file for Jupyter:

      • jupyter notebook --generate-config

      You’ll receive the following output:

      Output

      Writing default config to: /home/sammy/.jupyter/jupyter_notebook_config.py

      Both JupyterLab and Jupyter Notebook share the same configuration file.

      Now, use the following command to set a password for accessing your JupyterLab instance remotely:

      • jupyter notebook password

      Jupyter will prompt you to provide a password of your choice:

      Output

      Enter password: Verify password: [NotebookPasswordApp] Wrote hashed password to /home/sammy/.jupyter/jupyter_notebook_config.json

      Jupyter stores the password in a hashed format at /home/sammy/.jupyter/jupyter_notebook_config.json. You’ll need this hashed value in the future.

      Finally, use the cat command on the file generated by the previous command to view the hashed password:

      • cat /home/sammy/.jupyter/jupyter_notebook_config.json

      You’ll receive an output similar to the following:

      /home/sammy/.jupyter/jupyter_notebook_config.json

      {
        "NotebookApp": {
          "password": "sha1:your_hashed_password"
        }
      }
      

      Copy the value in the password key of the JSON and store it temporarily.

      You’ve set up a password for your JupyterLab instance. In the next step you’ll create a Let’s Encrypt certificate for your server.

      Step 2 — Configuring Let’s Encrypt

      In this step, you’ll create a Let’s Encrypt certificate for your domain. This will secure your data when you access your JupyterLab environment from your browser.

      First, you’ll install Certbot to your server. Begin by adding its repository to the apt sources:

      • sudo add-apt-repository ppa:certbot/certbot

      On executing the command, you’ll be asked to press ENTER to complete adding the PPA:

      Output

      This is the PPA for packages prepared by Debian Let's Encrypt Team and backported for Ubuntu. Note: Packages are only provided for currently supported Ubuntu releases. More info: https://launchpad.net/~certbot/+archive/ubuntu/certbot Press [ENTER] to continue or Ctrl-c to cancel adding it.

      Press ENTER to continue adding the PPA.

      Once the command has finished executing, refresh the apt sources using the apt update command:

      Next, you’ll install Certbot:

      Before you can start running Certbot to generate certificates for your instance, you’ll allow access on port :80 and port :443 of your server, so that Certbot can use these ports to verify your domain name. Port :80 is checked for http requests to the server while port :443 is used for https requests. Certbot shall be making an http request first and then after obtaining the certficates for your server, it will make an https request, which will be proxied through port :443 to the process listening at :80 port. This will verify the success of your certificate installation.

      First, allow access to port :80:

      You will receive the following output:

      Output

      Rule added Rule added (v6)

      Next, allow access to port :443:

      Output

      Rule added Rule added (v6)

      Finally, run Certbot to generate certificates for your instance using the following command:

      • sudo certbot certonly --standalone

      The standalone flag directs certbot to run a temporary server for the duration of the verfication process.

      It will prompt you for your email:

      Output

      Saving debug log to /var/log/letsencrypt/letsencrypt.log Plugins selected: Authenticator standalone, Installer None Enter email address (used for urgent renewal and security notices) (Enter 'c' to cancel): your_email

      Enter a working email and press ENTER.

      Next, it will ask you to review and agree to the Terms of Service for Certbot and Let’s Encrypt. Review the terms, type A if you accept, and press ENTER:

      Output

      - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Please read the Terms of Service at https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf. You must agree in order to register with the ACME server at https://acme-v02.api.letsencrypt.org/directory - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (A)gree/(C)ancel: A

      It will now prompt you to share your email with the Electronic Frontier Foundation. Type your answer and press ENTER:

      Output

      - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Would you be willing to share your email address with the Electronic Frontier Foundation, a founding partner of the Let's Encrypt project and the non-profit organization that develops Certbot? We'd like to send you email about our work encrypting the web, EFF news, campaigns, and ways to support digital freedom. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (Y)es/(N)o: Y/N

      Finally, you’ll be asked to enter your domain name. Type in your domain name without any protocol specification:

      Output

      Please enter in your domain name(s) (comma and/or space separated) (Enter 'c' to cancel): your_domain Obtaining a new certificate Performing the following challenges: http-01 challenge for your_domain Waiting for verification... Cleaning up challenges IMPORTANT NOTES: - Congratulations! Your certificate and chain have been saved at: /etc/letsencrypt/live/your_domain/fullchain.pem Your key file has been saved at: /etc/letsencrypt/live/your_domain/privkey.pem Your cert will expire on 2020-09-28. To obtain a new or tweaked version of this certificate in the future, simply run certbot again. To non-interactively renew *all* of your certificates, run "certbot renew" - Your account credentials have been saved in your Certbot configuration directory at /etc/letsencrypt. You should make a secure backup of this folder now. This configuration directory will also contain certificates and private keys obtained by Certbot so making regular backups of this folder is ideal. - If you like Certbot, please consider supporting our work by: Donating to ISRG / Let's Encrypt: https://letsencrypt.org/donate Donating to EFF: https://eff.org/donate-le

      Certbot will perform domain verification and generate certificates and keys for your domain and store them at /etc/letsencrypt/live/your_domain.

      Now that you have set up your Let’s Encrypt certificate, you’ll update your JupyterLab configuration file.

      Step 3 — Configuring JupyterLab

      In this step, you will edit the JupyterLab configuration to make sure it uses the Let’s Encrypt certificate you generated in Step 2. You will also make it accessible using the password you set up in Step 1.

      First, you need to edit the JupyterLab configuration file at /home/sammy/.jupyter/jupyter_notebook_config.py:

      • nano /home/sammy/.jupyter/jupyter_notebook_config.py

      Now, navigate to the line defining the value for c.NotebookApp.certfile and update it as follows:

      /home/sammy/.jupyter/jupyter_notebook_config.py

      ...
      ## The full path to an SSL/TLS certificate file.
      c.NotebookApp.certfile="/etc/letsencrypt/live/your_domain/fullchain.pem"
      ...
      

      Next, find the c.NotebookApp.keyfile variable and set it as shown:

      /home/sammy/.jupyter/jupyter_notebook_config.py

      ...
      ## The full path to a private key file for usage with SSL/TLS.
      c.NotebookApp.keyfile="/etc/letsencrypt/live/your_domain/privkey.pem"
      ...
      

      c.NotebookApp.certfile and c.NotebookApp.keyfile refer to the SSL Certificate, which will be served when you try to access your server remotely using the https protocol.

      Next, navigate to the line defining the c.NotebookApp.ip variable and update as follows:

      /home/sammy/.jupyter/jupyter_notebook_config.py

      ...
      ## The IP address the notebook server will listen on.
      c.NotebookApp.ip = '*'
      ...
      

      c.NotebookApp.ip defines the IPs that can access JupyterLab running your server. You set it to the * wildcard to allow access from any computer you need to access JupyterLab on.

      Next, find the c.NotebookApp.open_browser configuration and update it as follows:

      /home/sammy/.jupyter/jupyter_notebook_config.py

      ...
      ## Whether to open in a browser after starting. The specific browser used is
      #  platform dependent and determined by the python standard library `webbrowser`
      #  module, unless it is overridden using the --browser (NotebookApp.browser)
      #  configuration option.
      c.NotebookApp.open_browser = False
      ...
      

      By default, JupyterLab attempts to automatically initiate a browser session when it starts running. Since we do not have a browser on the remote server, it is necessary to turn this off to avoid errors.

      Next, navigate down to the c.NotebookApp.password variable and change to the following:

      /home/sammy/.jupyter/jupyter_notebook_config.py

      ...
      ## Hashed password to use for web authentication.
      #
      #  To generate, type in a python/IPython shell:
      #
      #    from notebook.auth import passwd; passwd()
      #
      #  The string should be of the form type:salt:hashed-password.
      c.NotebookApp.password = 'your_hashed_password'
      ...
      

      JupyterLab will use this hashed password configuration to check the password you enter for access in your browser.

      Finally, navigate further through the file and update the entry of the c.NotebookApp.port:

      /home/sammy/.jupyter/jupyter_notebook_config.py

      ...
      ## The port the notebook server will listen on.
      c.NotebookApp.port = 9000
      ...
      

      c.NotebookApp.port sets a fixed port for accessing your JupyterLab runtime. This way, you can allow access for only one port from the ufw firewall.

      Once you’re done, save and exit the file.

      Finally, allow traffic on the 9000 port:

      You’ll receive the following output:

      Output

      Rule added Rule added (v6)

      Now that you’ve set all your configuration, you’ll run JupyterLab.

      Step 4 — Running JupyterLab

      In this step, you’ll perform a test run of the JupyterLab instance.

      First, change your current working directory to the user’s home directory:

      Now, modify the access permissions of the certificate files to allow JupyterLab to access them. Change the permissions of the /etc/letsencrypt folder to the following:

      • sudo chmod 750 -R /etc/letsencrypt
      • sudo chown sammy:sammy -R /etc/letsencrypt

      Then, invoke your JupyterLab instance to start using the following command:

      This command accepts several configuration parameters. However, since we have already made these changes in the configuration file, we do not need to provide them here explicitly. You can provide them as arguments to this command to override the values in the configuration file.

      You can now navigate to https://your_domain:9000 to check you receive JupyterLab’s login screen.

      If you log in with the password you set up for JupyterLab in Step 2, you’ll be presented with the JupyterLab interface.

      JupyterLab interface after login

      Finally, press CTRL+C twice to stop the JupyterLab server.

      In the next step, you’ll set up a system service so that the JupyterLab server can be run in the background continuously.

      Step 6 — Setting Up a systemd Service

      In this step, you will create a systemd service that allows JupyterLab to keep running even when the terminal window is exited. You can read more about systemd services and units in this guide on systemd essentials.

      First, you’ll have to create a .service file, using the following command:

      • sudo nano /etc/systemd/system/jupyterlab.service

      Add the following content to the /etc/systemd/system/jupyterlab.service file:

      /etc/systemd/system/jupyterlab.service

      [Unit]
      Description=Jupyter Lab Server
      
      [Service]
      User=sammy
      Group=sammy
      Type=simple
      WorkingDirectory=/home/sammy/
      ExecStart=/home/sammy/anaconda3/bin/jupyter-lab --config=/home/sammy/.jupyter/jupyter_notebook_config.py
      StandardOutput=null
      Restart=always
      RestartSec=10
      
      [Install]
      WantedBy=multi-user.target
      

      Save and exit the editor once you’re done.

      The service file automatically registers itself in the system as a daemon. However, it is not running by default.

      Use the systemctl command to start the service:

      • sudo systemctl start jupyterlab

      This starts the JupyterLab server in the background. You can check if the server has started by using the following command:

      • sudo systemctl status jupyterlab

      You’ll receive the following output:

      Output

      ● jupyterlab.service - Jupyter Lab Server Loaded: loaded (/etc/systemd/system/jupyterlab.service; disabled; vendor preset: enabled) Active: active (running) since Sun 2020-04-26 20:58:29 UTC; 5s ago Main PID: 5654 (jupyter-lab) Tasks: 1 (limit: 1152) CGroup: /system.slice/jupyterlab.service └─5654 /home/sammy/anaconda3/bin/python3.7 /home/sammy/anaconda3/bin/jupyter-lab --config=/home/

      Press Q to exit the service status output.

      You can now head to https://your_domain:9000 in any browser of your choice, provide the password you set up in Step 2, and access the JupyterLab environment running on your server.

      Step 7 — Configuring Renewal of Your Let’s Encrypt Certificate

      In this final step, you will configure your SSL certificates provided by Let’s Encrypt to automatically renew when they expire every 90 days and then restart the server to load the new certificates.

      While Certbot takes care to renew the certificates for your installation, it does not automatically restart the server. To configure the server to restart with new certificates, you will have to provide a renew_hook to the Certbot configuration for your server.

      You’ll need to edit the /etc/letsencrypt/renewal/your_domain.conf file and add a renew_hook to the end of the configuration file.

      First, use the following command to open the /etc/letsencrypt/renewal/your_domain.conf file in an editor:

      • sudo nano /etc/letsencrypt/renewal/your_domain.conf

      Then, at the bottom of this file, add the following line:

      /etc/letsencrypt/renewal/your_domain.conf

      ...
      renew_hook = systemctl reload jupyterlab
      

      Save and exit the file.

      Finally, run a dry-run of the renewal process to verify that your configuration file is valid:

      • sudo certbot renew --dry-run

      If the command runs without any error, your Certbot renewal has been set up successfully and will automatically renew and restart your server when the certificate is near the date of expiry.

      Conclusion

      In this article, you set up a JupyterLab environment on your server and made it accessible remotely. Now you can access your machine learning or data science projects from any browser and rest assured that all exchanges are happening with SSL encryption in place. Along with that, your environment has all the benefits of cloud-based servers.



      Source link

      How To Set Up a Ruby on Rails GraphQL API


      The author selected Free Press to receive a donation as part of the Write for DOnations program.

      Introduction

      GraphQL is a strongly typed query language for APIs and a server-side runtime for executing those queries with your existing data. GraphQL allows clients to fetch multiple resources from the server in a single request by giving clients the ability to specify the exact data needed in the query. This removes the need for multiple API calls. GraphQL is language and database independent, and thus can be implemented in almost every programming language alongside any database of choice.

      In this tutorial, you will build a GraphQL-powered Ruby on Rails API for taking notes. When you are finished, you will be able to create and view notes from the API using GraphQL.

      GraphiQL IDE

      If you would like to take a look at the code for this tutorial, check out the companion repository for this tutorial on the DigitalOcean Community GitHub.

      Prerequisites

      To follow this tutorial, you’ll need:

      • The Ruby programming language and the Ruby on Rails framework installed on your development machine. This tutorial was tested on version 2.6.3 of Ruby and version 6.0.2.1 of Rails, so make sure to specify these versions during the installation process. Follow one of these tutorials to install Ruby and Rails:
      • PostgreSQL installed. To follow this tutorial, use PostgreSQL version 11.2. Install PostgreSQL by following Steps 1 and 2 of one of the following tutorials:

      Step 1 — Setting Up a New Rails API Application

      In this step, you will set up a new Rails API application and connect it to a PostgreSQL database. This will serve as the foundation for the note-taking API.

      Rails provides commands that make building modern web applications faster for developers. These commands can perform actions that range from creating a new Rails application to generating files required for app development. For a full list of these commands and what they do, run the following command in your terminal window:

      This command yields an extensive list of options you can use to set the parameters of your application. One of the commands listed is the new command, which accepts an APP_PATH and creates a new Rails application at the specified path.

      Create a new Rails application using the new generator. Run the following command in your terminal window:

      • rails new rails_graphql -d=postgresql -T --api

      This creates a new Rails application in a directory named rails_graphql and installs the required dependencies. Let’s go over the flags associated with the new command:

      • The -d flag pre-configures the application with the specified database.
      • The -T flag instructs Rails to not generate test files since you won’t be writing tests in this tutorial. You can also use this flag if you plan to use a different testing framework other than the one provided by Rails.
      • The --api flag configures a Rails application with only the files required for building an API with Rails. It skips configuring settings needed for browser applications.

      Once the command is done running, switch to the newly created rails_graphql directory, which is the application’s root directory:

      Now that you have successfully set up a new Rails API application, you have to connect it to a database before you can run the app. Rails provides a database.yml file found in config/database.yml, which contains configurations for connecting your app to a different database for different development environments. Rails specifies a database name for different development environments by appending an underscore (_) followed by the environment name to your app’s name. You can always change any environment database name to whatever you choose.

      Note: You can alter config/database.yml to choose the PostgreSQL role you would like Rails to use to create your database. If you created a role that is secured by a password, follow the instructions in Step 4 of How To Use PostgreSQL with Your Ruby on Rails Application on Ubuntu 18.04 or How To Use PostgreSQL with Your Ruby on Rails Application on macOS to configure your role.

      Rails includes commands for creating and working with databases. With your database credentials in place, run the following command in your terminal window to create your databases:

      The db:create command creates a development and test database based on the information provided in the config/database.yml file. Running the command yields the following output:

      Output

      Created database 'rails_graphql_development' Created database 'rails_graphql_test'

      With your application now successfully connected to a database, you can test the application to ensure it works. Start your server with the following command if you are working locally:

      If you are working on a development server, you can start your application by specifying the IP address the server should bind to:

      • bundle exec rails server --binding=your_server_ip

      Note: The server listens on port 3000. If you’re working on a development server, ensure that you have opened port 3000 in your firewall to allow connections.

      The rails server command launches Puma, a web server for Ruby distributed with Rails. The --binding=your_server_ip command binds the server to any IP you provide.

      Once you run this command, your command prompt will be replaced with the following output:

      Output

      => Booting Puma => Rails 6.0.2.1 application starting in development => Run `rails server --help` for more startup options Puma starting in single mode... * Version 4.3.1 (ruby 2.6.3-p62), codename: Mysterious Traveller * Min threads: 5, max threads: 5 * Environment: development * Listening on tcp://127.0.0.1:3000 * Listening on tcp://[::1]:3000 Use Ctrl-C to stop

      To run your application, navigate to localhost:3000 or http://your_server_ip:3000 in your browser. You’ll see the Rails default welcome page:

      Rails welcome page

      The welcome page means you have properly set up your Rails application.

      To stop the server, press CTRL+C in the terminal window where the server is running.

      You have successfully set up a Rails API application for a note-taking API. In the next step, you will set up your Rails API application to receive and execute GraphQL queries.

      Step 2 — Setting Up GraphQL for Rails

      In this step, you will configure your Rails API application to work with GraphQL. You will install and set up the necessary gems required for GraphQL development in Rails.

      As previously mentioned, GraphQL is language agnostic and is implemented in many programming languages. The graphql-ruby gem is the Ruby implementation for GraphQL. GraphQL also provides an interactive in-browser IDE known as GraphiQL for running GraphQL queries. The graphiql-rails gem helps you add GraphiQL to your development environment.

      To install these dependencies, open the project’s Gemfile for editing, using nano or your favorite text editor:

      Add the graphql and graphiql-rails gems to your Gemfile. You can add the graphiql gem anywhere, but the graphiql-rails gem should be added under the development dependencies:

      ~/rails_graphql/Gemfile

      ...
      group :development do
        gem 'listen', '>= 3.0.5', '< 3.2'
        # Spring speeds up development by keeping your application running in the background. Read more: https://github.com/rails/spring
        gem 'spring'
        gem 'spring-watcher-listen', '~> 2.0.0'
        gem 'graphiql-rails'
      end
      
      gem 'graphql', '1.9.18'
      ...
      

      Save and close the file when you are done adding the gems.

      In your terminal window, use the following command to install the gems:

      The output shows that the gems are installed.

      The graphql gem provides generators to create various files. To view the available generators, run the following command in your terminal window:

      The generators prefixed with graphql: are the ones associated with the graphql gem.

      You will use the graphql:install command to add graphql-ruby boilerplate code to the application and mount GraphiQL in your development environment. The boilerplate code will include all the files and directory needed for the graphql-ruby gem to work with Rails.

      In your terminal window, run the following commands:

      This command generates several files, including a graphql_controller.rb file located at app/controllers/graphql_controller.rb and a graphql directory at app/graphql which contains files required to get started with GraphQL in Rails. It also adds a /graphql HTTP POST route in the routes file located at config/routes.rb. This route is mapped to the app/controllers/graphql_controller.rb#execute method which handles all queries to the GraphQL server.

      Before you can test the GraphQL endpoint, you need to mount the GraphiQL engine to the routes file so you can access the GraphiQL in-browser IDE. To do this open the routes file located at config/routes.rb:

      • nano ~/rails_graphql/config/routes.rb

      Add the following code to the file to mount the GraphiQL engine in the development environment:

      ~/rails_graphql/config/routes.rb

      Rails.application.routes.draw do
        if Rails.env.development?
          mount GraphiQL::Rails::Engine, at: "/graphiql", graphql_path: "graphql#execute"
        end
        post "/graphql", to: "graphql#execute"
        # For details on the DSL available within this file, see https://guides.rubyonrails.org/routing.html
      end
      

      This mounts the GraphiQL engine to the /graphiql path and directs all queries to the graphql#execute method.

      Since this is an API application created with the --api flag, it does not expect to render any page in the browser. To make the GraphiQL editor show up in the browser, you need to make a couple of small changes to your application’s configuration.

      First, open the application.rb file located at config/application.rb:

      • nano ~/rails_graphql/config/application.rb

      Next, uncomment the require "sprockets/railtie" line:

      ~/rails_graphql/config/application.rb

      require_relative 'boot'
      
      require "rails"
      # Pick the frameworks you want:
      require "active_model/railtie"
      require "active_job/railtie"
      require "active_record/railtie"
      require "active_storage/engine"
      require "action_controller/railtie"
      require "action_mailer/railtie"
      require "action_mailbox/engine"
      require "action_text/engine"
      require "action_view/railtie"
      require "action_cable/engine"
      require "sprockets/railtie"
      # require "rails/test_unit/railtie"
      
      ...
      

      Save and close the file after uncommenting the line.

      Now create a config directory at app/assets:

      • mkdir -p app/assets/config

      Next, create a manifest.js file in the newly created config directory. The manifest.js file specifies additional assets to be compiled and made available to the browser:

      • nano app/assets/config/manifest.js

      Add the following code to the file which tells Rails to precompile the graphiql/rails/application.css and graphiql/rails/application.js files so Rails can serve them to your browser:

      ~/rails_graphql/app/assets/config/manifest.js

      //= link graphiql/rails/application.css
      //= link graphiql/rails/application.js
      

      Save and close the file.

      With that done, you can test your GraphQL endpoint. Restart your development server, and in your browser, navigate to localhost:3000/graphiql or http://your_server_ip:3000/graphiql. The GraphiQL query editor displays in your browser:

      GraphiQL IDE

      The left side of the GraphiQL IDE accepts GraphQL queries and the right side displays results of the run query. The GraphiQL query editor also has a syntax highlighter and a typeahead hinter powered by your GraphQL Schema. Together, these help you make a valid query.

      To try a Hello World example, clear out the default text in the editor’s left pane and type in the following query:

      query {
          testField
      }
      

      Click the Play icon button in the header and you’ll recieve a successful response on the screen, as shown in the following figure:

      GraphiQL IDE Response successful response

      You have successfully set up your Rails API application to work with GraphQL and tested your GraphQL endpoint to confirm it works. In the next step, you will create GraphQL types for your application.

      Step 3 — Creating Types for the Application

      GraphQL depends on its Types and Schema to validate and respond to queries. In this step, you will create a Note model and the GraphQL types required in your note-taking API.

      A GraphQL type consists of fields and arguments which, in turn, define the fields and arguments that can appear in any GraphQL query that operates on that type. These types make up a GraphQL Schema. GraphQL defines the following types:

      • The Query and Mutation types: These are special types that define the entry point of every GraphQL query. Every GraphQL service has a query type and may or may not have a mutation type.
      • Object types: These are the basic components of a GraphQL schema. These represent the objects you can fetch from a GraphQL service and the fields each object holds.
      • Scalar types: These are default types that come with GraphQL out of the box. They include Int, Float, String, Boolean, and ID.
      • Enumeration types: These are types that define a particular set of allowed values.
      • Input types: These are similar to object types, with the only difference being that they define objects that you can pass to queries as arguments.

      There are other types, including Union, List, Non-Null, and Interface. You can find a list of available GraphQL types in the official GraphQL documentation.

      For this application, you will create a Note model and a Note object and input type. The Note model will represent the database table that will store your notes while the Note object and input type will define the fields and arguments that exists on a Note object.

      First, create a Note model using the generate model subcommand provided by Rails and specify the name of the model along with its columns and data types. Run the following command in your terminal window:

      • rails generate model note title:string:index body:text

      This command creates a Note model with two fields: title, with the type string, and body, with the type text. The command also adds a database index on the title column. It generates these two files:

      • A note.rb file located at app/models/note.rb. This file will hold all model-related logic.
      • A 20200617173228_create_notes.rb file (the number at the beginning of the file will differ, depending on the date you run the command) located at db/migrate/20200617173228_create_notes.rb. This is a migration file that holds the instruction for creating a corresponding notes table in the database.

      To execute the instructions in the migration file, you’ll use the db:migrate subcommand which executes the instruction in your migration files. Run the following command in your terminal window:

      Once the command runs successfully, you will see output similar to the following:

      Output

      == 20200617173228 CreateNotes: migrating ====================================== -- create_table(:notes) -> 0.0134s -- add_index(:notes, :title) -> 0.0073s == 20200617173228 CreateNotes: migrated (0.0208s) =============================

      With the note model in place, next you’ll create a NoteType. A valid note object is expected to have an id, a title, and text. Run the following command in your terminal window to create a NoteType:

      • rails generate graphql:object Note id:ID! title:String! body:String!

      The command instructs Rails to create a GraphQL object type called Note with three fields: an id field with a type of ID, and the title and body fields, each with a String type. The exclamation point (!) appended to the field type indicates that the field should be non-nullable, meaning that the field should never return a null value. Non-nullable fields are important, as they serve as a form of validation that guarantees which fields must be present whenever GraphQL objects are queried.

      Running the preceding command creates a note_type.rb file located at app/graphql/types/note_type.rb containing a Types::NoteType class with three non-nullable fields.

      Lastly, you will create a NoteInput type to define the arguments required to create a note. Start by creating an input directory under app/graphql/types. The input directory will house input types:

      • mkdir ~/rails_graphql/app/graphql/types/input

      Note: It’s not a requirement to create input types in the input directory; it is merely a common convention. You can decide to keep all your types under the types directory and exclude nesting the class under an Input module whenever you’re accessing it.

      In the ~/rails_graphql/app/graphql/types/input directory, create a note_input_type.rb file:

      • nano ~/rails_graphql/app/graphql/types/input/note_input_type.rb

      Add the following code to the file to define the fields for the Input type:

      ~/rails_graphql/app/graphql/types/input/note_input_type.rb

      module Types
        module Input
          class NoteInputType < Types::BaseInputObject
            argument :title, String, required: true
            argument :body, String, required: true
          end
        end
      end
      

      In the note_input_type.rb file, you added a Types::Input::NoteInputType class that inherits from the Types::BaseInputObject class and accepts two required arguments; title and body, both of a string type.

      You’ve created a model and two GraphQL types for your note-taking app. In the next step, you will create queries to fetch existing notes.

      Step 4 — Creating Queries for the Application

      Your GraphQL-powered API is gradually coming together. In this step you’ll create two queries; one to fetch a single note by id and another to fetch all notes. The GraphQL query type handles the fetching of data and can be likened to a GET request in REST.

      First, you’ll create a query to fetch all notes. To start, create a queries directory to house all queries:

      • mkdir ~/rails_graphql/app/graphql/queries

      In the app/graphql/queries directory, create a base_query.rb file from which all other query classes will inherit:

      • nano ~/rails_graphql/app/graphql/queries/base_query.rb

      Add the following code to the base_query.rb file to create a BaseQuery class that other query classes will inherit from:

      ~/rails_graphql/app/graphql/queries/base_query.rb

      module Queries
        class BaseQuery < GraphQL::Schema::Resolver
        end
      end
      

      In the base_query.rb file, you added a Queries::BaseQuery class that inherits from the GraphQL::Schema::Resolver class. The GraphQL::Schema::Resolver class is a container that can hold logic belonging to a field. It can be attached to a field with the resolver: keyword.

      The Queries::BaseQuery class can also contain any code you intend to reuse across multiple query classes.

      Next, create a fetch_notes.rb file in the queries directory. This file will hold the logic for fetching all existing notes, and will be attached to a field in the query type file:

      • nano ~/rails_graphql/app/graphql/queries/fetch_notes.rb

      Add the following code to the file to define the return object type and resolve the requested notes:

      ~/rails_graphql/app/graphql/queries/fetch_notes.rb

      module Queries
        class FetchNotes < Queries::BaseQuery
      
          type [Types::NoteType], null: false
      
          def resolve
            Note.all.order(created_at: :desc)
          end
        end
      end
      

      In the fetch_notes.rb file, you created a Queries::FetchNotes class that inherits the Queries::BaseQuery previously created. The class has a return type declaration that declares that the data returned by this query should be an array of the already created NoteType.

      The Queries::FetchNotes also contains a resolve method that returns an array of all existing notes sorted by their created date in descending order.

      The FetchNotes query is ready to receive and return requests for notes, but GraphQL is still unaware of its existence, to fix that, open the GraphQL query type file located at app/graphql/types/query_type.rb:

      • nano ~/rails_graphql/app/graphql/types/query_type.rb

      The query_type.rb file is the entry point for all GraphQL query types. It holds the query fields, and their respective resolver methods. Replace the sample code in the file with the following:

      ~/rails_graphql/app/graphql/types/query_type.rb

      module Types
        class QueryType < Types::BaseObject
          # Add root-level fields here.
          # They will be entry points for queries on your schema.
      
          field :fetch_notes, resolver: Queries::FetchNotes
        end
      end
      

      In the query_type.rb file, you added a fetch_notes field and attached it to the Queries::FetchNotes class using a resolver:. This way whenever the fetch_notes query is called, it executes the logic in the resolve method of the Queries::FetchNotes class.

      In order to test your query, you need some data to fetch, but you currently don’t have any notes in your database. You can fix that by adding some seed data to your database. Open the seeds.rb file located at db/seeds.rb:

      • nano ~/rails_graphql/db/seeds.rb

      Add the following code to the file to create five notes:

      ~/rails_graphql/db/seeds.rb

      5.times do |i|
        Note.create(title: "Note #{i + 1}", body: 'Lorem ipsum saves lives')
      end
      

      Save and close the file after adding the code.

      Open your project’s root directory in another terminal window and run the following command to run the code in the seed.rb file:

      This creates 5 notes in the database.

      With data in your database, and your development server running, navigate to localhost:3000/graphiql or http://your_server_ip:3000/graphiql in your browser to open your GraphiQL IDE. In the left side of the editor, type in the following query:

      query {
        fetchNotes {
          id
          title
          body
        }
      }
      

      This GraphQL query declares a query operation, indicating you want to make a query request. In the query operation, you called a fetchNotes field that matches the fetch_notes query field declared in the API, and included the fields on a note that you want to be returned in your response.

      Click the Play icon button in the header. You’ll see a response similar to the following in the output pane:

      {
        "data": {
          "fetchNotes": [
            {
              "id": "5",
              "title": "Note 5",
              "body": "Lorem ipsum saves lives"
            },
            {
              "id": "4",
              "title": "Note 4",
              "body": "Lorem ipsum saves lives"
            },
            {
              "id": "3",
              "title": "Note 3",
              "body": "Lorem ipsum saves lives"
            },
            {
              "id": "2",
              "title": "Note 2",
              "body": "Lorem ipsum saves lives"
            },
            {
              "id": "1",
              "title": "Note 1",
              "body": "Lorem ipsum saves lives"
            }
          ]
        }
      }
      

      The response contains an array of 5 notes that match the fields declared in the query on the left. If you remove some fields in the query on the left side of the editor and re-run the query, you get a response with only the fields you requested. That’s the power of GraphQL.

      Next, you’ll create another query to fetch notes by id. This query will be similar to the fetch_notes query, only that it’ll accept an id argument. Go ahead and create a fetch_note.rb file in the queries directory:

      • nano ~/rails_graphql/app/graphql/queries/fetch_note.rb

      Add the following code to the file to find and return a note with the provided id:

      ~/rails_graphql/app/graphql/queries/fetch_note.rb

      module Queries
        class FetchNote < Queries::BaseQuery
          type Types::NoteType, null: false
          argument :id, ID, required: true
      
          def resolve(id:)
            Note.find(id)
          rescue ActiveRecord::RecordNotFound => _e
            GraphQL::ExecutionError.new('Note does not exist.')
          rescue ActiveRecord::RecordInvalid => e
            GraphQL::ExecutionError.new("Invalid attributes for #{e.record.class}:"
              " #{e.record.errors.full_messages.join(', ')}")
          end
        end
      end
      

      This defines a Queries::FetchNote class that inherits from the Queries::BaseQuery class. This class not only returns a single item that must be of a NoteType, it also accepts an id argument with an ID type. The resolve method receives the provided id argument, then finds and returns a note with the provided id. If no note exists or an error occurs, it is rescued and returned as a GraphQL::ExecutionError.

      Next, you will attach the Queries::FetchNote class to a query field in the query type file. Open the query_type.rb file in your editor:

      • nano ~/rails_graphql/app/graphql/types/query_type.rb

      Add the following code to the file which defines a resolver for fetch_notes:

      ~/rails_graphql/app/graphql/types/query_type.rb

      module Types
        class QueryType < Types::BaseObject
          # Add root-level fields here.
          # They will be entry points for queries on your schema.
      
          field :fetch_notes, resolver: Queries::FetchNotes
          field :fetch_note, resolver: Queries::FetchNote
        end
      end
      

      To test your new query, ensure your server is running and navigate to localhost:3000/graphiql or http://your_server_ip:3000/graphiql in your browser to open your GraphiQL IDE. In the left side of the editor, type in the following query:

      query {
        fetchNote(id: 1) {
          id
          title
          body
        }
      }
      

      This query operation requests a fetchNote field, which corresponds to the fetch_note query field, and is passed an id argument. It specifies that we want three fields to be returned in the response.

      Run the query by clicking the Play icon button in the header. You will get a response like the following in the output pane:

      {
        "data": {
          "fetchNote": {
            "id": "1",
            "title": "Note 1",
            "body": "Lorem ipsum saves lives"
          }
        }
      }
      

      The response contains a single note that matches the requested id with fields matching the ones in the request.

      In this step, you created GraphQL queries to fetch notes from your API. Next you’ll write mutations to create notes.

      Step 5 — Creating GraphQL Mutations to Modify Notes

      In addition to queries, GraphQL also defines a mutation type for operations that modify server-side data. Just as REST provides POST, PUT, PATCH, and DELETE requests for creating, updating and deleting resources, GraphQL’s mutation type defines a convention for operations that cause writes on the server-side. In this step, you’ll create a mutation for adding new notes.

      graphQL-ruby includes two classes for writing mutations. They are:

      • GraphQL::Schema::Mutation: This is the generic base class for writing mutations. If you don’t want an input argument required in your mutations, you should use this class.
      • GraphQL::Schema::RelayClassicMutation: This is a base class with some conventions; an argument called clientMutationId that is always inserted to the response, and mutations that accepts one argument called input. This class is used by default when you use the install generator to add boilerplate GraphQL files to your project.

      Create an add_note.rb file in the mutations directory located at app/graphql/mutations:

      • nano ~/rails_graphql/app/graphql/mutations/add_note.rb

      Add the following code to the file to define the mutation for adding new notes:

      ~/rails_graphql/app/graphql/mutations/add_note.rb

      module Mutations
        class AddNote < Mutations::BaseMutation
          argument :params, Types::Input::NoteInputType, required: true
      
          field :note, Types::NoteType, null: false
      
          def resolve(params:)
            note_params = Hash params
      
            begin
              note = Note.create!(note_params)
      
              { note: note }
            rescue ActiveRecord::RecordInvalid => e
              GraphQL::ExecutionError.new("Invalid attributes for #{e.record.class}:"
                " #{e.record.errors.full_messages.join(', ')}")
            end
          end
        end
      end
      

      This defines a Mutations::AddNote class that inherits from the Mutations::BaseMutation class, which is one of the classes created when you ran the install generator while installing the GraphQL-Ruby gem. The Mutations::AddNote class receives an argument with the name params and a type of NoteInputType, which you created in Step 3. It also returns a field called note that must be a non-null NoteType type.

      The resolve method of the class receives the params and converts it to a hash which it uses to create and return a new hash containing the new note. If there’s an error while creating the note, the error is rescued and returned as a GraphQL::ExecutionError.

      Note: The resolve method in a mutation must return a hash whose symbol matches the field names.

      Like with queries, the Mutations::AddNote mutation has to be attached to a mutation field using the mutation: keyword.

      Open the mutation type file located at app/graphql/types/mutation_type.rb in your editor:

      • nano ~/rails_graphql/app/graphql/types/mutation_type.rb

      Replace the code in the file with the following code, which adds a field for the add_note with its corresponding mutation class:

      ~/rails_graphql/app/graphql/types/mutation_type.rb

      module Types
        class MutationType < Types::BaseObject
          field :add_note, mutation: Mutations::AddNote
        end
      end
      

      In this code, you added an add_note field to the mutation type file and attached it to the Mutations::AddNote class using the mutation: keyword. When the add_note mutation is called, it runs the code in the resolve method of the Mutations::AddNote class.

      To test your new mutation, navigate to localhost:3000/graphiql or http://your_server_ip:3000/graphiql in your browser to open your GraphiQL IDE. In the left side of the editor, type in the following query:

      mutation {
        addNote(input: { params: { title: "GraphQL notes", body: "A long body of text about GraphQL"  }}) {
          note {
            id
            title
            body
          }
        }
      }
      

      This declares a mutation operation with an addNote field that accepts a single input argument, which in turn accepts a param object with keys that match the NoteInputType. The mutation operation also includes a note field that matches the note field returned by the Mutations::AddNote class.

      Run the mutation in GraphiQL and you’ll see the following results in the output pane:

      {
        "data": {
          "addNote": {
            "note": {
              "id": "6",
              "title": "GraphQL notes",
              "body": "A long body of text about GraphQL"
            }
          }
        }
      }
      

      The response returned is the newly created note with the fields requested in the mutation request.

      With your add_note mutation now working, your API can fetch and create notes using GraphQL queries and mutations.

      Conclusion

      In this tutorial, you created a note-taking API application with Ruby on Rails using PostgreSQL as your database and GraphQL as your API query language. You can learn more about GraphQL on its official website. The GraphQL-Ruby gem website also contains some guides to help you work with GraphQL in Rails.



      Source link

      How To Set Up a Ceph Cluster within Kubernetes Using Rook


      The author selected the Mozilla Foundation to receive a donation as part of the Write for DOnations program.

      Introduction

      Kubernetes containers are stateless as a core principle, but data must still be managed, preserved, and made accessible to other services. Stateless means that the container is running in isolation without any knowledge of past transactions, which makes it easy to replace, delete, or distribute the container. However, it also means that data will be lost for certain lifecycle events like restart or deletion.

      Rook is a storage orchestration tool that provides a cloud-native, open source solution for a diverse set of storage providers. Rook uses the power of Kubernetes to turn a storage system into self-managing services that provide a seamless experience for saving Kubernetes application or deployment data.

      Ceph is a highly scalable distributed-storage solution offering object, block, and file storage. Ceph clusters are designed to run on any hardware using the so-called CRUSH algorithm (Controlled Replication Under Scalable Hashing).

      One main benefit of this deployment is that you get the highly scalable storage solution of Ceph without having to configure it manually using the Ceph command line, because Rook automatically handles it. Kubernetes applications can then mount block devices and filesystems from Rook to preserve and monitor their application data.

      In this tutorial, you will set up a Ceph cluster using Rook and use it to persist data for a MongoDB database as an example.

      Prerequisites

      Before you begin this guide, you’ll need the following:

      • A DigitalOcean Kubernetes cluster with at least three nodes that each have 2 vCPUs and 4 GB of Memory. To create a cluster on DigitalOcean and connect to it, see the Kubernetes Quickstart.
      • The kubectl command-line tool installed on a development server and configured to connect to your cluster. You can read more about installing kubectl in its official documentation.
      • A DigitalOcean block storage Volume with at least 100 GB for each node of the cluster you just created—for example, if you have three nodes you will need three Volumes. Select Manually Format rather than automatic and then attach your Volume to the Droplets in your node pool. You can follow the Volumes Quickstart to achieve this.

      Step 1 — Setting up Rook

      After completing the prerequisite, you have a fully functional Kubernetes cluster with three nodes and three Volumes—you’re now ready to set up Rook.

      In this section, you will clone the Rook repository, deploy your first Rook operator on your Kubernetes cluster, and validate the given deployment status. A Rook operator is a container that automatically bootstraps the storage clusters and monitors the storage daemons to ensure the storage clusters are healthy.

      First, you will clone the Rook repository, so you have all the resources needed to start setting up your Rook cluster:

      • git clone --single-branch --branch release-1.3 https://github.com/rook/rook.git

      This command will clone the Rook repository from Github and create a folder with the name of rook in your directory. Now enter the directory using the following command:

      • cd rook/cluster/examples/kubernetes/ceph

      Next you will continue by creating the common resources you needed for your Rook deployment, which you can do by deploying the Kubernetes config file that is available by default in the directory:

      • kubectl create -f common.yaml

      The resources you’ve created are mainly CustomResourceDefinitions (CRDs) and define new resources that the operator will later use. They contain resources like the ServiceAccount, Role, RoleBinding, ClusterRole, and ClusterRoleBinding.

      Note: This standard file assumes that you will deploy the Rook operator and all Ceph daemons in the same namespace. If you want to deploy the operator in a separate namespace, see the comments throughout the common.yaml file.

      After the common resources are created, the next step is to create the Rook operator.

      Before deploying the operator.yaml file, you will need to change the CSI_RBD_GRPC_METRICS_PORT variable because your DigitalOcean Kubernetes cluster already uses the standard port by default. Open the file with the following command:

      Then search for the CSI_RBD_GRPC_METRICS_PORT variable, uncomment it by removing the #, and change the value from port 9001 to 9093:

      operator.yaml

      kind: ConfigMap
      apiVersion: v1
      metadata:
        name: rook-ceph-operator-config
        namespace: rook-ceph
      data:
        ROOK_CSI_ENABLE_CEPHFS: "true"
        ROOK_CSI_ENABLE_RBD: "true"
        ROOK_CSI_ENABLE_GRPC_METRICS: "true"
        CSI_ENABLE_SNAPSHOTTER: "true"
        CSI_FORCE_CEPHFS_KERNEL_CLIENT: "true"
        ROOK_CSI_ALLOW_UNSUPPORTED_VERSION: "false"
        # Configure CSI CSI Ceph FS grpc and liveness metrics port
        # CSI_CEPHFS_GRPC_METRICS_PORT: "9091"
        # CSI_CEPHFS_LIVENESS_METRICS_PORT: "9081"
        # Configure CSI RBD grpc and liveness metrics port
        CSI_RBD_GRPC_METRICS_PORT: "9093"
        # CSI_RBD_LIVENESS_METRICS_PORT: "9080"
      

      Once you’re done, save and exit the file.

      Next, you can deploy the operator using the following command:

      • kubectl create -f operator.yaml

      The command will output the following:

      Output

      configmap/rook-ceph-operator-config created deployment.apps/rook-ceph-operator created

      Again, you’re using the kubectl create command with the -f flag to assign the file that you want to apply. It will take around a couple of seconds for the operator to be running. You can verify the status using the following command:

      • kubectl get pod -n rook-ceph

      You use the -n flag to get the pods of a specific Kubernetes namespace (rook-ceph in this example).

      Once the operator deployment is ready, it will trigger the creation of the DeamonSets that are in charge of creating the rook-discovery agents on each worker node of your cluster. You’ll receive output similar to:

      Output

      NAME READY STATUS RESTARTS AGE rook-ceph-operator-599765ff49-fhbz9 1/1 Running 0 92s rook-discover-6fhlb 1/1 Running 0 55s rook-discover-97kmz 1/1 Running 0 55s rook-discover-z5k2z 1/1 Running 0 55s

      You have successfully installed Rook and deployed your first operator. Next, you will create a Ceph cluster and verify that it is working.

      Step 2 — Creating a Ceph Cluster

      Now that you have successfully set up Rook on your Kubernetes cluster, you’ll continue by creating a Ceph cluster within the Kubernetes cluster and verifying its functionality.

      First let’s review the most important Ceph components and their functionality:

      • Ceph Monitors, also known as MONs, are responsible for maintaining the maps of the cluster required for the Ceph daemons to coordinate with each other. There should always be more than one MON running to increase the reliability and availability of your storage service.

      • Ceph Managers, also known as MGRs, are runtime daemons responsible for keeping track of runtime metrics and the current state of your Ceph cluster. They run alongside your monitoring daemons (MONs) to provide additional monitoring and an interface to external monitoring and management systems.

      • Ceph Object Store Devices, also known as OSDs, are responsible for storing objects on a local file system and providing access to them over the network. These are usually tied to one physical disk of your cluster. Ceph clients interact with OSDs directly.

      To interact with the data of your Ceph storage, a client will first make contact with the Ceph Monitors (MONs) to obtain the current version of the cluster map. The cluster map contains the data storage location as well as the cluster topology. The Ceph clients then use the cluster map to decide which OSD they need to interact with.

      Rook enables Ceph storage to run on your Kubernetes cluster. All of these components are running in your Rook cluster and will directly interact with the Rook agents. This provides a more streamlined experience for administering your Ceph cluster by hiding Ceph components like placement groups and storage maps while still providing the options of advanced configurations.

      Now that you have a better understanding of what Ceph is and how it is used in Rook, you will continue by setting up your Ceph cluster.

      You can complete the setup by either running the example configuration, found in the examples directory of the Rook project, or by writing your own configuration. The example configuration is fine for most use cases and provides excellent documentation of optional parameters.

      Now you’ll start the creation process of a Ceph cluster Kubernetes Object.

      First, you need to create a YAML file:

      The configuration defines how the Ceph cluster will be deployed. In this example, you will deploy three Ceph Monitors (MON) and enable the Ceph dashboard. The Ceph dashboard is out of scope for this tutorial, but you can use it later in your own individual project for visualizing the current status of your Ceph cluster.

      Add the following content to define the apiVersion and the Kubernetes Object kind as well as the name and the namespace the Object should be deployed in:

      cephcluster.yaml

      apiVersion: ceph.rook.io/v1
      kind: CephCluster
      metadata:
        name: rook-ceph
        namespace: rook-ceph
      

      After that, add the spec key, which defines the model that Kubernetes will use to create your Ceph cluster. You’ll first define the image version you want to use and whether you allow unsupported Ceph versions or not:

      cephcluster.yaml

      spec:
        cephVersion:
          image: ceph/ceph:v14.2.8
          allowUnsupported: false
      

      Then set the data directory where configuration files will be persisted using the dataDirHostPath key:

      cephcluster.yaml

        dataDirHostPath: /var/lib/rook
      

      Next, you define if you want to skip upgrade checks and when you want to upgrade your cluster using the following parameters:

      cephcluster.yaml

        skipUpgradeChecks: false
        continueUpgradeAfterChecksEvenIfNotHealthy: false
      

      You configure the number of Ceph Monitors (MONs) using the mon key. You also allow the deployment of multiple MONs per node:

      cephcluster.yaml

        mon:
          count: 3
          allowMultiplePerNode: false
      

      Options for the Ceph dashboard are defined under the dashboard key. This gives you options to enable the dashboard, customize the port, and prefix it when using a reverse proxy:

      cephcluster.yaml

        dashboard:
          enabled: true
          # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
          # urlPrefix: /ceph-dashboard
          # serve the dashboard at the given port.
          # port: 8443
          # serve the dashboard using SSL
          ssl: false
      

      You can also enable monitoring of your cluster with the monitoring key (monitoring requires Prometheus to be pre-installed):

      cephcluster.yaml

        monitoring:
          enabled: false
          rulesNamespace: rook-ceph
      

      RDB stands for RADOS (Reliable Autonomic Distributed Object Store) block device, which are thin-provisioned and resizable Ceph block devices that store data on multiple nodes.

      RBD images can be asynchronously shared between two Ceph clusters by enabling rbdMirroring. Since we’re working with one cluster in this tutorial, this isn’t necessary. The number of workers is therefore set to 0:

      cephcluster.yaml

        rbdMirroring:
          workers: 0
      

      You can enable the crash collector for the Ceph daemons:

      cephcluster.yaml

        crashCollector:
          disable: false
      

      The cleanup policy is only important if you want to delete your cluster. That is why this option has to be left empty:

      cephcluster.yaml

        cleanupPolicy:
          deleteDataDirOnHosts: ""
        removeOSDsIfOutAndSafeToRemove: false
      

      The storage key lets you define the cluster level storage options; for example, which node and devices to use, the database size, and how many OSDs to create per device:

      cephcluster.yaml

        storage:
          useAllNodes: true
          useAllDevices: true
          config:
            # metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore.
            # databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB
            # journalSizeMB: "1024"  # uncomment if the disks are 20 GB or smaller
      

      You use the disruptionManagement key to manage daemon disruptions during upgrade or fencing:

      cephcluster.yaml

        disruptionManagement:
          managePodBudgets: false
          osdMaintenanceTimeout: 30
          manageMachineDisruptionBudgets: false
          machineDisruptionBudgetNamespace: openshift-machine-api
      

      These configuration blocks will result in the final following file:

      cephcluster.yaml

      apiVersion: ceph.rook.io/v1
      kind: CephCluster
      metadata:
        name: rook-ceph
        namespace: rook-ceph
      spec:
        cephVersion:
          image: ceph/ceph:v14.2.8
          allowUnsupported: false
        dataDirHostPath: /var/lib/rook
        skipUpgradeChecks: false
        continueUpgradeAfterChecksEvenIfNotHealthy: false
        mon:
          count: 3
          allowMultiplePerNode: false
        dashboard:
          enabled: true
          # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
          # urlPrefix: /ceph-dashboard
          # serve the dashboard at the given port.
          # port: 8443
          # serve the dashboard using SSL
          ssl: false
        monitoring:
          enabled: false
          rulesNamespace: rook-ceph
        rbdMirroring:
          workers: 0
        crashCollector:
          disable: false
        cleanupPolicy:
          deleteDataDirOnHosts: ""
        removeOSDsIfOutAndSafeToRemove: false
        storage:
          useAllNodes: true
          useAllDevices: true
          config:
            # metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore.
            # databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB
            # journalSizeMB: "1024"  # uncomment if the disks are 20 GB or smaller
        disruptionManagement:
          managePodBudgets: false
          osdMaintenanceTimeout: 30
          manageMachineDisruptionBudgets: false
          machineDisruptionBudgetNamespace: openshift-machine-api
      

      Once you’re done, save and exit your file.

      You can also customize your deployment by, for example changing your database size or defining a custom port for the dashboard. You can find more options for your cluster deployment in the cluster example of the Rook repository.

      Next, apply this manifest in your Kubernetes cluster:

      • kubectl apply -f cephcluster.yaml

      Now check that the pods are running:

      • kubectl get pod -n rook-ceph

      This usually takes a couple of minutes, so just refresh until your output reflects something like the following:

      Output

      NAME READY STATUS RESTARTS AGE csi-cephfsplugin-lz6dn 3/3 Running 0 3m54s csi-cephfsplugin-provisioner-674847b584-4j9jw 5/5 Running 0 3m54s csi-cephfsplugin-provisioner-674847b584-h2cgl 5/5 Running 0 3m54s csi-cephfsplugin-qbpnq 3/3 Running 0 3m54s csi-cephfsplugin-qzsvr 3/3 Running 0 3m54s csi-rbdplugin-kk9sw 3/3 Running 0 3m55s csi-rbdplugin-l95f8 3/3 Running 0 3m55s csi-rbdplugin-provisioner-64ccb796cf-8gjwv 6/6 Running 0 3m55s csi-rbdplugin-provisioner-64ccb796cf-dhpwt 6/6 Running 0 3m55s csi-rbdplugin-v4hk6 3/3 Running 0 3m55s rook-ceph-crashcollector-pool-33zy7-68cdfb6bcf-9cfkn 1/1 Running 0 109s rook-ceph-crashcollector-pool-33zyc-565559f7-7r6rt 1/1 Running 0 53s rook-ceph-crashcollector-pool-33zym-749dcdc9df-w4xzl 1/1 Running 0 78s rook-ceph-mgr-a-7fdf77cf8d-ppkwl 1/1 Running 0 53s rook-ceph-mon-a-97d9767c6-5ftfm 1/1 Running 0 109s rook-ceph-mon-b-9cb7bdb54-lhfkj 1/1 Running 0 96s rook-ceph-mon-c-786b9f7f4b-jdls4 1/1 Running 0 78s rook-ceph-operator-599765ff49-fhbz9 1/1 Running 0 6m58s rook-ceph-osd-prepare-pool-33zy7-c2hww 1/1 Running 0 21s rook-ceph-osd-prepare-pool-33zyc-szwsc 1/1 Running 0 21s rook-ceph-osd-prepare-pool-33zym-2p68b 1/1 Running 0 21s rook-discover-6fhlb 1/1 Running 0 6m21s rook-discover-97kmz 1/1 Running 0 6m21s rook-discover-z5k2z 1/1 Running 0 6m21s

      You have now successfully set up your Ceph cluster and can continue by creating your first storage block.

      Step 3 — Adding Block Storage

      Block storage allows a single pod to mount storage. In this section, you will create a storage block that you can use later in your applications.

      Before Ceph can provide storage to your cluster, you first need to create a storageclass and a cephblockpool. This will allow Kubernetes to interoperate with Rook when creating persistent volumes:

      • kubectl apply -f ./csi/rbd/storageclass.yaml

      The command will output the following:

      Output

      cephblockpool.ceph.rook.io/replicapool created storageclass.storage.k8s.io/rook-ceph-block created

      Note: If you’ve deployed the Rook operator in a namespace other than rook-ceph you need to change the prefix in the provisioner to match the namespace you use.

      After successfully deploying the storageclass and cephblockpool, you will continue by defining the PersistentVolumeClaim (PVC) for your application. A PersistentVolumeClaim is a resource used to request storage from your cluster.

      For that, you first need to create a YAML file:

      • nano pvc-rook-ceph-block.yaml

      Add the following for your PersistentVolumeClaim:

      pvc-rook-ceph-block.yaml

      apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
        name: mongo-pvc
      spec:
        storageClassName: rook-ceph-block
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 5Gi
      

      First, you need to set an apiVersion (v1 is the current stable version). Then you need to tell Kubernetes which type of resource you want to define using the kind key (PersistentVolumeClaim in this case).

      The spec key defines the model that Kubernetes will use to create your PersistentVolumeClaim. Here you need to select the storage class you created earlier: rook-ceph-block. You can then define the access mode and limit the resources of the claim. ReadWriteOnce means the volume can only be mounted by a single node.

      Now that you have defined the PersistentVolumeClaim, it is time to deploy it using the following command:

      • kubectl apply -f pvc-rook-ceph-block.yaml

      You will receive the following output:

      Output

      persistentvolumeclaim/mongo-pvc created

      You can now check the status of your PVC:

      When the PVC is bound, you are ready:

      Output

      NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE mongo-pvc Bound pvc-ec1ca7d1-d069-4d2a-9281-3d22c10b6570 5Gi RWO rook-ceph-block 16s

      You have now successfully created a storage class and used it to create a PersistenVolumeClaim that you will mount to a application to persist data in the next section.

      Step 4 — Creating a MongoDB Deployment with a rook-ceph-block

      Now that you have successfully created a storage block and a persistent volume, you will put it to use by implementing it in a MongoDB application.

      The configuration will contain a few things:

      • A single container deployment based on the latest version of the mongo image.
      • A persistent volume to preserve the data of the MongoDB database.
      • A service to expose the MongoDB port on port 31017 of every node so you can interact with it later.

      First open the configuration file:

      Start the manifest with the Deployment resource:

      mongo.yaml

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: mongo
      spec:
        selector:
          matchLabels:
            app: mongo
        template:
          metadata:
            labels:
              app: mongo
          spec:
            containers:
            - image: mongo:latest
              name: mongo
              ports:
              - containerPort: 27017
                name: mongo
              volumeMounts:
              - name: mongo-persistent-storage
                mountPath: /data/db
            volumes:
            - name: mongo-persistent-storage
              persistentVolumeClaim:
                claimName: mongo-pvc
      
      ...
      

      For each resource in the manifest, you need to set an apiVersion. For deployments and services, use apiVersion: apps/v1, which is a stable version. Then, tell Kubernetes which resource you want to define using the kind key. Each definition should also have a name defined in metadata.name.

      The spec section tells Kubernetes what the desired state of your final state of the deployment is. This definition requests that Kubernetes should create one pod with one replica.

      Labels are key-value pairs that help you organize and cross-reference your Kubernetes resources. You can define them using metadata.labels and you can later search for them using selector.matchLabels.

      The spec.template key defines the model that Kubernetes will use to create each of your pods. Here you will define the specifics of your pod’s deployment like the image name, container ports, and the volumes that should be mounted. The image will then automatically be pulled from an image registry by Kubernetes.

      Here you will use the PersistentVolumeClaim you created earlier to persist the data of the /data/db directory of the pods. You can also specify extra information like environment variables that will help you with further customizing your deployment.

      Next, add the following code to the file to define a Kubernetes Service that exposes the MongoDB port on port 31017 of every node in your cluster:

      mongo.yaml

      ...
      
      ---
      apiVersion: v1
      kind: Service
      metadata:
        name: mongo
        labels:
          app: mongo
      spec:
        selector:
          app: mongo
        type: NodePort
        ports:
          - port: 27017
            nodePort: 31017
      

      Here you also define an apiVersion, but instead of using the Deployment type, you define a Service. The service will receive connections on port 31017 and forward them to the pods’ port 27017, where you can then access the application.

      The service uses NodePort as the service type, which will expose the Service on each Node’s IP at a static port between 30000 and 32767 (31017 in this case).

      Now that you have defined the deployment, it is time to deploy it:

      • kubectl apply -f mongo.yaml

      You will see the following output:

      Output

      deployment.apps/mongo created service/mongo created

      You can check the status of the deployment and service:

      • kubectl get svc,deployments

      The output will be something like this:

      Output

      NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.245.0.1 <none> 443/TCP 33m service/mongo NodePort 10.245.124.118 <none> 27017:31017/TCP 4m50s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/mongo 1/1 1 1 4m50s

      After the deployment is ready, you can start saving data into your database. The easiest way to do so is by using the MongoDB shell, which is included in the MongoDB pod you just started. You can open it using kubectl.

      For that you are going to need the name of the pod, which you can get using the following command:

      The output will be similar to this:

      Output

      NAME READY STATUS RESTARTS AGE mongo-7654889675-mjcks 1/1 Running 0 13m

      Now copy the name and use it in the exec command:

      • kubectl exec -it your_pod_name mongo

      Now that you are in the MongoDB shell let’s continue by creating a database:

      The use command switches between databases or creates them if they don’t exist.

      Output

      switched to db test

      Then insert some data into your new test database. You use the insertOne() method to insert a new document in the created database:

      • db.test.insertOne( {name: "test", number: 10 })

      Output

      { "acknowledged" : true, "insertedId" : ObjectId("5f22dd521ba9331d1a145a58") }

      The next step is retrieving the data to make sure it is saved, which can be done using the find command on your collection:

      • db.getCollection("test").find()

      The output will be similar to this:

      Output

      NAME READY STATUS RESTARTS AGE { "_id" : ObjectId("5f1b18e34e69b9726c984c51"), "name" : "test", "number" : 10 }

      Now that you have saved some data into the database, it will be persisted in the underlying Ceph volume structure. One big advantage of this kind of deployment is the dynamic provisioning of the volume. Dynamic provisioning means that applications only need to request the storage and it will be automatically provided by Ceph instead of developers creating the storage manually by sending requests to their storage providers.

      Let’s validate this functionality by restarting the pod and checking if the data is still there. You can do this by deleting the pod, because it will be restarted to fulfill the state defined in the deployment:

      • kubectl delete pod -l app=mongo

      Now let’s validate that the data is still there by connecting to the MongoDB shell and printing out the data. For that you first need to get your pod’s name and then use the exec command to open the MongoDB shell:

      The output will be similar to this:

      Output

      NAME READY STATUS RESTARTS AGE mongo-7654889675-mjcks 1/1 Running 0 13m

      Now copy the name and use it in the exec command:

      • kubectl exec -it your_pod_name mongo

      After that, you can retrieve the data by connecting to the database and printing the whole collection:

      • use test
      • db.getCollection("test").find()

      The output will look similar to this:

      Output

      NAME READY STATUS RESTARTS AGE { "_id" : ObjectId("5f1b18e34e69b9726c984c51"), "name" : "test", "number" : 10 }

      As you can see the data you saved earlier is still in the database even though you restarted the pod. Now that you have successfully set up Rook and Ceph and used them to persist the data of your deployment, let’s review the Rook toolbox and what you can do with it.

      The Rook Toolbox is a tool that helps you get the current state of your Ceph deployment and troubleshoot problems when they arise. It also allows you to change your Ceph configurations like enabling certain modules, creating users, or pools.

      In this section, you will install the Rook Toolbox and use it to execute basic commands like getting the current Ceph status.

      The toolbox can be started by deploying the toolbox.yaml file, which is in the examples/kubernetes/ceph directory:

      • kubectl apply -f toolbox.yaml

      You will receive the following output:

      Output

      deployment.apps/rook-ceph-tools created

      Now check that the pod is running:

      • kubectl -n rook-ceph get pod -l "app=rook-ceph-tools"

      Your output will be similar to this:

      Output

      NAME READY STATUS RESTARTS AGE rook-ceph-tools-7c5bf67444-bmpxc 1/1 Running 0 9s

      Once the pod is running you can connect to it using the kubectl exec command:

      • kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath="{.items[0].metadata.name}") bash

      Let’s break this command down for better understanding:

      1. The kubectl exec command lets you execute commands in a pod; like setting an environment variable or starting a service. Here you use it to open the BASH terminal in the pod. The command that you want to execute is defined at the end of the command.
      2. You use the -n flag to specify the Kubernetes namespace the pod is running in.
      3. The -i (interactive) and -t (tty) flags tell Kubernetes that you want to run the command in interactive mode with tty enabled. This lets you interact with the terminal you open.
      4. $() lets you define an expression in your command. That means that the expression will be evaluated (executed) before the main command and the resulting value will then be passed to the main command as an argument. Here we define another Kubernetes command to get a pod where the label app=rook-ceph-tool and read the name of the pod using jsonpath. We then use the name as an argument for our first command.

      Note: As already mentioned this command will open a terminal in the pod, so your prompt will change to reflect this.

      Now that you are connected to the pod you can execute Ceph commands for checking the current status or troubleshooting error messages. For example the ceph status command will give you the current health status of your Ceph configuration and more information like the running MONs, the current running data pools, the available and used storage, and the current I/O operations:

      Here is the output of the command:

      Output

      cluster: id: 71522dde-064d-4cf8-baec-2f19b6ae89bf health: HEALTH_OK services: mon: 3 daemons, quorum a,b,c (age 23h) mgr: a(active, since 23h) osd: 3 osds: 3 up (since 23h), 3 in (since 23h) data: pools: 1 pools, 32 pgs objects: 61 objects, 157 MiB usage: 3.4 GiB used, 297 GiB / 300 GiB avail pgs: 32 active+clean io: client: 5.3 KiB/s wr, 0 op/s rd, 0 op/s wr

      You can also query the status of specific items like your OSDs using the following command:

      This will print information about your OSD like the used and available storage and the current state of the OSD:

      Output

      +----+------------+-------+-------+--------+---------+--------+---------+-----------+ | id | host | used | avail | wr ops | wr data | rd ops | rd data | state | +----+------------+-------+-------+--------+---------+--------+---------+-----------+ | 0 | node-3jis6 | 1165M | 98.8G | 0 | 0 | 0 | 0 | exists,up | | 1 | node-3jisa | 1165M | 98.8G | 0 | 5734 | 0 | 0 | exists,up | | 2 | node-3jise | 1165M | 98.8G | 0 | 0 | 0 | 0 | exists,up | +----+------------+-------+-------+--------+---------+--------+---------+-----------+

      More information about the available commands and how you can use them to debug your Ceph deployment can be found in the official documentation.

      You have now successfully set up a complete Rook Ceph cluster on Kubernetes that helps you persist the data of your deployments and share their state between the different pods without having to use some kind of external storage or provision storage manually. You also learned how to start the Rook Toolbox and use it to debug and troubleshoot your Ceph deployment.

      Conclusion

      In this article, you configured your own Rook Ceph cluster on Kubernetes and used it to provide storage for a MongoDB application. You extracted useful terminology and became familiar with the essential concepts of Rook so you can customize your deployment.

      If you are interested in learning more, consider checking out the official Rook documentation and the example configurations provided in the repository for more configuration options and parameters.

      You can also try out the other kinds of storage Ceph provides like shared file systems if you want to mount the same volume to multiple pods at the same time.



      Source link