One place for hosting & domains

      CentOS

      How To Back Up, Restore, and Migrate a MongoDB Database on CentOS 8


      Not using CentOS 8?


      Choose a different version or distribution.

      The author selected the COVID-19 Relief Fund to receive a donation as part of the Write for DOnations program.

      Introduction

      MongoDB is one of the most popular NoSQL database engines. It is famous for being scalable, robust, reliable, and easy to use. In this article, you will back up, restore, and migrate a sample MongoDB database.

      Importing and exporting a database means dealing with data in a human-readable format that is compatible with other software products. In contrast, MongoDB’s backup and restore operations create or use MongoDB-specific binary data, which preserves not only the consistency and integrity of your data but also its specific MongoDB attributes. Thus, for migration, it’s usually preferable to use backup and restore as long as the source and target systems are compatible.

      Prerequisites

      Before following this tutorial, please make sure you complete the following prerequisites:

      • One CentOS 8 Droplet set up per the CentOS 8 initial server setup guide, including a sudo non-root user and a firewall.
      • MongoDB installed and configured using the article How to Install MongoDB on CentOS 8.
      • Example MongoDB database imported using the instructions in How To Import and Export a MongoDB Database. This article was written for Ubuntu but it can be used for any Linux distribution because the import and export commands work the same way. For CentOS 8 just ensure that you have wget installed. You can do this with the command sudo dnf install wget.

      Except otherwise noted, all of the commands that require root privileges in this tutorial should be run as a non-root user with sudo privileges.

      Step 1 — Using JSON and BSON in MongoDB

      Before continuing further with this article, some basic understanding of the matter is needed. If you have experience with other NoSQL database systems such as Redis, you may find some similarities when working with MongoDB.

      MongoDB uses JSON and BSON (binary JSON) formats for storing its information. JSON is the human-readable format that is perfect for exporting and, eventually, importing your data. You can further manage your exported data with any tool that supports JSON, including a simple text editor.

      An example json document looks like this:

      Example of JSON Format

      {"address":[
          {"building":"1007", "street":"Park Ave"},
          {"building":"1008", "street":"New Ave"},
      ]}
      

      JSON is convenient to work with, but it does not support all the data types available in BSON. This means that there will be the so-called ‘loss of fidelity’ of the information if you use JSON. For backing up and restoring, it’s better to use the binary BSON.

      Second, you don’t have to worry about explicitly creating a MongoDB database. If the database you specify for import doesn’t already exist, it is automatically created. Even better is the case with the collections’ (database tables) structure. In contrast to other database engines, in MongoDB, the structure is again automatically created upon the first document (database row) insert.

      Third, in MongoDB, reading or inserting large amounts of data, such as this article’s tasks, can be resource-intensive and consume much of your CPU, memory, and disk space. This is critical considering that MongoDB is frequently used for large databases and Big Data. The simplest solution to this problem is to run the exports and backups during the night or non-peak hours.

      Fourth, information consistency could be problematic if you have a busy MongoDB server where the information changes during the database export or backup process. One possible solution for this problem is replication, which you may consider when you advance in the MongoDB topic.

      While you can use the import and export functions to backup and restore your data, there are better ways to ensure the full integrity of your MongoDB databases. To backup your data, you should use the command mongodump. For restoring, use mongorestore. Let’s see how they work.

      Step 2 — Using mongodump to Back Up a MongoDB Database

      Let’s cover backing up your MongoDB database first.

      An essential argument to mongodump is --db, which specifies the name of the database you want to back up. If you don’t specify a database name, mongodump backs up all of your databases. The second important argument is --out, which defines the directory into which the data will be dumped. For example, let’s back up the newdb database and storing it in the /var/backups/mongobackups directory. Ideally, we’ll have each of our backups in a directory with the current date like /var/backups/mongobackups/10-29-20.

      First create that directory /var/backups/mongobackups:

      • sudo mkdir -p /var/backups/mongobackups

      Then run mongodump:

      • sudo mongodump --db newdb --out /var/backups/mongobackups/`date +"%m-%d-%y"`

      You will see an output like this:

      Output

      2020-10-29T19:22:36.886+0000 writing newdb.restaurants to 2020-10-29T19:22:36.969+0000 done dumping newdb.restaurants (25359 documents)

      Note that in the above directory path, we have used date +"%m-%d-%y" which automatically gets the current date. This will allow us to have backups inside the directory like /var/backups/10-29-20/. This is especially convenient when we automate the backups.

      At this point you have a complete backup of the newdb database in the directory /var/backups/mongobackups/10-29-20/newdb/. This backup has everything to restore the newdb properly and preserve its so-called “fidelity.”

      As a general rule, you should make regular backups and preferably when the server is least loaded. Thus, you can set the mongodump command as a cron job so that it runs regularly, e.g., every day at 03:03 AM.

      To accomplish this open crontab, cron’s editor:

      Note that when you run sudo crontab, you will be editing the cron jobs for the root user. This is recommended because if you set the crons for your user, they might not execute properly, especially if your sudo profile requires password verification.

      Inside the crontab prompt, insert the following mongodump command:

      crontab

      3 3 * * * mongodump --out /var/backups/mongobackups/`date +"%m-%d-%y"`
      

      In the above command, we omit the --db argument on purpose because you will typically want to have all of your databases backed up.

      Depending on your MongoDB database sizes, you may soon run out of disk space with too many backups. That’s why it’s also recommended to clean the old backups regularly or to compress them.

      For example, to delete all the backups older than seven days, you can use the following bash command:

      • find /var/backups/mongobackups/ -mtime +7 -exec rm -rf {} ;

      Similarly to the previous mongodump command, you can also add this as a cron job. It should run just before you start the next backup, e.g., at 03:01 AM. For this purpose, open crontab again:

      After that insert the following line:

      crontab

      3 1 * * * find /var/backups/mongobackups/ -mtime +7 -exec rm -rf {} ;
      

      save and close the file.

      Completing all the tasks in this step will ensure a proper backup solution for your MongoDB databases.

      Step 3 — Using mongorestore to Restore and Migrate a MongoDB Database

      When you restore your MongoDB database from a previous backup, you have the exact copy of your MongoDB information taken at a particular time, including all the indexes and data types. This is especially useful when you want to migrate your MongoDB databases. For restoring MongoDB, we’ll use the command mongorestore, which works with the binary backups that mongodump produces.

      Let’s continue our examples with the newdb database and see how we can restore it from the previously taken backup. We’ll first specify the name of the database with the --nsInclude argument. We’ll be using newdb.* to restore all collections. To restore a single collection such as restaurants, use newdb.restaurants instead.

      Then, using --drop, we’ll make sure that the target database is first dropped so that the backup is restored in a clean database. As a final argument we’ll specify the directory of the last backup, which will look something like this: /var/backups/mongobackups/10-29-20/newdb/.

      Once you have a timestamped backup, you can restore it using this command:

      • sudo mongorestore --db newdb --drop /var/backups/mongobackups/10-29-20/newdb/

      You will see an output like this:

      Output

      2020-10-29T19:25:45.825+0000 the --db and --collection args should only be used when restoring from a BSON file. Other uses are deprecated and will not exist in the future; use --nsInclude instead 2020-10-29T19:25:45.826+0000 building a list of collections to restore from /var/backups/mongobackups/10-29-20/newdb dir 2020-10-29T19:25:45.829+0000 reading metadata for newdb.restaurants from /var/backups/mongobackups/10-29-20/newdb/restaurants.metadata.json 2020-10-29T19:25:45.834+0000 restoring newdb.restaurants from /var/backups/mongobackups/10-29-20/newdb/restaurants.bson 2020-10-29T19:25:46.130+0000 no indexes to restore 2020-10-29T19:25:46.130+0000 finished restoring newdb.restaurants (25359 documents) 2020-10-29T19:25:46.130+0000 done

      In the above case, we are restoring the data on the same server where we created the backup. If you wish to migrate the data to another server and use the same technique, you should copy the backup directory, which is /var/backups/mongobackups/10-29-20/newdb/ in our case, to the other server.

      Conclusion

      You have now performed some essential tasks related to backing up, restoring, and migrating your MongoDB databases. No production MongoDB server should ever run without a reliable backup strategy, such as the one described here.



      Source link

      How To Install and Configure Elasticsearch on CentOS 8


      Not using CentOS 8?


      Choose a different version or distribution.

      The author selected the COVID-19 Relief Fund to receive a donation as part of the Write for DOnations program.

      Introduction

      Elasticsearch is a platform for the distributed search and analysis of data in real time. Its popularity is due to its ease of use, powerful features, and scalability.

      Elasticsearch supports RESTful operations. This means that you can use HTTP methods (GET, POST, PUT, DELETE, etc.) in combination with an HTTP URI (/collection/entry) to manipulate your data. The intuitive RESTful approach is both developer and user friendly, which is one of the reasons for Elasticsearch’s popularity.

      Elasticsearch is free and open-source software with a solid company behind it — Elastic. This combination makes it suitable for many use cases, from personal testing to corporate integration.

      This article will introduce you to Elasticsearch and show you how to install, configure, and start using it.

      Prerequisites

      To follow this tutorial you will need the following:

      Step 1 — Installing Java on CentOS 8

      Elasticsearch is written in the Java programming language. Your first task, then, is to install a Java Runtime Environment (JRE) on your server. You will use the native CentOS OpenJDK package for the JRE. This JRE is free, well-supported, and automatically managed through the CentOS Yum installation manager.

      Install the latest version of OpenJDK 8:

      • sudo dnf install java-1.8.0-openjdk.x86_64 -y

      Now verify your installation:

      The command will create an output like this:

      Output

      openjdk version "1.8.0_262" OpenJDK Runtime Environment (build 1.8.0_262-b10) OpenJDK 64-Bit Server VM (build 25.262-b10, mixed mode)

      When you advance in using Elasticsearch and you start looking for better Java performance and compatibility, you may opt to install Oracle’s proprietary Java (Oracle JDK 8). For more information, reference our article on How To Install Java on CentOS and Fedora.

      Step 2 — Downloading and Installing Elasticsearch on CentOS 8

      You can download Elasticsearch directly from elastic.co in zip, tar.gz, deb, or rpm packages. For CentOS, it’s best to use the native rpm package, which will install everything you need to run Elasticsearch.

      At the time of this writing, the latest Elasticsearch version is 7.9.2.

      From a working directory, download the program:

      • sudo rpm -ivh https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.9.2-x86_64.rpm

      Elasticsearch will install in /usr/share/elasticsearch/, with its configuration files placed in /etc/elasticsearch and its init script added in /etc/init.d/elasticsearch.

      To make sure Elasticsearch starts and stops automatically with the server, add its init script to the default runlevels:

      • sudo systemctl daemon-reload && sudo systemctl enable elasticsearch.service

      With Elasticsearch installed, you will now configure a few important settings.

      Step 3 — Configuring Elasticsearch on CentOS 8

      Now that you have installed Elasticsearch and its Java dependency, it is time to configure Elasticsearch.

      The Elasticsearch configuration files are in the /etc/elasticsearch directory. The ones we’ll review and edit are:

      • elasticsearch.yml — Configures the Elasticsearch server settings. This is where most options are stored, which is why we are mostly interested in this file.

      • jvm.options — Provides configuration for the JVM such as memory settings.

      The first variables to customize on any Elasticsearch server are node.name and cluster.name in elasticsearch.yml. Let’s do that now.

      As their names suggest, node.name specifies the name of the server (node) and the cluster to which the latter is associated. If you don’t customize these variables, a node.name will be assigned automatically in respect to the server hostname. The cluster.name will be automatically set to the name of the default cluster.

      The cluster.name value is used by the auto-discovery feature of Elasticsearch to automatically discover and associate Elasticsearch nodes to a cluster. Thus, if you don’t change the default value, you might have unwanted nodes, found on the same network, in your cluster.

      Let’s start editing the main elasticsearch.yml configuration file.

      Open it using nano or your preferred text editor:

      • sudo nano /etc/elasticsearch/elasticsearch.yml

      Remove the # character at the beginning of the lines for node.name and cluster.name to uncomment them, and then change their values. Your first configuration changes in the /etc/elasticsearch/elasticsearch.yml file will look like this:

      /etc/elasticsearch/elasticsearch.yml

      ...
      node.name: "My First Node"
      cluster.name: mycluster1
      ...
      

      The networking settings are also found in elasticsearch.yml. By default, Elasticsearch will listen on localhost on port 9200 so that only clients from the same server can connect. You should leave these settings unchanged from a security point of view, because the open source and free edition of Elasticsearch doesn’t offer authentication features.

      Another important setting is the node.roles property. You can set this to master-eligible (simply master in the configuration), data or ingest.

      The master-eligible role is responsible for the cluster’s health and stability. In large deployments with a lot of cluster nodes, it’s recommended to have more than one dedicated node with a master role only. Typically, a dedicated master node will neither store data nor create indices. Thus, there will be no chance of being overloaded, by which the cluster health could be endangered.

      The data role defines the nodes that will store the data. Even if a data node is overloaded, the cluster health shouldn’t be affected seriously, provided there are other nodes to take the additional load.

      Lastly, the ingest role allows a node to accept and process data streams. In larger setups, there should be dedicated ingest nodes in order to avoid possible overload on the master and data nodes.

      Note: one node may have one or more roles allowing scalability, redundancy and high-availability of the Elasticsearch setup. By default, all of these roles are assigned to the node. This is suitable for a single-node Elasticsearch, as in the example scenario described in this article. Therefore, you don’t have to change the role. Still, if you want to change the role, such as dedicating a node as a master, you can do it by changing /etc/elasticsearch/elasticsearch.yml like this:

      /etc/elasticsearch/elasticsearch.yml

      ...
      node.roles: [ master ]
      ...
      

      Another setting to consider changing is path.data. This determines the path where data is stored, and the default path is /var/lib/elasticsearch. In a production environment it’s recommended that you use a dedicated partition and mount point for storing Elasticsearch data. In the best case, this dedicated partition will be a separate storage media that will provide better performance and data isolation. You can specify a different path.data path by uncommenting the path.data line and changing its value like this:

      /etc/elasticsearch/elasticsearch.yml

      ...
      path.data: /media/different_media
      ...
      

      Now that you have made all your changes, save and close elasticsearch.yml.

      You must also edit your configurations in jvm.options.

      Recall that Elasticsearch is run by a JVM, i.e. essentially it’s a Java application. So just as any Java application it has JVM settings that can be configured in the file /etc/elasticsearch/jvm.options. Two of the most important settings, especially in regards to performance, are Xms and Xmx, which define the minimum (Xms) and maximum (Xmx) memory allocation.

      By default, both are set to 1GB, but that is almost never optimal. Not only that, but if your server only has 1GB of RAM, you won’t be able to start Elasticsearch with the default settings. This is because the operating system takes at least 100MB so it will not be possible to dedicate 1GB to Elasticsearch.

      Unfortunately, there is no universal formula for calculating the memory settings. Naturally, the more memory you allocate, the better your performance, but make sure that there is enough memory left for the rest of the processes on the server. For example, if your machine has 1GB of RAM, you could set both Xms and Xmx to 512MB, thus allowing another 512MB for the rest of the processes. Note that usually both Xms and Xmx are set to the same value in order to avoid the performance penalty of the JVM garbage collection.

      If your server only has 1GB of RAM, you must edit this setting.

      Open jvm.options:

      • sudo nano /etc/elasticsearch/jvm.options

      Now change the Xms and Xmx values to 512MB:

      /etc/elasticsearch/jvm.options

      ...
      -Xms512m
      -Xmx512m
      ...
      

      Save and exit the file.

      Now start Elasticsearch for the first time:

      • sudo systemctl start elasticsearch.service

      Allow at least 10 seconds for Elasticsearch to start before you attempt to use it. Otherwise, you may get a connection error.

      Note: You should know that not all Elasticsearch settings are set and kept in configuration files. Instead, some settings are set via its API, like index.number_of_shards and index.number_of_replicas. The first determines into how many pieces (shards) the index will split. The second defines the number of replicas that will be distributed across the cluster. Having more shards improves the indexing performance, while having more replicas makes searching faster.

      Assuming that you are still exploring and testing Elasticsearch on a single node, you can play with these settings and alter them by executing the following curl command:

      • curl -XPUT -H 'Content-Type: application/json' 'http://localhost:9200/_all/_settings?preserve_existing=true' -d '{
      • "index.number_of_replicas" : "0",
      • "index.number_of_shards" : "1"
      • }'

      With Elasticsearch installed and configured, you will now secure and test the server.

      Step 4 — (Optional) Securing Elasticsearch on CentOS 8

      Elasticsearch has no built-in security and anyone who can access the HTTP API can control it. This section is not a comprehensive guide to securing Elasticsearch. Take whatever measures are necessary to prevent unauthorized access to it and the server/virtual machine on which it is running.

      By default, Elasticsearch is configured to listen only on the localhost network interface, i.e. remote connections are not possible. You should leave this setting unchanged unless you have taken one or both of the following measures:

      • You have limited the access to TCP port 9200 only to trusted hosts with iptables.
      • You have created a vpn between your trusted hosts and you are going to expose Elasticsearch on one of the vpn’s virtual interfaces.

      Only once you have done the above should you consider allowing Elasticseach to listen on other network interfaces besides localhost. Such a change might be considered when you need to connect to Elasticsearch from another host, for example.

      To change the network exposure, open the file elasticsearch.yml:

      • sudo nano /etc/elasticsearch/elasticsearch.yml

      In this file find the line that contains network.host, uncomment it by removing the # character at the beginning of the line, and then change the value to the IP address of the secured network interface. The line will look something like this:

      /etc/elasticsearch/elasticsearch.yml

      ...
      network.host: 10.0.0.1
      ...
      

      Warning: Because Elasticsearch doesn’t have any built-in security, it is very important that you do not set this to any IP address that is accessible to any servers that you do not control or trust. Do not bind Elasticsearch to a public or shared private network IP address.

      Also, for additional security you can disable scripts that are used to evaluate custom expressions. By crafting a custom malicious expression, an attacker might be able to compromise your environment.

      To disable custom expressions, add the following line at the end of the /etc/elasticsearch/elasticsearch.yml file:

      /etc/elasticsearch/elasticsearch.yml

      ...
      script.allowed_types: none
      ...
      

      For the above changes to take effect, you will have to restart Elasticsearch.

      Restart Elasticsearch now:

      • sudo service elasticsearch restart

      In this step you took some measures to secure your Elasticsearch server. Now you are ready to test the application.

      Step 5 — Testing Elasticsearch on CentOS 8

      By now, Elasticsearch should be running on port 9200. You can test this using curl, the command-line tool for client-side URL transfers.

      To test the service, make a GET request like this:

      • curl -X GET 'http://localhost:9200'

      You will see the following response:

      Output

      { "name" : "My First Node", "cluster_name" : "mycluster1", "cluster_uuid" : "R23U2F87Q_CdkEI2zGhLGw", "version" : { "number" : "7.9.2", "build_flavor" : "default", "build_type" : "rpm", "build_hash" : "d34da0ea4a966c4e49417f2da2f244e3e97b4e6e", "build_date" : "2020-09-23T00:45:33.626720Z", "build_snapshot" : false, "lucene_version" : "8.6.2", "minimum_wire_compatibility_version" : "6.8.0", "minimum_index_compatibility_version" : "6.0.0-beta1" }, "tagline" : "You Know, for Search" }

      If you see a similar response, Elasticsearch is working properly. If not, recheck the installation instructions and allow some time for Elasticsearch to fully start.

      Your Elasticsearch server is now operational. In the next step you will add and retrieve some data from the application.

      Step 6 — Using Elasticsearch on CentOS 8

      In this step you will add some data to Elasticsearch and then make some manual queries.

      Use curl to add your first entry:

      • curl -H 'Content-Type: application/json' -X POST 'http://localhost:9200/tutorial/helloworld/1' -d '{ "message": "Hello World!" }'

      You will see the following output:

      Output

      {"_index":"tutorial","_type":"helloworld","_id":"1","_version":3,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":2,"_primary_term":4

      Using curl, you sent an HTTP POST request to the Elasticseach server. The URI of the request was /tutorial/helloworld/1. Let’s take a closer look at those parameters:

      • tutorial is the index of the data in Elasticsearch.
      • helloworld is the type.
      • 1 is the id of our entry under the above index and type.

      Note that it’s also required to set the content type of all POST requests to JSON with the argument -H 'Content-Type: application/json'. If you do not do this Elasticsearch will reject your request.

      Now retrieve your first entry using an HTTP GET request:

      • curl -X GET 'http://localhost:9200/tutorial/helloworld/1'

      The result will look like this:

      Output

      {"_index":"tutorial","_type":"helloworld","_id":"1","_version":3,"_seq_no":2,"_primary_term":4,"found":true,"_source":{ "message": "Hello World!" }}

      To modify an existing entry you can use an HTTP PUT request like this:

      • curl -H 'Content-Type: application/json' -X PUT 'localhost:9200/tutorial/helloworld/1?pretty' -d '
      • {
      • "message": "Hello People!"
      • }'

      Elasticsearch will acknowledge successful modification like this:

      Output

      { "_index" : "tutorial", "_type" : "helloworld", "_id" : "1", "_version" : 2, "result" : "updated", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 1, "_primary_term" : 1 }

      In the above example you modified the message of the first entry to "Hello People!". With that, the version number increased to 2.

      To make the output of your GET operations more human-readable, you can also “prettify” your results by adding the pretty argument:

      • curl -X GET 'http://localhost:9200/tutorial/helloworld/1?pretty'

      Now the response will output in a more readable format:

      Output

      { "_index" : "tutorial", "_type" : "helloworld", "_id" : "1", "_version" : 2, "_seq_no" : 1, "_primary_term" : 1, "found" : true, "_source" : { "message" : "Hello People!" } }

      This is how you can add and query data in Elasticsearch. To learn about the other operations you can check the Elasticsearch API documentation.

      Conclusion

      In this tutorial you installed, configured, and began using Elasticsearch on CentOS 8. Once you are comfortable with manual queries, your next task will be to start using the service from your applications.



      Source link

      How To Install MongoDB on CentOS 8


      An earlier version of this tutorial was written by Melissa Anderson.

      Introduction

      MongoDB, also known as Mongo, is an open-source document database used in many modern web applications. It’s classified as a NoSQL database because it doesn’t rely on a traditional table-based relational database structure.

      Instead, it uses JSON-like documents with dynamic schemas, meaning that, unlike relational databases, MongoDB does not require a predefined schema before you add data to a database. You can alter the schema at any time and as often as is necessary without having to set up a new database with an updated schema.

      In this tutorial you’ll install MongoDB on a CentOS 8 server, test it, and learn how to manage it as a systemd service.

      Prerequisites

      To complete this tutorial, you will need a server running CentOS 8. This server should have a non-root user with administrative privileges and a firewall configured with firewalld. To set this up, follow our Initial Server Setup guide for CentOS 8.

      Step 1 — Installing MongoDB

      There’s no official MongoDB package available in the standard CentOS repositories. In order to install Mongo on your server, you’ll need to add a repository file that points to MongoDB’s official repo. Your package manager will then read this file when searching for packages, and will be able to use it to install Mongo and any of its dependencies.

      In this guide, we’ll install Mongo with the DNF package manager, so you’ll need to add the repository file to the /etc/yum.repos.d/ directory. DNF checks any files in this directory that end with the .repo suffix when searching for package sources.

      You can use vi — a widely-used text editor that’s installed on CentOS systems by default — to create the repository file, but vi can be somewhat unintuitive for users who aren’t experienced with it. As an alternative, we recommend nano, a more user-friendly editor available from the standard CentOS repositories.

      To install nano with DNF, run the following command:

      During this installation process, the system will ask you to confirm that you want to install the software. To do so, press y, and then ENTER:

      Output

      Transaction Summary ================================================================================ Install 1 Package Total download size: 581 k Installed size: 2.2 M Is this ok [y/N]: y

      Once the installation is done, run the following command to create and open a repository file for editing:

      • sudo nano /etc/yum.repos.d/mongodb-org.repo

      Then add the following content to the empty file. This will install version 4.4 of MongoDB (the latest version at the time of this writing):

      /etc/yum.repos.d/mongodb-org.repo

      [mongodb-org]
      name=MongoDB Repository
      baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/4.4/x86_64/
      gpgcheck=1
      enabled=1
      gpgkey=https://www.mongodb.org/static/pgp/server-4.4.asc
      

      Note: You can check whether there’s a newer version of MongoDB available by consulting the database’s official documentation. In a web browser, navigate to the Configure the package management system section of MongoDB’s RedHat and CentOS installation instructions.

      There, you’ll find a code block with the repository information for the latest version of MongoDB. If different from the previous file contents, you can copy that configuration and add it to your .repo file instead.

      Here’s what each of these directives do:

      • [mongodb-org]: the first line of a .repo file is a single string of characters, wrapped in brackets, that serves as an identifier for the repository
      • name: this directive defines a human-readable name to describe the repository. You could enter whatever name you’d like here, but for clarity you can enter MongoDB Repository
      • baseurl: this points to the URL of a directory where the repository’s repodata directory, which contains the repository’s metadata, can be found
      • gpgcheck: setting this directive to 1 tells the DNF package manager to enable GPG signature-checking on this repository, requiring it to verify whether any packages you want to install from it have been corrupted or tampered with
      • enabled: setting this directive to 1 will tell DNF to include this repository as a package source; setting it to 0 would disable this behavior
      • gpgkey: this directive specifies the URL of the GPG key that should be imported to verify the signatures of packages from this repository

      After adding the repository information, save and close the file. If you used nano to create the repo file, do so by pressing CTRL + X, Y, then ENTER.

      Before continuing, you can test whether DNF is able to find and use this repository by running the program’s repolist command:

      If the repository is available to your server’s package manager, you’ll find it listed in the output:

      Output

      repo id repo name AppStream CentOS-8 - AppStream BaseOS CentOS-8 - Base extras CentOS-8 - Extras mongodb-org MongoDB Repository

      Following that, you can install the mongodb-org package with this command:

      • sudo dnf install mongodb-org

      Again, you’ll be asked to confirm that you want to install the package by pressing y then ENTER. DNF may also ask you to confirm the import of Mongo’s signing key; if this is the case, do so by once more pressing y and then ENTER.

      Once the command finishes, MongoDB will be installed on your server. Before you start up the database, though, the MongoDB documentation recommends that you disable Transparent Huge Pages on your server to optimize performance.

      Step 2 — Disabling Transparent Huge Pages to Improve Performance

      CentOS enables Transparent Huge Pages (THP), a Linux memory management system, by default. THP uses extra large memory pages to reduce the impact of Translation Lookaside Buffer lookups on machines with large amounts of memory. However, this system can have a negative impact on database performance and the MongoDB documentation recommends that you disable THP.

      To disable Transparent Huge Pages on CentOS 8, you can create a systemd unit file that will disable it at boot. systemd is an init system and software suite used in many Linux operating systems with which you can control many aspects of a server. In systemd, a unit refers to any resource that the system knows how to operate on and manage. A unit file is a special configuration file that defines one of these resources.

      Create this file in the /etc/systemd/system/ directory with sudo privileges:

      • sudo nano /etc/systemd/system/disable-thp.service

      In the file, you’ll need to add a few separate sections. The first you’ll add is a [Unit] section which holds some general information about the service unit.

      Add the following highlighted lines:

      /etc/systemd/system/disable-thp.service

      [Unit]
      Description=Disable Transparent Huge Pages (THP)
      After=sysinit.target local-fs.target
      Before=mongod.service
      

      After the [Unit] section header are the following options:

      • Description: this defines a human-readable name for the unit
      • After: this ensures that the disable-thp service unit will start up only after the two specified targets — predefined groups of systemd units — finish starting up
      • Before: this option ensures that the disable-thp service will always finish starting up before mongod, the MongoDB service unit, starts up

      Next, add the highlighted [Service] section to the file:

      /etc/systemd/system/disable-thp.service

      . . .
      Before=mongod.service
      
      [Service]
      Type=oneshot
      ExecStart=/bin/sh -c 'echo never | tee /sys/kernel/mm/transparent_hugepage/enabled > /dev/null' 
      

      This section’s Type option defines that this unit will be a oneshot type process, meaning that it will be a one-off task and systemd should wait for the process to exit before continuing.

      The ExecStart option is used to specify the full path and any arguments of the command that will start the process. The command included here will open up a shell process and then run the command between the single quotes. This command echoes the string never and then pipes it into the following tee command. The tee command then rewrites the /sys/kernel/mm/transparent_hugepage/enabled file with never as its only contents and passes any output to /dev/null, a null device which immediately discards any information written to it. This is what will actually disable THP.

      Following that, add this highlighted [Install] section:

      /etc/systemd/system/disable-thp.service

      . . .
      ExecStart=/bin/sh -c 'echo never | tee /sys/kernel/mm/transparent_hugepage/enabled > /dev/null'
      
      [Install]
      WantedBy=basic.target
      

      [Install] sections carry installation information for the unit. This section only has one option, WantedBy, which will cause the disable-thp unit to start when basic.target is started.

      Note: If you’d like to learn more about systemd units and services, we encourage you to check out our guide on Understanding Systemd Units and Unit Files.

      The entire service unit file should look like this when you’ve finished adding all the lines:

      /etc/systemd/system/disable-thp.service

      [Unit]
      Description=Disable Transparent Huge Pages (THP)
      After=sysinit.target local-fs.target
      Before=mongod.service
      
      [Service]
      Type=oneshot
      ExecStart=/bin/sh -c 'echo never | tee /sys/kernel/mm/transparent_hugepage/enabled > /dev/null'
      
      [Install]
      WantedBy=basic.target
      

      Save and close the file when finished. Then, reload systemd to make your system aware of the new disable-thp service:

      • sudo systemctl daemon-reload

      Next, start the disable-thp service:

      • sudo systemctl start disable-thp.service

      You can confirm that THP has been disabled by checking the contents of the /sys/kernel/mm/transparent_hugepage/enabled file:

      • cat /sys/kernel/mm/transparent_hugepage/enabled

      If THP was successfully disabled, never will be wrapped in brackets in the output:

      Output

      always madvise [never]

      Following that, enable the disable-thp service so that it will start up automatically and disable THP whenever the server boots up. Notice that this command doesn’t include .service in the service file definition. systemctl will append this suffix to whatever argument you pass automatically if it isn’t already present, so it isn’t necessary to include it:

      • sudo systemctl enable disable-thp

      The disable-thp service will now start up whenever your server boots and disables THP before the MongoDB service starts. However, there’s one more step you’ll need to take to ensure that THP remains disabled on your system.

      By default, CentOS 8 also has tuned — a kernel tuning tool — installed and enabled. tuned uses a number of preconfigured tuning profiles that can improve performance for a number of specific use cases. You can edit these profiles or create new ones customized for your system.

      The tuned tool can affect the THP setting on your system, so the MongoDB documentation also recommends that you also create a custom profile to ensure that THP doesn’t get enabled unexpectedly.

      Get started with this by creating a new directory to hold the custom tuned profile:

      • sudo mkdir /etc/tuned/no-thp

      Within this directory, create a configuration file named tuned.conf:

      • sudo nano /etc/tuned/no-thp/tuned.conf

      Add the following two sections to the file:

      /etc/tuned/no-thp/tuned.conf

      [main]
      include=virtual-guest
      
      [vm]
      transparent_hugepages=never
      

      The first section, [main], must be included in every tuned configuration file. Here, we only specify one include statement which will cause the no-thp profile to inherit the characteristics of another tuned profile named virtual-guest.

      Note: Inheriting the characteristics of the virtual-guest profile like this will work in many cases, but it may not be optimal for every case. We encourage you to review this documentation on provided tuned profiles to determine whether inheriting the characteristics of another profile would be more appropriate for your system.

      The next section specifies a special plugin — vm — which is used specifically to enable or disable THP based on the value following the transparent_hugepages boolean option.

      After adding these lines, save and close the file. Then enable the new profile:

      • sudo tuned-adm profile no-thp

      With that, you’ve disabled THP on your server. You can now start the MongoDB service and test the database’s functionality.

      Step 3 — Starting the MongoDB Service and Testing the Database

      The installation process described in Step 1 automatically configures MongoDB to run as a daemon controlled by systemd, meaning you can manage MongoDB using the various systemctl commands. However, this installation procedure doesn’t automatically start the service.

      Run the following systemctl command to start the MongoDB service:

      • sudo systemctl start mongod

      Then check the service’s status:

      • sudo systemctl status mongod

      This command will return output like the following, indicating that the service is up and running:

      Output

      ● mongod.service - MongoDB Database Server Loaded: loaded (/usr/lib/systemd/system/mongod.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2020-10-01 20:31:26 UTC; 19s ago Docs: https://docs.mongodb.org/manual Process: 14208 ExecStart=/usr/bin/mongod $OPTIONS (code=exited, status=0/SUCCESS) Process: 14205 ExecStartPre=/usr/bin/chmod 0755 /var/run/mongodb (code=exited, status=0/SUCCESS) Process: 14203 ExecStartPre=/usr/bin/chown mongod:mongod /var/run/mongodb (code=exited, status=0/SUCCESS) Process: 14201 ExecStartPre=/usr/bin/mkdir -p /var/run/mongodb (code=exited, status=0/SUCCESS) Main PID: 14210 (mongod) Memory: 66.6M CGroup: /system.slice/mongod.service └─14210 /usr/bin/mongod -f /etc/mongod.conf

      After confirming that the service is running as expected, enable the MongoDB service to start up at boot:

      • sudo systemctl enable mongod

      You can further verify that the database is operational by connecting to the database server and executing a diagnostic command. The following command will connect to the database and output its current version, server address, and port. It will also return the result of MongoDB’s internal connectionStatus command:

      • mongo --eval 'db.runCommand({ connectionStatus: 1 })'

      connectionStatus will check and return the status of the database connection. A value of 1 for the ok field in the response indicates that the server is working as expected:

      Output

      MongoDB shell version v4.4.1 connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb Implicit session: session { "id" : UUID("460fe822-2881-477c-b095-aa3ccb49702d") } MongoDB server version: 4.4.1 { "authInfo" : { "authenticatedUsers" : [ ], "authenticatedUserRoles" : [ ] }, "ok" : 1 }

      Also, note that the database is running on port 27017 on 127.0.0.1, the local loopback address representing localhost. This is MongoDB’s default port number.

      Next, we’ll look at how to manage the MongoDB server instance with systemd.

      Step 4 — Managing the MongoDB Service

      As mentioned previously, the installation process described in Step 1 configures MongoDB to run as a systemd service. This means that you can manage it using standard systemctl commands as you would with other CentOS system services.

      Recall that the systemctl status command checks the status of the MongoDB service:

      • sudo systemctl status mongod

      You can stop the service anytime by typing:

      • sudo systemctl stop mongod

      To start the service when it’s stopped, run:

      • sudo systemctl start mongod

      You can also restart the server when it’s already running:

      • sudo systemctl restart mongod

      In Step 3, you enabled MongoDB to start automatically with the server. If you ever wish to disable this automatic startup, type:

      • sudo systemctl disable mongod

      Then to re-enable it to start up at boot, run the enable command again:

      • sudo systemctl enable mongod

      For more information on how to manage systemd services, check out Systemd Essentials: Working with Services, Units, and the Journal.

      Conclusion

      In this tutorial, you added the official MongoDB repository to your list of DNF repos and installed the latest version of the database. You then disabled Transparent Huge Pages to optimize the database’s performance, tested Mongo’s functionality, and practiced some systemctl commands.

      As an immediate next step, we strongly recommend that you harden your MongoDB installation’s security by following our guide on How To Secure MongoDB on CentOS 8. Once it’s secured, you could then configure MongoDB to accept remote connections.

      You can find more tutorials on how to configure and use MongoDB in these DigitalOcean community articles. We also encourage you to check out the official MongoDB documentation, as it’s a great resource on the possibilities that MongoDB provides.



      Source link