One place for hosting & domains

      Replication

      How To Migrate Redis Data with Replication on Ubuntu 18.04


      Introduction

      Redis is an in-memory, key-value data store known for its flexibility, performance, wide language support, and built-in features like replication. Replication is the practice of regularly copying data from one database to another in order to have a replica that always remains an exact duplicate of the primary instance. One common use of Redis replication is to migrate an existing Redis data store to a new server, as one might do when scaling up their infrastructure for better performance.

      This tutorial outlines the process of using Redis’s built-in replication features to migrate data from one Ubuntu 18.04 server (the “source”) to another (the “target”). This involves making a few configuration changes to each server, setting the target server to function as a replica of the source, and then promoting the replica back to a primary after the migration is completed.

      Prerequisites

      To complete this tutorial, you will need:

      Step 1 — (Optional) Loading Your Source Redis Instance with Sample Data

      This optional step involves loading your source Redis instance with some sample data so you can experiment with migrating data to your target instance. If you already have data that you want to migrate over to your target, you can move ahead to Step 2 which will go over how to back it up.

      To begin, connect to the Ubuntu server you’ll use as your source Redis instance as your non-root user:

      • ssh sammy@source_server_ip

      Then run the following command to access your Redis server:

      If you’ve configured your Redis server to require password authentication, run the auth command followed by your Redis password:

      • auth source_redis_password

      Next, run the following commands. These will create a number of keys holding a few strings, a hash, a list, and a set:

      • mset string1 "Redis" string2 "is" string3 "fun!"
      • hmset hash1 field1 "Redis" field2 "is" field3 "fast!"
      • rpush list1 "Redis" "is" "feature-rich!"
      • sadd set1 "Redis" "is" "free!"

      Additionally, run the following expire commands to provide a few of these keys with a timeout. This will make them volatile, meaning that Redis will delete them after a specified amount of time (7500 seconds, in this case):

      • expire string2 7500
      • expire hash1 7500
      • expire set1 7500

      With that, you have some example data you can export to your target Redis instance. Keep the redis-cli prompt open for now, since we will run a few more commands from it in the next step to back this data up.

      Step 2 — Backing Up Your Source Redis Instance

      Any time you plan to move data from one server to another, there’s a risk that something could go wrong and you could lose data as a result. Even though this risk is small, we will use Redis’s bgsave command to create a backup of your source Redis database in case you encounter an error during the replication process.

      If you don’t already have it open, start by opening up the Redis command line interface:

      Also, if you’ve configured your Redis server to require password authentication, run the auth command followed by your Redis password:

      Next, run the bgsave command. This will create a snapshot of your current data set and export it to a dump file held in Redis’s working directory:

      Note: You can take a snapshot of your Redis database with either the save or bgsave commands. The reason we use the bgsave command here, though, is that the save command runs synchronously, meaning it will block any other clients connected to the database. Because of this, the save command documentation recommends that you should almost never run it in a production environment.

      Instead, it suggests using the bgsave command which runs asynchronously. This will cause Redis to fork the database into two processes: the parent process will continue to serve clients while the child saves the database before exiting:

      Note that if clients add or modify data while the bgsave operation is running, these changes won’t be captured in the snapshot.

      Following that, you can close the connection to your Redis instance by running the exit command:

      If you need it in the future, you can find the data dump file in your Redis instance’s working directory. Recall how in the prerequisite Redis installation tutorial you set your Redis instance to use /var/lib/redis as its working directory.

      List the contents of your Redis working directory to confirm that it’s holding the data dump file:

      If the dump file was exported correctly, you will see it in this command’s output. By default, this file is named dump.rdb:

      Output

      dump.rdb

      After confirming that your data was backed up correctly, you’re all set to configure your source Redis server to accept external connections and allow for replication.

      Step 3 — Configuring Your Source Redis Instance

      By default, Redis isn’t configured to listen for external connections, meaning that any replicas you configure won’t be able to sync with your source instance unless you update its configuration. Here, we will update the source instance’s configuration file to allow for external connections and also set a password which the target instance will use to authenticate once replication begins. After that, we’ll add a firewall rule to allow connections to the port on which Redis is running.

      Open up your source Redis instance’s configuration file with your preferred text editor. Here, we’ll use nano:

      • sudo nano /etc/redis/redis.conf

      Navigate to the line that begins with the bind directive. It will look like this by default:

      /etc/redis/redis.conf

      . . .
      bind 127.0.0.1
      . . .
      

      This directive binds Redis to 127.0.0.1, an IPv4 loopback address that represents localhost. This means that this Redis instance is configured to only listen for connections that originate from the same server as the one where it’s installed. To allow your source instance to accept any connection made to its public IP address, such as those made from your target instance, add your source Redis server’s IP address after the 127.0.0.1. Note that you shouldn’t include any commas after 127.0.0.1:

      /etc/redis/redis.conf

      . . .
      bind 127.0.0.1 source_server_IP
      . . .
      

      Next, if you haven’t already done so, use the requirepass directive to configure a password which users must enter before they can interact with the data on the source instance. Do so by uncommenting the directive and setting it to a complex password or passphrase:

      /etc/redis/redis.conf

      . . .
      requirepass source_redis_password
      . . .
      

      Be sure to take note of the password you set here, as you will need it when you configure the target server.

      Following that change, you can save and close the Redis configuration file. If you edited it with nano, do so by pressing CTRL+X, Y, then ENTER.

      Then, restart the Redis service to put these changes into effect:

      • sudo systemctl restart redis

      That’s all you need to do in terms of configuring Redis, but if you configured a firewall on your server it will continue to block any attempts by your target server to connect with the source. Assuming you configured your firewall with ufw, you could update it to allow connections to the port on which Redis is running with the following command. Note that Redis is configured to use port 6379 by default:

      After making that final change you’re all done configuring your source Redis server. Continue on to configure your target Redis instance to function as a replica of the source.

      Step 4 — Configuring your Target Redis Instance

      By this point you’ve configured your source Redis instance to accept external connections. However, because you’ve locked down access to the source by uncommenting the requirepass directive, your target instance won’t be able to replicate the data held on the source. Here, you will configure your target Redis instance to be able to authenticate its connection to the source, thereby allowing replication.

      Begin by connecting to your target Redis server as your non-root user:

      • ssh sammy@target_server_ip

      Next, open up your target server’s Redis configuration file:

      • sudo nano /etc/redis/redis.conf

      If you haven’t done so already, you should configure a password for your target Redis instance with the requirepass directive:

      /etc/redis/redis.conf

      . . .
      requirepass target_redis_password
      . . .
      

      Next, uncomment the masterauth directive and set it to your source Redis instance’s authentication password. By doing this, your target server will be able to authenticate to the source instance after you enable replication:

      /etc/redis/redis.conf

      . . .
      masterauth source_redis_password
      . . .
      

      Lastly, if you have clients writing information to your source instance, you will want to configure them to write data to your target instance as well. This way, if a client writes any data after you promote the target back to being a primary instance, it won’t get lost.

      To do this, though, you will need to adjust the replica-read-only directive. This is set to yes by default, which means that it’s configured to become a “read-only” replica which clients won’t be able to write to. Set this directive to no to allow clients to write to it:

      /etc/redis/redis.conf

      . . .
      replica-read-only no
      . . .
      

      Those are all the changes you need to make to the target’s configuration file, so you can save and close it.

      Then, restart the Redis service to put these changes into effect:

      • sudo systemctl restart redis

      After restarting the Redis service your target server will be ready to become a replica of the source. All you’ll need to do to turn it into one is to run a single command, which we’ll do shortly.

      Note: If you have any clients writing data to your source Redis instance, now would be a good time to configure them to also write data to your target.

      Step 5 — Starting and Verifying Replication

      By this point, you have configured your source Redis instance to accept connections from your target server and you’ve configured your target Redis instance to be able to authenticate to the source as a replica. With these pieces in place, you’re ready to turn your target instance into a replica of the source.

      Begin by opening up the Redis command line interface on your target Redis server:

      Run the auth command to authenticate the connection:

      Next, turn the target instance into a replica of the source with the replicaof command. Be sure to replace source_server_ip with your source instance’s public IP address and source_port with the port used by Redis on your source instance:

      • replicaof source_server_ip source_port

      From the prompt, run the following scan command. This will return all the keys currently held by the replica:

      If replication is working as expected, you will see all the keys from your source instance held in the replica. If you loaded your source with the sample data in Step 1, the scan command’s output will look like this:

      Output

      1) "0" 2) 1) "string3" 2) "string1" 3) "set1" 4) "string2" 5) "hash1" 6) "list1"

      Note: Be aware that this command may return the keys in a different order than what’s shown in this example.

      However, if this command doesn’t return the same keys held on your source Redis instance, it may be that there is an error in one of your servers’ configuration files preventing the target database from connecting to the source. In this case, close the connection to your target Redis instance, and double check that you’ve edited the configuration files on both your source and target Redis servers correctly.

      While you have the connection open, you can also confirm that the keys you set to expire are still volatile. Do so by running the ttl command with one of these keys as an argument:

      This will return the number of seconds before this key will be deleted:

      Output

      5430

      Once you’ve confirmed that the data on your source instance was correctly synced to your target, you can promote the target back to being a primary instance by running the replicaof command once again. This time, however, instead of following replicaof with an IP address and port, follow it with no one. This will cause the target instance to stop syncing with the source immediately:

      To confirm that the data replicated from the source persist on the target, rerun the scan command you entered previously:

      scan 0
      

      You should see the same keys in this command’s output as when you ran the scan command when the target was still replicating the source:

      Output

      1) "0" 2) 1) "string3" 2) "string1" 3) "set1" 4) "string2" 5) "hash1" 6) "list1"

      With that, you’ve successfully migrated all the data from your source Redis instance to your target. If you have any clients that are still writing data to the source instance, now would be a good time to configure them to only write to the target.

      Conclusion

      There are several methods besides replication you can use to migrate data from one Redis instance to another, but replication has the advantages of requiring relatively few configuration changes to work and only a single command to initiate or stop.

      If you’d like to learn more about working with Redis, we encourage you to check out our tutorial series on How To Manage a Redis Database. Also, if you want to move your Redis data to a Redis instance managed by DigitalOcean, follow our guide on how to do so.



      Source link

      How To Set Up Logical Replication with PostgreSQL 10 on Ubuntu 18.04


      Introduction

      When setting up an application for production, it’s often useful to have multiple copies of your database in place. The process of keeping database copies in sync is called replication. Replication can provide high-availability horizontal scaling for high volumes of simultaneous read operations, along with reduced read latencies. It also allows for peer-to-peer replication between geographically distributed database servers.

      PostgreSQL is an open-source object-relational database system that is highly extensible and compliant with ACID (Atomicity, Consistency, Isolation, Durability) and the SQL standard. Version 10.0 of PostgreSQL introduced support for logical replication, in addition to physical replication. In a logical replication scheme, high-level write operations are streamed from a master database server into one or more replica database servers. In a physical replication scheme, binary write operations are instead streamed from master to replica, producing a byte-for-byte exact copy of the original content. In cases where you would like to target a particular subset of data, such as off-load reporting, patching, or upgrading, logical replication can offer speed and flexibility.

      In this tutorial, you will configure logical replication with PostgreSQL 10 on two Ubuntu 18.04 servers, with one server acting as the master and the other as the replica. By the end of the tutorial you will be able to replicate data from the master server to the replica using logical replication.

      Prerequisites

      To follow this tutorial, you will need:

      Step 1 — Configuring PostgreSQL for Logical Replication

      There are several configuration settings you will need to modify to enable logical replication between your servers. First, you’ll configure Postgres to listen on the private network interface instead of the public one, as exposing data over the public network is a security risk. Then you’ll configure the appropriate settings to allow replication to db-replica.

      On db-master, open /etc/postgresql/10/main/postgresql.conf, the main server configuration file:

      • sudo nano /etc/postgresql/10/main/postgresql.conf

      Find the following line:

      /etc/postgresql/10/main/postgresql.conf

      ...
      #listen_addresses = 'localhost'         # what IP address(es) to listen on;
      ...
      

      Uncomment it by removing the #, and add your db_master_private_ip_address to enable connections on the private network:

      Note: In this step and the steps that follow, make sure to use the private IP addresses of your servers, and not their public IPs. Exposing a database server to the public internet is a considerable security risk.

      /etc/postgresql/10/main/postgresql.conf

      ...
      listen_addresses = 'localhost, db_master_private_ip_address'
      ...
      

      This makes db-master listen for incoming connections on the private network in addition to the loopback interface.

      Next, find the following line:

      /etc/postgresql/10/main/postgresql.conf

      ...
      #wal_level = replica                    # minimal, replica, or logical
      ...
      

      Uncomment it, and change it to set the PostgreSQL Write Ahead Log (WAL) level to logical. This increases the volume of entries in the log, adding the necessary information for extracting discrepancies or changes to particular data sets:

      /etc/postgresql/10/main/postgresql.conf

      ...
      wal_level = logical
      ...
      

      The entries on this log will be consumed by the replica server, allowing for the replication of the high-level write operations from the master.

      Save the file and close it.

      Next, let’s edit /etc/postgresql/10/main/pg_hba.conf, the file that controls allowed hosts, authentication, and access to databases:

      • sudo nano /etc/postgresql/10/main/pg_hba.conf

      After the last line, let’s add a line to allow incoming network connections from db-replica. We’ll use db-replica‘s private IP address, and specify that connections are allowed from all users and databases:

      /etc/postgresql/10/main/pg_hba.conf

      ...
      # TYPE      DATABASE        USER            ADDRESS                               METHOD
      ...
      host         all            all             db_replica_private_ip_address/32      md5
      

      Incoming network connections will now be allowed from db-replica, authenticated by a password hash (md5).

      Save the file and close it.

      Next, let’s set our firewall rules to allow traffic from db-replica to port 5432 on db-master:

      • sudo ufw allow from db_replica_private_ip_address to any port 5432

      Finally, restart the PostgreSQL server for the changes to take effect:

      • sudo systemctl restart postgresql

      With your configuration set to allow logical replication, you can now move on to creating a database, user role, and table.

      Step 2 — Setting Up a Database, User Role, and Table

      To test the functionality of your replication settings, let’s create a database, table, and user role. You will create an example database with a sample table, which you can then use to test logical replication between your servers. You will also create a dedicated user and assign them privileges over both the database and the table.

      First, open the psql prompt as the postgres user with the following command on both db-master and db-replica:

      Create a new database called example on both hosts:

      Note: The final ; in these commands is required. On interactive sessions, PostgreSQL will not execute SQL commands until you terminate them with a semicolon. Meta-commands (those starting with a backslash, like q and c) directly control the psql client itself, and are therefore exempt from this rule. For more on meta-commands and the psql client, please refer to the PostgreSQL documentation.

      Using the connect meta-command, connect to the databases you just created on each host:

      Create a new table called widgets with arbitrary fields on both hosts:

      • CREATE TABLE widgets
      • (
      • id SERIAL,
      • name TEXT,
      • price DECIMAL,
      • CONSTRAINT widgets_pkey PRIMARY KEY (id)
      • );
      • CREATE TABLE widgets
      • (
      • id SERIAL,
      • name TEXT,
      • price DECIMAL,
      • CONSTRAINT widgets_pkey PRIMARY KEY (id)
      • );

      The table on db-replica does not need to be identical to its db-master counterpart. However, it must contain every single column present on the table at db-master. Additional columns must not have NOT NULL or other constraints. If they do, replication will fail.

      On db-master, let's create a new user role with the REPLICATION option and a login password. The REPLICATION attribute must be assigned to any role used for replication. We will call our user sammy, but you can replace this with your own username. Make sure to also replace my_password with your own secure password:

      • CREATE ROLE sammy WITH REPLICATION LOGIN PASSWORD 'my_password';

      Make a note of your password, as you will use it later on db-replica to set up replication.

      Still on db-master, grant full privileges on the example database to the user role you just created:

      • GRANT ALL PRIVILEGES ON DATABASE example TO sammy;

      Next, grant privileges on all of the tables contained in the database to your user:

      • GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO sammy;

      The public schema is a default schema in each database into which tables are automatically placed.

      With these privileges set, you can now move on to making the tables in your example database available for replication.

      Step 3 — Setting Up a Publication

      Publications are the mechanism that PostgreSQL uses to make tables available for replication. The database server will keep track internally of the connection and replication status of any replica servers associated with a given publication. On db-master, you will create a publication, my_publication, that will function as a master copy of the data that will be sent to your subscribers — in our case, db-replica.

      On db-master, create a publication called my_publication:

      • CREATE PUBLICATION my_publication;

      Add the widgets table you created previously to it:

      • ALTER PUBLICATION my_publication ADD TABLE widgets;

      With your publication in place, you can now add a subscriber that will pull data from it.

      Step 4 — Creating a Subscription

      Subscriptions are used by PostgreSQL to connect to existing publications. A publication can have many subscriptions across different replica servers, and replica servers can also have their own publications with subscribers. To access the data from the table you created on db-master, you will need to create a subscription to the publication you created in the previous step, my_publication.

      On db-replica, let's create a subscription called my_subscription. The CREATE SUBSCRIPTION command will name the subscription, while the CONNECTION parameter will define the connection string to the publisher. This string will include the master server's connection details and login credentials, including the username and password you defined earlier, along with the name of the example database. Once again, remember to use db-master's private IP address, and replace my_password with your own password:

      • CREATE SUBSCRIPTION my_subscription CONNECTION 'host=db_master_private_ip_address port=5432 password=my_password user=sammy dbname=example' PUBLICATION my_publication;

      You will see the following output confirming the subscription:

      Output

      NOTICE: created replication slot "my_subscription" on publisher CREATE SUBSCRIPTION

      Upon creating a subscription, PostgreSQL will automatically sync any pre-existing data from the master to the replica. In our case there is no data to sync since the widgets table is empty, but this is a useful feature when adding new subscriptions to an existing database.

      With a subscription in place, let's test the setup by adding some demo data to the widgets table.

      Step 5 — Testing and Troubleshooting

      To test replication between our master and replica, let's add some data to the widgets table and verify that it replicates correctly.

      On db-master, insert the following data on the widgets table:

      • INSERT INTO widgets (name, price) VALUES ('Hammer', 4.50), ('Coffee Mug', 6.20), ('Cupholder', 3.80);

      On db-replica, run the following query to fetch all the entries on this table:

      You should now see:

      Output

      id | name | price ----+------------+------- 1 | Hammer | 4.50 2 | Coffee Mug | 6.20 3 | Cupholder | 3.80 (3 rows)

      Success! The entries have been successfully replicated from db-master to db-replica. From now on, all INSERT, UPDATE, and DELETE queries will be replicated across servers unidirectionally.

      One thing to note about write queries on replica servers is that they are not replicated back to the master server. PostgreSQL currently has limited support for resolving conflicts when the data between servers diverges. If there is a conflict, the replication will stop and PostgreSQL will wait until the issue is manually fixed by the database administrator. For that reason, most applications will direct all write operations to the master server, and distribute reads among available replica servers.

      You can now exit the psql prompt on both servers:

      Now that you have finished testing your setup, you can add and replicate data on your own.

      Troubleshooting

      If replication doesn't seem to be working, a good first step is checking the PostgreSQL log on db-replica for any possible errors:

      • tail /var/log/postgresql/postgresql-10-main.log

      Here are some common problems that can prevent replication from working:

      • Private networking is not enabled on both servers, or the servers are on different networks;
      • db-master is not configured to listen for connections on the correct private network IP;
      • The Write Ahead Log level on db-master is incorrectly configured (it must be set to logical);
      • db-master is not configured to accept incoming connections from the correct db-replica private IP address;
      • A firewall like UFW is blocking incoming PostgreSQL connections on port 5432;
      • There are mismatched table names or fields between db-master and db-replica;
      • The sammy database role is missing the required permissions to access the example database on db-master;
      • The sammy database role is missing the REPLICATION option on db-master;
      • The sammy database role is missing the required permissions to access the widgets table on db-master;
      • The table wasn't added to the publication on db-master.

      After resolving the existing problem(s), replication should take place automatically. If it doesn't, use following command to remove the existing subscription before recreating it:

      • DROP SUBSCRIPTION my_subscription;

      Conclusion

      In this tutorial you've successfully installed PostgreSQL 10 on two Ubuntu 18.04 servers and configured logical replication between them.

      You now have the required knowledge to experiment with horizontal read scaling, high availability, and the geographical distribution of your PostgreSQL database by adding additional replica servers.

      To learn more about logical replication in PostgreSQL 10, you can read the chapter on the topic on the official PostgreSQL documentation, as well as the manual entries on the CREATE PUBLICATION and CREATE SUBSCRIPTION commands.



      Source link