One place for hosting & domains

      Issues

      How To Troubleshoot Issues in Redis


      Introduction

      Redis is an open-source, in-memory key-value data store. It comes with several commands that can help with troubleshooting and debugging issues. Because of Redis’s nature as an in-memory key-value store, many of these commands focus on memory management, but there are others that are valuable for providing an overview of the state of your Redis server. This tutorial will provide details on how to use some of these commands to help diagnose and resolve issues you may run into as you use Redis.

      How To Use This Guide
      This guide is written as a cheat sheet with self-contained examples. We encourage you to jump to any section that is relevant to the task you’re trying to complete.

      The commands and outputs shown in this guide were tested on an Ubuntu 18.04 server running Redis version 4.0.9. To obtain a similar setup, you can follow Step 1 of our guide on How To Install and Secure Redis on Ubuntu 18.04. We will demonstrate how these commands behave by running them with redis-cli, the Redis command line interface. Note that if you’re using a different Redis interface — Redli, for example — the exact outputs of certain commands may differ.

      Alternatively, you could provision a managed Redis database instance to test these commands, but note that depending on the level of control allowed by your database provider, some commands in this guide may not work as described. To provision a DigitalOcean Managed Database, follow our Managed Databases product documentation. Then, you must either install Redli or set up a TLS tunnel in order to connect to the Managed Database over TLS.

      memory usage tells you how much memory is currently being used by a single key. It takes the name of a key as an argument and outputs the number of bytes it uses:

      • memory usage key_meaningOfLife

      Output

      (integer) 42

      For a more general understanding of how your Redis server is using memory, you can run the memory stats command:

      This command outputs an array of memory-related metrics and their values. The following are the metrics reported by memory stats:

      • peak.allocated: The peak number of bytes consumed by Redis
      • total.allocated: The total number of bytes allocated by Redis
      • startup.allocated: The initial number of bytes consumed by Redis at startup
      • replication.backlog: The size of the replication backlog, in bytes
      • clients.slaves: The total size of all replica overheads (the output and query buffers and connection contexts)
      • clients.normal: The total size of all client overheads
      • aof.buffer: The total size of the current and rewrite append-only file buffers
      • db.0: The overheads of the main and expiry dictionaries for each database in use on the server, reported in bytes
      • overhead.total: The sum of all overheads used to manage Redis’s keyspace
      • keys.count: The total number of keys stored in all the databases on the server
      • keys.bytes-per-key: The ratio of the server’s net memory usage and keys.count
      • dataset.bytes: The size of the dataset, in bytes
      • dataset.percentage: The percentage of Redis’s net memory usage taken by dataset.bytes
      • peak.percentage: The percentage of peak.allocated taken out of total.allocated
      • fragmentation: The ratio of the amount of memory currently in use divided by the physical memory Redis is actually using

      memory malloc-stats provides an internal statistics report from jemalloc, the memory allocator used by Redis on Linux systems:

      If it seems like you’re running into memory-related issues, but parsing the output of the previous commands proves to be unhelpful, you can try running memory doctor:

      This feature will output any memory consumption issues that it can find and suggest potential solutions.

      Getting General Information about Your Redis Instance

      A debugging command that isn’t directly related to memory management is monitor. This command allows you to see a constant stream of every command processed by the Redis server:

      Output

      OK 1566157213.896437 [0 127.0.0.1:47740] "auth" "foobared" 1566157215.870306 [0 127.0.0.1:47740] "set" "key_1" "878"

      Another command useful for debugging is info, which returns several blocks of information and statistics about the server:

      Output

      # Server redis_version:4.0.9 redis_git_sha1:00000000 redis_git_dirty:0 redis_build_id:9435c3c2879311f3 redis_mode:standalone os:Linux 4.15.0-52-generic x86_64 . . .

      This command returns a lot of information. If you only want to see one info block, you can specify it as an argument to info:

      Output

      # CPU used_cpu_sys:173.16 used_cpu_user:70.89 used_cpu_sys_children:0.01 used_cpu_user_children:0.04

      Note that the information returned by the info command will depend on which version of Redis you’re using.

      Using the keys Command

      The keys command is helpful in cases where you’ve forgotten the name of a key, or perhaps you’ve created one but accidentally misspelled its name. keys looks for keys that match a pattern:

      The following glob-style variables are supported

      • ? is a wildcard standing for any single character, so s?mmy matches sammy, sommy, and sqmmy
      • * is a wildcard that stands for any number of characters, including no characters at all, so sa*y matches sammy, say, sammmmmmy, and salmony
      • You can specify two or more characters that the pattern can include by wrapping them in brackets, so s[ai]mmy will match sammy and simmy, but not summy
      • To set a wildcard that disregards one or more letters, wrap them in brackets and precede them with a carrot (^), so s[^oi]mmy will match sammy and sxmmy, but not sommy or simmy
      • To set a wildcard that includes a range of letters, separate the beginning and end of the range with a hyphen and wrap it in brackets, so s[a-o]mmy will match sammy, skmmy, and sommy, but not srmmy

      Warning: The Redis documentation warns that keys should almost never be used in a production environment, since it can have a major negative impact on performance.

      Conclusion

      This guide details a number of commands that are helpful for troubleshooting and resolving issues one might encounter as they work with Redis. If there are other related commands, arguments, or procedures you’d like to see outlined in this guide, please ask or make suggestions in the comments below.

      For more information on Redis commands, see our tutorial series on How to Manage a Redis Database.



      Source link

      Troubleshooting Basic Connection Issues


      Updated by Linode Written by Linode

      This guide presents troubleshooting strategies for Linodes that are unresponsive to any network access. One reason that a Linode may be unresponsive is if you recently performed a distribution upgrade or other broad software updates to your Linode, as those changes can lead to unexpected problems for your core system components.

      Similarly, your server may be unresponsive after maintenance was applied by Linode to your server’s host (frequently, this is correlated with software/distribution upgrades performed on your deployment prior to the host’s maintenance). This guide is designed as a useful resource for either of these scenarios.

      If you can ping your Linode, but you cannot access SSH or other services, this guide will not assist with troubleshooting those services. Instead, refer to the Troubleshooting SSH or Troubleshooting Web Servers, Databases, and Other Services guides.

      Where to go for help outside this guide

      This guide explains how to use different troubleshooting commands on your Linode. These commands can produce diagnostic information and logs that may expose the root of your connection issues. For some specific examples of diagnostic information, this guide also explains the corresponding cause of the issue and presents solutions for it.

      If the information and logs you gather do not match a solution outlined here, consider searching the Linode Community Site for posts that match your system’s symptoms. Or, post a new question in the Community Site and include your commands’ output.

      Linode is not responsible for the configuration or installation of software on your Linode. Refer to Linode’s Scope of Support for a description of which issues Linode Support can help with.

      Before You Begin

      There are a few core troubleshooting tools you should familiarize yourself with that are used when diagnosing connection problems.

      The Linode Shell (Lish)

      Lish is a shell that provides access to your Linode’s serial console. Lish does not establish a network connection to your Linode, so you can use it when your networking is down or SSH is inaccessible. Much of your troubleshooting for basic connection issues will be performed from the Lish console.

      To learn about Lish in more detail, and for instructions on how to connect to your Linode via Lish, review the Using the Linode Shell (Lish) guide. In particular, using your web browser is a fast and simple way to access Lish.

      MTR

      When your network traffic leaves your computer to your Linode, it travels through a series of routers that are administered by your internet service provider, by Linode’s transit providers, and by the various organizations that form the Internet’s backbone. It is possible to analyze the route that your traffic takes for possible service interruptions using a tool called MTR.

      MTR is similar to the traceroute tool, in that it will trace and display your traffic’s route. MTR also runs several iterations of its tracing algorithm, which means that it can report statistics like average packet loss and latency over the period that the MTR test runs.

      Review the installation instructions in Linode’s Diagnosing Network Issues with MTR guide and install MTR on your computer.

      Is your Linode Running?

      Log in to the Linode Manager and inspect the Linode’s dashboard. If the Linode is powered off, turn it on.

      Inspect the Lish Console

      If the Linode is listed as running in the Manager, or after you boot it from the Manager, open the Lish console and look for a login prompt. If a login prompt exists, try logging in with your root user credentials (or any other Linux user credentials that you previously created on the server).

      Note

      The root user is available in Lish even if root user login is disabled in your SSH configuration.

      1. If you can log in at the Lish console, move on to the diagnose network connection issues section of this guide.

        If you see a log in prompt, but you have forgotten the credentials for your Linode, follow the instructions for resetting your root password and then attempt to log in at the Lish console again.

      2. If you do not see a login prompt, your Linode may have issues with booting.

      Troubleshoot Booting Issues

      If your Linode isn’t booting normally, you will not be able to rely on the Lish console to troubleshoot your deployment directly. To continue, you will first need to reboot your Linode into Rescue Mode, which is a special recovery environment that Linode provides.

      When you boot into Rescue Mode, you are booting your Linode into the Finnix recovery Linux distribution. This Finnix image includes a working network configuration, and you will be able to mount your Linode’s disks from this environment, which means that you will be able to access your files.

      1. Review the Rescue and Rebuild guide for instructions and boot into Rescue Mode. If your Linode does not reboot into Rescue Mode successfully, please contact Linode Support.

      2. Connect to Rescue Mode via the Lish console as you would normally. You will not be required to enter a username or password to start using the Lish console while in Rescue Mode.

      Perform a File System Check

      If your Linode can’t boot, then it may have experienced filesystem corruption.

      1. Review the Rescue and Rebuild guide for instructions on running a filesystem check.

        Caution

        Never run a filesystem check on a disk that is mounted.

      2. If your filesystem check reports errors that cannot be fixed, you may need to rebuild your Linode.

      3. If the filesystem check reports errors that it has fixed, try rebooting your Linode under your normal configuration profile. After you reboot, you may find that your connection issues are resolved. If you still cannot connect as normal, restart the troubleshooting process from the beginning of this guide.

      4. If the filesystem check does not report any errors, there may be another reason for your booting issues. Continue to inspecting your system and kernel logs.

      Inspect System and Kernel Logs

      In addition to being able to mount your Linode’s disks, you can also change root (sometimes abbreviated as chroot) within Rescue Mode. Chrooting will make Rescue Mode’s working environment emulate your normal Linux distribution. This means your files and logs will appear where you normally expect them, and you will be able to work with tools like your standard package manager and other system utilities.

      To proceed, review the Rescue and Rebuild guide’s instructions on changing root. Once you have chrooted, you can then investigate your Linode’s logs for messages that may describe the cause of your booting issues.

      In systemd Linux distributions (like Debian 8+, Ubuntu 16.04+, CentOS 7+, and recent releases of Arch), you can run the journalctl command to view system and kernel logs. In these and other distributions, you may also find system log messages in the following files:

      • /var/log/messages

      • /var/log/syslog

      • /var/log/kern.log

      • /var/log/dmesg

      You can use the less command to review the contents of these files (e.g. less /var/log/syslog). Try pasting your log messages into a search engine or searching in the Linode Community Site to see if anyone else has run into similar issues. If you don’t find any results, you can try asking about your issues in a new post on the Linode Community Site. If it becomes difficult to find a solution, you may need to rebuild your Linode.

      Quick Tip for Ubuntu and Debian Systems

      After you have chrooted inside Rescue Mode, the following command may help with issues related to your package manager’s configuration:

      dpkg --configure -a
      

      After running this command, try rebooting your Linode into your normal configuration profile. If your issues persist, you may need to investigate and research your system logs further, or consider rebuilding your Linode.

      Diagnose Network Connection Issues

      If you can boot your Linode normally and access the Lish console, you can continue investigating network issues. Networking issues may have two causes:

      • There may be a network routing problem between you and your Linode, or:

      • If the traffic is properly routed, your Linode’s network configuration may be malfunctioning.

      Check for Network Route Problems

      To diagnose routing problems, run and analyze an MTR report from your computer to your Linode. For instructions on how to use MTR, review Linode’s MTR guide. It is useful to run your MTR report for 100 cycles in order to get a good sample size (note that running a report with this many cycles will take more time to complete). This recommended command includes other helpful options:

      mtr -rwbzc 100 -i 0.2 -rw 198.51.100.0 <Linode's IP address>
      

      Once you have generated this report, compare it with the following example scenarios.

      Note

      If you are located in China, and the output of your MTR report shows high packet loss or an improperly configured router, then your IP address may have been blacklisted by the GFW (Great Firewall of China). Linode is not able to change your IP address if it has been blacklisted by the GFW. If you have this issue, review this community post for troubleshooting help.
      • High Packet Loss

        root@localhost:~# mtr --report www.google.com
        HOST: localhost                   Loss%   Snt   Last   Avg  Best  Wrst StDev
        1. 63.247.74.43                   0.0%    10    0.3   0.6   0.3   1.2   0.3
        2. 63.247.64.157                  0.0%    10    0.4   1.0   0.4   6.1   1.8
        3. 209.51.130.213                60.0%    10    0.8   2.7   0.8  19.0   5.7
        4. aix.pr1.atl.google.com        60.0%    10    6.7   6.8   6.7   6.9   0.1
        5. 72.14.233.56                  50.0%   10    7.2   8.3   7.1  16.4   2.9
        6. 209.85.254.247                40.0%   10   39.1  39.4  39.1  39.7   0.2
        7. 64.233.174.46                 40.0%   10   39.6  40.4  39.4  46.9   2.3
        8. gw-in-f147.1e100.net          40.0%   10   39.6  40.5  39.5  46.7   2.2
        

        This example report shows high persistent packet loss starting mid-way through the route at hop 3, which indicates an issue with the router at hop 3. If your report looks like this, open a support ticket with your MTR results for further troubleshooting assistance.

        Note

        If your route only shows packet loss at certain routers, and not through to the end of the route, then it is likely that those routers are purposefully limiting ICMP responses. This is generally not a problem for your connection. Linode’s MTR guide provides more context for packet loss issues.

        If your report resembles the example, open a support ticket with your MTR results for further troubleshooting assistance. Also, consult Linode’s MTR guide for more context on packet loss issues.

      • Improperly Configured Router

        root@localhost:~# mtr --report www.google.com
        HOST: localhost                   Loss%   Snt   Last   Avg  Best  Wrst StDev
        1. 63.247.74.43                  0.0%    10    0.3   0.6   0.3   1.2   0.3
        2. 63.247.64.157                 0.0%    10    0.4   1.0   0.4   6.1   1.8
        3. 209.51.130.213                0.0%    10    0.8   2.7   0.8  19.0   5.7
        4. aix.pr1.atl.google.com        0.0%    10    6.7   6.8   6.7   6.9   0.1
        5. ???                           0.0%    10    0.0   0.0   0.0   0.0   0.0
        6. ???                           0.0%    10    0.0   0.0   0.0   0.0   0.0
        7. ???                           0.0%    10    0.0   0.0   0.0   0.0   0.0
        8. ???                           0.0%    10    0.0   0.0   0.0   0.0   0.0
        9. ???                           0.0%    10    0.0   0.0   0.0   0.0   0.0
        10. ???                           0.0%    10    0.0   0.0   0.0   0.0   0.0
        

        If your report shows question marks instead of the hostnames (or IP addresses) of the routers, and if these question marks persist to the end of the route, then the report indicates an improperly configured router. If your report looks like this, open a support ticket with your MTR results for further troubleshooting assistance.

        Note

        If your route only shows question marks for certain routers, and not through to the end of the route, then it is likely that those routers are purposefully blocking ICMP responses. This is generally not a problem for your connection. Linode’s MTR guide provides more information about router configuration issues.
      • Destination Host Networking Improperly Configured

        root@localhost:~# mtr --report www.google.com
        HOST: localhost                   Loss%   Snt   Last   Avg  Best  Wrst StDev
        1. 63.247.74.43                  0.0%    10    0.3   0.6   0.3   1.2   0.3
        2. 63.247.64.157                 0.0%    10    0.4   1.0   0.4   6.1   1.8
        3. 209.51.130.213                0.0%    10    0.8   2.7   0.8  19.0   5.7
        4. aix.pr1.atl.google.com        0.0%    10    6.7   6.8   6.7   6.9   0.1
        5. 72.14.233.56                  0.0%    10    7.2   8.3   7.1  16.4   2.9
        6. 209.85.254.247                0.0%    10   39.1  39.4  39.1  39.7   0.2
        7. 64.233.174.46                 0.0%    10   39.6  40.4  39.4  46.9   2.3
        8. gw-in-f147.1e100.net         100.0    10    0.0   0.0   0.0   0.0   0.0
        

        If your report shows no packet loss or low packet loss (or non-persistent packet loss isolated to certain routers) until the end of the route, and 100% loss at your Linode, then the report indicates that your Linode’s network interface is not configured correctly. If your report looks like this, move down to confirming network configuration issues from Rescue Mode.

      Note

      If your report does not look like any of the previous examples, read through the MTR guide for other potential scenarios.

      Confirm Network Configuration Issues from Rescue Mode

      If your MTR indicates a configuration issue within your Linode, you can confirm the problem by using Rescue Mode:

      1. Reboot your Linode into Rescue Mode.

      2. Run another MTR report from your computer to your Linode’s IP address.

      3. As noted earlier, Rescue Mode boots with a working network configuration. If your new MTR report does not show the same packet loss that it did before, this result confirms that your deployment’s network configuration needs to be fixed. Continue to troubleshooting network configuration issues.

      4. If your new MTR report still shows the same packet loss at your Linode, this result indicates issues outside of your configuration. Open a support ticket with your MTR results for further troubleshooting assistance.

      Open a Support Ticket with your MTR Results

      Before opening a support ticket, you should also generate a reverse MTR report. The MTR tool is run from your Linode and targets your machine’s IP address on your local network, whether you’re on your home LAN, for example, or public WiFi. To run an MTR from your Linode, log in to your Lish console. To find your local IP, visit a website like https://www.whatismyip.com/.

      Once you have generated your original MTR and your reverse MTR, open a Linode support ticket, and include your reports and a description of the troubleshooting you’ve performed so far. Linode Support will try to help further diagnose the routing issue.

      Troubleshoot Network Configuration Issues

      If you have determined that your network configuration is the cause of the problem, review the following troubleshooting suggestions. If you make any changes in an attempt to fix the issue, you can test those changes with these steps:

      1. Run another MTR report (or ping the Linode) from your computer to your Linode’s IP.

      2. If the report shows no packet loss but you still can’t access SSH or other services, this result indicates that your network connection is up again, but the other services are still down. Move onto troubleshooting SSH or troubleshooting other services.

      3. If the report still shows the same packet loss, review the remaining troubleshooting suggestions in this section.

      If the recommendations in this section do not resolve your issue, try pasting your diagnostic commands’ output into a search engine or searching for your output in the Linode Community Site to see if anyone else has run into similar issues. If you don’t find any results, you can try asking about your issues in a new post on the Linode Community Site. If it becomes difficult to find a solution, you may need to rebuild your Linode.

      Try Enabling Network Helper

      A quick fix may be to enable Linode’s Network Helper tool. Network Helper will attempt to generate the appropriate static networking configuration for your Linux distribution. After you enable Network Helper, reboot your Linode for the changes to take effect. If Network Helper was already enabled, continue to the remaining troubleshooting suggestions in this section.

      Did You Upgrade to Ubuntu 18.04+ From an Earlier Version?

      If you performed an inline upgrade from an earlier version of Ubuntu to Ubuntu 18.04+, you may need to enable the systemd-networkd service:

      sudo systemctl enable systemd-networkd
      

      Afterwards, reboot your Linode.

      Run Diagnostic Commands

      To collect more information about your network configuration, collect output from the diagnostic commands appropriate for your distribution:

      Network diagnostic commands

      • Debian 7, Ubuntu 14.04

        sudo service network status
        cat /etc/network/interfaces
        ip a
        ip r
        sudo ifdown eth0 && sudo ifup eth0
        
      • Debian 8 and 9, Ubuntu 16.04

        sudo systemctl status networking.service -l
        sudo journalctl -u networking --no-pager | tail -20
        cat /etc/network/interfaces
        ip a
        ip r
        sudo ifdown eth0 && sudo ifup eth0
        
      • Ubuntu 18.04

        sudo networkctl status
        sudo systemctl status systemd-networkd -l
        sudo journalctl -u systemd-networkd --no-pager | tail -20
        cat /etc/systemd/network/05-eth0.network
        ip a
        ip r
        sudo netplan apply
        
      • Arch, CoreOS

        sudo systemctl status systemd-networkd -l
        sudo journalctl -u systemd-networkd --no-pager | tail -20
        cat /etc/systemd/network/05-eth0.network
        ip a
        ip r
        
      • CentOS 6

        sudo service network status
        cat /etc/sysconfig/network-scripts/ifcfg-eth0
        ip a
        ip r
        sudo ifdown eth0 && sudo ifup eth0
        
      • CentOS 7, Fedora

        sudo systemctl status NetworkManager -l
        sudo journalctl -u NetworkManager --no-pager | tail -20
        sudo nmcli
        cat /etc/sysconfig/network-scripts/ifcfg-eth0
        ip a
        ip r
        sudo ifdown eth0 && sudo ifup eth0
        

      Inspect Error Messages

      Your commands’ output may show error messages, including generic errors like Failed to start Raise network interfaces. There may also be more specific errors that appear. Two common errors that can appear are related to Sendmail and iptables:

      Sendmail

      If you find a message similar to the following, it is likely that a broken Sendmail update is at fault:

        
      /etc/network/if-up.d/sendmail: 44: .: Can't open /usr/share/sendmail/dynamic run-parts: /etc/network/if-up.d/sendmail exited with return code 2
      
      

      The Sendmail issue can usually be resolved by running the following command and restarting your Linode:

      sudo mv /etc/network/if-up.d/sendmail ~
      ifdown -a && ifup -a
      

      Note

      Read more about the Sendmail bug here.

      iptables

      Malformed rules in your iptables ruleset can sometimes cause issues for your network scripts. An error similar to the following can appear in your logs if this is the case:

        
      Apr 06 01:03:17 xlauncher ifup[6359]: run-parts: failed to exec /etc/network/if- Apr 06 01:03:17 xlauncher ifup[6359]: run-parts: /etc/network/if-up.d/iptables e
      
      

      Run the following command and restart your Linode to resolve this issue:

      sudo mv /etc/network/if-up.d/iptables ~
      

      Please note that your firewall will be down at this point, so you will need to re-enable it manually. Review the Control Network Traffic with iptables guide for help with managing iptables.

      Was your Interface Renamed?

      In your commands’ output, you might notice that your eth0 interface is missing and replaced with another name (for example, ensp or ensp0). This behavior can be caused by systemd’s Predictable Network Interface Names feature.

      1. Disable the use of Predictable Network Interface Names with these commands:

        ln -s /dev/null /etc/systemd/network/99-default.link
        ln -s /dev/null /etc/udev/rules.d/80-net-setup-link.rules
        
      2. Reboot your Linode for the changes to take effect.

      Review Firewall Rules

      If your interface is up but your networking is still down, your firewall (which is likely implemented by the iptables software) may be blocking all connections, including basic ping requests. To review your current firewall ruleset, run:

      sudo iptables -L # displays IPv4 rules
      sudo ip6tables -L # displays IPv6 rules
      

      Note

      Your deployment may be running FirewallD or UFW, which are frontend software packages used to more easily manage your iptables rules. Run these commands to find out if you are running either package:

      sudo ufw status
      sudo firewall-cmd --state
      

      Review How to Configure a Firewall with UFW and Introduction to FirewallD on CentOS to learn how to manage and inspect your firewall rules with those packages.

      Firewall rulesets can vary widely. Review our Control Network Traffic with iptables guide to analyze your rules and determine if they are blocking connections.

      Disable Firewall Rules

      In addition to analyzing your firewall ruleset, you can also temporarily disable your firewall to test if it is interfering with your connections. Leaving your firewall disabled increases your security risk, so we recommend re-enabling it afterwards with a modified ruleset that will accept your connections. Review Control Network Traffic with iptables for help with this subject.

      1. Create a temporary backup of your current iptables:

        sudo iptables-save > ~/iptables.txt
        
      2. Set the INPUT, FORWARD and OUTPUT packet policies as ACCEPT:

        sudo iptables -P INPUT ACCEPT
        sudo iptables -P FORWARD ACCEPT
        sudo iptables -P OUTPUT ACCEPT
        
      3. Flush the nat table that is consulted when a packet that creates a new connection is encountered:

        sudo iptables -t nat -F
        
      4. Flush the mangle table that is used for specialized packet alteration:

        sudo iptables -t mangle -F
        
      5. Flush all the chains in the table:

        sudo iptables -F
        
      6. Delete every non-built-in chain in the table:

        sudo iptables -X
        
      7. Repeat these steps with the ip6tables command to flush your IPv6 rules. Be sure to assign a different name to the IPv6 rules file. (e.g. ~/ip6tables.txt).

      Next Steps

      If you are able to restore basic networking, but you still can’t access SSH or other services, refer to the Troubleshooting SSH or Troubleshooting Web Servers, Databases, and Other Services guides.

      If your connection issues were the result of maintenance performed by Linode, review the Reboot Survival Guide for methods to prepare a Linode for any future maintenance.

      Find answers, ask questions, and help others.

      This guide is published under a CC BY-ND 4.0 license.



      Source link

      How To Troubleshoot Issues in MySQL


      Introduction

      MySQL is an open-source relational database management system (RDBMS), the most popular of its kind in the world. As is the case when working with any software, both newcomers and experienced users can run into confusing error messages or difficult-to-diagnose problems.

      This guide will serve as a troubleshooting resource and starting point as you diagnose your MySQL setup. We’ll go over some of the issues that many MySQL users encounter and provide guidance for troubleshooting specific problems. We will also include links to DigitalOcean tutorials and the official MySQL documentation that may be useful in certain cases.

      Please note that this guide assumes the setup described in How To Install MySQL on Ubuntu 18.04, and the linked tutorials throughout the guide reflect this configuration. If your server is running another distribution, however, you can find a guide specific to that distro in the Tutorial Version Menu at the top of the linked tutorials when one is available.

      How To Get Started with MySQL

      The place where many first-time users of MySQL run into a problem is during the installation and configuration process. Our guide on How To Install MySQL on Ubuntu 18.04 provides instructions on how to set up a basic configuration and may be helpful to those new to MySQL.

      Another reason some users run into issues is that their application requires database features that are only present in the latest releases, but the version of MySQL available in the default repositories of some Linux distributions — including Ubuntu — isn’t the latest version. For this reason, the MySQL developers maintain their own software repository, which you can use to install the latest version and keep it up to date. Our tutorial “How To Install the Latest MySQL on Ubuntu 18.04” provides instructions on how to do this.

      How to Access MySQL Error Logs

      Oftentimes, the root cause of slowdowns, crashes, or other unexpected behavior in MySQL can be determined by analyzing its error logs. On Ubuntu systems, the default location for the MySQL is /var/log/mysql/error.log. In many cases, the error logs are most easily read with the less program, a command line utility that allows you to view files but not edit them:

      • sudo less /var/log/mysql/error.log

      If MySQL isn’t behaving as expected, you can obtain more information about the source of the trouble by running this command and diagnosing the error based on the log’s contents.

      Resetting the root MySQL User’s Password

      If you’ve set a password for your MySQL installation’s root user but have since forgotten it, you could be locked out of your databases. As long as you have access to the server on which your database is hosted, though, you should be able to reset it.

      This process differs from resetting the password for a standard Linux username. Check out our guide on How To Reset Your MySQL or MariaDB Root Password to walk through and understand this process.

      Troubles with Queries

      Sometimes users run into problems once they begin issuing queries on their data. In some database systems, including MySQL, query statements in must end in a semicolon (;) for the query to complete, as in the following example:

      If you fail to include a semicolon at the end of your query, the prompt will continue on a new line until you complete the query by entering a semicolon and pressing ENTER.

      Some users may find that their queries are exceedingly slow. One way to find which query statement is the cause of a slowdown is to enable and view MySQL's slow query log. To do this, open your mysqld.cnf file, which is used to configure options for the MySQL server. This file is typically stored within the /etc/mysql/mysql.conf.d/ directory:

      • sudo nano /etc/mysql/mysql.conf.d/mysqld.cnf

      Scroll through the file until you see the following lines:

      /etc/mysql/mysql.conf.d/mysqld.cnf

      . . .
      #slow_query_log         = 1
      #slow_query_log_file    = /var/log/mysql/mysql-slow.log
      #long_query_time = 2
      #log-queries-not-using-indexes
      . . .
      

      These commented-out directives provide MySQL's default configuration options for the slow query log. Specifically, here's what each of them do:

      • slow-query-log: Setting this to 1 enables the slow query log.
      • slow-query-log-file: This defines the file where MySQL will log any slow queries. In this case, it points to the /var/log/mysql-slow.log file.
      • long_query_time: By setting this directive to 2, it configures MySQL to log any queries that take longer than 2 seconds to complete.
      • log_queries_not_using_indexes: This tells MySQL to also log any queries that run without indexes to the /var/log/mysql-slow.log file. This setting isn't required for the slow query log to function, but it can be helpful for spotting inefficient queries.

      Uncomment each of these lines by removing the leading pound signs (#). The section will now look like this:

      /etc/mysql/mysql.conf.d/mysqld.cnf

      . . .
      slow_query_log = 1
      slow_query_log_file = /var/log/mysql-slow.log
      long_query_time = 2
      log_queries_not_using_indexes
      . . .
      

      Note: If you're running MySQL 8+, these commented lines will not be in the mysqld.cnf file by default. In this case, add the following lines to the bottom of the file:

      /etc/mysql/mysql.conf.d/mysqld.cnf

      . . .
      slow_query_log = 1
      slow_query_log_file = /var/log/mysql-slow.log
      long_query_time = 2
      log_queries_not_using_indexes
      

      After enabling the slow query log, save and close the file. Then restart the MySQL service:

      • sudo systemctl restart mysql

      With these settings in place, you can find problematic query statements by viewing the slow query log. You can do so with less, like this:

      • sudo less /var/log/mysql_slow.log

      Once you've singled out the queries causing the slowdown, you may find our guide on How To Optimize Queries and Tables in MySQL and MariaDB on a VPS to be helpful with optimizing them.

      Additionally, MySQL includes the EXPLAIN statement, which provides information about how MySQL executes queries. This page from the official MySQL documentation provides insight on how to use EXPLAIN to highlight inefficient queries.

      For help with understanding basic query structures, see our Introduction to MySQL Queries.

      Allowing Remote Access

      Many websites and applications start off with their web server and database backend hosted on the same machine. With time, though, a setup like this can become cumbersome and difficult to scale. A common solution is to separate these functions by setting up a remote database, allowing the server and database to grow at their own pace on their own machines.

      One of the more common problems that users run into when trying to set up a remote MySQL database is that their MySQL instance is only configured to listen for local connections. This is MySQL's default setting, but it won't work for a remote database setup since MySQL must be able to listen for an external IP address where the server can be reached. To enable this, open up your mysqld.cnf file:

      • sudo nano /etc/mysql/mysql.conf.d/mysqld.cnf

      Navigate to the line that begins with the bind-address directive. It will look like this:

      /etc/mysql/mysql.conf.d/mysqld.cnf

      . . .
      lc-messages-dir = /usr/share/mysql
      skip-external-locking
      #
      # Instead of skip-networking the default is now to listen only on
      # localhost which is more compatible and is not less secure.
      bind-address            = 127.0.0.1
      . . .
      

      By default, this value is set to 127.0.0.1, meaning that the server will only look for local connections. You will need to change this directive to reference an external IP address. For the purposes of troubleshooting, you could set this directive to a wildcard IP address, either *, ::, or 0.0.0.0:

      /etc/mysql/mysql.conf.d/mysqld.cnf

      . . .
      lc-messages-dir = /usr/share/mysql
      skip-external-locking
      #
      # Instead of skip-networking the default is now to listen only on
      # localhost which is more compatible and is not less secure.
      bind-address            = 0.0.0.0
      . . .
      

      Note: If you're running MySQL 8+, the bind-address directive will not be in the mysqld.cnf file by default. In this case, add the following highlighted line to the bottom of the file:

      /etc/mysql/mysql.conf.d/mysqld.cnf

      . . .
      [mysqld]
      pid-file        = /var/run/mysqld/mysqld.pid
      socket          = /var/run/mysqld/mysqld.sock
      datadir         = /var/lib/mysql
      log-error       = /var/log/mysql/error.log
      bind-address            = 0.0.0.0
      

      After changing this line, save and close the file and then restart the MySQL service:

      • sudo systemctl restart mysql

      Following this, try accessing your database remotely from another machine:

      • mysql -u user -h database_server_ip -p

      If you're able to access your database, it confirms that the bind-address directive in your configuration file was the issue. Please note, though, that setting bind-address to 0.0.0.0 is insecure as it allows connections to your server from any IP address. On the other hand, if you're still unable to access the database remotely, then something else may be causing the issue. In either case, you may find it helpful to follow our guide on How To Set Up a Remote Database to Optimize Site Performance with MySQL on Ubuntu 18.04 to set up a more secure remote database configuration.

      MySQL Stops Unexpectedly or Fails to Start

      The most common cause of crashes in MySQL is that it stopped or failed to start due to insufficient memory. To check this, you will need to review the MySQL error log after a crash.

      First, attempt to start the MySQL server by typing:

      • sudo systemctl start mysql

      Then review the error logs to see what's causing MySQL to crash. You can use less to review your logs, one page at a time:

      • sudo less /var/log/mysql/error.log

      Some common messages that would indicate an insufficient amount of memory are Out of memory or mmap can't allocate.

      Potential solutions to an inadequate amount of memory are:

      • Optimizing your MySQL configuration. A great open-source tool for this is MySQLtuner. Running the MySQLtuner script will output a set of recommended adjustments to your MySQL configuration file (mysqld.cnf). Note that the longer your server has been running before using MySQLTuner, the more accurate its suggestions will be. To get a memory usage estimate of both your current settings and those proposed by MySQLTimer, use this MySQL Calculator.

      • Reducing your web application’s reliance on MySQL for page loads. This can usually be done by adding static caching to your application. Examples for this include Joomla, which has caching as a built-in feature that can be enabled, and WP Super Cache, a WordPress plugin that adds this kind of functionality.

      • Upgrading to a larger VPS. At minimum, we recommend a server with at least 1GB of RAM for any server using a MySQL database, but the size and type of your data can significantly affect memory requirements.

      Take note that even though upgrading your server is a potential solution, it's only recommended after you investigate and weigh all of your other options. An upgraded server with more resources will likewise cost more money, so you should only go through with resizing if it truly ends up being your best option. Also note that the MySQL documentation includes a number of other suggestions for diagnosing and preventing crashes.

      Corrupted Tables

      Occasionally, MySQL tables can become corrupted, meaning that an error has occurred and the data held within them is unreadable. Attempts to read from a corrupted table will usually lead to the server crashing.

      Some common causes of corrupted tables are:

      • The MySQL server stops in middle of a write.
      • An external program modifies a table that's simultaneously being modified by the server.
      • The machine is shut down unexpectedly.
      • The computer hardware fails.
      • There's a software bug somewhere in the MySQL code.

      If you suspect that one of your tables has been corrupted, you should make a backup of your data directory before troubleshooting or attempting to fix the table. This will help to minimize the risk of data loss.

      First, stop the MySQL service:

      • sudo systemctl stop mysql

      Then copy all of your data into a new backup directory. On Ubuntu systems, the default data directory is /var/lib/mysql/:

      • cp -r /var/lib/mysql /var/lib/mysql_bkp

      After making the backup, you're ready to begin investigating whther the table is in fact corrupted. If the table uses the MyISAM storage engine, you can check whether it's corrupted by running a CHECK TABLE statement from the MySQL prompt:

      A message will appear in this statement's output letting you know whether or not it's corrupted. If the MyISAM table is indeed corrupted, it can usually be repaired by issuing a REPAIR TABLE statement:

      Assuming the repair was successful, you will see a message like the following in your output:

      Output

      +--------------------------+--------+----------+----------+ | Table | Op | Msg_type | Msg_text | +--------------------------+--------+----------+----------+ | database_name.table_name | repair | status | OK | +--------------------------+--------+----------+----------+

      If the table is still corrupted, though, the MySQL documentation suggests a few alternative methods for repairing corrupted tables.

      On the other hand, if the corrupted table uses the InnoDB storage engine, then the process for repairing it will be different. InnoDB is the default storage engine in MySQL as of version 5.5, and it features automated corruption checking and repair operations. InnoDB checks for corrupted pages by performing checksums on every page it reads, and if it finds a checksum discrepancy it will automatically stop the MySQL server.

      There is rarely a need to repair InnoDB tables, as InnoDB features a crash recovery mechanism that can resolve most issues when the server is restarted. However, if you do encounter a situation where you need to rebuild a corrupted InnoDB table, the MySQL documentation recommends using the "Dump and Reload" method. This involves regaining access to the corrupted table, using the mysqldump utility to create a logical backup of the table, which will retain the table structure and the data within it, and then reloading the table back into the database.

      With that in mind, try restarting the MySQL service to see if doing so will allow you access to the server:

      • sudo systemctl restart mysql

      If the server remains crashed or otherwise inaccessible, then it may be helpful to enable InnoDB's force_recovery option. You can do this by editing the mysqld.cnf file:

      • sudo nano /etc/mysql/mysql.conf.d/mysqld.cnf

      In the [mysqld] section, add the following line:

      /etc/mysql/mysql.conf.d/mysqld.cnf

      . . .
      [mysqld]
      . . .
      innodb_force_recovery=1
      

      Save and close the file, and then try restarting the MySQL service again. If you can successfully access the corrupted table, use the mysqldump utility to dump your table data to a new file. You can name this file whatever you like, but here we'll name it out.sql:

      • mysqldump database_name table_name > out.sql

      Then drop the table from the database. To avoid having to reopen the MySQL prompt, you can use the following syntax:

      • mysql -u user -p --execute="DROP TABLE database_name.table_name"

      Following this, restore the table with the dump file you just created:

      • mysql -u user -p < out.sql

      Note that the InnoDB storage engine is generally more fault-tolerant than the older MyISAM engine. Tables using InnoDB can still be corrupted, but because of its auto-recovery features the risk of table corruption and crashes is decidedly lower.

      Socket Errors

      MySQL manages connections to the database server through the use of a socket file, a special kind of file that facilitates communications between different processes. The MySQL server's socket file is named mysqld.sock and on Ubuntu systems it's usually stored in the /var/run/mysqld/ directory. This file is created by the MySQL service automatically.

      Sometimes, changes to your system or your MySQL configuration can result in MySQL being unable to read the socket file, preventing you from gaining access to your databases. The most common socket error looks like this:

      Output

      ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)

      There are a few reasons why this error may occur, and a few potential ways to resolve it.

      One common cause of this error is that the MySQL service is stopped or did not start to begin with, meaning that it was unable to create the socket file in the first place. To find out if this is the reason you're seeing this error, try starting the service with systemctl:

      • sudo systemctl start mysql

      Then try accessing the MySQL prompt again. If you still receive the socket error, double check the location where your MySQL installation is looking for the socket file. This information can be found in the mysqld.cnf file:

      • sudo nano /etc/mysql/mysql.conf.d/mysql.cnf

      Look for the socket parameter in the [mysqld] section of this file. It will look like this:

      /etc/mysql/mysql.conf.d/mysqld.cnf

      . . .
      [mysqld]
      user            = mysql
      pid-file        = /var/run/mysqld/mysqld.pid
      socket          = /var/run/mysqld/mysqld.sock
      port            = 3306
      . . .
      

      Close this file, then ensure that the mysqld.sock file exists by running an ls command on the directory where MySQL expects to find it:

      If the socket file exists, you will see it in this command's output:

      Output

      . .. mysqld.pid mysqld.sock mysqld.sock.lock

      If the file does not exist, the reason may be that MySQL is trying to create it, but does not have adequate permissions to do so. You can ensure that the correct permissions are in place by changing the directory's ownership to the mysql user and group:

      • sudo chown mysql:mysql /var/run/mysqld/

      Then ensure that the mysql user has the appropriate permissions over the directory. Setting these to 775 will work in most cases:

      • sudo chmod -R 755 /var/run/mysqld/

      Finally, restart the MySQL service so it can attempt to create the socket file again:

      • sudo systemctl restart mysql

      Then try accessing the MySQL prompt once again. If you still encounter the socket error, there's likely a deeper issue with your MySQL instance, in which case you should review the error log to see if it can provide any clues.

      Conclusion

      MySQL serves as the backbone of countless data-driven applications and websites. With so many use cases, there are as many potential causes of errors. Likewise, there are also many different ways to resolve such errors. We've covered some of the most frequently encountered errors in this guide, but there are many more that could come up depending on how your own application works with MySQL.

      If you weren't able to find a solution to your particular problem, we hope that this guide will at least give you some background into MySQL troubleshooting and help you find the source of your errors. For more information, you can look at the official MySQL documentation, which covers the topics we have discussed here as well as other troubleshooting strategies.

      Additionally, if your MySQL database is hosted on a DigitalOcean Droplet, you can contact our Support team for further assistance.



      Source link