One place for hosting & domains

      Elastic

      How to Deploy the Elastic Stack on Kubernetes


      Updated by Linode Written by Tyler Langlois

      What is the Elastic Stack?

      The Elastic Stack is a collection of open source projects from Elastic that help collect and visualize a wide variety of data sources. Elasticsearch can store and aggregate data such as log files, container metrics, and more. The products in the stack include: Elasticsearch, Logstash, Kibana, and now Beats.

      In this guide:

      At the end of this guide, you will have a deployment installed and configured that you can further use for application logs or monitoring Kubernetes itself.

      Caution

      This guide’s example instructions will create the following billable resources on your Linode account: four (4) Linodes and three (3) Block Storage volumes. If you do not want to keep using the example cluster that you create, be sure to delete the cluster Linodes and volumes when you have finished the guide.

      If you remove the resources afterward, you will only be billed for the hour(s) that the resources were present on your account. Consult the Billing and Payments guide for detailed information about how hourly billing works and for a table of plan pricing.

      Before You Begin

      Note

      This guide uses Kubernetes services which are private by default. Local listeners are opened which allow you to access the services on your local browser, however, web servers and NodeBalancers are out scope for this guide. Due to this, you should complete the steps of this guide from your local computer or from a computer that you will give you access the web browser. If you wish to be able to access these services from a public domain, please see our guide on Getting Started with NodeBalancers.
      1. Install the Kubernetes CLI (kubectl) on your computer, if it is not already.

      2. Follow the How to Deploy Kubernetes on Linode with the k8s-alpha CLI guide to set up a Kubernetes cluster. This guide will use a three node + master node cluster. You can use the following Linode k8s-alpha CLI command to create your cluster:

        linode-cli k8s-alpha create example-cluster --node-type g6-standard-2 --nodes 3 --master-type g6-standard-2 --region us-east --ssh-public-key ~/.ssh/id_rsa.pub
        
        • You should use this guide instead of manual installation via a method such as kubeadmin, as the k8s-alpha tool will setup support for persistent volume claims.

        • Node sizes are important when configuring Elasticsearch, and this guide assumes 4GB Linode instances.

        • This guide also assumes that your cluster has role-based access control (RBAC) enabled. This feature became available in Kubernetes 1.6. It is enabled on clusters created via the k8s-alpha Linode CLI.

      3. You should also make sure that your Kubernetes CLI is using the right cluster context. Run the get-contexts subcommand to check:

        kubectl config get-contexts
        
      4. Set up Helm in your Kubernetes cluster by following the How to Install Apps on Kubernetes with Helm guide and stop following the steps in this guide upon reaching the Use Helm Charts to Install Apps section.

      Configure Helm

      After following the prerequisites for this guide, you should have a Kubernetes cluster with Helm installed and configured.

      1. Add the elastic chart repository to your local installation of Helm:

        helm repo add elastic https://helm.elastic.co
        
      2. Fetch the updated list of charts from all configured chart repositories:

        helm repo update
        
      3. Search for the official elasticsearch chart to confirm Helm has been configured correctly. Note that this chart released by Elastic differs from the chart bundled with the default installation of Helm.

        helm search elasticsearch --version 7
        

        This command should return results similar to the following. Note that your exact version numbers may be different.

        NAME                    CHART VERSION   APP VERSION     DESCRIPTION
        elastic/elasticsearch   7.3.2           7.3.2           Official Elastic helm chart for Elasticsearch
        

      Your Helm environment is now prepared to install official Elasticsearch charts into your kubernetes cluster.

      Install Charts

      Install Elasticsearch

      Before installing the chart, ensure that resources are set appropriately. By default, the elasticsearch chart allocates 1G of memory to the JVM heap and sets Kubernetes resource requests and limits to 2G. Using a Linode 4GB instance is compatible with these defaults, but if you are using a different instance type, you will need to provide different values to the chart at install time in order to ensure that running pods are within the resource constraints of the node sizes you have chosen.

      1. Install the elasticsearch chart. This command will wait to complete until all pods are started and ready:

        helm install --name elasticsearch --wait --timeout=600 elastic/elasticsearch
        
      2. A three-node Elasticsearch cluster is now configured and available locally to the Kubernetes cluster. To confirm this, first port-forward a local port to the Elasticsearch service. You should leave this command running in a terminal window or tab in in the background for the remainder of this tutorial.

        kubectl port-forward svc/elasticsearch-master 9200:9200
        
      3. In another terminal window, send a request to this port:

        curl http://localhost:9200/
        
      4. You should see a response similar to the following:

        {
          "name" : "elasticsearch-master-0",
          "cluster_name" : "elasticsearch",
          "cluster_uuid" : "o66WYOm5To2znbZ0kOkDUw",
          "version" : {
            "number" : "7.3.2",
            "build_flavor" : "default",
            "build_type" : "docker",
            "build_hash" : "1c1faf1",
            "build_date" : "2019-09-06T14:40:30.409026Z",
            "build_snapshot" : false,
            "lucene_version" : "8.1.0",
            "minimum_wire_compatibility_version" : "6.8.0",
            "minimum_index_compatibility_version" : "6.0.0-beta1"
          },
          "tagline" : "You Know, for Search"
        }
        

      Note that your specific version numbers and dates may be different in this json response. Elasticsearch is operational, but not receiving or serving any data.

      Install Filebeat

      In order to start processing data, deploy the filebeat chart to your Kubernetes cluster. This will collect all pod logs and store them in Elasticsearch, after which they can be searched and used in visualizations within Kibana.

      1. Deploy the filebeat chart. No custom values.yaml file should be necessary:

        helm install --name filebeat --wait --timeout=600 elastic/filebeat
        
      2. Confirm that Filebeat has started to index documents into Elasticsearch by sending a request to the locally-forwarded Elasticsearch service port:

        curl http://localhost:9200/_cat/indices
        

        At least one filebeat index should be present, and output should be similar to the following:

          
                green open filebeat-7.3.12-2019.09.30-000001 peGIaeQRQq-bfeSG3s0RWA 1 1 9886 0 5.7mb 2.8mb
            
        

      Install Kibana

      Kibana will provide a frontend to Elasticsearch and the data collected by Filebeat.

      1. Deploy the kibana chart:

        helm install --name kibana --wait --timeout=600 elastic/kibana
        
      2. Port-forward the kibana-kibana service in order to access Kibana locally. Leave this command running in the background as well for the remainder of this tutorial.

        kubectl port-forward svc/kibana-kibana 5601:5601
        

      Configure Kibana

      Before visualizing pod logs, Kibana must be configured with an index pattern for Filebeat’s indices.

      1. With the previous port-forward command running in another terminal window, open your browser and navigate to http://localhost:5601

      2. A page similar to the following should render in your browser.

        Initial Kibana Page

      3. To begin configuring index patterns, scroll down until the Index Patterns button appears, and click it.

        Kibana Home Page Index Patterns

      4. The Index Patterns page should be displayed. Click the Create index pattern button to begin.

        Kibana Index Patterns Page

      5. From this page, enter “filebeat-*” into the Index pattern text box, then click the Next step button.

        Kibana Create Index Pattern

      6. In the following page, select @timestamp from the Time Filter field name dropdown menu, then click the Create index pattern button.

        Kibana Create Index Pattern

      7. A page with the index pattern details will then be shown. Click the Discover compass icon from the sidebar to view incoming logs.

        Kibana Select Discover

      8. The Discover page provides a realtime view of logs as they are ingested by Elasticsearch from your Kubernetes cluster. The histogram provides a view of log volume over time, which by default, spans the last 15 minutes. The sidebar on the left side of the user interface displays various fields parsed from json fields sent by Filebeat to Elasticsearch.

      9. Use the Filters box to search only for logs arriving from Kibana pods by filtering for kubernetes.container.name : "kibana". Click the Update button to apply the search filter.

        Note

        When searching in the filters box, field names and values are auto-populated.

        Kibana Filter

      10. In order to expand a log event, click the arrow next to an event in the user interface.

        Kibana Open Log Event

      11. Scroll down to view the entire log document in Kibana. Observe the fields provided by Filebeat, including the message field, which contains standard out and standard error messages from the container, as well as the kubernetes node and pod name in fields prefixed with kubernetes.

        Kibana Log Document

      12. Look closely at the message field in the log representation and note that the text field is formatted as json. While the terms in this field can be searched with free text search terms in Kibana, parsing this field will generally yield better results. The following section explains how to configure Filebeat and Kibana to achieve this.

      Update Stack Configuration

      At this point, the Elastic stack is functional and provides an interface to visualize and create dashboards for your logs from Kubernetes. This section will explain how to further configure the various components of the stack for greater visibility into your Kubernetes environment.

      1. Create a values file for Filebeat. This configuration will add the ability to provide autodiscover hints. Instead of changing the Filebeat configuration each time parsing differences are encountered, autodiscover hints permit fragments of Filebeat configuration to be defined at the pod level dynamically so that applications can instruct Filebeat as to how their logs should be parsed.

        filebeat-values.yml
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        
        ---
        
        filebeatConfig:
        filebeat.yml: |
        filebeat.autodiscover:
        providers:
          - type: kubernetes
        hints.enabled: true
        output.elasticsearch:
        hosts: '${ELASTICSEARCH_HOSTS:elasticsearch-master:9200}'
      2. Upgrade the filebeat deployment to use this new configuration file:

        helm upgrade --values filebeat-values.yml --wait --timeout=600 filebeat elastic/filebeat
        
      3. Once this command completes, Filebeat’s DaemonSet will have successfully updated all running pods.

      4. Next, create a Kibana values file to append annotations to the Kibana Deployment that will indicate that Filebeat should parse certain fields as json values. This configuration file will instruct Filebeat to parse the message field as json and store the parsed object underneath the kibana field.

        kibana-values.yml
        1
        2
        3
        4
        5
        
        ---
        
        podAnnotations:
        co.elastic.logs/processors.decode_json_fields.fields: message
        co.elastic.logs/processors.decode_json_fields.target: kibana
      5. Upgrade the Kibana Helm release in your Kubernetes cluster, passing this file as an argument for the Chart values.

        helm upgrade --values kibana-values.yml --wait --timeout=600 kibana elastic/kibana
        
      6. Note, triggering a rolling pod update of Kibana will cause the previous port-forward to lose track of running pods. Terminate the previous Kibana port-forward command in the background terminal with Ctrl-C and start the command again:

        kubectl port-forward svc/kibana-kibana 5601:5601
        
      7. Open a browser window to http://localhost:5601 and navigate to the same Index Patterns page again:

        Kibana Home Page Index Patterns

      8. From the Index Patterns page, select the filebeat-* index pattern.

        Kibana Filebeat Index Pattern

      9. From the index pattern page for filebeat-*, select the Refresh field list button.

        Kibana Refresh Fields

      10. Confirm this action by selecting the Refresh button in the pop-up dialog.

        Kibana Refresh Fields Confirm

      11. Navigate to the “Discover” page.

        Kibana Select Discover

      12. Filter for kibana containers again, scroll down, and expand a log document. Note that various fields have been parsed into the kibana field, such as kibana.req.method, indicating which HTTP verb was issued for a request for Kibana.

        Kibana Parsed Fields

      Metricbeat

      In addition to collecting logs with Filebeat, Metricbeat can collect pod and node metrics in order to visualize information such as resource utilization.

      Install Metricbeat

      1. Deploy the metricbeat chart.

        helm install --name metricbeat --wait --timeout=600 elastic/metricbeat
        
      2. Confirm that Metricbeat has started to index documents into Elasticsearch by sending a request to the locally-forwarded Elasticsearch service port:

        curl http://localhost:9200/_cat/indices
        
      3. At least one metricbeat index should be present, similar to the following:

        green open metricbeat-7.3.2-2019.09.30-000001 N75uVk_hTpmVbDKZE0oeIw 1 1   455  0   1.1mb 567.9kb
        

      Load Dashboards

      Metricbeat can install default Dashboards into Kibana to provide out-of-the-box visualizations for data collected by Kubernetes.

      Before following these steps, ensure that the port-forward command to expose Kibana over port 5601 locally is still running.

      Run the following commands on your local machine. This will communicate with Kibana over 127.0.0.1:5601 to import default Dashboards that will be populated by data from Metricbeat.

      Note

      Your commands should use the same version of Metricbeat deployed to your Kubernetes cluster. You can find this version by issuing the following command:

      helm get values --all metricbeat | grep imageTag
      

      For Linux

      1. Get the Metricbeat package.

        wget https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.3.2-linux-x86_64.tar.gz
        
      2. Unzip the package.

        tar xvzf metricbeat-7.3.2-linux-x86_64.tar.gz
        
      3. Navigate to the directory.

        cd metricbeat-7.3.2-linux-x86_64
        
      4. Setup the dashboards.

        ./metricbeat setup --dashboards
        

      For MacOS

      1. Get the Metricbeat package.

        wget https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.3.2-darwin-x86_64.tar.gz
        
      2. Unzip the package.

        tar xvzf metricbeat-7.3.2-darwin-x86_64.tar.gz
        
      3. Navigate to the directory.

        cd metricbeat-7.3.2-darwin-x86_64
        
      4. Setup the dashboards.

        ./metricbeat setup --dashboards
        

      Explore Dashboards

      1. Open a browser window to http://localhost:5601 and click the Dashboards icon on the left sidebar.

        Kibana Dashboards Link

      2. In the search box, enter “kubernetes” and press Enter. Select the [Metricbeat Kubernetes] Overview ECS dashboard.

        Kibana Dashboards

      3. The following dashboard displays several types of metrics about your Kubernetes cluster.

        Kibana Kubernetes Dashboards

      4. You can explore the various visualizations on this page in order to view metrics about pods, nodes, and the overall health of the Kubernetes cluster.

      Next Steps

      From this point onward, any additional workloads started in Kubernetes will be processed by Filebeat and Metricbeat in order to collect logs and metrics for later introspection within Kibana. As Kubernetes nodes are added or removed, the Filebeat and Metricbeat DaemonSets will automatically scale out pods to monitor nodes as they join the Kubernetes cluster.

      More Information

      You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.

      Find answers, ask questions, and help others.

      This guide is published under a CC BY-ND 4.0 license.



      Source link

      How To Analyze Managed Redis Database Statistics Using the Elastic Stack on Ubuntu 18.04


      The author selected the Free and Open Source Fund to receive a donation as part of the Write for DOnations program.

      Introduction

      Database monitoring is the continuous process of systematically tracking various metrics that show how the database is performing. By observing performance data, you can gain valuable insights and identify possible bottlenecks, as well as find additional ways of improving database performance. Such systems often implement alerting that notifies administrators when things go wrong. Gathered statistics can be used to not only improve the configuration and workflow of the database, but also those of client applications.

      The benefit of using the Elastic Stack (ELK stack) for monitoring your managed database is its excellent support for searching and the ability to ingest new data very quickly. It does not excel at updating the data, but this trade-off is acceptable for monitoring and logging purposes, where past data is almost never changed. Elasticsearch offers a powerful means of querying the data, which you can use through Kibana to get a better understanding of how the database fares through different time periods. This will allow you to correlate database load with real-life events to gain insight into how the database is being used.

      In this tutorial, you’ll import database metrics, generated by the Redis INFO command, into Elasticsearch via Logstash. This entails configuring Logstash to periodically run the command, parse its output and send it to Elasticsearch for indexing immediately afterward. The imported data can later be analyzed and visualized in Kibana. By the end of the tutorial, you’ll have an automated system pulling in Redis statistics for later analysis.

      Prerequisites

      Step 1 — Installing and Configuring Logstash

      In this section, you will install Logstash and configure it to pull statistics from your Redis database cluster, then parse them to send to Elasticsearch for indexing.

      Start off by installing Logstash with the following command:

      • sudo apt install logstash -y

      Once Logstash is installed, enable the service to automatically start on boot:

      • sudo systemctl enable logstash

      Before configuring Logstash to pull the statistics, let’s see what the data itself looks like. To connect to your Redis database, head over to your Managed Database Control Panel, and under the Connection details panel, select Flags from the dropdown:

      Managed Database Control Panel

      You’ll be shown a preconfigured command for the Redli client, which you’ll use to connect to your database. Click Copy and run the following command on your server, replacing redli_flags_command with the command you have just copied:

      Since the output from this command is long, we’ll explain this broken down into its different sections:

      In the output of the Redis info command, sections are marked with #, which signifies a comment. The values are populated in the form of key:value, which makes them relatively easy to parse.

      Output

      # Server redis_version:5.0.4 redis_git_sha1:ab60b2b1 redis_git_dirty:1 redis_build_id:7909f4de3561dc50 redis_mode:standalone os:Linux 5.2.14-200.fc30.x86_64 x86_64 arch_bits:64 multiplexing_api:epoll atomicvar_api:atomic-builtin gcc_version:9.1.1 process_id:72 run_id:ddb7b96c93bbd0c369c6d06ce1c02c78902e13cc tcp_port:25060 uptime_in_seconds:1733 uptime_in_days:0 hz:10 configured_hz:10 lru_clock:8687593 executable:/usr/bin/redis-server config_file:/etc/redis.conf # Clients connected_clients:3 client_recent_max_input_buffer:2 client_recent_max_output_buffer:0 blocked_clients:0 . . .

      The Server section contains technical information about the Redis build, such as its version and the Git commit it’s based on. While the Clients section provides the number of currently opened connections.

      Output

      . . . # Memory used_memory:941560 used_memory_human:919.49K used_memory_rss:4931584 used_memory_rss_human:4.70M used_memory_peak:941560 used_memory_peak_human:919.49K used_memory_peak_perc:100.00% used_memory_overhead:912190 used_memory_startup:795880 used_memory_dataset:29370 used_memory_dataset_perc:20.16% allocator_allocated:949568 allocator_active:1269760 allocator_resident:3592192 total_system_memory:1030356992 total_system_memory_human:982.62M used_memory_lua:37888 used_memory_lua_human:37.00K used_memory_scripts:0 used_memory_scripts_human:0B number_of_cached_scripts:0 maxmemory:463470592 maxmemory_human:442.00M maxmemory_policy:allkeys-lru allocator_frag_ratio:1.34 allocator_frag_bytes:320192 allocator_rss_ratio:2.83 allocator_rss_bytes:2322432 rss_overhead_ratio:1.37 rss_overhead_bytes:1339392 mem_fragmentation_ratio:5.89 mem_fragmentation_bytes:4093872 mem_not_counted_for_evict:0 mem_replication_backlog:0 mem_clients_slaves:0 mem_clients_normal:116310 mem_aof_buffer:0 mem_allocator:jemalloc-5.1.0 active_defrag_running:0 lazyfree_pending_objects:0 . . .

      Here Memory confirms how much RAM Redis has allocated for itself, as well as the maximum amount of memory it can possibly use. If it starts running out of memory, it will free up keys using the strategy you specified in the Control Panel (shown in the maxmemory_policy field in this output).

      Output

      . . . # Persistence loading:0 rdb_changes_since_last_save:0 rdb_bgsave_in_progress:0 rdb_last_save_time:1568966978 rdb_last_bgsave_status:ok rdb_last_bgsave_time_sec:0 rdb_current_bgsave_time_sec:-1 rdb_last_cow_size:217088 aof_enabled:0 aof_rewrite_in_progress:0 aof_rewrite_scheduled:0 aof_last_rewrite_time_sec:-1 aof_current_rewrite_time_sec:-1 aof_last_bgrewrite_status:ok aof_last_write_status:ok aof_last_cow_size:0 # Stats total_connections_received:213 total_commands_processed:2340 instantaneous_ops_per_sec:1 total_net_input_bytes:39205 total_net_output_bytes:776988 instantaneous_input_kbps:0.02 instantaneous_output_kbps:2.01 rejected_connections:0 sync_full:0 sync_partial_ok:0 sync_partial_err:0 expired_keys:0 expired_stale_perc:0.00 expired_time_cap_reached_count:0 evicted_keys:0 keyspace_hits:0 keyspace_misses:0 pubsub_channels:0 pubsub_patterns:0 latest_fork_usec:353 migrate_cached_sockets:0 slave_expires_tracked_keys:0 active_defrag_hits:0 active_defrag_misses:0 active_defrag_key_hits:0 active_defrag_key_misses:0 . . .

      In the Persistence section, you can see the last time Redis saved the keys it stores to disk, and if it was successful. The Stats section provides numbers related to client and in-cluster connections, the number of times the requested key was (or wasn’t) found, and so on.

      Output

      . . . # Replication role:master connected_slaves:0 master_replid:9c1d345a46d29d08537981c4fc44e312a21a160b master_replid2:0000000000000000000000000000000000000000 master_repl_offset:0 second_repl_offset:-1 repl_backlog_active:0 repl_backlog_size:46137344 repl_backlog_first_byte_offset:0 repl_backlog_histlen:0 . . .

      Note: The Redis project uses the terms “master” and “slave” in its documentation and in various commands. DigitalOcean generally prefers the alternative terms “primary” and “replica.”
      This guide will default to the terms “primary” and “replica” whenever possible, but note that there are a few instances where the terms “master” and “slave” unavoidably come up.

      By looking at the role under Replication, you’ll know if you’re connected to a primary or replica node. The rest of the section provides the number of currently connected replicas and the amount of data that the replica is lacking in regards to the primary. There may be additional fields if the instance you are connected to is a replica.

      Output

      . . . # CPU used_cpu_sys:1.972003 used_cpu_user:1.765318 used_cpu_sys_children:0.000000 used_cpu_user_children:0.001707 # Cluster cluster_enabled:0 # Keyspace

      Under CPU, you’ll see the amount of system (used_cpu_sys) and user (used_cpu_user) CPU Redis is consuming at the moment. The Cluster section contains only one unique field, cluster_enabled, which serves to indicate that the Redis cluster is running.

      Logstash will be tasked to periodically run the info command on your Redis database (similar to how you just did), parse the results, and send them to Elasticsearch. You’ll then be able to access them later from Kibana.

      You’ll store the configuration for indexing Redis statistics in Elasticsearch in a file named redis.conf under the /etc/logstash/conf.d directory, where Logstash stores configuration files. When started as a service, it will automatically run them in the background.

      Create redis.conf using your favorite editor (for example, nano):

      • sudo nano /etc/logstash/conf.d/redis.conf

      Add the following lines:

      /etc/logstash/conf.d/redis.conf

      input {
          exec {
              command => "redis_flags_command info"
              interval => 10
              type => "redis_info"
          }
      }
      
      filter {
          kv {
              value_split => ":"
              field_split => "rn"
              remove_field => [ "command", "message" ]
          }
      
          ruby {
              code =>
              "
              event.to_hash.keys.each { |k|
                  if event.get(k).to_i.to_s == event.get(k) # is integer?
                      event.set(k, event.get(k).to_i) # convert to integer
                  end
                  if event.get(k).to_f.to_s == event.get(k) # is float?
                      event.set(k, event.get(k).to_f) # convert to float
                  end
              }
              puts 'Ruby filter finished'
              "
          }
      }
      
      output {
          elasticsearch {
              hosts => "http://localhost:9200"
              index => "%{type}"
          }
      }
      

      Remember to replace redis_flags_command with the command shown in the control panel that you used earlier in the step.

      You define an input, which is a set of filters that will run on the collected data, and an output that will send the filtered data to Elasticsearch. The input consists of the exec command, which will run a command on the server periodically, after a set time interval (expressed in seconds). It also specifies a type parameter that defines the document type when indexed in Elasticsearch. The exec block passes down an object containing two fields, command and message string. The command field will contain the command that was run, and the message will contain its output.

      There are two filters that will run sequentially on the data collected from the input. The kv filter stands for key-value filter, and is built-in to Logstash. It is used for parsing data in the general form of keyvalue_separatorvalue and provides parameters for specifying what are considered a value and field separators. The field separator pertains to strings that separate the data formatted in the general form from each other. In the case of the output of the Redis INFO command, the field separator (field_split) is a new line, and the value separator (value_split) is :. Lines that do not follow the defined form will be discarded, including comments.

      To configure the kv filter, you pass : to thevalue_split parameter, and rn (signifying a new line) to the field_split parameter. You also order it to remove the command and message fields from the current data object by passing them to remove_field as elements of an array, because they contain data that are now useless.

      The kv filter represents the value it parsed as a string (text) type by design. This raises an issue because Kibana can’t easily process string types, even if it’s actually a number. To solve this, you’ll use custom Ruby code to convert the number-only strings to numbers, where possible. The second filter is a ruby block that provides a code parameter accepting a string containing the code to be run.

      event is a variable that Logstash provides to your code, and contains the current data in the filter pipeline. As was noted before, filters run one after another, meaning that the Ruby filter will receive the parsed data from the kv filter. The Ruby code itself converts the event to a Hash and traverses through the keys, then checks if the value associated with the key could be represented as an integer or as a float (a number with decimals). If it can, the string value is replaced with the parsed number. When the loop finishes, it prints out a message (Ruby filter finished) to report progress.

      The output sends the processed data to Elasticsearch for indexing. The resulting document will be stored in the redis_info index, defined in the input and passed in as a parameter to the output block.

      Save and close the file.

      You’ve installed Logstash using apt and configured it to periodically request statistics from Redis, process them, and send them to your Elasticsearch instance.

      Step 2 — Testing the Logstash Configuration

      Now you’ll test the configuration by running Logstash to verify it will properly pull the data.

      Logstash supports running a specific configuration by passing its file path to the -f parameter. Run the following command to test your new configuration from the last step:

      • sudo /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/redis.conf

      It may take some time to show the output, but you’ll soon see something similar to the following:

      Output

      WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console [WARN ] 2019-09-20 11:59:53.440 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified [INFO ] 2019-09-20 11:59:53.459 [LogStash::Runner] runner - Starting Logstash {"logstash.version"=>"6.8.3"} [INFO ] 2019-09-20 12:00:02.543 [Converge PipelineAction::Create<main>] pipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50} [INFO ] 2019-09-20 12:00:03.331 [[main]-pipeline-manager] elasticsearch - Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}} [WARN ] 2019-09-20 12:00:03.727 [[main]-pipeline-manager] elasticsearch - Restored connection to ES instance {:url=>"http://localhost:9200/"} [INFO ] 2019-09-20 12:00:04.015 [[main]-pipeline-manager] elasticsearch - ES Output version determined {:es_version=>6} [WARN ] 2019-09-20 12:00:04.020 [[main]-pipeline-manager] elasticsearch - Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>6} [INFO ] 2019-09-20 12:00:04.071 [[main]-pipeline-manager] elasticsearch - New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["http://localhost:9200"]} [INFO ] 2019-09-20 12:00:04.100 [Ruby-0-Thread-5: :1] elasticsearch - Using default mapping template [INFO ] 2019-09-20 12:00:04.146 [Ruby-0-Thread-5: :1] elasticsearch - Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}} [INFO ] 2019-09-20 12:00:04.295 [[main]-pipeline-manager] exec - Registering Exec Input {:type=>"redis_info", :command=>"...", :interval=>10, :schedule=>nil} [INFO ] 2019-09-20 12:00:04.315 [Converge PipelineAction::Create<main>] pipeline - Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x73adceba run>"} [INFO ] 2019-09-20 12:00:04.483 [Ruby-0-Thread-1: /usr/share/logstash/lib/bootstrap/environment.rb:6] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]} [INFO ] 2019-09-20 12:00:05.318 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600} Ruby filter finished Ruby filter finished Ruby filter finished ...

      You’ll see the Ruby filter finished message being printed at regular intervals (set to 10 seconds in the previous step), which means that the statistics are being shipped to Elasticsearch.

      You can exit Logstash by clicking CTRL + C on your keyboard. As previously mentioned, Logstash will automatically run all config files found under /etc/logstash/conf.d in the background when started as a service. Run the following command to start it:

      • sudo systemctl start logstash

      You’ve run Logstash to check if it can connect to your Redis cluster and gather data. Next, you’ll explore some of the statistical data in Kibana.

      Step 3 — Exploring Imported Data in Kibana

      In this section, you’ll explore and visualize the statistical data describing your database’s performance in Kibana.

      In your web browser, navigate to your domain where you exposed Kibana as a part of the prerequisite. You’ll see the default welcome page:

      Kibana - Welcome Page

      Before exploring the data Logstash is sending to Elasticsearch, you’ll first need to add the redis_info index to Kibana. To do so, click on Management from the left-hand vertical sidebar, and then on Index Patterns under the Kibana section.

      Kibana - Index Pattern Creation

      You’ll see a form for creating a new Index Pattern. Index Patterns in Kibana provide a way to pull in data from multiple Elasticsearch indexes at once, and can be used to explore only one index.

      Beneath the Index pattern text field, you’ll see the redis_info index listed. Type it in the text field and then click on the Next step button.

      You’ll then be asked to choose a timestamp field, so you’ll later be able to narrow your searches by a time range. Logstash automatically adds one, called @timestamp. Select it from the dropdown and click on Create index pattern to finish adding the index to Kibana.

      Kibana - Index Pattern Timestamp Selection

      To create and see existing visualizations, click on the Visualize item in the left-hand vertical menu. You’ll see the following page:

      Kibana - Visualizations

      To create a new visualization, click on the Create a visualization button, then select Line from the list of types that will pop up. Then, select the redis_info* index pattern you have just created as the data source. You’ll see an empty visualization:

      Kibana - Empty Visualization

      The left-side panel provides a form for editing parameters that Kibana will use to draw the visualization, which will be shown on the central part of the screen. On the upper-right hand side of the screen is the date range picker. If the @timestamp field is being used in the visualization, Kibana will only show the data belonging to the time interval specified in the range picker.

      You’ll now visualize the average Redis memory usage during a specified time interval. Click on Y-Axis under Metrics in the panel on the left to unfold it, then select Average as the Aggregation and select used_memory as the Field. This will populate the Y axis of the plot with the average values.

      Next, click on X-Axis under Buckets. For the Aggregation, choose Date Histogram. @timestamp should be automatically selected as the Field. Then, show the visualization by clicking on the blue play button on the top of the panel. If your database is brand new and not used you won’t see a very long line. In all cases, however, you will see an accurate portrayal of average memory usage. Here is how the resulting visualization may look after little to no usage:

      Kibana - Redis Memory Usage Visualization

      In this step, you have visualized memory usage of your managed Redis database, using Kibana. You can also use other plot types Kibana offers, such as the Visual Builder, to create more complicated graphs that portray more than one field at the same time. This will allow you to gain a better understanding of how your database is being used, which will help you optimize client applications, as well as your database itself.

      Conclusion

      You now have the Elastic stack installed on your server and configured to pull statistics data from your managed Redis database on a regular basis. You can analyze and visualize the data using Kibana, or some other suitable software, which will help you gather valuable insights and real-world correlations into how your database is performing.

      For more information about what you can do with your Redis Managed Database, visit the product docs. If you’d like to present the database statistics using another visualization type, check out the Kibana docs for further instructions.



      Source link

      Como Analisar as Estatísticas do Banco de Dados PostgreSQL Gerenciado Usando o Elastic Stack no Ubuntu 18.04


      O autor escolheu o Free and Open Source Fund para receber uma doação como parte do programa Write for DOnations.

      Introdução

      O monitoramento de banco de dados é o processo contínuo de rastrear sistematicamente várias métricas que mostram como o banco de dados está se comportando. Ao observar os dados de desempenho, você pode obter informações valiosas e identificar possíveis gargalos, além de encontrar maneiras adicionais de melhorar o desempenho do banco de dados. Esses sistemas geralmente implementam alertas que notificam os administradores quando as coisas dão errado. As estatísticas coletadas podem ser usadas não apenas para melhorar a configuração e o fluxo de trabalho do banco de dados, mas também para os aplicativos clientes.

      Os benefícios da utilização do Elastic Stack (ELK stack) para monitorar seu banco de dados gerenciado, é seu excelente suporte para pesquisa e a capacidade de ingerir novos dados muito rapidamente. Ele não se destaca na atualização dos dados, mas esse trade-off é aceitável para fins de monitoramento e logs, onde dados passados quase nunca são alterados. O Elasticsearch oferece meios poderosos de consultar os dados, que você pode usar através do Kibana para entender melhor como o banco de dados se sai em diferentes períodos de tempo. Isso permitirá que você correlacione a carga do banco de dados com eventos da vida real para obter informações sobre como o banco de dados está sendo usado.

      Neste tutorial, você importará métricas de banco de dados geradas pelo coletor de estatísticas do PostgreSQL no Elasticsearch via Logstash. Isso implica em configurar o Logstash para extrair dados do banco de dados usando o conector JDBC do PostgreSQL para enviá-los ao Elasticsearch para indexação imediatamente após. Os dados importados podem ser analisados e visualizados posteriormente no Kibana. Então, se seu banco de dados for novo, você usará o pgbench, uma ferramenta de benchmarking do PostgreSQL, para criar visualizações mais interessantes. No final, você terá um sistema automatizado buscando as estatísticas do PostgreSQL para análise posterior.

      Pré-requisitos

      Passo 1 — Configurando o Logstash e o Driver JDBC para PostgreSQL

      Nesta seção, você instalará o Logstash e fará o download do driver JDBC para PostgreSQL para que o Logstash possa se conectar ao seu banco de dados gerenciado.

      Comece instalando o Logstash com o seguinte comando:

      • sudo apt install logstash -y

      Depois que o Logstash estiver instalado, habilite o serviço para iniciar automaticamente na inicialização:

      • sudo systemctl enable logstash

      O Logstash é escrito em Java, portanto, para conectar-se ao PostgreSQL, é necessário que a biblioteca JDBC (Java Database Connectivity) do PostgreSQL esteja disponível no sistema em que está sendo executado. Por causa de uma limitação interna, o Logstash carregará adequadamente a biblioteca apenas se for encontrada no diretório /usr/share/logstash/logstash-core/lib/jars, onde armazena bibliotecas de terceiros que ele utiliza.

      Vá para a página de download da biblioteca JDBC e copie o link para a versão mais recente. Em seguida, faça o download usando curl executando o seguinte comando:

      • sudo curl https://jdbc.postgresql.org/download/postgresql-42.2.6.jar -o /usr/share/logstash/logstash-core/lib/jars/postgresql-jdbc.jar

      No momento da escrita deste tutorial, a versão mais recente da biblioteca era a 42.2.6, com o Java 8 como a versão runtime suportada. Certifique-se de baixar a versão mais recente; emparelhando-a com a versão Java correta que tanto o JDBC quanto o Logstash suportam.

      O Logstash armazena seus arquivos de configuração em /etc/logstash/conf.d e armazena a si próprio em /usr/share/logstash/bin. Antes de criar uma configuração que coletará estatísticas do seu banco de dados, será necessário ativar o plug-in JDBC no Logstash executando o seguinte comando:

      • sudo /usr/share/logstash/bin/logstash-plugin install logstash-input-jdbc

      Você instalou o Logstash usando o apt e baixou a biblioteca JDBC do PostgreSQL para que o Logstash possa usá-la para conectar-se ao seu banco de dados gerenciado. Na próximo passo, você configurará o Logstash para coletar dados estatísticos dele.

      Passo 2 — Configurando o Logstash Para Coletar Estatísticas

      Nesta seção, você configurará o Logstash para coletar métricas do seu banco de dados PostgreSQL gerenciado.

      Você configurará o Logstash para monitorar três bancos de dados de sistema no PostgreSQL, a saber:

      • pg_stat_database: fornece estatísticas sobre cada banco de dados, incluindo seu nome, número de conexões, transações, rollbacks, linhas retornadas ao consultar o banco de dados, deadlocks e assim por diante. Possui um campo stats_reset, que especifica quando as estatísticas foram zeradas pela última vez.
      • pg_stat_user_tables: fornece estatísticas sobre cada tabela criada pelo usuário, como o número de linhas inseridas, excluídas e atualizadas.
      • pg_stat_user_indexes: coleta dados sobre todos os índices em tabelas criadas pelo usuário, como o número de vezes que um índice específico foi consultado.

      Você armazenará a configuração para indexar as estatísticas do PostgreSQL no Elasticsearch em um arquivo chamado postgresql.conf no diretório /etc/logstash/conf.d, onde o Logstash armazena arquivos de configuração. Quando iniciado como um serviço, ele será executado automaticamente em segundo plano.

      Crie o postgresql.conf usando seu editor de textos favorito (por exemplo, o nano):

      • sudo nano /etc/logstash/conf.d/postgresql.conf

      Adicione as seguintes linhas:

      /etc/logstash/conf.d/postgresql.conf

      input {
              # pg_stat_database
              jdbc {
                      jdbc_driver_library => ""
                      jdbc_driver_class => "org.postgresql.Driver"
                      jdbc_connection_string => "jdbc:postgresql://host:port/defaultdb"
                      jdbc_user => "username"
                      jdbc_password => "password"
                      statement => "SELECT * FROM pg_stat_database"
                      schedule => "* * * * *"
                      type => "pg_stat_database"
              }
      
              # pg_stat_user_tables
              jdbc {
                      jdbc_driver_library => ""
                      jdbc_driver_class => "org.postgresql.Driver"
                      jdbc_connection_string => "jdbc:postgresql://host:port/defaultdb"
                      jdbc_user => "username"
                      jdbc_password => "password"
                      statement => "SELECT * FROM pg_stat_user_tables"
                      schedule => "* * * * *"
                      type => "pg_stat_user_tables"
              }
      
              # pg_stat_user_indexes
              jdbc {
                      jdbc_driver_library => ""
                      jdbc_driver_class => "org.postgresql.Driver"
                      jdbc_connection_string => "jdbc:postgresql://host:port/defaultdb"
                      jdbc_user => "username"
                      jdbc_password => "password"
                      statement => "SELECT * FROM pg_stat_user_indexes"
                      schedule => "* * * * *"
                      type => "pg_stat_user_indexes"
              }
      }
      
      output {
              elasticsearch {
                      hosts => "http://localhost:9200"
                      index => "%{type}"
              }
      }
      

      Lembre-se de substituir host pelo seu endereço de host, port pela porta à qual você pode se conectar ao seu banco de dados, username pelo nome de usuário do banco de dados e password pela sua senha. Todos esses valores podem ser encontrados no Painel de Controle do seu banco de dados gerenciado.

      Nesta configuração, você define três entradas JDBC e uma saída Elasticsearch. As três entradas extraem dados dos bancos de dados pg_stat_database, pg_stat_user_tables e pg_stat_user_indexes, respectivamente. Todos eles definem o parâmetro jdbc_driver_library como uma string vazia, porque a biblioteca JDBC do PostgreSQL está em uma pasta que o Logstash carrega automaticamente.

      Em seguida, eles definem o jdbc_driver_class, cujo valor é específico da biblioteca JDBC, e fornecem uma jdbc_connection_string, que detalha como se conectar ao banco de dados. A parte jdbc: significa que é uma conexão JDBC, enquanto postgres:// indica que o banco de dados de destino é o PostgreSQL. Em seguida, vem o host e a porta do banco de dados e, após a barra, você também especifica um banco de dados ao qual se conectar; isso ocorre porque o PostgreSQL exige que você esteja conectado a um banco de dados para poder emitir quaisquer consultas. Aqui, ele está definido como o banco de dados padrão que sempre existe e não pode ser excluído, apropriadamente denominado de defaultdb.

      Em seguida, eles definem um nome de usuário e a senha do usuário através do qual o banco de dados será acessado. O parâmetro statement contém uma consulta SQL que deve retornar os dados que você deseja processar — nessa configuração, ele seleciona todas as linhas do banco de dados apropriado.

      O parâmetro schedule aceita uma string na sintaxe do cron que define quando o Logstash deve executar esta entrada; omiti-lo completamente fará com que o Logstash a execute apenas uma vez. Especificando * * * * *, como você fez aqui, dirá ao Logstash para executá-la a cada minuto. Você pode especificar sua própria string do cron se desejar coletar dados em diferentes intervalos.

      Há apenas uma saída, que aceita dados de três entradas. Todas elas enviam dados para o Elasticsearch, que está sendo executado localmente e pode ser acessado em http://localhost:9200. O parâmetro index define para qual índice do Elasticsearch ele enviará os dados e seu valor é passado no campo type da entrada.

      Quando você terminar de editar, salve e feche o arquivo.

      Você configurou o Logstash para reunir dados de várias tabelas estatísticas do PostgreSQL e enviá-los ao Elasticsearch para armazenamento e indexação. Em seguida, você executará o Logstash para testar a configuração.

      Passo 3 — Testando a Configuração do Logstash

      Nesta seção, você testará a configuração executando o Logstash para verificar se ele extrairá os dados corretamente. Em seguida, você executará essa configuração em segundo plano, configurando-a como um pipeline do Logstash.

      O Logstash suporta a execução de uma configuração específica, passando seu caminho de arquivo para o parâmetro -f. Execute o seguinte comando para testar sua nova configuração da último passo:

      • sudo /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/postgresql.conf

      Pode levar algum tempo até que ele mostre qualquer saída, que será semelhante a esta:

      Output

      Thread.exclusive is deprecated, use Thread::Mutex WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console [WARN ] 2019-08-02 18:29:15.123 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified [INFO ] 2019-08-02 18:29:15.154 [LogStash::Runner] runner - Starting Logstash {"logstash.version"=>"7.3.0"} [INFO ] 2019-08-02 18:29:18.209 [Converge PipelineAction::Create<main>] Reflections - Reflections took 77 ms to scan 1 urls, producing 19 keys and 39 values [INFO ] 2019-08-02 18:29:20.195 [[main]-pipeline-manager] elasticsearch - Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}} [WARN ] 2019-08-02 18:29:20.667 [[main]-pipeline-manager] elasticsearch - Restored connection to ES instance {:url=>"http://localhost:9200/"} [INFO ] 2019-08-02 18:29:21.221 [[main]-pipeline-manager] elasticsearch - ES Output version determined {:es_version=>7} [WARN ] 2019-08-02 18:29:21.230 [[main]-pipeline-manager] elasticsearch - Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7} [INFO ] 2019-08-02 18:29:21.274 [[main]-pipeline-manager] elasticsearch - New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["http://localhost:9200"]} [INFO ] 2019-08-02 18:29:21.337 [[main]-pipeline-manager] elasticsearch - Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}} [WARN ] 2019-08-02 18:29:21.369 [[main]-pipeline-manager] elasticsearch - Restored connection to ES instance {:url=>"http://localhost:9200/"} [INFO ] 2019-08-02 18:29:21.386 [[main]-pipeline-manager] elasticsearch - ES Output version determined {:es_version=>7} [WARN ] 2019-08-02 18:29:21.386 [[main]-pipeline-manager] elasticsearch - Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7} [INFO ] 2019-08-02 18:29:21.409 [[main]-pipeline-manager] elasticsearch - New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["http://localhost:9200"]} [INFO ] 2019-08-02 18:29:21.430 [[main]-pipeline-manager] elasticsearch - Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}} [WARN ] 2019-08-02 18:29:21.444 [[main]-pipeline-manager] elasticsearch - Restored connection to ES instance {:url=>"http://localhost:9200/"} [INFO ] 2019-08-02 18:29:21.465 [[main]-pipeline-manager] elasticsearch - ES Output version determined {:es_version=>7} [WARN ] 2019-08-02 18:29:21.466 [[main]-pipeline-manager] elasticsearch - Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7} [INFO ] 2019-08-02 18:29:21.468 [Ruby-0-Thread-7: :1] elasticsearch - Using default mapping template [INFO ] 2019-08-02 18:29:21.538 [Ruby-0-Thread-5: :1] elasticsearch - Using default mapping template [INFO ] 2019-08-02 18:29:21.545 [[main]-pipeline-manager] elasticsearch - New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["http://localhost:9200"]} [INFO ] 2019-08-02 18:29:21.589 [Ruby-0-Thread-9: :1] elasticsearch - Using default mapping template [INFO ] 2019-08-02 18:29:21.696 [Ruby-0-Thread-5: :1] elasticsearch - Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}} [INFO ] 2019-08-02 18:29:21.769 [Ruby-0-Thread-7: :1] elasticsearch - Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}} [INFO ] 2019-08-02 18:29:21.771 [Ruby-0-Thread-9: :1] elasticsearch - Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}} [WARN ] 2019-08-02 18:29:21.871 [[main]-pipeline-manager] LazyDelegatingGauge - A gauge metric of an unknown type (org.jruby.specialized.RubyArrayOneObject) has been create for key: cluster_uuids. This may result in invalid serialization. It is recommended to log an issue to the responsible developer/development team. [INFO ] 2019-08-02 18:29:21.878 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>125, :thread=>"#<Thread:0x470bf1ca run>"} [INFO ] 2019-08-02 18:29:22.351 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"} [INFO ] 2019-08-02 18:29:22.721 [Ruby-0-Thread-1: /usr/share/logstash/lib/bootstrap/environment.rb:6] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]} [INFO ] 2019-08-02 18:29:23.798 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600} /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/cronline.rb:77: warning: constant ::Fixnum is deprecated /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/cronline.rb:77: warning: constant ::Fixnum is deprecated /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/cronline.rb:77: warning: constant ::Fixnum is deprecated [INFO ] 2019-08-02 18:30:02.333 [Ruby-0-Thread-22: /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:284] jdbc - (0.042932s) SELECT * FROM pg_stat_user_indexes [INFO ] 2019-08-02 18:30:02.340 [Ruby-0-Thread-23: /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:331] jdbc - (0.043178s) SELECT * FROM pg_stat_user_tables [INFO ] 2019-08-02 18:30:02.340 [Ruby-0-Thread-24: :1] jdbc - (0.036469s) SELECT * FROM pg_stat_database ...

      Se o Logstash não mostrar nenhum erro e registrar com êxito as linhas Selecionadas (SELECT) dos três bancos de dados, suas métricas serão enviadas ao Elasticsearch. Se você obtiver um erro, verifique todos os valores no arquivo de configuração para garantir que a máquina na qual você está executando o Logstash possa se conectar ao banco de dados gerenciado.

      O Logstash continuará importando os dados em horários especificados. Você pode pará-lo com segurança pressionando CTRL+C.

      Como mencionado anteriormente, quando iniciado como um serviço, o Logstash executa automaticamente todos os arquivos de configuração encontrados em /etc/logstash/conf.d em segundo plano. Execute o seguinte comando para iniciá-lo como um serviço:

      • sudo systemctl start logstash

      Neste passo, você executou o Logstash para verificar se ele pode se conectar ao seu banco de dados e coletar dados. Em seguida, você visualizará e explorará alguns dos dados estatísticos no Kibana.

      Passo 4 — Explorando os Dados Importados no Kibana

      Nesta seção, você verá como você pode explorar os dados estatísticos que descrevem o desempenho do seu banco de dados no Kibana.

      No seu navegador, navegue até a instalação do Kibana que você configurou como pré-requisito. Você verá a página de boas-vindas padrão.

      Kibana - Default Welcome Page

      Para interagir com os índices do Elasticsearch no Kibana, você precisará criar um padrão de índice ou Index patterns. Index patterns especificam em quais índices o Kibana deve operar. Para criar um, pressione o último ícone (chave inglesa) na barra lateral vertical esquerda para abrir a página Management. Em seguida, no menu esquerdo, clique em Index Patterns abaixo de Kibana.. Você verá uma caixa de diálogo para criar um padrão de índice.

      Kibana - Add Index Pattern

      Aqui estão listados os três índices para os quais o Logstash está enviando estatísticas. Digite pg_stat_database na caixa de entrada Index Pattern e pressione Next step. Você será solicitado a selecionar um campo que armazene tempo, para poder restringir seus dados posteriormente por um intervalo de tempo. No menu suspenso, selecione @timestamp.

      Kibana - Index Pattern Timestamp Field

      Clique em Create index pattern para concluir a criação do padrão de índice. Agora você poderá explorá-lo usando o Kibana. Para criar uma visualização, clique no segundo ícone na barra lateral e, em seguida, em Create new visualization. Selecione a visualização Line quando o formulário aparecer e escolha o padrão de índice que você acabou de criar (pg_stat_database). Você verá uma visualização vazia.

      Kibana - Empty Visualisation

      Na parte central da tela está o gráfico resultante — o painel do lado esquerdo governa sua geração a partir da qual você pode definir os dados para os eixos X e Y. No lado superior direito da tela, está o seletor de período. A menos que você escolha especificamente outro intervalo ao configurar os dados, esse intervalo será mostrado no gráfico.

      Agora você visualizará o número médio de tuplas de dados inseridas (INSERT) em minutos no intervalo especificado. Clique em Y-Axis em Metrics no painel à esquerda para expandir. Selecione Average ou média como Aggregation e selecione tup_inserted como Field ou campo. Isso preencherá o eixo Y do gráfico com os valores médios.

      Em seguida, clique em X-Axis em Buckets. Para Aggregation, escolha Date Histogram. @timestamp deve ser selecionado automaticamente como o Field. Em seguida, pressione o botão play azul na parte superior do painel para gerar seu gráfico. Se o seu banco de dados for novo e não for usado, você não verá nada ainda. Em todos os casos, no entanto, você verá um retrato preciso do uso do banco de dados.

      O Kibana suporta muitas outras formas de visualização — você pode explorar outras formas na documentação do Kibana. Você também pode adicionar os dois índices restantes, mencionados no Passo 2 ao Kibana para poder visualizá-los também.

      Neste passo, você aprendeu como visualizar alguns dos dados estatísticos do PostgreSQL, usando o Kibana.

      Passo 5 — (Opcional) Fazendo Benchmark Usando pgbench

      Se você ainda não trabalhou no seu banco de dados fora deste tutorial, você pode concluir esta etapa para criar visualizações mais interessantes usando o pgbench para fazer benchmark do seu banco de dados. O pgbench executará os mesmos comandos SQL repetidamente, simulando o uso do banco de dados no mundo real por um cliente real.

      Você primeiro precisará instalar o pgbench executando o seguinte comando:

      • sudo apt install postgresql-contrib -y

      Como o pgbench irá inserir e atualizar dados de teste, você precisará criar um banco de dados separado para ele. Para fazer isso, vá para a guia Users & Databases no Painel de Controle do seu banco de dados gerenciado e role para baixo até a seção Databases. Digite pgbench como o nome do novo banco de dados e pressione Save. Você passará esse nome, bem como as informações de host, porta e nome de usuário para o pgbench.

      Accessing Databases section in DO control panel

      Antes de executar de fato o pgbench, você precisará executá-lo com a flag -i para inicializar seu banco de dados:

      • pgbench -h host -p port -U username -i pgbench

      Você precisará substituir host pelo seu endereço de host, port pela porta à qual você pode se conectar ao seu banco de dados e username com o nome do usuário do banco de dados. Você pode encontrar todos esses valores no painel de controle do seu banco de dados gerenciado.

      Observe que o pgbench não possui um argumento de senha; em vez disso, você será solicitado sempre que executá-lo.

      A saída terá a seguinte aparência:

      Output

      NOTICE: table "pgbench_history" does not exist, skipping NOTICE: table "pgbench_tellers" does not exist, skipping NOTICE: table "pgbench_accounts" does not exist, skipping NOTICE: table "pgbench_branches" does not exist, skipping creating tables... 100000 of 100000 tuples (100%) done (elapsed 0.16 s, remaining 0.00 s) vacuum... set primary keys... done.

      O pgbench criou quatro tabelas que serão usadas para o benchmarking e as preencheu com algumas linhas de exemplo. Agora você poderá executar benchmarks.

      Os dois argumentos mais importantes que limitam por quanto tempo o benchmark será executado são -t, que especifica o número de transações a serem concluídas, e -T, que define por quantos segundos o benchmark deve ser executado. Essas duas opções são mutuamente exclusivas. No final de cada benchmark, você receberá estatísticas, como o número de transações por segundo (tps).

      Agora, inicie um benchmark que durará 30 segundos executando o seguinte comando:

      • pgbench -h host -p port -U username pgbench -T 30

      A saída será semelhante a:

      Output

      starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 30 s number of transactions actually processed: 7602 latency average = 3.947 ms tps = 253.382298 (including connections establishing) tps = 253.535257 (excluding connections establishing)

      Nesta saída, você vê as informações gerais sobre o benchmark, como o número total de transações executadas. O efeito desses benchmarks é que as estatísticas enviadas pelo Logstash para o Elasticsearch refletirão esse número, o que tornará as visualizações no Kibana mais interessantes e mais próximas dos gráficos do mundo real. Você pode executar o comando anterior mais algumas vezes e talvez alterar a duração.

      Quando terminar, vá para o Kibana e clique em Refresh no canto superior direito. Agora você verá uma linha diferente da anterior, que mostra o número de INSERTs. Sinta-se à vontade para alterar o intervalo de tempo dos dados mostrados, alterando os valores no seletor posicionado acima do botão de atualização. Aqui está como o gráfico pode se apresentar após vários benchmarks de duração variável:

      Kibana - Visualization After Benchmarks

      Você usou o pgbench para fazer benchmark em seu banco de dados e avaliou os gráficos resultantes no Kibana.

      Conclusão

      Agora você tem a pilha Elastic instalada em seu servidor e configurada para coletar regularmente dados estatísticos do banco de dados PostgreSQL gerenciado. Você pode analisar e visualizar os dados usando o Kibana, ou algum outro software adequado, que o ajudará a reunir informações valiosas e correlações do mundo real sobre o desempenho do seu banco de dados.

      Para obter mais informações sobre o que você pode fazer com o banco de dados gerenciado do PostgreSQL, visite a documentação de produto.



      Source link