One place for hosting & domains

      How to Install and Configure pgAdmin 4 in Server Mode


      Introduction

      pgAdmin is an open-source administration and development platform for PostgreSQL and its related database management systems. Written in Python and jQuery, it supports all the features found in PostgreSQL. You can use pgAdmin to do everything from writing basic SQL queries to monitoring your databases and configuring advanced database architectures.

      In this tutorial, we’ll walk through the process of installing and configuring the latest version of pgAdmin onto an Ubuntu 18.04 server, accessing pgAdmin through a web browser, and connecting it to a PostgreSQL database on your server.

      Prerequisites

      To complete this tutorial, you will need:

      Step 1 — Installing pgAdmin and its Dependencies

      As of this writing, the most recent version of pgAdmin is pgAdmin 4, while the most recent version available through the official Ubuntu repositories is pgAdmin 3. pgAdmin 3 is no longer supported though, and the project maintainers recommend installing pgAdmin 4. In this step, we will go over the process of installing the latest version of pgAdmin 4 within a virtual environment (as recommended by the project’s development team) and installing its dependencies using apt.

      To begin, update your server’s package index if you haven’t done so recently:

      Next, install the following dependencies. These include libgmp3-dev, a multiprecision arithmetic library; libpq-dev, which includes header files and a static library that helps communication with a PostgreSQL backend; and libapache2-mod-wsgi-py3, an Apache module that allows you to host Python-based web applications within Apache:

      • sudo apt install libgmp3-dev libpq-dev libapache2-mod-wsgi-py3

      Following this, create a few directories where pgAdmin will store its sessions data, storage data, and logs:

      • sudo mkdir -p /var/lib/pgadmin4/sessions
      • sudo mkdir /var/lib/pgadmin4/storage
      • sudo mkdir /var/log/pgadmin4

      Then, change ownership of these directories to your non-root user and group. This is necessary because they are currently owned by your root user, but we will install pgAdmin from a virtual environment owned by your non-root user, and the installation process involves creating some files within these directories. After the installation, however, we will change the ownership over to the www-data user and group so it can be served to the web:

      • sudo chown -R sammy:sammy /var/lib/pgadmin4
      • sudo chown -R sammy:sammy /var/log/pgadmin4

      Next, open up your virtual environment. Navigate to the directory your programming environment is in and activate it. Following the naming conventions of the prerequisite Python 3 tutorial, we’ll go to the environments directory and activate the my_env environment:

      • cd environments/
      • source my_env/bin/activate

      Following this, download the pgAdmin 4 source code onto your machine. To find the latest version of the source code, navigate to the pgAdmin 4 (Python Wheel) Download page and click the link for the latest version (v3.4, as of this writing). This will take you to a Downloads page on the PostgreSQL website. Once there, copy the file link that ends with .whl — the standard built-package format used for Python distributions. Then go back to your terminal and run the following wget command, making sure to replace the link with the one you copied from the PostgreSQL site, which will download the .whl file to your server:

      • wget https://ftp.postgresql.org/pub/pgadmin/pgadmin4/v3.4/pip/pgadmin4-3.4-py2.py3-none-any.whl

      Next install the wheel package, the reference implementation of the wheel packaging standard. A Python library, this package serves as an extension for building wheels and includes a command line tool for working with .whl files:

      • python -m pip install wheel

      Then install pgAdmin 4 package with the following command:

      • python -m pip install pgadmin4-3.4-py2.py3-none-any.whl

      That takes care of installing pgAdmin and its dependencies. Before connecting it to your database, though, there are a few changes you’ll need to make to the program’s configuration.

      Step 2 — Configuring pgAdmin 4

      Although pgAdmin has been installed on your server, there are still a few steps you must go through to ensure it has the permissions and configurations needed to allow it to correctly serve the web interface.

      pgAdmin’s main configuration file, config.py, is read before any other configuration file. Its contents can be used as a reference point for further configuration settings that can be specified in pgAdmin’s other config files, but to avoid unforeseen errors, you should not edit the config.py file itself. We will add some configuration changes to a new file, named config_local.py, which will be read after the primary one.

      Create this file now using your preferred text editor. Here, we will use nano:

      • nano my_env/lib/python3.6/site-packages/pgadmin4/config_local.py

      In your editor, add the following content:

      environments/my_env/lib/python3.6/site-packages/pgadmin4/config_local.py

      LOG_FILE = '/var/log/pgadmin4/pgadmin4.log'
      SQLITE_PATH = '/var/lib/pgadmin4/pgadmin4.db'
      SESSION_DB_PATH = '/var/lib/pgadmin4/sessions'
      STORAGE_DIR = '/var/lib/pgadmin4/storage'
      SERVER_MODE = True
      

      Here are what these five directives do:

      • LOG_FILE: this defines the file in which pgAdmin’s logs will be stored.
      • SQLITE_PATH: pgAdmin stores user-related data in an SQLite database, and this directive points the pgAdmin software to this configuration database. Because this file is located under the persistent directory /var/lib/pgadmin4/, your user data will not be lost after you upgrade.
      • SESSION_DB_PATH: specifies which directory will be used to store session data.
      • STORAGE_DIR: defines where pgAdmin will store other data, like backups and security certificates.
      • SERVER_MODE: setting this directive to True tells pgAdmin to run in Server mode, as opposed to Desktop mode.

      Notice that each of these file paths point to the directories you created in Step 1.

      After adding these lines, save and close the file (press CTRL + X, followed by Y and then ENTER). With those configurations in place, run the pgAdmin setup script to set your login credentials:

      • python my_env/lib/python3.6/site-packages/pgadmin4/setup.py

      After running this command, you will see a prompt asking for your email address and a password. These will serve as your login credentials when you access pgAdmin later on, so be sure to remember or take note of what you enter here:

      Output

      . . . Enter the email address and password to use for the initial pgAdmin user account: Email address: sammy@example.com Password: Retype password:

      Following this, deactivate your virtual environment:

      Recall the file paths you specified in the config_local.py file. These files are held within the directories you created in Step 1, which are currently owned by your non-root user. They must, however, be accessible by the user and group running your web server. By default on Ubuntu 18.04, these are the www-data user and group, so update the permissions on the following directories to give www-data ownership over both of them:

      • sudo chown -R www-data:www-data /var/lib/pgadmin4/
      • sudo chown -R www-data:www-data /var/log/pgadmin4/

      With that, pgAdmin is fully configured. However, the program isn't yet being served from your server, so it remains inaccessible. To resolve this, we will configure Apache to serve pgAdmin so you can access its user interface through a web browser.

      Step 3 — Configuring Apache

      The Apache web server uses virtual hosts to encapsulate configuration details and host more than one domain from a single server. If you followed the prerequisite Apache tutorial, you may have set up an example virtual host file under the name example.com.conf, but in this step we will create a new one from which we can serve the pgAdmin web interface.

      To begin, make sure you're in your root directory:

      Then create a new file in your /sites-available/ directory called pgadmin4.conf. This will be your server’s virtual host file:

      • sudo nano /etc/apache2/sites-available/pgadmin4.conf

      Add the following content to this file, being sure to update the highlighted parts to align with your own configuration:

      /etc/apache2/sites-available/pgadmin4.conf

      <VirtualHost *>
          ServerName your_server_ip
      
          WSGIDaemonProcess pgadmin processes=1 threads=25 python-home=/home/sammy/environments/my_env
          WSGIScriptAlias / /home/sammy/environments/my_env/lib/python3.6/site-packages/pgadmin4/pgAdmin4.wsgi
      
          <Directory "/home/sammy/environments/my_env/lib/python3.6/site-packages/pgadmin4/">
              WSGIProcessGroup pgadmin
              WSGIApplicationGroup %{GLOBAL}
              Require all granted
          </Directory>
      </VirtualHost>
      

      Save and close the virtual host file. Next, use the a2dissite script to disable the default virtual host file, 000-default.conf:

      • sudo a2dissite 000-default.conf

      Note: If you followed the prerequisite Apache tutorial, you may have already disabled 000-default.conf and set up an example virtual host configuration file (named example.com.conf in the prerequisite). If this is the case, you will need to disable the example.com.conf virtual host file with the following command:

      • sudo a2dissite example.com.conf

      Then use the a2ensite script to enable your pgadmin4.conf virtual host file. This will create a symbolic link from the virtual host file in the /sites-available/ directory to the /sites-enabled/ directory:

      • sudo a2ensite pgadmin4.conf

      Following this, test that your configuration file’s syntax is correct:

      If your configuration file is all in order, you will see Syntax OK. If you see an error in the output, reopen the pgadmin4.conf file and double check that your IP address and file paths are all correct, then rerun the configtest.

      Once you see Syntax OK in your output, restart the Apache service so it reads your new virtual host file:

      • sudo systemctl restart apache2

      pgAdmin is now fully installed and configured. Next, we'll go over how to access pgAdmin from a browser before connecting it to your PostgreSQL database.

      Step 4 — Accessing pgAdmin

      On your local machine, open up your preferred web browser and navigate to your server’s IP address:

      http://your_server_ip
      

      Once there, you’ll be presented with a login screen similar to the following:

      pgAdmin login screen

      Enter the login credentials you defined in Step 2, and you’ll be taken to the pgAdmin Welcome Screen:

      pgAdmin Welcome Page

      Now that you've confirmed you can access the pgAdmin interface, all that's left to do is to connect pgAdmin to your PostgreSQL database. Before doing so, though, you'll need to make one minor change to your PostgreSQL superuser's configuration.

      Step 5 — Configuring your PostgreSQL User

      If you followed the prerequisite PostgreSQL tutorial, you should already have PostgreSQL installed on your server with a new superuser role and database set up.

      By default in PostgreSQL, you authenticate as database users using the "Identification Protocol," or "ident," authentication method. This involves PostgreSQL taking the client's Ubuntu username and using it as the allowed database username. This can allow for greater security in many cases, but it can also cause issues in instances where you'd like an outside program, such as pgAdmin, to connect to one of your databases. To resolve this, we will set a password for this PostgreSQL role which will allow pgAdmin to connect to your database.

      From your terminal, open the PostgreSQL prompt under your superuser role:

      From the PostgreSQL prompt, update the user profile to have a strong password of your choosing:

      • ALTER USER sammy PASSWORD 'password';

      Then exit the PostgreSQL prompt:

      Next, go back to the pgAdmin 4 interface in your browser, and locate the Browser menu on the left hand side. Right-click on Servers to open a context menu, hover your mouse over Create, and click Server….

      Create Server context menu

      This will cause a window to pop up in your browser in which you'll enter info about your server, role, and database.

      In the General tab, enter the name for this server. This can be anything you'd like, but you may find it helpful to make it something descriptive. In our example, the server is named Sammy-server-1.

      Create Server - General tab

      Next, click on the Connection tab. In the Host name/address field, enter localhost. The Port should be set to 5432 by default, which will work for this setup, as that's the default port used by PostgreSQL.

      In the Maintenance database field, enter the name of the database you'd like to connect to. Note that this database must already be created on your server. Then, enter the PostgreSQL username and password you configured previously in the Username and Password fields, respectively.

      Create Server - Connection tab

      The empty fields in the other tabs are optional, and it's only necessary that you fill them in if you have a specific setup in mind in which they're required. Click the Save button, and the database will appear under the Servers in the Browser menu.

      You've successfully connected pgAdmin4 to your PostgreSQL database. You can do just about anything from the pgAdmin dashboard that you would from the PostgreSQL prompt. To illustrate this, we will create an example table and populate it with some sample data through the web interface.

      Step 6 — Creating a Table in the pgAdmin Dashboard

      From the pgAdmin dashboard, locate the Browser menu on the left-hand side of the window. Click on the plus sign (+) next to Servers (1) to expand the tree menu within it. Next, click the plus sign to the left of the server you added in the previous step (Sammy-server-1 in our example), then expand Databases, the name of the database you added (sammy, in our example), and then Schemas (1). You should see a tree menu like the following:

      Expanded Browser tree menu

      Right-click the Tables list item, then hover your cursor over Create and click Table….

      Create Table context menu

      This will open up a Create-Table window. Under the General tab of this window, enter a name for the table. This can be anything you'd like, but to keep things simple we'll refer to it as table-01.

      Create Table - General tab

      Then navigate to the Columns tab and click the + sign in the upper right corner of the window to add some columns. When adding a column, you're required to give it a Name and a Data type, and you may need to choose a Length if it's required by the data type you've selected.

      Additionally, the official PostgreSQL documentation states that adding a primary key to a table is usually best practice. A primary key is a constraint that indicates a specific column or set of columns that can be used as a special identifier for rows in the table. This isn't a requirement, but if you'd like to set one or more of your columns as the primary key, toggle the switch at the far right from No to Yes.

      Click the Save button to create the table.

      Create Table - Columns Tab with Primary Key turned on

      By this point, you've created a table and added a couple columns to it. However, the columns don't yet contain any data. To add data to your new table, right-click the name of the table in the Browser menu, hover your cursor over Scripts and click on INSERT Script.

      INSERT script context menu

      This will open a new panel on the dashboard. At the top you'll see a partially-completed INSERT statement, with the appropriate table and column names. Go ahead and replace the question marks (?) with some dummy data, being sure that the data you add aligns with the data types you selected for each column. Note that you can also add multiple rows of data by adding each row in a new set of parentheses, with each set of parentheses separated by a comma as shown in the following example.

      If you'd like, feel free to replace the partially-completed INSERT script with this example INSERT statement:

      INSERT INTO public."table-01"(
          col1, col2, col3)
          VALUES ('Juneau', 14, 337), ('Bismark', 90, 2334), ('Lansing', 51, 556);
      

      Example INSERT statement

      Click on the lightning bolt icon () to execute the INSERT statement. To view the table and all the data within it, right-click the name of your table in the Browser menu once again, hover your cursor over View/Edit Data, and select All Rows.

      View/Edit Data, All Rows context menu

      This will open another new panel, below which, in the lower panel's Data Output tab, you can view all the data held within that table.

      View Data - example data output

      With that, you've successfully created a table and populated it with some data through the pgAdmin web interface. Of course, this is just one method you can use to create a table through pgAdmin. For example, it's possible to create and populate a table using SQL instead of the GUI-based method described in this step.

      Conclusion

      In this guide, you learned how to install pgAdmin 4 from a Python virtual environment, configure it, serve it to the web with Apache, and how to connect it to a PostgreSQL database. Additionally, this guide went over one method that can be used to create and populate a table, but pgAdmin can be used for much more than just creating and editing tables.

      For more information on how to get the most out of all of pgAdmin's features, we encourage you to review the project's documentation. You can also learn more about PostgreSQL through our Community tutorials on the subject.



      Source link

      How to Install Hadoop in Stand-Alone Mode on Debian 9


      Introduction

      Hadoop is a Java-based programming framework that supports the processing and storage of extremely large datasets on a cluster of inexpensive machines. It was the first major open source project in the big data playing field and is sponsored by the Apache Software Foundation.

      Hadoop is comprised of four main layers:

      • Hadoop Common is the collection of utilities and libraries that support other Hadoop modules.
      • HDFS, which stands for Hadoop Distributed File System, is responsible for persisting data to disk.
      • YARN, short for Yet Another Resource Negotiator, is the “operating system” for HDFS.
      • MapReduce is the original processing model for Hadoop clusters. It distributes work within the cluster or map, then organizes and reduces the results from the nodes into a response to a query. Many other processing models are available for the 3.x version of Hadoop.

      Hadoop clusters are relatively complex to set up, so the project includes a stand-alone mode which is suitable for learning about Hadoop, performing simple operations, and debugging.

      In this tutorial, you’ll install Hadoop in stand-alone mode and run one of the example example MapReduce programs it includes to verify the installation.

      Before you begin, you might also like to take a look at An Introduction to Big Data Concepts and Terminology or An Introduction to Hadoop

      Prerequisites

      To follow this tutorial, you will need:

      Step 1 — Installing Hadoop

      To install Hadoop, first visit the Apache Hadoop Releases page to find the most recent stable release.

      Navigate to binary for the release you’d like to install. In this guide, we’ll install Hadoop 3.0.3.

      Screenshot of the Hadoop releases page highlighting the link to the latest stable binary

      On the next page, right-click and copy the link to the release binary.

      Screenshot of the Hadoop mirror page

      On your server, use wget to fetch it:

      • wget http://www-us.apache.org/dist/hadoop/common/hadoop-3.0.3/hadoop-3.0.3.tar.gz

      Note: The Apache website will direct you to the best mirror dynamically, so your URL may not match the URL above.

      In order to ensure that the file you downloaded hasn’t been altered, do a quick check using SHA-256. Return to the releases page, then right-click and copy the link to the checksum file for the release binary you downloaded:

      Screenshot highlighting the .mds file

      Again, use wget on your server to download the file:

      • wget https://dist.apache.org/repos/dist/release/hadoop/common/hadoop-3.0.3/hadoop-3.0.3.tar.gz.mds

      Then run the verification:

      • sha256sum hadoop-3.0.3.tar.gz

      Output

      db96e2c0d0d5352d8984892dfac4e27c0e682d98a497b7e04ee97c3e2019277a hadoop-3.0.3.tar.gz

      Compare this value with the SHA-256 value in the .mds file:

      • cat hadoop-3.0.3.tar.gz.mds | grep SHA256

      ~/hadoop-3.0.3.tar.gz.mds

      ...
      SHA256 = DB96E2C0 D0D5352D 8984892D FAC4E27C 0E682D98 A497B7E0 4EE97C3E 2019277A
      

      You can safely ignore the difference in case and the spaces. The output of the command you ran against the file we downloaded from the mirror should match the value in the file you downloaded from apache.org.

      Now that you’ve verified that the file wasn’t corrupted or changed, use the tar command with the -x flag to extract, -z to uncompress, -v for verbose output, and -f to specify that you’re extracting the archive from a file. Use tab-completion or substitute the correct version number in the command below:

      • tar -xzvf hadoop-3.0.3.tar.gz

      Finally, move the extracted files into /usr/local, the appropriate place for locally installed software. Change the version number, if needed, to match the version you downloaded.

      • sudo mv hadoop-3.0.3 /usr/local/hadoop

      With the software in place, we’re ready to configure its environment.

      Step 3 — Running Hadoop

      Let’s make sure Hadoop runs. Execute the following command to launch Hadoop and display its help options:

      • /usr/local/hadoop/bin/hadoop

      You’ll see the following output, which lets you know you’ve successfully configured Hadoop to run in stand-alone mode.

      Output

      Usage: hadoop [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS] or hadoop [OPTIONS] CLASSNAME [CLASSNAME OPTIONS] where CLASSNAME is a user-provided Java class OPTIONS is none or any of: --config dir Hadoop config directory --debug turn on shell script debug mode --help usage information buildpaths attempt to add class files from build tree hostnames list[,of,host,names] hosts to use in slave mode hosts filename list of hosts to use in slave mode loglevel level set the log4j level for this command workers turn on worker mode SUBCOMMAND is one of: . . .

      We’ll ensure that it is functioning properly by running the example MapReduce program it ships with. To do so, create a directory called input in your home directory and copy Hadoop’s configuration files into it to use those files as our data.

      • mkdir ~/input
      • cp /usr/local/hadoop/etc/hadoop/*.xml ~/input

      Next, we’ll run the MapReduce hadoop-mapreduce-examples program, a Java archive with several options. We’ll invoke its grep program, one of the many examples included in hadoop-mapreduce-examples, followed by the input directory, input and the output directory grep_example. The MapReduce grep program will count the matches of a literal word or regular expression. Finally, we’ll supply the regular expression allowed[.]* to find occurrences of the word allowed within or at the end of a declarative sentence. The expression is case-sensitive, so we wouldn’t find the word if it were capitalized at the beginning of a sentence.

      Execute the following command:

      • /usr/local/hadoop/bin/hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.3.jar grep ~/input ~/grep_example 'allowed[.]*'

      When the task completes, it provides a summary of what has been processed and errors it has encountered, but this doesn’t contain the actual results:

      Output

      . . . File System Counters FILE: Number of bytes read=1330690 FILE: Number of bytes written=3128841 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 Map-Reduce Framework Map input records=2 Map output records=2 Map output bytes=33 Map output materialized bytes=43 Input split bytes=115 Combine input records=0 Combine output records=0 Reduce input groups=2 Reduce shuffle bytes=43 Reduce input records=2 Reduce output records=2 Spilled Records=4 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=3 Total committed heap usage (bytes)=478150656 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=147 File Output Format Counters Bytes Written=34

      The results are stored in the ~/grep_example directory.

      If this output directory already exists, the program will fail, and rather than seeing the summary, you’ll see something like this:

      Output

      . . . at java.base/java.lang.reflect.Method.invoke(Method.java:564) at org.apache.hadoop.util.RunJar.run(RunJar.java:244) at org.apache.hadoop.util.RunJar.main(RunJar.java:158)

      Check the results by running cat on the output directory:

      You'll see this output:

      Output

      19 allowed. 1 allowed

      The MapReduce task found 19 occurrences of the word allowed followed by a period and one occurrence where it was not. Running the example program has verified that our stand-alone installation is working properly and that non-privileged users on the system can run Hadoop for exploration or debugging.

      Conclusion

      In this tutorial, we've installed Hadoop in stand-alone mode and verified it by running an example program it provided. To learn how to write your own MapReduce programs, visit Apache Hadoop's MapReduce tutorial which walks through the code behind the example you used in this tutorial. When you're ready to set up a cluster, see the Apache Foundation Hadoop Cluster Setup guide.



      Source link