One place for hosting & domains

      Perform

      How To Perform Full-text Search in MongoDB


      The author selected the Open Internet/Free Speech Fund to receive a donation as part of the Write for DOnations program.

      Introduction

      MongoDB queries that filter data by searching for exact matches, using greater-than or less-than comparisons, or by using regular expressions will work well enough in many situations. However, these methods fall short when it comes to filtering against fields containing rich textual data.

      Imagine you typed “coffee recipe” into a web search engine but it only returned pages that contained that exact phrase. In this case, you may not find exactly what you were looking for since most popular websites with coffee recipes may not contain the exact phrase “coffee recipe.” If you were to enter that phrase into a real search engine, though, you might find pages with titles like “Great Coffee Drinks (with Recipes!)” or “Coffee Shop Drinks and Treats You Can Make at Home.” In these examples, the word “coffee” is present but the titles contain another form of the word “recipe” or exclude it entirely.

      This level of flexibility in matching text to a search query is typical for full-text search engines that specialize in searching textual data. There are multiple specialized open-source tools for such applications in use, with ElasticSearch being an especially popular choice. However, for scenarios that don’t require the robust search features found in dedicated search engines, some general-purpose database management systems offer their own full-text search capabilities.

      In this tutorial, you’ll learn by example how to create a text index in MongoDB and use it to search the documents in the database against common full-text search queries and filters.

      Prerequisites

      To follow this tutorial, you will need:

      Note: The linked tutorials on how to configure your server, install MongoDB, and secure the MongoDB installation refer to Ubuntu 20.04. This tutorial concentrates on MongoDB itself, not the underlying operating system. It will generally work with any MongoDB installation regardless of the operating system as long as authentication has been enabled.

      Step 1 — Preparing the Test Data

      To help you learn how to perform full-text searches in MongoDB, this step outlines how to open the MongoDB shell to connect to your locally-installed MongoDB instance. It also explains how to create a sample collection and insert a few sample documents into it. This sample data will be used in commands and examples throughout this guide to help explain how to use MongoDB to search text data.

      To create this sample collection, connect to the MongoDB shell as your administrative user. This tutorial follows the conventions of the prerequisite MongoDB security tutorial and assumes the name of this administrative user is AdminSammy and its authentication database is admin. Be sure to change these details in the following command to reflect your own setup, if different:

      • mongo -u AdminSammy -p --authenticationDatabase admin

      Enter the password you set during installation to gain access to the shell. After providing the password, your prompt will change to a greater-than sign:

      Note: On a fresh connection, the MongoDB shell will connect to the test database by default. You can safely use this database to experiment with MongoDB and the MongoDB shell.

      Alternatively, you could switch to another database to run all of the example commands given in this tutorial. To switch to another database, run the use command followed by the name of your database:

      To understand how full-text search can be applied to documents in MongoDB, you’ll need a collection of documents you can filter against. This guide will use a collection of sample documents that include names and descriptions of several different types of coffee drinks. These documents will have the same format as the following example document describing a Cuban coffee drink:

      Example Cafecito document

      {
          "name": "Cafecito",
          "description": "A sweet and rich Cuban hot coffee made by topping an espresso shot with a thick sugar cream foam."
      }
      

      This document contains two fields: the name of the coffee drink and a longer description which provides some background information about the drink and its ingredients.

      Run the following insertMany() method in the MongoDB shell to create a collection named recipes and, at the same time, insert five sample documents into it:

      • db.recipes.insertMany([
      • {"name": "Cafecito", "description": "A sweet and rich Cuban hot coffee made by topping an espresso shot with a thick sugar cream foam."},
      • {"name": "New Orleans Coffee", "description": "Cafe Noir from New Orleans is a spiced, nutty coffee made with chicory."},
      • {"name": "Affogato", "description": "An Italian sweet dessert coffee made with fresh-brewed espresso and vanilla ice cream."},
      • {"name": "Maple Latte", "description": "A wintertime classic made with espresso and steamed milk and sweetened with some maple syrup."},
      • {"name": "Pumpkin Spice Latte", "description": "It wouldn't be autumn without pumpkin spice lattes made with espresso, steamed milk, cinnamon spices, and pumpkin puree."}
      • ])

      This method will return a list of object identifiers assigned to the newly inserted objects:

      Output

      { "acknowledged" : true, "insertedIds" : [ ObjectId("61895d2787f246b334ece911"), ObjectId("61895d2787f246b334ece912"), ObjectId("61895d2787f246b334ece913"), ObjectId("61895d2787f246b334ece914"), ObjectId("61895d2787f246b334ece915") ] }

      You can verify that the documents were properly inserted by running the find() method on the recipes collection with no arguments. This will retrieve every document in the collection:

      Output

      { "_id" : ObjectId("61895d2787f246b334ece911"), "name" : "Cafecito", "description" : "A sweet and rich Cuban hot coffee made by topping an espresso shot with a thick sugar cream foam." } . . .

      With the sample data in place, you’re ready to start learning how to use MongoDB’s full-text search features.

      Step 2 — Creating a Text Index

      To start using MongoDB’s full-text search capabilities, you must create a text index on a collection. Indexes are special data structures that store only a small subset of data from each document in a collection separately from the documents themselves. There are several types of indexes users can create in MongoDB, all of which help the database optimize search performance when querying the collection.

      A text index, however, is a special type of index used to further facilitate searching fields containing text data. When a user creates a text index, MongoDB will automatically drop any language-specific stop words from searches. This means that MongoDB will ignore the most common words for the given language (in English, words like “a”, “an”, “the”, or “this”).

      MongoDB will also implement a form of suffix-stemming in searches. This involves MongoDB identifying the root part of the search term and treating other grammar forms of that root (created by adding common suffixes like “-ing”, “-ed”, or perhaps “-er”) as equivalent to the root for the purposes of the search.

      Thanks to these and other features, MongoDB can more flexibly support queries written in natural language and provide better results.

      Note: This tutorial focuses on English text, but MongoDB supports multiple languages when using full-text search and text indexes. To learn more about what languages MongoDB supports, refer to the official documentation on supported languages.

      You can only create one text index for any given MongoDB collection, but the index can be created using more than one field. In our example collection, there is useful text stored in both the name and description fields of each document. It could be useful to create a text index for both fields.

      Run the following createIndex() method, which will create a text index for the two fields:

      • db.recipes.createIndex({ "name": "text", "description": "text" });

      For each of the two fields, name and description, the index type is set to text, telling MongoDB to create a text index tailored for full-text search based on these fields. The output will confirm the index creation:

      Output

      { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }

      Now that you’ve created the index, you can use it to issue full-text search queries to the database. In the next step, you’ll learn how to execute queries containing both single and multiple words.

      Step 3 — Searching for One or More Individual Words

      Perhaps the most common search problem is to look up documents containing one or more individual words.

      Typically, users expect the search engine to be flexible in determining where the given search terms should appear. As an example, if you were to use any popular web search engine and type in “coffee sweet spicy”, you likely are not expecting results that will contain those three words in that exact order. It’s more likely that you’d expect a list of web pages containing the words “coffee”, “sweet”, and “spicy” but not necessarily immediately near each other.

      That’s also how MongoDB approaches typical search queries when using text indexes. This step outlines how MongoDB interprets search queries with a few examples.

      To begin, say you want to search for coffee drinks with spices in their recipe, so you search for the word spiced alone using the following command:

      • db.recipes.find({ $text: { $search: "spiced" } });

      Notice that the syntax when using full-text search is slightly different from regular queries. Individual field names — like name or description — don’t appear in the filter document. Instead, the query uses the $text operator, telling MongoDB that this query intends to use the text index you created previously. You don’t need to be any more specific than that because, as you may recall, a collection may only have a single text index. Inside the embedded document for this filter is the $search operator taking the search query as its value. In this example, the query is a single word: spiced.

      After running this command, MongoDB produces the following list of documents:

      Output

      { "_id" : ObjectId("61895d2787f246b334ece915"), "name" : "Pumpkin Spice Latte", "description" : "It wouldn't be autumn without pumpkin spice lattes made with espresso, steamed milk, cinnamon spices, and pumpkin puree." } { "_id" : ObjectId("61895d2787f246b334ece912"), "name" : "New Orleans Coffee", "description" : "Cafe Noir from New Orleans is a spiced, nutty coffee made with chicory." }

      There are two documents in the result set, both of which contain words resembling the search query. While the New Orleans Coffee document does have the word spiced in the description, the Pumpkin Spice Late document doesn’t.

      Regardless, it was still returned by this query thanks to MongoDB’s use of stemming. MongoDB stripped the word spiced down to just spice, looked up spice in the index, and also stemmed it. Because of this, the words spice and spices in the Pumpkin Spice Late document matched the search query successfully, even though you didn’t search for either of those words specifically.

      Now, suppose you’re particularly fond of espresso drinks. Try looking up documents with a two-word query, spiced espresso, to look for a spicy, espresso-based coffee.

      • db.recipes.find({ $text: { $search: "spiced espresso" } });

      The list of results this time is longer than before:

      Output

      { "_id" : ObjectId("61895d2787f246b334ece914"), "name" : "Maple Latte", "description" : "A wintertime classic made with espresso and steamed milk and sweetened with some maple syrup." } { "_id" : ObjectId("61895d2787f246b334ece913"), "name" : "Affogato", "description" : "An Italian sweet dessert coffee made with fresh-brewed espresso and vanilla ice cream." } { "_id" : ObjectId("61895d2787f246b334ece911"), "name" : "Cafecito", "description" : "A sweet and rich Cuban hot coffee made by topping an espresso shot with a thick sugar cream foam." } { "_id" : ObjectId("61895d2787f246b334ece915"), "name" : "Pumpkin Spice Latte", "description" : "It wouldn't be autumn without pumpkin spice lattes made with espresso, steamed milk, cinnamon spices, and pumpkin puree." } { "_id" : ObjectId("61895d2787f246b334ece912"), "name" : "New Orleans Coffee", "description" : "Cafe Noir from New Orleans is a spiced, nutty coffee made with chicory." }

      When using multiple words in a search query, MongoDB performs a logical OR operation, so a document only has to match one part of the expression to be included in the result set. The results contain documents containing both spiced and espresso or either term alone. Notice that words do not necessarily need to appear near each other as long as they appear in the document somewhere.

      Note: If you try to execute any full-text search query on a collection for which there is no text index defined, MongoDB will return an error message instead:

      Error message

      Error: error: { "ok" : 0, "errmsg" : "text index required for $text query", "code" : 27, "codeName" : "IndexNotFound" }

      In this step, you learned how to use one or multiple words as a text search query, how MongoDB joins multiple words with a logical OR operation, and how MongoDB performs stemming. Next, you’ll use a complete phrase in a text search query and begin using exclusions to narrow down your search results further.

      Step 4 — Searching for Full Phrases and Using Exclusions

      Looking up individual words might return too many results, or the results may not be precise enough. In this step, you’ll use phrase search and exclusions to control search results more precisely.

      Suppose you have a sweet tooth, it’s hot outside, and coffee topped with ice cream sounds like a nice treat. Try finding an ice cream coffee using the basic search query as outlined previously:

      • db.recipes.find({ $text: { $search: "ice cream" } });

      The database will return two coffee recipes:

      Output

      { "_id" : ObjectId("61895d2787f246b334ece913"), "name" : "Affogato", "description" : "An Italian sweet dessert coffee made with fresh-brewed espresso and vanilla ice cream." } { "_id" : ObjectId("61895d2787f246b334ece911"), "name" : "Cafecito", "description" : "A sweet and rich Cuban hot coffee made by topping an espresso shot with a thick sugar cream foam." }

      While the Affogato document matches your expectations, Cafecito isn’t made with ice cream. The search engine, using the logical OR operation, accepted the second result just because the word cream appears in the description.

      To tell MongoDB that you are looking for ice cream as a full phrase and not two separate words, use the following query:

      • db.recipes.find({ $text: { $search: ""ice cream"" } });

      Notice the backslashes preceding each of the double quotes surrounding the phrase: "ice cream". The search query you’re executing is "ice cream", with double quotes denoting a phrase that should be matched exactly. The backslashes () escape the double quotes so they’re not treated as a part of JSON syntax, since these can appear inside the $search operator value.

      This time, MongoDB returns a single result:

      Output

      { "_id" : ObjectId("61895d2787f246b334ece913"), "name" : "Affogato", "description" : "An Italian sweet dessert coffee made with fresh-brewed espresso and vanilla ice cream." }

      This document matches the search term exactly, and neither cream nor ice alone would be enough to count as a match.

      Another useful full-text search feature is the exclusion modifier. To illustrate how to this works, first run the following query to get a list of all the coffee drinks in the collection based on espresso:

      • db.recipes.find({ $text: { $search: "espresso" } });

      This query returns four documents:

      Output

      { "_id" : ObjectId("61895d2787f246b334ece914"), "name" : "Maple Latte", "description" : "A wintertime classic made with espresso and steamed milk and sweetened with some maple syrup." } { "_id" : ObjectId("61895d2787f246b334ece913"), "name" : "Affogato", "description" : "An Italian sweet dessert coffee made with fresh-brewed espresso and vanilla ice cream." } { "_id" : ObjectId("61895d2787f246b334ece915"), "name" : "Pumpkin Spice Latte", "description" : "It wouldn't be autumn without pumpkin spice lattes made with espresso, steamed milk, cinnamon spices, and pumpkin puree." } { "_id" : ObjectId("61895d2787f246b334ece911"), "name" : "Cafecito", "description" : "A sweet and rich Cuban hot coffee made by topping an espresso shot with a thick sugar cream foam." }

      Notice that two of these drinks are served with milk, but suppose you want a milk-free drink. This is a case where exclusions can come in handy. In a single query, you can join words that you want to appear in the results with those that you want to be excluded by prepending the word or phrase you want to exclude with a minus sign (-).

      As an example, say you run the following query to look up espresso coffees that do not contain milk:

      • db.recipes.find({ $text: { $search: "espresso -milk" } });

      With this query, two documents will be excluded from the previously returned results:

      Output

      { "_id" : ObjectId("61895d2787f246b334ece913"), "name" : "Affogato", "description" : "An Italian sweet dessert coffee made with fresh-brewed espresso and vanilla ice cream." } { "_id" : ObjectId("61895d2787f246b334ece911"), "name" : "Cafecito", "description" : "A sweet and rich Cuban hot coffee made by topping an espresso shot with a thick sugar cream foam." }

      You can also exclude full phrases. To search for coffees without ice cream, you could include -"ice cream" in your search query. Again, you’d need to escape the double quotes with backslashes, like this:

      • db.recipes.find({ $text: { $search: "espresso -"ice cream"" } });

      Output

      { "_id" : ObjectId("61d48c31a285f8250c8dd5e6"), "name" : "Maple Latte", "description" : "A wintertime classic made with espresso and steamed milk and sweetened with some maple syrup." } { "_id" : ObjectId("61d48c31a285f8250c8dd5e7"), "name" : "Pumpkin Spice Latte", "description" : "It wouldn't be autumn without pumpkin spice lattes made with espresso, steamed milk, cinnamon spices, and pumpkin puree." } { "_id" : ObjectId("61d48c31a285f8250c8dd5e3"), "name" : "Cafecito", "description" : "A sweet and rich Cuban hot coffee made by topping an espresso shot with a thick sugar cream foam." }

      Now that you’ve learned how to filter documents based on a phrase consisting of multiple words and how to exclude certain words and phrases from search results, you can acquaint yourself with MongoDB’s full-text search scoring.

      Step 5 — Scoring the Results and Sorting By Score

      When a query, especially a complex one, returns multiple results, some documents are likely to be a better match than others. For example, when you look for spiced espresso drinks, those that are both spiced and espresso-based are more fitting than those without spices or not using espresso as the base.

      Full-text search engines typically assign a relevance score to the search results, indicating how well they match the search query. MongoDB also does this, but the search relevance is not visible by default.

      Search once again for spiced espresso, but this time have MongoDB also return each result’s search relevance score. To do this, you could add a projection after the query filter document:

      • db.recipes.find(
      • { $text: { $search: "spiced espresso" } },
      • { score: { $meta: "textScore" } }
      • )

      The projection { score: { $meta: "textScore" } } uses the $meta operator, a special kind of projection that returns specific metadata from returned documents. This example returns the documents’ textScore metadata, a built-in feature of MongoDB’s full-text search engine that contains the search relevance score.

      After executing the query, the returned documents will include a new field named score, as was specified in the filter document:

      Output

      { "_id" : ObjectId("61895d2787f246b334ece913"), "name" : "Affogato", "description" : "An Italian sweet dessert coffee made with fresh-brewed espresso and vanilla ice cream.", "score" : 0.5454545454545454 } { "_id" : ObjectId("61895d2787f246b334ece911"), "name" : "Cafecito", "description" : "A sweet and rich Cuban hot coffee made by topping an espresso shot with a thick sugar cream foam.", "score" : 0.5384615384615384 } { "_id" : ObjectId("61895d2787f246b334ece914"), "name" : "Maple Latte", "description" : "A wintertime classic made with espresso and steamed milk and sweetened with some maple syrup.", "score" : 0.55 } { "_id" : ObjectId("61895d2787f246b334ece912"), "name" : "New Orleans Coffee", "description" : "Cafe Noir from New Orleans is a spiced, nutty coffee made with chicory.", "score" : 0.5454545454545454 } { "_id" : ObjectId("61895d2787f246b334ece915"), "name" : "Pumpkin Spice Latte", "description" : "It wouldn't be autumn without pumpkin spice lattes made with espresso, steamed milk, cinnamon spices, and pumpkin puree.", "score" : 2.0705128205128203 }

      Notice how much higher the score is for Pumpkin Spice Latte, the only coffee drink that contains both the words spiced and espresso. According to MongoDB’s relevance score, it’s the most relevant document for that query. However, by default, the results are not returned in order of relevance.

      To change that, you could add a sort() clause to the query, like this:

      • db.recipes.find(
      • { $text: { $search: "spiced espresso" } },
      • { score: { $meta: "textScore" } }
      • ).sort(
      • { score: { $meta: "textScore" } }
      • );

      The syntax for the sorting document is the same as that of the projection. Now, the list of documents is the same, but their order is different:

      Output

      { "_id" : ObjectId("61895d2787f246b334ece915"), "name" : "Pumpkin Spice Latte", "description" : "It wouldn't be autumn without pumpkin spice lattes made with espresso, steamed milk, cinnamon spices, and pumpkin puree.", "score" : 2.0705128205128203 } { "_id" : ObjectId("61895d2787f246b334ece914"), "name" : "Maple Latte", "description" : "A wintertime classic made with espresso and steamed milk and sweetened with some maple syrup.", "score" : 0.55 } { "_id" : ObjectId("61895d2787f246b334ece913"), "name" : "Affogato", "description" : "An Italian sweet dessert coffee made with fresh-brewed espresso and vanilla ice cream.", "score" : 0.5454545454545454 } { "_id" : ObjectId("61895d2787f246b334ece912"), "name" : "New Orleans Coffee", "description" : "Cafe Noir from New Orleans is a spiced, nutty coffee made with chicory.", "score" : 0.5454545454545454 } { "_id" : ObjectId("61895d2787f246b334ece911"), "name" : "Cafecito", "description" : "A sweet and rich Cuban hot coffee made by topping an espresso shot with a thick sugar cream foam.", "score" : 0.5384615384615384 }

      The Pumpkin Spice Latte document appears as the first result since it has the highest relevance score.

      Sorting results according to their relevance score can be helpful. This is especially true with queries containing multiple words, where the most fitting documents will usually contain multiple search terms while the less relevant documents might contain only one.

      Conclusion

      By following this tutorial, you’ve acquainted yourself with MongoDB’s full-text search features. You created a text index and wrote text search queries using single and multiple words, full phrases, and exclusions. You’ve also assessed the relevance scores for returned documents and sorted the search results to show the most relevant results first. While MongoDB’s full-text search features may not be as robust as those of some dedicated search engines, they are capable enough for many use cases.

      Note that there are more search query modifiers — such as case and diacritic sensitivity and support for multiple languages — within a single text index. These can be used in more robust scenarios to support text search applications. For more information on MongoDB’s full-text search features and how they can be used, we encourage you to check out the official official MongoDB documentation.



      Source link

      How To Perform CRUD operations in MongoDB


      The author selected the Open Internet/Free Speech Fund to receive a donation as part of the Write for DOnations program.

      Introduction

      MongoDB is a persistent document-oriented database used to store and process data in the form of documents. As with other database management systems, MongoDB allows you to manage and interact with data through four fundamental types of data operations:

      • Create operations, which involve writing data to the database
      • Read operations, which query a database to retrieve data from it
      • Update operations, which change data that already exists in a database
      • Delete operations, which permanently remove data from a database

      These four operations are jointly referred to as CRUD operations.

      This tutorial outlines how to create new MongoDB documents and later retrieve them to read their data. It also explains how to update the data within documents, as well as how to delete documents when they are no longer needed.

      Prerequisites

      To follow this tutorial, you will need:

      Note: The linked tutorials on how to configure your server, install, and then secure MongoDB installation refer to Ubuntu 20.04. This tutorial concentrates on MongoDB itself, not the underlying operating system. It will generally work with any MongoDB installation regardless of the operating system as long as authentication has been enabled.

      Step 1 — Connecting to the MongoDB Server

      This guide involves using the MongoDB shell to interact with MongoDB. In order to follow along and practice CRUD operations in MongoDB, you must first connect to a MongoDB database by opening up the MongoDB shell.

      If your MongoDB instance is running on a remote server, SSH into that server from your local machine:

      Then connect to your MongoDB installation by opening up the MongoDB shell. Be sure to connect as a MongoDB user with privileges to write and read data. If you followed the prerequisite MongoDB security tutorial, you can connect as the administrative user you created in Step 1 of that guide:

      • mongo -u AdminSammy -p --authenticationDatabase admin

      After providing the user’s password, your terminal prompt will change to a greater-than sign (>). This means the shell is now ready to accept commands for the MongoDB server it’s connected to.

      Note: On a fresh connection, the MongoDB shell will automatically connect to the test database by default. You can safely use this database to experiment with MongoDB and the MongoDB shell.

      Alternatively, you could also switch to another database to run all of the example commands given in this tutorial. To switch to another database, run the use command followed by the name of your database:

      Now that you have connected to the MongoDB server using a MongoDB shell, you can move on to creating new documents.

      Step 2 — Creating Documents

      In order to have data that you can practice reading, updating, and deleting in the later steps of this guide, this step focuses on how to create data documents in MongoDB.

      Imagine that you’re using MongoDB to build and manage a directory of famous historical monuments from around the world. This directory will store information like each monument’s name, country, city, and geographical location.

      The documents in this directory will follow a format similar to this example, which represents The Pyramids of Giza:

      The Pyramids of Giza

      {
          "name": "The Pyramids of Giza",
          "city": "Giza",
          "country": "Egypt",
          "gps": {
              "lat": 29.976480,
              "lng": 31.131302
          }
      }
      

      This document, like all MongoDB documents, is written in BSON. BSON is a binary form of JSON, a human-readable data format. All data in BSON or JSON documents are represented as field-and-value pairs that take the form of field: value.

      This document consists of four fields. First is the name of the monument, followed by the city and the country. All three of these fields contain strings. The last field, called gps, is a nested document which details the monument’s GPS location. This location is made up of a pair of latitude and longitude coordinates, represented by the lat and lng fields respectively, each of which hold floating point values.

      Note: You can learn more about how MongoDB documents are structured in our conceptual article An Introduction to Document-Oriented Databases.

      Insert this document into a new collection called monuments using the insertOne method. As its name implies, insertOne is used to create individual documents, as opposed to creating multiple documents at once.

      In the MongoDB shell, run the following operation:

      • db.monuments.insertOne(
      • {
      • "name": "The Pyramids of Giza",
      • "city": "Giza",
      • "country": "Egypt",
      • "gps": {
      • "lat": 29.976480,
      • "lng": 31.131302
      • }
      • }
      • )

      Notice that you haven’t explicitly created the monuments collection before executing this insertOne method. MongoDB allows you to run commands on non-existent collections freely, and the missing collections only get created when the first object is inserted. By executing this example insertOne() method, not only will it insert the document into the collection but it will also create the collection automatically.

      MongoDB will execute the insertOne method and insert the requested document representing the Pyramids of Giza. The operation’s output will inform you that it executed successfully, and also provides the ObjectId which it generated automatically for the new document:

      Output

      { "acknowledged" : true, "insertedId" : ObjectId("6105752352e6d1ebb7072647") }

      In MongoDB, each document within a collection must have a unique _id field which acts as a primary key. You can include the _id field and provide it with a value of your own choosing, as long as you ensure each document’s _id field will be unique. However, if a new document omits the _id field, MongoDB will automatically generate an object identifier (in the form of an ObjectId object) as the value for the _id field.

      You can verify that the document was inserted by checking the object count in the monuments collection:

      Since you’ve only inserted one document into this collection, the count method will return 1:

      Output

      1

      Inserting documents one by one like this would quickly become tedious if you wanted to create multiple documents. MongoDB provides the insertMany method which you can use to insert multiple documents in a single operation.

      Run the following example command, which uses the insertMany method to insert six additional famous monuments into the monuments collection:

      • db.monuments.insertMany([
      • {"name": "The Valley of the Kings", "city": "Luxor", "country": "Egypt", "gps": { "lat": 25.746424, "lng": 32.605309 }},
      • {"name": "Arc de Triomphe", "city": "Paris", "country": "France", "gps": { "lat": 48.873756, "lng": 2.294946 }},
      • {"name": "The Eiffel Tower", "city": "Paris", "country": "France", "gps": { "lat": 48.858093, "lng": 2.294694 }},
      • {"name": "Acropolis", "city": "Athens", "country": "Greece", "gps": { "lat": 37.970833, "lng": 23.726110 }},
      • {"name": "The Great Wall of China", "city": "Huairou", "country": "China", "gps": { "lat": 40.431908, "lng": 116.570374 }},
      • {"name": "The Statue of Liberty", "city": "New York", "country": "USA", "gps": { "lat": 40.689247, "lng": -74.044502 }}
      • ])

      Notice the square brackets ([ and ]) surrounding the six documents. These brackets signify an array of documents. Within square brackets, multiple objects can appear one after another, delimited by commas. In cases where the MongoDB method requires more than one object, you can provide a list of objects in the form of an array like this one.

      MongoDB will respond with several object identifiers, one for each of the newly inserted objects:

      Output

      { "acknowledged" : true, "insertedIds" : [ ObjectId("6105770952e6d1ebb7072648"), ObjectId("6105770952e6d1ebb7072649"), ObjectId("6105770952e6d1ebb707264a"), ObjectId("6105770952e6d1ebb707264b"), ObjectId("6105770952e6d1ebb707264c"), ObjectId("6105770952e6d1ebb707264d") ] }

      You can verify that the documents were inserted by checking the object count in the monuments collection:

      After adding these six new documents, the expected output of this command is 7:

      Output

      7

      With that, you have used two separate insertion methods to create a number of documents representing several famous monuments. Next, you will read the data you just inserted with MongoDB’s find() method.

      Step 3 — Reading Documents

      Now that your collection has some documents stored within it, you can query your database to retrieve these documents and read their data. This step first outlines how to query all of the documents in a given collection, and then describes how to use filters to narrow down the list of retrieved documents.

      After completing the previous step, you now have seven documents describing famous monuments inserted into the monuments collection. You can retrieve all seven documents with a single operation using the find() method:

      This method, when used without any arguments, doesn’t apply any filtering and asks MongoDB to return all objects available in the specified collection, monuments. MongoDB will return the following output:

      Output

      { "_id" : ObjectId("6105752352e6d1ebb7072647"), "name" : "The Pyramids of Giza", "city" : "Giza", "country" : "Egypt", "gps" : { "lat" : 29.97648, "lng" : 31.131302 } } { "_id" : ObjectId("6105770952e6d1ebb7072648"), "name" : "The Valley of the Kings", "city" : "Luxor", "country" : "Egypt", "gps" : { "lat" : 25.746424, "lng" : 32.605309 } } { "_id" : ObjectId("6105770952e6d1ebb7072649"), "name" : "Arc de Triomphe", "city" : "Paris", "country" : "France", "gps" : { "lat" : 48.873756, "lng" : 2.294946 } } { "_id" : ObjectId("6105770952e6d1ebb707264a"), "name" : "The Eiffel Tower", "city" : "Paris", "country" : "France", "gps" : { "lat" : 48.858093, "lng" : 2.294694 } } { "_id" : ObjectId("6105770952e6d1ebb707264b"), "name" : "Acropolis", "city" : "Athens", "country" : "Greece", "gps" : { "lat" : 37.970833, "lng" : 23.72611 } } { "_id" : ObjectId("6105770952e6d1ebb707264c"), "name" : "The Great Wall of China", "city" : "Huairou", "country" : "China", "gps" : { "lat" : 40.431908, "lng" : 116.570374 } } { "_id" : ObjectId("6105770952e6d1ebb707264d"), "name" : "The Statue of Liberty", "city" : "New York", "country" : "USA", "gps" : { "lat" : 40.689247, "lng" : -74.044502 } }

      The MongoDB shell prints out all seven documents one by one and in full. Notice that each of these objects have an _id property which you didn’t define. As mentioned previously, the _id fields serve as their respective documents’ primary key, and were created automatically when you ran the insertMany method in the previous step.

      The default output from the MongoDB shell is compact, with each document’s fields and values printed in a single line. This can become difficult to read with objects containing multiple fields or nested documents, in particular.

      To make the find() method’s output more readable, you can use its pretty printing feature, like this:

      • db.monuments.find().pretty()

      This time, the MongoDB shell will print the documents on multiple lines, each with indentation:

      Output

      { "_id" : ObjectId("6105752352e6d1ebb7072647"), "name" : "The Pyramids of Giza", "city" : "Giza", "country" : "Egypt", "gps" : { "lat" : 29.97648, "lng" : 31.131302 } } { "_id" : ObjectId("6105770952e6d1ebb7072648"), "name" : "The Valley of the Kings", "city" : "Luxor", "country" : "Egypt", "gps" : { "lat" : 25.746424, "lng" : 32.605309 } } . . .

      Notice that in the two previous examples, the find() method was executed without any arguments. In both cases, it returned every object from the collection. You can apply filters to a query to narrow down the results.

      Recall from the previous examples that MongoDB automatically assigned The Valley of the Kings an object identifier with the value of ObjectId("6105770952e6d1ebb7072648"). The object identifier is not just the hexadecimal string inside the ObjectId(""), but the whole ObjectId object — a special datatype used in MongoDB to store object identifiers.

      The following find() method returns a single object by accepting a query filter document as an argument. Query filter documents follow the same structure as the documents you insert into a collection, consisting of fields and values, but they’re instead used to filter query results.

      The query filter document used in this example includes the _id field, with The Valley of the Kings’ object identifier as the value. To run this query on your own database, be sure to replace the highlighted object identifier with that of one of the documents stored in your own monuments collection:

      • db.monuments.find({"_id": ObjectId("6105770952e6d1ebb7072648")}).pretty()

      The query filter document in this example uses the equality condition, meaning the query will return any documents that have a field and value pair matching the one specified in the document. Essentially, this example tells the find() method to only return the documents whose _id value is equal to ObjectId("6105770952e6d1ebb7072648").

      After executing this method, MongoDB will return a single object matching the requested object identifier:

      Output

      { "_id" : ObjectId("6105770952e6d1ebb7072648"), "name" : "The Valley of the Kings", "city" : "Luxor", "country" : "Egypt", "gps" : { "lat" : 25.746424, "lng" : 32.605309 } }

      You can use quality condition on any other field from the document as well. To illustrate, try searching for monuments in France:

      • db.monuments.find({"country": "France"}).pretty()

      This method will return two monuments:

      Output

      { "_id" : ObjectId("6105770952e6d1ebb7072649"), "name" : "Arc de Triomphe", "city" : "Paris", "country" : "France", "gps" : { "lat" : 48.873756, "lng" : 2.294946 } } { "_id" : ObjectId("6105770952e6d1ebb707264a"), "name" : "The Eiffel Tower", "city" : "Paris", "country" : "France", "gps" : { "lat" : 48.858093, "lng" : 2.294694 } }

      Query filter documents are quite powerful and flexible, and they allow you to apply complex filters to collection documents.

      Step 4 — Updating Documents

      It’s common for documents within a document-oriented database like MongoDB to change over time. Sometimes, their structures must evolve along with the changing requirements of an application, or the data itself might change. This step focuses on how to update existing documents by changing field values in individual documents as well as and adding a new field to every document in a collection.

      Similar to the insertOne() and insertMany() methods, MongoDB provides methods that allow you to update either a single document or multiple documents at once. An important difference with these update methods is that, when creating new documents, you only need to pass the document data as method arguments. To update an existing document in the collection, you must also pass an argument that specifies which document you want to update.

      To allow users to do this, MongoDB uses the same query filter document mechanism in update methods as the one you used in the previous step to find and retrieve documents. Any query filter document that can be used to retrieve documents can also be used to specify documents to update.

      Try changing the name of Arc de Triomphe to the full name of Arc de Triomphe de l'Étoile. To do so, use the updateOne() method which updates a single document:

      • db.monuments.updateOne(
      • { "name": "Arc de Triomphe" },
      • {
      • $set: { "name": "Arc de Triomphe de l'Étoile" }
      • }
      • )

      The first argument of the updateOne method is the query filter document with a single equality condition, as covered in the previous step. In this example, { "name": "Arc de Triomphe" } finds documents with name key holding the value of Arc de Triomphe. Any valid query filter document can be used here.

      The second argument is the update document, specifying what changes should be applied during the update. The update document consists of update operators as keys, and parameters for each of the operator as values. In this example, the update operator used is $set. It is responsible for setting document fields to new values and requires a JSON object with new field values. Here, set: { "name": "Arc de Triomphe de l'Étoile" } tells MongoDB to set the value of field name to Arc de Triomphe de l'Étoile.

      The method will return a result telling you that one object was found by the query filter document, and also one object was successfully updated.

      Output

      { "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }

      Note: If the document query filter is not precise enough to select a single document, updateOne() will update only the first document returned from multiple results.

      To check whether the update worked, try retrieving all the monuments related to France:

      • db.monuments.find({"country": "France"}).pretty()

      This time, the method returns Arc de Triomphe but with its full name, which was changed by the update operation:

      Output

      { "_id" : ObjectId("6105770952e6d1ebb7072649"), "name" : "Arc de Triomphe de l'Étoile", "city" : "Paris", "country" : "France", "gps" : { "lat" : 48.873756, "lng" : 2.294946 } } . . .

      To modify more than one document, you can instead use the updateMany() method.

      As an example, say you notice there is no information about who created the entry and you’d like to credit the author who added each monument to the database. To do this, you’ll add a new editor field to each document in the monuments collection.

      The following example includes an empty query filter document. By including an empty query document, this operation will match every document in the collection and the updateMany() method will affect each of them . The update document adds a new editor field to each document, and assigns it a value of Sammy:

      • db.monuments.updateMany(
      • { },
      • {
      • $set: { "editor": "Sammy" }
      • }
      • )

      This method will return the following output:

      Output

      { "acknowledged" : true, "matchedCount" : 7, "modifiedCount" : 7 }

      This output informs you that seven documents were matched and seven were also modified.

      Confirm that the changes were applied:

      • db.monuments.find().pretty()

      Output

      { "_id" : ObjectId("6105752352e6d1ebb7072647"), "name" : "The Pyramids of Giza", "city" : "Giza", "country" : "Egypt", "gps" : { "lat" : 29.97648, "lng" : 31.131302 }, "editor" : "Sammy" } { "_id" : ObjectId("6105770952e6d1ebb7072648"), "name" : "The Valley of the Kings", "city" : "Luxor", "country" : "Egypt", "gps" : { "lat" : 25.746424, "lng" : 32.605309 }, "editor" : "Sammy" } . . .

      All the returned documents now have a new field called editor set to Sammy. By providing a non-existing field name to the $set update operator, the update operation will create missing fields in all matched documents and properly set the new value.

      Although you’ll likely use $set most often, many other update operators are available in MongoDB, allowing you to make complex alterations to your documents’ data and structure. You can learn more about these update operators in MongoDB’s official documentation on the subject.

      Step 5 — Deleting Documents

      There are times when data in the database becomes obsolete and needs to be deleted. As with Mongo’s update and insertion operations, there is a deleteOne() method, which removes only the first document matched by the query filter document, and deleteMany(), which deletes multiple objects at once.

      To practice using these methods, begin by trying to remove the Arc de Triomphe de l'Étoile monument you modified previously:

      • db.monuments.deleteOne(
      • { "name": "Arc de Triomphe de l'Étoile" }
      • )

      Notice that this method includes a query filter document like the previous update and retrieval examples. As before, you can use any valid query to specify what documents will be deleted.

      MongoDB will return the following result:

      Output

      { "acknowledged" : true, "deletedCount" : 1 }

      Here, the result tells you how many documents were deleted in the process.

      Check whether the document has indeed been removed from the collection by querying for monuments in France:

      • db.monuments.find({"country": "France"}).pretty()

      This time the method returns only single monument, The Eiffel Tower, since you removed the Arc de Triomphe de l'Étoile:

      Output

      { "_id" : ObjectId("6105770952e6d1ebb707264a"), "name" : "The Eiffel Tower", "city" : "Paris", "country" : "France", "gps" : { "lat" : 48.858093, "lng" : 2.294694 }, "editor" : "Sammy" }

      To illustrate removing multiple documents at once, remove all the monument documents for which Sammy was the editor. This will empty the collection, as you’ve previously designated Sammy as the editor for every monument:

      • db.monuments.deleteMany(
      • { "editor": "Sammy" }
      • )

      This time, MongoDB lets you know that this method removed six documents:

      Output

      { "acknowledged" : true, "deletedCount" : 6 }

      You can verify that the monuments collection is now empty by counting the number of documents within it:

      Output

      0

      Since you’ve just removed all documents from the collection, this command returns the expected output of 0.

      Conclusion

      By reading this article, you became familiar with the concept of CRUD operations — Create, Read, Update and Delete — the four essential components of data management. You can now insert new documents into a MongoDB database, modify existing ones, retrieve documents already present in a collection, and also delete documents as needed.

      Be aware, though, that this tutorial covered only one fundamental way of query filtering. MongoDB offers a robust query system allowing to precisely select documents of interest against complex criteria. To learn more about creating more complex queries, we encourage you to check out the official MongoDB documentation on the subject.



      Source link

      How To Perform CRUD Operations in MongoDB Using PyMongo on Ubuntu 20.04


      The author selected the Free and Open Source Fund to receive a donation as part of the Write for DOnations program.

      Introduction

      MongoDB is a general-purpose, document-oriented, NoSQL database program that uses JSON-like documents to store data. Unlike tabular relations used in relational databases, JSON-like documents allow for flexible and dynamic schemas while maintaining simplicity. In general, NoSQL databases have the ability to scale horizontally, making them suitable for big data and real-time applications.

      A database driver or connector is a program that connects an application to a database program. To perform CRUD operations in MongoDB using Python, a driver is required to establish the communication channel. PyMongo is the recommended driver for working with MongoDB from Python.

      In this guide, you will write a Python script that creates, retrieves, updates, and deletes data in a locally installed MongoDB server on Ubuntu 20.04. In the end, you will acquire relevant skills to understand the underlying concepts in how data moves across MongoDB and a Python application.

      Prerequisites

      Before you move forward with this guide, you will need the following:

      Step 1 — Setting Up PyMongo

      In this step, you will install PyMongo, the recommended driver for MongoDB from Python. As a collection of tools for working with MongoDB, PyMongo facilitates database requests using syntax and an interface native to Python.

      To enable PyMongo, open your Ubuntu terminal and install from the Python Package Index. It is recommended to install PyMongo within a virtual environment in order to isolate your Python project. Refer to this guide if you missed how to set up a virtual environment in the prerequisites.

      pip3 refers to the Python3 version of the popular pip package installer for Python. Note that within the Python 3 virtual environment you can use the command pip instead of pip3.

      Now, open the Python interpreter with the command below. The interpreter is a virtual machine that operates like a Unix shell, where you can execute Python code interactively.

      You are in the interpreter when you get an output similar to what’s below:

      Output

      Python 3.8.5 (default, Jan 27 2021, 15:41:15) [GCC 9.3.0] on linux Type "help", "copyright", "credits" or "license" for more information.

      With a successful output, import pymongo in the Python interpreter:

      Using the import statement, you can access the pymongo module and its code in your terminal. The import statement will run without raising exceptions.

      On the next line, import getpass.

      • from getpass import getpass

      getpass is a module for managing password inputs. The module prompts you for a password without showing an input, and adds a security mechanism to prevent displaying passwords as plaintext.

      Here, make a connection with MongoClient to enable a MongoDB instance of your database. Declare a variable client to hold the MongoClient instance with host, username, password, and authMechanism as arguments:

      • client = pymongo.MongoClient('localhost', username="username", password=getpass('Password: '), authMechanism='SCRAM-SHA-256')

      To connect to MongoDB with authorization enabled, MongoClient requires four arguments:

      • host - the hostname of the server on which MongoDB is installed. Since Mongo is local in this context, use localhost.
      • username and password - authorization credentials created after enabling authentication in MongoDB.
      • authMechanism - SCRAM-SHA-256 is the default authentication mechanism supported by a cluster configured for authentication with MongoDB 4.0 or later.

      Once you’ve established the client connection, you can now interact with your MongoDB instance.

      Step 2 — Testing Databases and Collections

      In this step, you will get familiar with NoSQL concepts such as collections and documents as applied to MongoDB.

      MongoDB supports managing multiple independent databases within a MongoClient instance. You can access or create a database using attribute style on a MongoClient instance. Declare a variable db and assign the new database as an attribute of client:

      In this context, the workplace database keeps track of employee records you will add such as the employee’s name and role.

      Next, create a collection. Like tables in relational databases, collections store a group of documents in MongoDB. In your Python interpreter, create an employees collection as an attribute of db and assign it to a variable of the same name:

      Create the employees collection as an attribute of db and assign it to a variable of the same name.

      Note: In MongoDB, databases, and collections are created lazily. This means that none of the above codes are actually executed until the first document is created.

      Now that you’ve reviewed collections, let’s look at how MongoDB represents documents, the basic structure for representing data.

      Step 3 — Performing CRUD Operations

      In this step, you will perform CRUD operations to manipulate data in MongoDB. Create, retrieve, update, and delete (CRUD) are the four basic operations in computer programming that one can perform to create persistent storage.

      To represent data in Python as JSON-like documents, dictionaries are used. Create a sample employee record with name and role attributes:

      • employee = {
      • "name": "Sammy",
      • "role": "Developer"
      • }

      As you can see, Python dictionaries are very similar in syntax to JSON documents. PyMongo converts Python dictionaries to JSON documents for scalable data storage.

      At this point, insert the employee record into the employees collection:

      • employees.insert_one(employee)

      Calling the insert_one() method on the employees collection, provide the employee record created earlier to be inserted. A successful insertion should return a successful output like below:

      Output

      <pymongo.results.InsertOneResult object at 0x7f8c5e3ed1c0>

      Now, verify you’ve successfully inserted the employee record and the collection. Make a query to find the employee you just created:

      • employees.find_one({"name": "Sammy"})

      Calling thefind_one() method on the employees collection with a name query returns a single matching document. This method is useful when you have only one document, or when you are interested in the first match.

      The output should look similar to this:

      Output

      {'_id': ObjectId('606ae5b2358ddf640da46894'), 'name': 'Sammy', 'role': 'Developer'}

      Note: When a document is inserted, a unique key _id is automatically added to the document if it does not already contain an _id key.

      If the need arises to modify existing documents, use the update_one() method. The update_one() method requires two arguments, query and update:

      • query - {"name": "Sammy"} - PyMongo will use this query parameter to find documents with elements that match.
      • update - { "$set": {"role": "Technical Writer"} } - The update parameter implements the $set operator, which replaces the value of a field with the specified value.

      Call the update_one() method on the employees collection:

      • employees.update_one({"name": "Sammy"}, { "$set": {"role": "Technical Writer"} })

      A successful update will return an output similar to this:

      Output

      <pymongo.results.UpdateResult object at 0x7f8c5e3eb940>

      To delete a single document, employ the delete_one() method. delete_one() requires a query parameter which specifies the document to delete. Execute the delete_one() method as an attribute of the employees collection with the name Sammy as a query parameter.

      • employees.delete_one({"name": "Sammy"})

      This will delete the only entry you have in your employees collection.

      Output

      <pymongo.results.DeleteResult object at 0x7f8c5e3c8280>

      Using the find_one() method again, it is apparent that you’ve successfully deleted Sammy’s employee record as nothing prints to the console.

      • employees.find_one({"name": "Sammy"})

      insert_one(), find_one(), update_one(), and delete_one() are great ways of getting started with performing CRUD operations in MongoDB with PyMongo.

      Conclusion

      In this guide, you have explored how to set up and configure PyMongo, the database driver, to connect Python code to MongoDB, as well as creating, retrieving, updating, and deleting documents. Although this guide focuses on introductory concepts, PyMongo offers more powerful and flexible ways of working with MongoDB. For instance, you can make bulk inserts, query for more than one document, add indexes to queries, and many more.

      To learn more about MongoDB management, see How To Back Up, Restore, and Migrate a MongoDB Database on Ubuntu 20.04 and How To Import and Export a MongoDB Database on Ubuntu 20.04.



      Source link