Showing posts with label MongoDB. Show all posts
Showing posts with label MongoDB. Show all posts

Monday, July 3, 2017

Python MongoDB

What is Mongo?

Mongo is an open-source non-relational database service, written in C++. It is the ideal tool for backend services that need to save information that needs little processing quickly. These characteristics make it ideal for mobile and social networks backend services.


PyMongo is an API used for managing Mongo databases in Python. It is really easy to learn, and straightforward to use. Let’s start with the basics; the first thing you need is a database and a collection to start.
For all of the examples, the default “test” database will be used, and a collection called “People”. If you do not know how to create a collection, you can check the Mongo documentation.

Connecting to the db client

After that, you need to install PyMongo on your system. You can do it simply by using pip or adding it to your requirements.txt. In this case, as it is an example, it will be installed via pip with the following command:
python -m pip install pymongo
Then, in your project, you have to create a client to connect to the database. In order to create it, you must know the IP and port of the Mongo database. By default the Mongod port is the 27017 and, in this case, the Mongod is installed on the same computer of the Python project, so the example uses localhost as the IP address. Here you have a really simple example script:
HOST = "localhost"
PORT = "27017"

db = MongoClient("mongodb://" + HOST + ":" + PORT).test
On this script, a global “db” variable is defined as the test database on the local computer. Also, if you have user and password in your database, you can define it on the client by adding it at the beginning of the URI: user:password@mongodb://...

Populating the database

After the client is defined, we will need some data in the database to start with the API testing. You can insert data using the insert_one method from the mongo collection. For example, to add a new person to the People collection you can run the following command:
    "name": {
        "first_name": "Alice",
        "last_name": "Smith"
    "address": {
        "street": "5th Avenue",
        "building": "269",
        "coord": {"type": "Point", "coordinates": [-56.137, -34.901]}
Also if want to insert multiple documents at once, a better and more efficient way to insert data is using bulk operations. You can initialize bulk objects using the collection methods initialize_unordered_bulk_op or initialize_ordered_bulk_op. On this objects, you can insert, modify or delete data without changing the database, and then execute it to reflect those changes all at once. Here you can find an example:
bulk = db.People.initialize_unordered_bulk_op()

    # Document to be inserted


Creating queries

In PyMongo, you can define queries the same way as you do in Mongo. You can use the find method from a collection to create simple queries or the aggregate method for more complex ones. Here you have a query that returns the documents whose last names are “Smith”:
cursor = db.People.find({"name.last_name": "Smith"})
Then you can iterate over the returned documents by using:
for document in cursor:
    # Manipulation of the documents
The documents are handled as JSONs objects so you can retrieve its data using get or the brackets operators.

Updating and deleting data

In order to update, delete or replace data you should use one of the six collections methods for database modification. Those methods are the following:
  1. update_one
  2. update_many
  3. replace_one
  4. replace_many
  5. delete_one
  6. delete_many
All of them receive as their first parameter a query for the elements that will be modified. This query can use any of the operators as the find methods. As you can see on the method name, the ones ending on _one modify the first document found for that query, and the ones ending in _many change all of them.
Then, the first four methods receive as their second parameter the object to update or replace. The update methods require the $set operator, while the other can receive any JSON object.
Here you can see an example for the local database:
result = db.People.update_many({"name.last_name": "Smith"},
                                   "$set": {
                                       "name.last_name": "Johnson"
As you can see here, every person whose last name is Smith will be updated to have their last name be Johnson instead.
Also, the replace methods can receive an optional parameter named upsert. When this value is true, if no document was found by the query, then it inserts the object. By default this value is false.

Creating and using indexes

Indexes are used to improve the speed of queries, or for special kind of queries, like geospatial queries. Creating indexes is really simple. You have to use the create_index collection method, which receives a list of all the indexes you want to create.
The indexes are maps of values, where the key is the name of the attribute and the value the kind of index which will be created. The index can be any of the listed here:
  3. GEO2D (“2d” - 2-dimensional geospatial index)
  4. GEOSPHERE (“2dsphere” - spherical geospatial index)
  6. TEXT
Here you have an example that creates a spherical index and then makes a geospatial query:
    ("address.coord", GEOSPHERE)

longitude = -56.134
latitude = -34.9

distance = 304.8  # 1000ft in mts

cursor = db.People.aggregate([{
    "$geoNear": {
        "near": {"type": "Point", "coordinates": [longitude, latitude]},
        "spherical": True,
        "distanceField": "distance",
        "maxDistance": distance
The first command creates the geospatial index by the address.coord attribute, then it creates a query that finds the people that are within 1000ft of the defined position.
Congratulations! Now you know how to manipulate Mongo databases with Python. If you want to see a sample project you can check ours here.