Showing posts with label Transaction. Show all posts
Showing posts with label Transaction. Show all posts

Thursday, July 6, 2017

Django Isolated Transactions

Introduction

At some point in your life, you're going to need to use serializable transactions, it happens to everyone. Keep reading to learn how to do it using Django's awesome ORM.




Before starting let's make sure we all understand the following concepts:

Transactions - Database Engine

(*1) A transaction is a sequence of operations performed as a single logical unit of work. A logical unit of work must exhibit four properties, called the atomicity, consistency, isolation, and durability (ACID) properties, to qualify as a transaction.

Atomicity

A transaction must be an atomic unit of work; either all of its data modifications are performed, or none of them is performed.

Consistency

When completed, a transaction must leave all data in a consistent state. In a relational database, all rules must be applied to the transaction's modifications to maintain all data integrity. All internal data structures, such as B-tree indexes or doubly-linked lists, must be correct at the end of the transaction.

Isolation

Modifications made by concurrent transactions must be isolated from the modifications made by any other concurrent transactions. A transaction either recognizes data in the state it was in before another concurrent transaction modified it, or it recognizes the data after the second transaction has completed, but it does not recognize an intermediate state. This is referred to as serializability because it results in the ability to reload the starting data and replay a series of transactions to end up with the data in the same state it was in after the original transactions were performed.

Durability

After a transaction has completed, its effects are permanently in place in the system. The modifications persist even in the event of a system failure.

TL;DR: We intend to guarantee four things; that all the changes in the transactions are made or none of them; that all the database structures remain consistent with its definition; that we define how the database will handle concurrent transactions over the same data; once the changes are committed, they'll stay there despite a system failure.

Let's now define a common scenario where we could use this.

Scenario

Imagine that you work at Innuy and you were told by our awesome CTO that we need to make some changes to the website. Apparently, the popularity of Innuy is growing so much that we'll start selling shirts with the company logo. For that, we'll need to add a view on our website that allows users to purchase shirts and the necessary backend for it to work.

As we are focusing on database transactions, I'll only show backend code, I'll let you imagine the front end.

Analysis

We'll be selling a finite item, so we know we have a limited stock. Please think about how the flow (simplified) of the app would be:

1 - User accesses the store
2 - User selects shirt and size
3 - User confirms the purchase
4 - User adds payment information
5 - Order is created - In this step we'll reduce the available stock of the selected shirt in the selected size by 1, as we know it's it has already been sold.

As the attentive reader might have realized we just created ourselves a problem there. What would happen if two different clients make a purchase at exactly the same time when the stock of the shirt is only one?

The answer is, if you weren't careful enough with the implementation, you probably will be sending an apologize email and funds return.

If you didn't realize the issue please check the following diagram: You will quickly notice that at the end of this flow, the stock for the shirt is minus one. This happens if we don't have any control over different database connections reading or modifying the same data.



As we are awesome developers we'll build a solution that avoids this behavior.

Our tools

Django provides, out of the box, a few tools to solve this situation:

transaction.atomic() (*2)

Grants us the ability to write code blocks where database operations are atomic. This means that Django guarantees us that one of two things will happen: or all operations are executed, or none of them are. One important thing to notice about transaction.atomic is that the auto-commit is disabled because the commit will be called after finishing the whole block.

select_for_update() (*3)

Returns a queryset that will lock rows until the end of the transaction, generating a SELECT ... FOR UPDATE SQL statement on supported databases. Must be called from within a transaction.atomic block. Check out the docs for more information about the extra options this method supports.

An important point to consider when using transactions in Django when developing the test cases for our app instead of extending the Django tests from TestCase, you should extend from TransactionTestCase. (*4)

MySQL

For this example, we'll be using MySQL so let's look into how transactions are handled.

First of all, we must select the transaction isolation level. As Django does not provide an interface for settings the isolation level for a specific transaction, we'll have to set it up on the database (if this is not an option for you, you can still achieve this using raw SQL, but that's not the goal of this post).

MySQL provides a few options to set the database isolation level (how the database will behave when a SELECT ... FOR UPDATE statement is executed). In this case, we'll use the isolation level SERIALIZABLE which will block read, write operations after a transaction has called the SELECT FOR UPDATE statement. (*5)(*6)

Implementation

This representation of the models will make easy for us to represent the defined scenario:


from django.db import models

class Item(models.Model):
    """
    Ultra simple item model just intended for demo purposes
    """
    CATEGORIES = (
        ('books', 'Books'),
        ('apparel', 'Apparel'),
        ('electronics', 'Electronics')
    )
    image = models.ImageField()
    description = models.TextField()
    stock = models.IntegerField(default=0)
    price = models.PositiveIntegerField(default=0)
    category = models.CharField(max_length=12, default='apparel', choices=CATEGORIES)


class ItemOrder(models.Model):
    """
    Ultra simple order model just intended for demo purposes
    """
    ORDER_STATUS = (
        ('under_review', 'Under review'),
        ('packaged', 'Packaged'),
        ('shipped', 'Shipped'),
        ('delivered', 'Delivered')
    )
    item = models.OneToOneField(Item)
    amount = models.PositiveIntegerField(default=1)
    creation_date = models.DateTimeField(auto_created=True)
    status = models.CharField(max_length=12, default='under_review', choices=ORDER_STATUS)

We have an Item model, representing any product we sell. The stock indicates how many of the item we have physically, while the other fields are only for orientation purpose in this example. The ItemOrder will represent a confirmed purchase on an Item. As you can see it contains a state field that will allow Innuy's staff to know what to do with the order; a reference to the sold item, and other common fields (we don't really need for the example).

Now that our models are ready, let's create a method for creating a new purchase order. We'll add it as a static method to the ItemOrder model.


from django.db import transaction

@staticmethod
def create_order(item_id, how_many):
    """
    Creates an order
    :param item_id: The id of the item to purchase
    :param how_many: The amount of items
    :return: The created order or None
    """
    order = None
    try:
        # start atomic block
        with transaction.atomic():
            # select for update will lock de item so we can work with it 
            item = Item.objects.select_for_update().get(id=item_id)
            if item.stock - how_many >= 0:
                order = ItemOrder.objects.create(item=item, amount=how_many)
                item.stock -= how_many
                item.save()
                logger.info('Order created for item %s, amount %s.' % (item_id, how_many))
    except Item.DoesNotExist:
        logger.exception('Item with id %s does not exists' % item_id)
    except Exception as e:
        logger.exception('Unexpected error: e' % e)
    return order

As you see the create_order method does the trick. It locks the Item until we make every necessary operation. When using transaction.atomic it's important to remember that the try/except block is used outside the atomic block, if an exception is triggered inside the atomic block, the transaction will be rolled back, and the exception will be raised again to be caught by our outer try/except. 

Now, let's create a view to serve this service:


import json
from django.http import JsonResponse
from .models import ItemOrder # provided the models file is in the same package.

def create_order(request):
    """
    Creates an order after the payment has been processed
    :param request: The django request object
    """
    item_id = request.body.get('item_id')
    how_many = request.body.get('how_many')
    order = ItemOrder.create_order(item_id, how_many)
    result = {'order_status': 'Under review'} if order else {'order_status': 'Failed'}
    result = json.dumps(result)
    return JsonResponse(result)
Now we are all set up! We have developed the necessary models and logic to provide an order creation service that guarantees us the advantages of ACID transactions.

You have acquired a new skill, use it wisely :)

References


*1 - https://technet.microsoft.com/en-us/library/ms190612(v=sql.105).aspx

*2 - https://docs.djangoproject.com/en/1.11/topics/db/transactions/#s-controlling-transactions-explicitly

*3 - https://docs.djangoproject.com/en/1.11/ref/models/querysets/#s-select-for-update

*4 - https://docs.djangoproject.com/en/1.11/topics/db/transactions/#s-use-in-tests

*5 - Isolation level MySQL: https://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-isolation-levels.html

*6 - Set isolation level MySQL:https://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-isolation-levels.html