We shall start this blog post by understanding what causes a database concurrency issue, the difference between how Transactional Databases and NoSQL databases handle such issues, and an example on how to implement optimistic concurrency in an Azure Cosmos DB when submitting documents from code.
Database concurrency issues occur when multiple users/processes try to access and write a record in the database, resulting in multiple versions of the data having conflicting values. Typically, to update a record the process would involve: reading the record, updating the record in memory, and writing it back to the database. If, for any reason, multiple processes have read the same record, modified it in memory and try to write it back, the record that was modified might not be the latest and thus cause a versioning issue with the data (Figure 1).
Traditional relational databases (such as Microsoft SQL) handle OCC with the use of transactions. One can implement a transaction either through the use of applicable programming languages such as Python, C#, Java, etc., or by implementing the code as a Transactional Programming Language (T-SQL), which is executed by the database through the use of a stored-procedures (Figure 2) and/or triggers (Figure 3).
In contrast, NoSQL databases implement their own version of Triggers and Stored Procedures. In the case of Azure Cosmos DB, Stored Procedures and Triggers are written using the Cosmos DB JavaScript Server-side SDK.
On the other hand, to handle OCC from code, Cosmos DB implements a series of options that upon saving of the document check the _etag value in the document, relaying back an error if the current _etag and the _etag found in the Database are not the same. This mechanism allows the developer to then handle any concurrency issues accordingly.
The following is an example of how to edit a document in a Cosmos DB. Assume we are storing a simple document that represents an image Thumbnail.
Data
Firstly, let’s look at the data in the database. The JSON data used in this instance shall be a thumbnail submission as indicated in Figure 5. Once the document is saved in Cosmos DB, the platform shall add additional properties to our JSON, as indicated in Figure 6
One should notice that in the document saved in Cosmos DB, we now have the _etag property, which Cosmos DB will use to track any changes done to the data.
Code
Secondly, we’ll look at the code to insert/update our document from an Azure Function. In this sample, we are initializing a Cosmos Client and reading the body of the HTTP Request. We then get the document from the database and update the last updated property to the current date and time. Finally, we upload or insert the record accordingly.
The above is fine if we are sure that the API with the same data can’t be triggered more than once at the same time. However, in a distributed environment this can’t be ensured. To implement OCC, we need to add: ItemRequestOptions
to our UpsertItemAsync
method.
If there is a case where the data is not consistent, the CosmosException will have an HTTP status code of 412 PreconditionFailed. Once this is given, it is up to the developer to handle that record. In some instances one would retry the update, but in other instances one could just throw back the error to the invoker and leave the invoker to handle it.
Conclusion
In reality, Concurrency Control may not be the first thing to come to mind when updating the same record from multiple locations. However, it becomes an issue if not handled. Whilst NoSql databases are not traditionally used for transactional queries, if needs be one can see that implementing OCC does not have too much of an effect on the architecture of the solution, as all the mechanisms are already in place within the Cosmos DB API.
Subscribe to our RSS feed