Rambling on Software

ORM's and Locking (Part 2)

by coatta 2/3/2009 9:15:00 PM

Back in January I wrote about some of my initial experiences with ORM's and deadlocks. By explicitly marking each transaction as being either read or write we were able to solve a large number of deadlocks that cropped up in our initial testing. In fact, that tactic was good enough to take the system live. Even with people accessing the system on a regular basis we did not see any deadlocks. This lulled me into a false sense of security that I wouldn't have to think about deadlocks any more. In hindsight I can't imagine why I thought this -- self-delusion rears its ugly head again.

Anyhow, after a while we got around to stress testing the system; pushing one or two orders of magnitude more data through the system than was typical. Once again the deadlocks started cropping up. To make this part of the story short, it finally occurred to me that all the usual sorts of deadlocks that I had grown to love when programming directly with locks were just as possible in an ORM-based system with the locking done by the database. Locks acquired in the wrong order, read locks acquired when write locks should have been, etc.

But even though the issues are fundamentally the same, the experience is quite different. There are a number of reasons for this:

The database has much nicer tools for dealing with deadlocks. In the first place, it doesn't actually deadlock, it notices that a deadlock exists and aborts one of the transactions. Second, it has very nice tools that identify the exact cause of the deadlock (at least SQL Server does, I assume other DB's do too) - that is, it will identify the resources in conflict, the cycle of resources held/requested that created the deadlock, and the statements associated with the resource requests.
To compensate for this somewhat, the connection between your code and the underlying SQL is indirect. The ORM is responsible for generating the SQL statements that the DB executes, and its not always obvious which SQL statements are associated with particular operations in the code. Not only does one have to have a good grasp of the underlying mapping of objects to the DB, its also necessary to understand how the ORM moves data back and forth between the objects in memory and the DB.
An extension of this last point is that the ORM can make it hard to resolve some locking issues. For example, the particular version of NHibernate that we are using always uses read locks when it is pulling in collection data. This is done implicitly as part of loading an object and there isn't a way to force it to use an update lock which is necessary if elements of the collection are written to later in the transaction. Similarly, if you're working directly with locks a fairly common technique to decrease contention is to release locks before the end of a transaction when that doesn't compromise serializability -- a technique that may not be available to you with your ORM.

Overall, I find the experience with the ORM to be positive in the balance. Its certainly made it easier for us to program without thinking explicitly about locking most of the time. But as with any layer of abstraction, it tends to complicate things when things get complicated :-)

Tags: orm, programming

Software Development

ORM's and Locking

by coatta 1/11/2009 11:13:00 PM

An interesting deadlock situation came up in our application recently. We use NHibernate for our ORM and our database is SQL Server. Within the application we've constructed some utility classes for NHibernate so that there is a standard pattern for finding, accessing, and modifying persisten objects. A key part of this is our UnitOfWork class that wraps up an NHibernate session and transaction. All access to persistent objects occurs in the context of a UnitOfWork. When a UnitOfWork is completed it either commits its changes back to the DB or they are aborted.

Initially, the typical pattern for carrying out some changes to a persistent object (or set of objects) looked like:

By default, the UnitOfWork commits at the end, so that the change to the Salary property would end up being saved to the DB by this code. Much to our chagrin, we found ourselves getting quite a few deadlocks when using this code.

The trouble with this code arises because, by default, NHibernate acquires a read lock when getting the data from the DB as part of the FindById() call. If multiple threads execute this code concurrently, they can acquire a read lock on the same object since read locks are allowed to be shared. When the transaction commits, the read locks need to be upgraded to write locks (because we are changing the object). The result is a deadlock because the shared read lock cannot be upgraded.

To try and simplify life for the programmer, we decided to add a parameter to the UnitOfWork which would indicated whether the code was read-only or not. That would eliminate the need to determine on a case b case basis what locks should be used for each object. Calls such as FindById() would observe the value of the read-only flag of the current session and modify their locking behaviour accordingly. So, for a UnitOfWork which was not read-only, it would acquire an update lock -- which is used in SQL server to indicate a lock which may need to be upgraded to a write lock later in the transaction.

Overall this was a vary effective strategy. It was a relatively simple programming model and it virtually eliminated all of the deadlocks that we were seeing with the application. In part 2 of this posting, I'll talk about how we eventually needed to go beyond this simple model due to more complex interactions between concurrent threads.

Tags: orm, .net, programming