The Imminent Release of SQL Server 2016

When I first started working with SQL Server, version 7.0 was the cutting edge. That version was released in late 1998, and the project I was working on at the time was still in the planning stages. The powers that be decided to move to 7.0 rather than 6.5, and it was only some time later when I had to support some 6.5 instances that I came to appreciate the paradigm shift that the new version represented. I think we can all agree that we all benefit from that brave new world to this day.

Since then, we have seen six new releases of SQL Server (2000, 2005, 2008, 2008R2, 2012 and 2014), and we now find ourselves on the cusp of the release of SQL Server 2016. We have seen over the past couple of releases an increase in the transparency of the development process with the release of beta version, and 2016 took it a whole new level. There have been, if my count is correct, eleven public preview versions released as of this writing, up to release candidate 1. That is absolutely incredible, and I think Microsoft deserves a ton of credit for opening up the process to this degree.

The new and updated feature set in SQL Server 2016 is equally impressive. I have downloaded, installed, and (to varying degrees) used every single one of those preview versions, and yet I feel that I’ve barely skimmed the surface of what I need to eventually learn. I have studied pretty much every new feature in the product, but I just haven’t been able to dig in nearly as deep as I would like. There’s just too much.

So, I guess that all of this just my way of saying that I’m really quite excited for the imminent release of SQL Server 2016. It is packed with new features, and I am sure that it will be some time until I get the opportunity to use some of these new bits of functionality in real systems (and until they mature enough to be ready for such use). Still, this is an exciting time for me — more so than for the past few releases.

Professional Development Notes

I have been doing a brain dump on Hekaton for the past few weeks, but I need a bit of break before diving into (and possibly drowning in) the internals.

I have attended a few technology-related professional development courses over the past few years, and I wanted to place some of the better pieces of advice here, primarily for my reference.  Because I very much need this as much as anyone, if not more.

Because I’m such an expert on such matters (not!), I have decided to leave these suggestions in bullet form. It may also have something to with the fact I really don’t have anything to add.

  • Interview question: What is one of the worst things about working here?
  • Act like a consultant (even if you aren’t)
  • Work on advanced projects – just enough to be uncomfortable
  • Get some exercise / mediate
  • Non-computer-related hobby
  • Make your accomplishments visible (far beyond the team that already knows)
  • Have strong opinions, weakly held (be willing to change)
  • Be a jack-of-all-trades
  • Be willing to step up and work extra hours when the need arises
  • Avoid negativity – in yourself and others
  • Learn new, related technologies
  • Know where you want to go
  • Use objective information (not emotion) to project the lifetime of a technology
  • If your current job doesn’t include what you love to do, expand your job and take on extra responsibilities
  • Once per year take time to assess where you were a year ago and where you want to be in one year
  • Iterative, continual learning is the best approach
  • Recognize that short-term thinking is usually driven by artificially dictated project deadlines
  • On the other hand, too much long-term thinking can cause the project to become mired and never ship
  • Learn when to ignore company politics and just do your job
  • Choose your battles; is this the right hill to die on?
  • Listen before responding; let them give their story
  • Most people need criticism; most people who deliver criticism do it poorly (get over it)
  • Don’t try to do too much; do fewer things and do them well
  • Low performers realize they can do 20% and survive; high performers do 20% more and stay ahead
  • Continually interview, even if you don’t intend to leave (helps evaluate the industry and self-skills)

Hekaton Part 7: Error Handling

One in a series of posts about Microsoft’s In-Memory OLTP Functionality

In the previous post in this series, I discussed three validation errors that are unique to Hekaton and their common causes:

  • Error 41302 – Generated when two concurrent sessions update the same record.
  • Error 41303 – Generated when a concurrent update causes a repeatable read failure.
  • Error 41325 – Generated when a concurrent insert causes a serializable failure.

There is also error 41301 that I will address later in this series.

Applications that access databases containing In-Memory tables need to be aware of these errors and be able to respond gracefully when they occur.

There is one error condition that has been around for a long time in SQL Server that closely parallels these three errors, and that is the deadlock (error 1205). Deadlocks, however, are largely not accounted for in error handling because they tend to be rare, and can largely be avoided by good coding practices (specifically, consistent order of access to tables and other resources, when at all possible). On the other hand, the Hekaton errors will, in practice, be more commonplace, and I consider them to be “normal” events in practice, especially as the throughput on the system becomes heavy.

The appropriate response, in most situations, is to retry the operation when this error is raised. Microsoft has an example of handling errors in stored procedures, but applications also need to address these errors.

Here is a stub of a data access class that demonstrates how this can be done.

using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace HekatonErrorTrapping
{
    class Dal : IDisposable
    {
        private SqlConnection _connection;

        public Dal()
        {
            SqlConnectionStringBuilder sb = new SqlConnectionStringBuilder();
            sb.DataSource = @".";
            sb.InitialCatalog = "Hekaton";
            sb.IntegratedSecurity = true;

            _connection = new SqlConnection(sb.ToString());
            _connection.Open();
        }

        public SqlTransaction StartTransaction()
        {
            return _connection.BeginTransaction();
        }

        public void UpdateProductUnitPrice(int productId, decimal newPrice, SqlTransaction transaction)
        {
            string sql = @"update dbo.Product with (snapshot) set UnitPrice = @NewPrice where ProductID = @ProductId;";
            SqlParameter idParameter = new SqlParameter("@ProductId", productId);
            SqlParameter priceParameter = new SqlParameter("@NewPrice", newPrice);

            using (SqlCommand command = new SqlCommand(sql, _connection, transaction))
            {
                command.Parameters.Add(idParameter);
                command.Parameters.Add(priceParameter);

                int retryCount = 5;
                bool isComplete = false;
                while (isComplete == false)
                {
                    try
                    {
                        command.ExecuteNonQuery();
                        isComplete = true;
                    }
                    catch (SqlException ex)
                    {
                        switch (ex.Number)
                        {
                            // 41302: Updated a record that has been updated since the txn started.
                            case 41302:

                            // 41305: Repeatable read validation failure.
                            case 41305:

                            // 41325: Serializable validation failure.
                            case 41325:

                            // 41301: Commit dependency on a transaction that aborted.
                            case 41301:

                            // 1205: Deadlock (not applicable to in-memory tables, but same principles apply).
                            case 1205:
                                retryCount--;
                                if (retryCount <= 0)
                                {
                                    throw;
                                }

                                // Take a very brief pause -- mostly needed in cases of very high write contention.
                                System.Threading.Thread.Sleep(1);
                                break;

                            default:
                                throw;
                        }
                    }
                }
            }
        }

        public void Dispose()
        {
            Dispose(true);
            GC.SuppressFinalize(this);
        }

        private void Dispose(bool disposing)
        {
            if (disposing)
            {
                if (_connection != null)
                {
                    _connection.Dispose();
                    _connection = null;
                }
            }
        }
    }
}

And here is an example of how to call this code.

using (Dal dal = new Dal())
{
    using (SqlTransaction transaction = dal.StartTransaction())
    {
        dal.UpdateProductUnitPrice(productId, newPrice, transaction);
        transaction.Commit();
    }
}

Now if one of these errors is generated during the update, the system will pause for one millisecond (just in case there is a lot of write contention going on; this pause may need to be longer on some systems), and retry the operation up to five times. (Of course, these values should be configurable and not hard-coded.)

There is one HUGE caveat to this, however. When one of these errors is thrown, the entire transaction is rolled back. In this simple example, the transaction is just a single update statement, but if the transaction spans a dozen statements, they all get rolled back.

This means that at the very least the application needs to be aware that the full transaction will go away, and it should wrap transaction appropriately. Better yet, the application will know how to replay the full transaction up to the point of failure and will gracefully retry the operation a given number of times before giving up.

Hekaton Part 6: MVCC Assumptions and Validation

One in a series of posts about Microsoft’s In-Memory OLTP Functionality

I originally intended to make part 6 in this series about error handling, but quickly realized that I needed to lay a bit of a foundation on multi-version concurrency control (MVCC), and particularly some of the assumptions made by MVCC and how those assumptions are subsequently confirmed.

Historically, SQL Server has used a pessimistic form of concurrency. My session sees every other connection as the enemy. Every other connection on the system is presumed to want access to my resources, and so SQL employs a locking mechanism to prevent anyone else from stomping over my work.

Now, locking and isolation levels is a complex topic in itself, but a simple example will suffice for the the purposes of this discussion. Suppose I update the Product table for ID 23, and before I commit my transaction, you come along and try to update the same row. Your connection will be blocked from proceeding until I commit or rollback, at which point you will be allowed to make the update. Under most isolation levels you won’t even be allowed to read the row I’ve updated so long as it is uncommitted.

Blocking is a good thing in that it preserved isolation, but a bad thing when everybody is always waiting on someone else.

In SQL Server 2005, Microsoft introduced a quasi-optimistic concurrency model called snapshot that addressed some of these problems. With snapshot enabled, you would be able to read the row that I’ve updated (you will see the values in the row prior to any of my changes). However, you still won’t be able to update it. You will be blocked just as always.

Snapshot isolation is frequently explained as reader don’t block writers, and writers don’t block readers.

Hekaton takes us a huge step forward toward optimistic concurrency. It makes a couple of assumptions that remove that “every other session is the enemy” paradigm. One assumption is that write conflicts are rare, so SQL always proceeds with data modifications under the assumption that no one else is going to change the same row.

A second assumption is that competing transactions are going be committed and not rolled back.

Now, sometimes these assumptions simply aren’t valid, so SQL Server will includes validation checks to make sure that things don’t go awry.

We’ll get into internals in a later part of this series to better explain how SQL does all this magic, but for now suffice it to say that SQL maintains multiple versions of the same row along with a pair of time stamps indicating when the row is valid. So when we update ProductId 23, SQL will actually create a new row in the table, and mark the original row as valid through, say, 1:17 pm, and the new row only becomes valid at 1:17 pm. When a user starts a new transaction, SQL notes the time that the transaction starts and during the course the transaction will always return the row that was valid and committed at that time.

Let’s demonstrate. We’ll open two tabs in Management Studio, each connected to our Hekaton database. In the first tab, we’ll make sure that the price on product 23 is a known value.

update dbo.Product
set UnitPrice = 23.00
where ProductID = 23;

Now in tab 1, we’ll start a transaction and modify the price.

begin transaction;

update dbo.Product with (snapshot)
set UnitPrice = 30.00
where ProductID = 23;

In tab 2, let’s examine the value.

select ProductID, UnitPrice
from dbo.Product
where ProductID = 23;

This confirms that connection 2 sees a UnitPrice of 23.00, which is the committed value of the row at the time the transaction started. Continuing in tab 2, start a new transaction:

begin transaction;

Switch back to tab 1 and commit.

commit;

Re-execute the select in connection 2, and you will still get 23.00. Again, this was the committed value of the row at the time the transaction began. But if we close the transaction and do it again:

commit;

begin transaction;

select ProductID, UnitPrice
from dbo.Product with (snapshot)
where ProductID = 23;

commit;

This time we get a unit price of 30.00, reflecting the committed price when the new transaction begins.

So that is MVCC in a nutshell.

What about the validations? If one of the MVCC assumptions doesn’t hold up, SQL Server will detect this circumstance by one of several validation methods. In some cases, these validations happen as the transaction progresses, and in cases the validation happens when the transaction commits.

In connection 1, make sure that there are no open transactions, then run the following:

begin transaction;

update dbo.Product with (snapshot)
set UnitPrice = 30.00
where ProductID = 23;

And in connection 2 we will try a simultaneous update.

begin transaction;

update dbo.Product with (snapshot)
set UnitPrice = 35.00
where ProductID = 23;

This immediately fails with an error:

Msg 41302, Level 16, State 110, Line 18
The current transaction attempted to update a record that has been updated since this transaction started. The transaction was aborted.
Msg 3998, Level 16, State 1, Line 16
Uncommittable transaction is detected at the end of the batch. The transaction is rolled back.
The statement has been terminated.

I find the wording on the 41302 error a bit odd because it’s not completely correct, but the upshot is that there are two simultaneous update operations in progress. The “write-write conflict” assumption has failed, so the error is generated. Note that SQL Server does not block the progress of the transaction, as with previous versions, but it does generate an error.

In addition to the 41302 error, we might also get 41303 and 41325 errors. These relate to violations of the repeatable read and serializable isolation level requirements. Make sure that both connections don’t have open transactions, then run this in connection 1:

begin transaction;

select ProductID, UnitPrice
from dbo.Product with (repeatableread)
where ProductID = 23;

Then back in connection 2 change the price:

begin transaction;

update dbo.Product with (snapshot)
set UnitPrice = 38.00
where ProductID = 23;

commit;

Now try to commit transaction 1:

commit;

At commit time, SQL Server will validate the repeatable read requirement and the following error will be generated.


Msg 41305, Level 16, State 0, Line 25
The current transaction failed to commit due to a repeatable read validation failure.

Finally, we can create the conditions for a 41325 error. Commit any open transactions, then in connection 1 run:

delete dbo.Product where ProductID = 1000;

begin transaction;

select ProductID, UnitPrice
from dbo.Product with (serializable)
where ProductID = 1000;

And in connection 2 run:

insert dbo.Product (ProductID, Description, UnitPrice)
values (1000, 'Product 1000', 1000.00);

Back in connection 1:

commit;


Msg 41325, Level 16, State 0, Line 35
The current transaction failed to commit due to a serializable validation failure.

Whew, that was a lot just to talk about the validations that take place (and there is actually one more validation that happens at commit time, but we’ll address that separately). With that background, we can now talk about error handling in the next post.