Hekaton Part 5: Native Procedures

One in a series of posts about Microsoft’s In-Memory OLTP Functionality

Now we come to one of the Hekaton components that is really quite cool but that I haven’t found a practical use for. Not even once.

Natively compiled stored procedures.

The cool part is that they can be really, really fast. They can be orders of magnitude faster than traditional interpreted procedures. The not so cool part is that they support only a very, very limited subset of T-SQL.

Let’s demonstrate. First, we’ll create a new table and add a couple of rows.

if exists (select * from sys.tables where name = 'Product')
	drop table dbo.Product;
create table dbo.Product
(
	ProductID int not null,
	Description nvarchar(500) not null,
	UnitPrice money not null,

	primary key nonclustered hash (ProductID) with (bucket_count = 1048576),
	index idx_Product__UnitPrice nonclustered (UnitPrice)
) with (memory_optimized = on, durability = schema_and_data);

-- Populate a few rows into the table

insert dbo.Product (ProductID, Description, UnitPrice)
values (1, 'Gizmo', 34.00),
	(2, 'Whatzit', 16.00);

Now can create a natively compiled stored procedure to update the table. This isn’t going to demonstrate speed, but it’s just to show the general idea (and the general awkwardness of the syntax).

if exists (select * from sys.objects where type = 'P' and name = 'usp_UpdateProduct')
	drop procedure usp_UpdateProduct
go
create procedure dbo.usp_UpdateProduct (@ProductID int, @NewUnitPrice money)
with native_compilation, schemabinding, execute as owner
as
begin atomic with (transaction isolation level = snapshot, language = 'us_english')
	update dbo.Product
	set UnitPrice = @NewUnitPrice
	where ProductID = @ProductID;
end
go

We can demonstrate that the procedure really works by running something like.

select ProductID, Description, UnitPrice from dbo.Product where ProductID = 1;
exec dbo.usp_UpdateProduct @ProductID = 1, @NewUnitPrice = 42.00;
select ProductID, Description, UnitPrice from dbo.Product where ProductID = 1;

Now let’s write a procedure to add a bunch of new rows. First, we will create and populate a tally table.

create table Tally
(
    Number int not null,
    constraint PK_Tally primary key nonclustered
        (Number)
) with (memory_optimized = on, durability = schema_and_data);

insert Tally (Number)
select top 1000000 row_number() over (order by v1.number)
from master.dbo.spt_values v1 cross join master.dbo.spt_values v2;

And then try this for a stored procedure:

if exists (select * from sys.objects where type = 'P' and name = 'usp_CreateNewRecords')
	drop procedure usp_CreateNewRecords
go
create procedure dbo.usp_CreateNewRecords (@RecordCount int)
with native_compilation, schemabinding, execute as owner
as
begin atomic with (transaction isolation level = snapshot, language = 'us_english')
	declare @firstId int;
	select @firstId = isnull(max(ProductID), 0) from dbo.Product;

	insert	dbo.Product (ProductID, Description, UnitPrice)
	select	t.Number + @firstId ProductID,
			'Product ' + cast(t.Number + @firstId as nvarchar(500)) Description,
			cast(t.Number + @firstId as money) UnitPrice
	from	dbo.Tally t
	where	t.Number <= @RecordCount;
end
go

Then we’ll run the procedure to add 1,000,000 records:

exec dbo.usp_CreateNewRecords @RecordCount = 1000000;

This goes pretty fast on my machine, consistently adding the 1M records in 5 to 6 seconds.

I wrote the procedure the way I did because I’ve always been taught (any experience has validated) that set-based logic is almost always superior to row-by-row processing. Hekaton definitely changes that paradigm. Let’s rewrite that procedure to do one-off inserts.

if exists (select * from sys.objects where type = 'P' and name = 'usp_CreateNewRecords')
	drop procedure usp_CreateNewRecords
go
create procedure dbo.usp_CreateNewRecords (@RecordCount int)
with native_compilation, schemabinding, execute as owner
as
begin atomic with (transaction isolation level = snapshot, language = 'us_english')
	declare @firstId int;
	select @firstId = isnull(max(ProductID), 0) from dbo.Product;

	declare @counter int = 1;
	while @counter <= @RecordCount
	begin
		insert dbo.Product (ProductID, Description, UnitPrice)
		values (@firstId + @counter,
			'Product ' + cast(@firstId + @counter as nvarchar(500)),
			cast(@firstId + @counter as money));
		select @counter = @counter + 1;
	end
end
go

Now that insert runs in 3 to 4 seconds. Not a dramatic difference, but intriguing to me because it seems to be a game-changer.

Finally, the limitations. This is why I’ve not found natively compiled procedures to be terribly compelling, because I am constantly running up against some limitation. The Microsoft master In-Memory Limitations page lists them. Scroll down to the Natively Compiled Stored Procedures section, which occupies the majority of the page, and you’ll see the long list. I won’t repeat them all here, but the big ones for me are lack of support for CTEs and subqueries, CASE statements, temp tables, non-BIN2 limits, lots of unsupported join types, no IN or OR or LIKE, no DISTINCT and limitations on aggregations, no ROW_NUMBER(), and of course no references to disk-based tables.

Whew! Even my abbreviated list is pretty long!

Hekaton Part 4: Isolation Levels

One in a series of posts about Microsoft’s In-Memory OLTP Functionality

One thing to keep in mind about querying Hekaton tables is that the isolation level you use is quite important. Queries must be run at the snapshot, repeatable read or serializable isolation level. In the previous post, we executed a select query. Let’s add an explicit isolation level.

set transaction isolation level read committed;

select EmployeeID, FirstName, LastName, Salary
from dbo.Employee
order by LastName, FirstName;

And this query runs just fine. We execute at the read committed level using snapshot isolation.

But change that “read committed” to just about anything else (read uncommitted, repeatable read) and the statement will produce an error.

Msg 10794, Level 16, State 80, Line 3
The transaction isolation level 'READ UNCOMMITTED' is not supported with memory optimized tables.

Msg 41333, Level 16, State 1, Line 3
The following transactions must access memory optimized tables and natively compiled stored procedures under snapshot isolation: RepeatableRead transactions, Serializable transactions, and transactions that access tables that are not memory optimized in RepeatableRead or Serializable isolation.

Msg 41333, Level 16, State 1, Line 3
Memory optimized tables and natively compiled stored procedures cannot be accessed or created when the session TRANSACTION ISOLATION LEVEL is set to SNAPSHOT.

Things also start to break down when you use an explicit transaction (instead of the autocommit transaction above).

set transaction isolation level read committed;

begin transaction;

select EmployeeID, FirstName, LastName, Salary
from dbo.Employee
order by LastName, FirstName;

commit;

Msg 41368, Level 16, State 0, Line 5
Accessing memory optimized tables using the READ COMMITTED isolation level is supported only for autocommit transactions. It is not supported for explicit or implicit transactions. Provide a supported isolation level for the memory optimized table using a table hint, such as WITH (SNAPSHOT).

The error message give the solution: use the with (snapshot) table hint.

set transaction isolation level read committed;

begin transaction;

select EmployeeID, FirstName, LastName, Salary
from dbo.Employee with (snapshot)
order by LastName, FirstName;

commit;

There is a database-level setting to allow omitting the table hint.

alter database Hekaton
set memory_optimized_elevate_to_snapshot = on;

And now the table hint can be left out of the query.

set transaction isolation level read committed;

begin transaction;

select EmployeeID, FirstName, LastName, Salary
from dbo.Employee
order by LastName, FirstName;

commit;

Things get a bit more interesting when queries access both Hekaton tables and disk-based tables. Suppose we have an on-disk Customer table and we want to note which Employees have certain relationships to a given Customer.

create table CustomerEmployeeRelationship
(
	RelationshipId int not null identity(1,1) primary key clustered,
	CustomerId int not null,
	EmployeeId int not null,
	RelationshipTypeId int not null
);

Side note: Trying create a foreign key back to the Employee table won’t work, another limitation of Hekaton tables.

alter table dbo.CustomerEmployeeRelationship
add constraint fk_CustomerEmployeeRelationship_EmployeeId
foreign key (EmployeeId) references dbo.Employee (EmployeeID);


Msg 10794, Level 16, State 10, Line 57
The feature 'FOREIGN KEY' is not supported with memory optimized tables.
Msg 1750, Level 16, State 0, Line 57
Could not create constraint or index. See previous errors.

Let’s find customer-employee relationships of a certain type:

select rel.CustomerId, rel.EmployeeId
from dbo.Employee emp
inner join dbo.CustomerEmployeeRelationship rel
on emp.EmployeeID = rel.EmployeeId
where rel.RelationshipTypeId = 7;

That works fine and well, but some combinations of isolation levels don’t mix and match.

set transaction isolation level repeatable read;

select rel.CustomerId, rel.EmployeeId
from dbo.Employee emp with (repeatableread)
inner join dbo.CustomerEmployeeRelationship rel
on emp.EmployeeID = rel.EmployeeId
where rel.RelationshipTypeId = 7;


Msg 41333, Level 16, State 1, Line 74
The following transactions must access memory optimized tables and natively compiled stored procedures under snapshot isolation: RepeatableRead transactions, Serializable transactions, and transactions that access tables that are not memory optimized in RepeatableRead or Serializable isolation.

Hekaton Part 3: Querying In-Memory Tables

One in a series of posts about Microsoft’s In-Memory OLTP Functionality

So now that we’ve created our Hekaton table, we should be able to query and modify it just like any other table, right?

Actually, yes. OK, so there are a few limitations, but by and large this is the part of Hekaton that is the most fully supported part of the product.

insert dbo.Employee (EmployeeID, FirstName, LastName, Salary)
values
(1, 'Marie', 'Salazar', 106779.90),
(2, 'Jason', 'Brown', 85553.93),
(3, 'Felicia', 'Normandin', 52278.50),
(4, 'Peter', 'Sabatino', 55018.27),
(5, 'Gregory', 'Mannan', 66715.94);

Selecting from the data works well.

select EmployeeID, FirstName, LastName, Salary
from dbo.Employee
order by LastName, FirstName;

The data can be modified just like normal.

update dbo.Employee
set Salary = 57219.00
where EmployeeID = 4;

The table can be joined to other tables, whether another Hekaton table or a good old-fashioned disk table, so long as the other table in in the same database. If you try to cross over to a different database, you get this error:

Msg 41317, Level 16, State 1, Line 20
A user transaction that accesses memory optimized tables or natively compiled procedures cannot access more than one user database or databases model and msdb, and it cannot write to master.

Some constructs aren’t supported. This MERGE statement is valid for a disk-based table.

merge dbo.Employee emp
using (values (5, 'Gregory', 'Mannan', 69384.58), (6, 'Michelle', 'Irvin', 80221.66) )
as src (EmployeeID, FirstName, LastName, Salary)
on emp.EmployeeID = src.EmployeeID
when matched then
	update set FirstName = src.FirstName,
		LastName = src.LastName,
		Salary = src.Salary
when not matched by target then
	insert (EmployeeID, FirstName, LastName, Salary)
	values (src.EmployeeID, src.FirstName, src.LastName, src.Salary);

But for a Hekaton table we get an error:

Msg 10794, Level 16, State 100, Line 25
The operation 'MERGE' is not supported with memory optimized tables.

(However, using a Hekaton table in a MERGE statement but not as the target is OK.)

truncate table dbo.Employee;

Msg 10794, Level 16, State 92, Line 37
The statement 'TRUNCATE TABLE' is not supported with memory optimized tables.

So yeah, a few limitations on querying, but most things work just fine.

Hekaton Part 2: Table Limitations

One in a series of posts about Microsoft’s In-Memory OLTP Functionality

One the big bummers about the first version of Hekaton is the limitations, which are legion. Microsoft has the complete list, but here are the big ones as far as I’m concerned.

Once you’ve created a table structure, it can’t be modified. No new columns, no new indexes. If you need to change the schema, you need to copy out the data somewhere else, drop and recreate the table with the modified structure, then copy the data back. This is tedious at best for small data sets, but a possible killer for larger tables.

A table cannot have more than 8 indexes. It need to have at least one index, a primary key, but otherwise none of the indexes can be unique. Also, you cannot update the primary key columns. Oh, and if a index is sorted ascending, SQL can only use that index in ascending mode. If queries need descending order, you’ll have to create a second index on the same keys, but in the reverse order.

There is a hard limit of 8,060 bytes per row, enforced at table create time. Disk-based tables had a similar limitation at one point, but I sure have liked things since that went away.

As a natural extension of the 8,060 byte limit, tables cannot have LOB columns.

No foreign keys. No check constraints.

Any string columns in an index must be in a BIN2 collation. This is a big one, in my experience. We’ve come to expect certain behavior from certain collations, and not having this behavior has been a deal-breaker in a number of circumstances when I’ve considered using an in-memory table.

A note about indexes that I didn’t cover before is that are two types of indexes. One is the hash index, and you would create one by simply declaring a nonclustered hash index, and include a bucket count based on how many items you expect the index will ultimately hold. Hash indexes are more useful if the query specifies an equality.

index idx_Product__UnitPrice nonclustered hash (UnitPrice)
with (bucket_count = 1048576)

Range indexes are useful for finding ranges of data, such as when the query uses an inequality or a range of values. To create a range index, omit the HASH keyword.

index idx_Product__UnitPrice nonclustered (UnitPrice)