SQL Server – InMemory OLTP Hash Collisions Performance Overhead

May 27, 2016

Sometimes the best of learnings are based on people’s questions. I always feel when questions are asked, more the learnings are. In a recent session for a customer who had implemented SQL Server 2014 InMemory OLTP in their systems, they were getting bad performance even when they were using InMemory OLTP. I was surprised to hear that and I went on to ask more details. What I noticed is that the customer had a very low hash bucket value in their InMemory Table definition.

If you read the article – SQL SERVER – Introduction to SQL Server 2014 In-Memory OLTP you can see how we can define a hash bucket value while defining the table. I have always been recommending customers that the hash bucket values, it needs to be double the number of unique values to avoid any hash collisions and moreover it would be helpful for future growth in mind. Though this was like a rule of thumb recommendation, I always told them the hash collision (two values inside the same hash bucket) is performance deterrent too.

No one challenged me till date on how it is performance damper and if there was a script to show the case. In this case, I thought I needed to showcase the same to the customer asap to drive home the concept. I did the following script which I would like to share it with you.

Create your database for testing
Create table1 which has large Hash bucket value
Insert 1M rows on table1 and track the time
Create table2 which has smaller Hash bucket value causing hash collisions
Insert 1M rows on table2 and track the time
Clean up

Let us first create the database.

USE master;
GO

-- Create a new database
CREATE DATABASE InMem_HashCollision
ON PRIMARY 
	(NAME = N'HashCollisions',
	FILENAME = N'd:\Data\HashCollisions.mdf',
	SIZE = 64MB, MAXSIZE = UNLIMITED, FILEGROWTH = 16MB),
	FILEGROUP [XTPFileGroup] CONTAINS MEMORY_OPTIMIZED_DATA
		(NAME = N'HashCollisionInMemContainer',
		FILENAME = N'd:\Data\HashCollisionInMemContainer', MAXSIZE = UNLIMITED)
LOG ON 
	(NAME = N'HashCollisions_Log',
	FILENAME = N'd:\Data\HashCollisions.ldf',
	SIZE = 16MB, MAXSIZE = 2048MB, FILEGROWTH = 8MB)
COLLATE Latin1_General_100_BIN2;
GO

Let us create our InMemory table and a Native stored procedure which will be used to enter values into this table.

USE InMem_HashCollision;
GO

-- create a simple XTP table with a bucket size 1048576
CREATE TABLE dbo.InMem_Data1
(
	Col1 INT NOT NULL
		PRIMARY KEY NONCLUSTERED 
		HASH WITH (BUCKET_COUNT = 1048576),
	Col2 INT NOT NULL,
	Col3 INT NOT NULL
)
WITH
(
	MEMORY_OPTIMIZED = ON, 
	DURABILITY = SCHEMA_AND_DATA
);
GO

-- create a native compiled stored procedure to insert data
CREATE PROCEDURE dbo.InsertInMem_Data1
(
	@Rows [int]
)
WITH 
    NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS 
BEGIN
	ATOMIC WITH 
	(	TRANSACTION ISOLATION LEVEL = SNAPSHOT,
		LANGUAGE = 'us_english'
	)
	DECLARE @counter INT = 0;
	WHILE @counter < @Rows
	BEGIN
		INSERT INTO dbo.InMem_Data1 (Col1, Col2, Col3) 
		VALUES (@counter, @counter, @counter);
		SET @counter += 1;
	END
END;
GO

Let us insert 1M rows by calling the SP and check on the time taken for the same.

-- execute the stored procedure to insert 1000000 rows
DECLARE
	@starttime DATETIME = CURRENT_TIMESTAMP,
	@endtime DATETIME;

EXECUTE dbo.InsertInMem_Data1 1000000;

SET @endtime = CURRENT_TIMESTAMP;
PRINT CONCAT('The query executed in ', 
DATEDIFF(millisecond, @starttime, @endtime), ' milliseconds.');

Here is the resultset from the query.

SQL Server - InMemory OLTP Hash Collisions Performance Overhead InMem-hash-collision-perf-01

As you can see, we completed this operation in almost 1 second. We will next do the same process by creating a table with lesser hash bucket count and insert the same 1M rows.

CREATE TABLE dbo.InMem_Data2
(
	Col1 INT NOT NULL
		PRIMARY KEY NONCLUSTERED HASH WITH 
		(BUCKET_COUNT = 1024),
	Col2 INT NOT NULL,
	Col3 INT NOT NULL
)
WITH
(
	MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA
);
GO
-- create a native compiled stored procedure to insert data
CREATE PROCEDURE dbo.InsertInMem_Data2
(
	@Rows [int]
)
WITH 
    NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS 
BEGIN
	ATOMIC WITH 
	(
		TRANSACTION ISOLATION LEVEL = SNAPSHOT, 
		LANGUAGE = 'us_english'
	)
	DECLARE @counter INT = 0;
	WHILE @counter < @Rows
	BEGIN
		INSERT INTO dbo.InMem_Data2 (Col1, Col2, Col3) 
		VALUES (@counter, @counter, @counter);
		SET @counter += 1;
	END
END;
GO
-- execute the stored procedure to insert 1000000 rows
DECLARE
	@starttime DATETIME = CURRENT_TIMESTAMP,
	@endtime DATETIME;

EXECUTE dbo.InsertInMem_Data2 1000000;

SET @endtime = CURRENT_TIMESTAMP;
PRINT CONCAT('The query executed in ', 
DATEDIFF(millisecond, @starttime, @endtime), ' milliseconds.');

Here is the resultset from the query.

SQL Server - InMemory OLTP Hash Collisions Performance Overhead InMem-hash-collision-perf-02

As you can see, the second query took close to 47 seconds which is way more than what we got in the previous values. This is the effect of INSERT with hash collision and you can think about a similar overhead when used for retrieving data too.

As a best practice you can see that some of these settings can have adverse effect when it comes to performance. I would highly recommend you to keep an eye on these settings before implementing the same for the sake of using a new feature introduced inside SQL Server.

Let us now clean up the script:

-- cleanup
USE master;
GO
IF DATABASEPROPERTYEX('InMem_HashCollision', 'Version') < 0
BEGIN
	ALTER DATABASE InMem_HashCollision 
        SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
	DROP DATABASE InMem_HashCollision;
END

Reference: Pinal Dave (https://blog.sqlauthority.com)

In-Memory OLTP, SQL Scripts, SQL Server

SQL SERVER – Identify Table Where Clustered Index is Not a Primary Key – DMV

SQL SERVER – Finding Waits For Any Session with Extended Event wait_info

1 Comment. Leave new

Tom Wickerath
June 5, 2019 3:57 am
It appears as if the cleanup stage requires a greater than zero comparison:
IF DATABASEPROPERTYEX(‘InMem_HashCollision’, ‘Version’) 0
In SQL Server 2017 (RTM-CU15), this test returns 869 for every database tested:
select DATABASEPROPERTYEX(‘DatabaseName’, ‘Version’);
In SQL Server 2016 (SP1-CU10), this command returns 852 for every database I tested, and in SQL Server 2012 (SP4), this command returns 706 for every database I tested.
Reply