I really like my work as every single day I get to see different SQL Server and new interesting problems to solve when I work with Comprehensive Database Performance Health Check. Recently, while working with a client we noticed that they have a huge index with a first key column as uniqueidentifier datatype. In this blog post, we will discuss Using NEWID vs NEWSEQUENTIALID for Performance.
As soon as I noticed that the column is of datatype uniqueidentifier, the very first my reaction was to remove it and replace it with some other column. However, that was not possible due to heavy dependance of this datatype in the application and third party vendor app. This limited us to keep on using the datatype uniqueidentifier as the column.
After careful testing by comparing uniqueidentifier and INT as a datatype, we realized that by keeping the column as uniqueidentifier we were loosing over 200% of the performance. However, we had no option to change the datatype. I noticed that my client was using the default value of NEWID to generate new rows. As soon as I learned that I realized that I had one more trick, I can use to improve the performance of the query.
We replaced NEWID with NEWSEQUENTIALID as the default value and started our test again. In our test with the various different workloads, we identified a huge performance improvement and we were able to keep our data type as it is. The reason for this behavior is that NEWID generates absolutely random uniqueidentifier value whereas NEWSEQUENTIALID generates incrementing uniqueidentifier value.
As NEWID has to generate a random number it is not possible to guess which number will be the next number but in the case of NEWSEQUENTIALID, often it is easy to guess the next number just like INTEGER as an auto-increment digit. In terms of security, you should use NEWID. Now let us see a following script which generates 100 random number for us and see the pattern of both the numbers.
CREATE TABLE myTable (NewIDCol UNIQUEIDENTIFIER DEFAULT NEWID(), NewSeqIDCol UNIQUEIDENTIFIER DEFAULT NEWSEQUENTIALID()); GO INSERT INTO myTable (NewIDCol, NewSeqIDCol) VALUES (DEFAULT, DEFAULT); GO 100 SELECT * FROM myTable; GO DROP TABLE myTable GO
Here is the output of the above query.
You can clearly see the difference between both – NEWID and NEWSEQUENTIALID. If you do not need a true random number I suggest you use the NEWSEQUENTIALID as it is faster to generate and also fills up a complete page due to fill factor. We will further test various aspects of the same in the upcoming blog posts.
Reference: Pinal Dave (https://blog.sqlauthority.com)
Well, that is a truth with limitation; generated GUIDs are only sequential (as in ever increasing) until you restart Windows – after which the GUID generated next may even be smaller than the ones previously generated…
I’ve read that the algorithm for generating the NEWSEQUENTIALID is based on the NIC, which makes me wonder if it would maintain its increment property when the database moves between servers, such as can happen with many HA mechanisms, i.e. availability groups. If not, performance might mysteriously plummet after a failover.
The unique identifier is especially hurting if you have it as a primary key, as that is your clustered index, which means with every insert you have to physically reorder rows to insert it in the right place -> after some time you’ll have 99% space taken up with “free” space due to frequent moving of rows, not to mention the performance impact…
We are transforming the application to Azure SQL and our DB Transactional tables have NEWID() as uniqueidentifier field. Is it good idea to have uniqueidentifier as a foreign key(keeping unique constraint on Parent table). We may need for the createduserid/modifieduserid/requestid. Is there any alternative to uniqueidentifier which would have better performance and still holds unique values as we migrate across dbs and servers.
I am not a fan of uniqueidentifier fields as I have often seen performance issues.