SQL SERVER – Do You Know Your Data’s Classification? – Notes from the Field #050

[Notes from Pinal]: Data is a very simple word, yet it is very powerful. There is a famous saying – Know Your Data. I have quite often found that developers do not know their data, they are often confused with the same and not sure how to answer that. My friend Tim Radney is an amazing person who usually have answer to each of the questions which looks simple but are complicated in reality.

Linchpin People are database coaches and wellness experts for a data driven world. In this 50th episode of the Notes from the Fields series database expert Tim Radney (partner at Linchpin People) explains in a very simple word. Please follow Tim on his twitter handle at @tradney.


Do you know your data’s classification?

As data professionals, we have many responsibilities. We could be responsible for backing up and restoring data, writing reports, building queries, writing stored procedures, tuning workloads, or any vast number of responsibilities.

If you are in the business of granting access to the data either through reports/queries or provisioning login access, you should be aware of the type of data you are granting access to. Your company likely has policies in place that should guide how access to certain classifications of data should be handled.

Some of the more common types of data that your company would have stricter controls over would be related to PII, PCI, SOX, GLBA, or HIPPA. You should know what the guidelines are within your company for access to this data and help make sure that those standards are being upheld.

These data assets may require additional auditing on who has access to view, print, export, etc. When the data is viewed it may require water marks or headers/footers to be on any reports.

Your organization may require certain types of data, such as PCI to be audited on who is viewing certain elements of that data.

Worst, your organization may need to be doing these things but are not. If not, you should be asking why and helping to implement a data governance plan.

If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server.

Note: Tim has also written an excellent book on SQL Backup and Recovery, a must have for everyone.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Row Offset in SQL Server For Different Version

A very common question is how to find row offset in SQL Server. Here are a few examples based on different version of SQL Server. I have included SQL Server 2000 event, though it is almost 15 years old product and I encounter is less and less every day on production server.

I have used database adventureworks for example.

USE AdventureWorks2014
GO
--------------------------------------------------
-- SQL Server 2012/2014
--------------------------------------------------
DECLARE @RowsPerPage INT = 10, @PageNumber INT = 6
SELECT SalesOrderDetailID, SalesOrderID, ProductID
FROM Sales.SalesOrderDetail
ORDER BY SalesOrderDetailID
OFFSET
(@PageNumber-1)*@RowsPerPage ROWS
FETCH NEXT @RowsPerPage ROWS ONLY
GO
--------------------------------------------------
-- SQL Server 2008 / R2
-- SQL Server 2005
--------------------------------------------------
DECLARE @RowsPerPage INT = 10, @PageNumber INT = 6
SELECT SalesOrderDetailID, SalesOrderID, ProductID
FROM (
SELECT SalesOrderDetailID, SalesOrderID, ProductID,
ROW_NUMBER() OVER (ORDER BY SalesOrderDetailID) AS RowNum
FROM Sales.SalesOrderDetail ) AS SOD
WHERE SOD.RowNum BETWEEN ((@PageNumber-1)*@RowsPerPage)+1
AND @RowsPerPage*(@PageNumber)
GO
--------------------------------------------------
-- SQL Server 2000
--------------------------------------------------
DECLARE @RowsPerPage INT
DECLARE
@PageNumber INT
SET
@RowsPerPage = 10
SET @PageNumber = 6
SELECT SalesOrderDetailID, SalesOrderID, ProductID
FROM
(
SELECT TOP (@RowsPerPage)
SalesOrderDetailID, SalesOrderID, ProductID
FROM
(
SELECT TOP ((@PageNumber)*@RowsPerPage)
SalesOrderDetailID, SalesOrderID, ProductID
FROM Sales.SalesOrderDetail
ORDER BY SalesOrderDetailID
) AS SOD
ORDER BY SalesOrderDetailID DESC
) AS SOD2
ORDER BY SalesOrderDetailID ASC
GO

The result of the above queries is identical to each other.

This blog post is based on an earlier blog post which had a few errors corrected in this blog.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Performance Dashboard – Historic Information

There are a lot of games that I play with my daughter in spare time. Some of them are for just for fun and a lot of them are fun filled with some learning. As she started to grow up, it was important for me to weave learning into her day-to-day activities. So as soon as she was born, one of the biggest decision I took was to travel and show her the various places in India and even abroad. The idea is to visit these places from fun point of view but also from an historic importance. Learning different cultures, people, places, food habits etc. she gets to learn a lot in that process. I personally feel it is up to each parent to find their own sweet spot of making learning fun for their kids while balancing all that they might do. It is a long process and a lot of planning goes behind making them realize what they see in the books is what they are visiting. I have huge admiration for the culture and history each country brings and we can always learn a lot from our ancestors.

Talking about history bring us to the next set of reports from Performance Dashboard. They are grouped inside the Historical Information and Miscellaneous Information. They are logically grouped based on Waits and Expensive queries. We are looking at expensive queries based on:

  1. CPU
  2. Logical Reads
  3. Logical Writes
  4. Duration
  5. Physical Reads
  6. CLR Time

Apart from this the historical information also gives the Waits and IO Statistics across each of the databases. We can find the same in image below.

Historical Waits

This report has two sections and they are come from the same dataset. The top section is a bar-chart and the bottom is a details for the each.

The dataset for this report comes from msdb database.

EXEC msdb.MS_PerfDashboard.usp_DmOsWaitStats

Executing this gives a table of all the Waits and the amount of waits on each of them. Since I started the server and executed this report, we can see it has been idle for a longer time which is quite possible. Having said that, in a busy systems these can be completely different.

I have blogged about the Wait Stats extensively over this blog as a 28 part series – if you are interested, feel free to read the same from SQL SERVER – 28 Links for Learning SQL Wait Stats from Beginning.

Historical IO Report

This report is an aggregation of IO operations for each database since the server was started. Since this is IO, it compares relative to each database and compares them by Reads and writes. To get this information, the report makes a call to msdb database for the following SP.

EXEC msdb.MS_PerfDashboard.usp_DBFileIO

As we can see in the figure below, this is a great indicator in my server that the AdventureWorks2012 has taken maximum amount of Reads while my TempDB has suffered the maximum writes.

The patterns that emerge out of this tell me stories that we normally find it difficult to get. Based on this analysis we can assume that there are lot of reports or operations we are doing using AdventureWorks2012 and working them into Tempdb tables on our server. I am sure you will be able to analyze similarly on your installation box too.

The second section of the report shows the details of Top 20 objects for each database based on IO. For simplicity sake I have expanded the AdventureWorks2012 database for reference here.

Expensive Queries section

The next logical section we can see in the report are around expensive queries. They are the same data for each report but logically sorted based on CPU, IO, Reads, Writes etc. If we run profiler we can find the below query hitting our msdb database.

EXEC sp_executesql @stmt=N'exec msdb.MS_PerfDashboard.usp_QueryStatsTopN1 @OrderBy_Criteria',@params=N'@OrderBy_Criteria NVarChar(max)',@OrderBy_Criteria=N'CPU'

As mentioned above the same query gets fired with a different sort order. The top section shows a graphical bar chart of Top CPU consuming queries.

And as usual the second section outlines the details. In each case we can see the details of CPU, IO, and Reads etc. are also shown. This is marked in figure below.

On expanding each of the columns, we can also check the Total, Max, Min and Avg. for CPU, Duration, Physical Reads, and Logical Writes and so on.

Miscellaneous Reports

This contains a bunch of reports that are like addendum to already available reports. The one that is additional and worth a mention are the XEvents Session report.

XEvents are a deeper topic to understand. You can learn a primer from SQL SERVER – Introduction to Extended Events – Finding Long Running Queries. Fundamentally, the report shows the active running Xevents on the server at this moment of time. As you might have guessed by now, the report calls an SP from MSDB database to fillup the contents.

EXEC msdb.MS_PerfDashboard.usp_XEventSessions

End of the day these joining various DMVs to get the output. Like in this case the report uses sys.dm_xe_sessions, sys.dm_xe_session_targets and sys.dm_xe_session_event_actions to get values. As we have said a number of times before, the power of DMVs are always underestimated when working with SQL Server.

Another very important link is about Active Traces. To demonstrate the power of this report, I have started a profiler trace on SQL instance.  Trace ID 1 is the default trace which runs with every default installation of SQL Server and many standard reports are dependent on that.

Trace ID 2 in below report has some warning:

  • Rowset trace: client/GUI traces can negatively impact server performance.
  • This trace is configured to capture one or more frequently occurring events or events which typically produce a large amount of trace data; tracing these events may negatively impact system performance. The expensive events are highlighted in yellow in the table below.

As we can see below that this report displays a warning for any trace that is configured to capture events which are frequently occurring or that typically produce a large amount of trace data.  It is advisable to avoid capturing these events unless strictly required to prevent potential performance problems on the system, generally on non-production server when you are troubleshooting some particular issue. If we click on (+) symbols, we can see events captured by trace and it would also highlight the expensive events.

Here is the list of events which are “expensive” and would be highlighted in Yellow automatically.

Lock:Released
Lock:Acquired
SQL:StmtStarting
SQL:StmtCompleted
SP:StmtStarting
SP:StmtCompleted
Scan:Started
Scan:Stopped
TransactionLog
Showplan Text (Unencoded)
Showplan Text
Showplan All
Showplan Statistics Profile
Audit Statement Permission Event
Audit Schema Object Access Event
Showplan XML
Showplan XML Statistics Profile
Audit Database Object Access Event

Ideally, we should use server side traces. I have seen common misconception about client side and server side trace. People think that if they run profiler UI on server itself, it is called as server side trace which is WRONG. Trace using profiler.exe is called as Rowset-based trace or client side trace, which consumes more resources than tracing directly to a file.  I would recommend my readers to use a server side trace writing directly to a fast disk to minimize the performance impact.  We can use the Profiler user interface to configure the events and columns you want to capture and save that those setting to a script by choosing Export – Script Trace Definition under the File menu option.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Database Taking Long Time to Create – Notes from the Field #047

[Notes from Pinal]: There are few things which confuse us when we encounter first time, however, there are few concepts which confuses us even though we face them multiple times. One of the such subjects is database taking a long time to create. Think about it, if you see a database which takes long time to create, your natural reaction will be how long various operations will take with the same database. I asked him if he can write further on this topic and help people understand this complex subject in simple words.

Linchpin People are database coaches and wellness experts for a data driven world. In this 26th episode of the Notes from the Fields series database expert Tim Radney (partner at Linchpin People) explains in a very simple word. Please follow Tim on his twitter handle at @tradney.


I recently came across a support question on a forum where the user was very concerned about the health of their environment because it was taking over 10 minutes to create a 150GB database. When I was reading the post from the user two things immediately came to mind. First was that they could be creating the data file on a slow IO subsystem and second that they probably do not have Instant File Initialization turned on.

When data and log files are created in SQL Server they are initialized to overwrite any existing data. This is done by filling the files with zeros. This is a process commonly referred to as zero’ing out the file.

In SQL Server there is a way to change a setting that allows for instant initialization of the data file. This process does have a security consideration that must be understood, however for most organizations it is not an issue. Basically by not writing over the free space with zeros it could potentially allow an unauthorized user to read any previous data written to the disk.

Instant file initialization is only available if the SQL Server (MSSQLSERVER) service account has been granted SE_MANAGE_VOLUME_NAME. Members of the Windows Administrator group have this right and can grant it to other users by adding them to the Perform Volume Maintenance Tasks security policy. For more information about assigning user rights, see the Windows documentation. (Reference)

In the case of the user on the forum, they were not using Instant File Initialization and decided to enable it. The user then created another database of the same size and it created in seconds versus the same operation before taking over 10 minutes.

I created a blog and video a couple of years ago walking through the process of enabling this feature. You can view it here.

If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server.

Note: Tim has also written an excellent book on SQL Backup and Recovery, a must have for everyone.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Location of Natively Compiled Stored Procedure and Naming Convention

Yesterday I wrote about SQL SERVER – Beginning In-Memory OLTP with Sample Example. One of the questions I received right after I published a blog post was why do I call stored procedure natively coded stored procedure when the entire code is in T-SQL. Indeed a very good question. The answer is very simple, the reason we call it natively compiled stored procedure as soon as we execute create statement compiler will convert interpreted T-SQL, query plans and expressions into native code.

You can execute the following query in your SSMS and find out the location of the natively compiled stored procedure.

SELECT name,
description
FROM   sys.dm_os_loaded_modules
WHERE description = 'XTP Native DLL'
GO

To see this DMV in action execute the code from this blog post on your SQL Server.

– Create database
CREATE DATABASE InMemory
ON PRIMARY(NAME = InMemoryData,
FILENAME = 'd:\data\InMemoryData.mdf', size=200MB),
-- Memory Optimized Data
FILEGROUP [InMem_FG] CONTAINS MEMORY_OPTIMIZED_DATA(
NAME = [InMemory_InMem_dir],
FILENAME = 'd:\data\InMemory_InMem_dir')
LOG ON (name = [InMem_demo_log], Filename='d:\data\InMemory.ldf', size=100MB)
GO

– Create table
USE InMemory
GO
-- Create a Memeory Optimized Table
CREATE TABLE DummyTable_Mem (ID INT NOT NULL,
Name VARCHAR(100) NOT NULL
CONSTRAINT ID_Clust_DummyTable_Mem PRIMARY KEY NONCLUSTERED HASH (ID) WITH (BUCKET_COUNT=1000000))
WITH (MEMORY_OPTIMIZED=ON)
GO

– Create stored procedure
-- Inserting same 100,000 rows using InMemory Table
CREATE PROCEDURE ImMemory_Insert_test
WITH NATIVE_COMPILATION, SCHEMABINDING,EXECUTE AS OWNER
AS
BEGIN
ATOMIC WITH (TRANSACTION ISOLATION LEVEL=SNAPSHOT, LANGUAGE='english')
DECLARE @counter AS INT = 1
DECLARE @start DATETIME
SELECT
@start = GETDATE()
WHILE (@counter <= 100000)
BEGIN
INSERT INTO
dbo.DummyTable_Mem VALUES(@counter, 'SQLAuthority')
SET @counter = @counter + 1
END
SELECT
DATEDIFF(SECOND, @start, GETDATE() ) [InMemory_Insert in sec]
END
GO

Now let us execute our script as described.

Now we can see in our result, there are two different dll files.  From the image above I have explained various parts of the dll file.

As per the image, our database id is 11 and if we check it is same as what we have created few seconds ago. Similarly the name of the object id can be found as well.

If we open up the folder where we have created this object we will see two sets of file information. One for stored procedure and one for table.

My friend Balmukund explains this concept very well on his blog over here.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Beginning In-Memory OLTP with Sample Example

In-Memory OLTP is a wonderful new feature introduced in SQL Server 2014. My friend Balmukund Lakhani has written amazing series on A-Z of In-Memory on his blog. All serious learner should study it for deep understanding of the same subject. I will try to cover a few of the concepts in simpler word and often you may find me referring Balmukund’s site on this subject.

Why do we need In-Memory?

Here is the paragraph from Balmukund’s blog (published with approval):

Looking at the market trends of tumbling cost of RAM (USD/MB) and performance implication of reading data from memory vs disk, its evident that people would love to keep the data in memory. With this evolution in hardware industry, softwares have to be evolved and modified so that they can take advantage and scale as much as possible. On the other hand, businesses also don’t want to compromise the durability of data – restart would clear RAM, but data should be back in the same state as it was before the failure. To meet hardware trends and durability requirements, SQL Server 2014 has introduced In-Memory OLTP which would solve them in a unique manner.

Before we start on the subject, let us see a few of the reasons, why you want to go for high-performance memory optimized OLTP operation.

  • It naturally integrates with SQL Server relational database
  • It supports Full ACID properties
  • It helps with non-blocking multi-version optimistic concurrency control, in other words, no locks or latches

Well, let us start with a working example. In this example, we will learn a few things – please pay attention to the details.

  1. We will create a database with a file group which will contain memory optimized data
  2. We will create a table with setting memory_optimized set to enabled
  3. We will create a stored procedure which is natively compiled

The procedure of our test is very simple. We will create two stored procedures 1) Regular Stored Procedure 2) Natively Compiled. We will compare the performance of both the SP and see which one performs better.

Let’s Start!

Step 1: Create a database which creates a file group containing memory_optimized_data

CREATE DATABASE InMemory
ON PRIMARY(NAME = InMemoryData,
FILENAME = 'd:\data\InMemoryData.mdf', size=200MB),
-- Memory Optimized Data
FILEGROUP [InMem_FG] CONTAINS MEMORY_OPTIMIZED_DATA(
NAME = [InMemory_InMem_dir],
FILENAME = 'd:\data\InMemory_InMem_dir')
LOG ON (name = [InMem_demo_log], Filename='d:\data\InMemory.ldf', size=100MB)
GO

Step 2: Create two different tables 1) Regular table and 2) Memory Optimized table

USE InMemory
GO
-- Create a Simple Table
CREATE TABLE DummyTable (ID INT NOT NULL PRIMARY KEY,
Name VARCHAR(100) NOT NULL)
GO
-- Create a Memeory Optimized Table
CREATE TABLE DummyTable_Mem (ID INT NOT NULL,
Name VARCHAR(100) NOT NULL
CONSTRAINT ID_Clust_DummyTable_Mem PRIMARY KEY NONCLUSTERED HASH (ID) WITH (BUCKET_COUNT=1000000))
WITH (MEMORY_OPTIMIZED=ON)
GO

Step 3: Create two stored procedures 1) Regular SP and 2) Natively Compiled SP

Stored Procedure – Simple Insert
-- Simple table to insert 100,000 Rows
CREATE PROCEDURE Simple_Insert_test
AS
BEGIN
SET NOCOUNT ON
DECLARE
@counter AS INT = 1
DECLARE @start DATETIME
SELECT
@start = GETDATE()
WHILE (@counter <= 100000)
BEGIN
INSERT INTO
DummyTable VALUES(@counter, 'SQLAuthority')
SET @counter = @counter + 1
END
SELECT
DATEDIFF(SECOND, @start, GETDATE() ) [Simple_Insert in sec]
END
GO

Stored Procedure – InMemory Insert
-- Inserting same 100,000 rows using InMemory Table
CREATE PROCEDURE ImMemory_Insert_test
WITH NATIVE_COMPILATION, SCHEMABINDING,EXECUTE AS OWNER
AS
BEGIN
ATOMIC WITH (TRANSACTION ISOLATION LEVEL=SNAPSHOT, LANGUAGE='english')
DECLARE @counter AS INT = 1
DECLARE @start DATETIME
SELECT
@start = GETDATE()
WHILE (@counter <= 100000)
BEGIN
INSERT INTO
dbo.DummyTable_Mem VALUES(@counter, 'SQLAuthority')
SET @counter = @counter + 1
END
SELECT
DATEDIFF(SECOND, @start, GETDATE() ) [InMemory_Insert in sec]
END
GO

Step 4: Compare the performance of two SPs

Both of the stored procedure measures and print time taken to execute them. Let us execute them and measure the time.

-- Running the test for Insert
EXEC Simple_Insert_test
GO
EXEC ImMemory_Insert_test
GO

Here is the time taken by Simple Insert: 12 seconds

Here is the time taken by InMemory Insert: Nearly 0 second (less than 1 seconds)

Step 5: Clean up!

-- Clean up
USE MASTER
GO
DROP DATABASE InMemory
GO

Analysis of Result

It is very clear that memory In-Memory OLTP improves performance of the query and stored procedure. To implement In-Memory OLTP there are few steps user to have follow with regards to filegroup and table creation. However, the end result is much better in the case of In-Memory OTLP setup.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Video Introduction to Delayed Durability – SQL in Sixty Seconds #074

Earlier I wrote blog post based on my latest Pluralsight course on learning SQL Server 2014. SQL Server 2014 has introduced many new features and one of performance is Delayed Transaction Durability. This concept is indeed very interesting. To explain this feature in detail, we will require to understand what is Full Transaction Durability. The current default of SQL Server is Full Transaction Durability. A common question I often received is why would we go away from to delayed durability. The answer is – Performance prioritation over Durability.

I hope my earlier blog post clearly explained how delayed durability works for executing query works. If not, I suggest you watch following quick video where I explain this concept in extremely simple words.

You can download the code used in this video from Simple Example of Delayed Durability.

Action Item

Here are the blog posts I have previously written. You can read it over here:

You can subscribe to my YouTube Channel for frequent updates.

Reference: Pinal Dave (http://blog.sqlauthority.com)