SQL SERVER – Practical Tips to Reduce SQL Server Database Table Size – Experts Opinion

I am one of those fortunate people who has friends everywhere in the world. Syrovatchenko Sergey is one of my good friends who is an expert on SQL Server and works at Devart. When we met a few days ago in one of the event, we end up discussing about practical tips to reduce database table size. Upon returning home, I found Sergey has sent me an amazing article on this subject. It is beautifully written article and I enjoyed it very much. Let us read what Sergey has to say about this subject in his own words.

In this article, I’d like to elaborate on the subject of boosting productivity while working with database tables. Chances are you already know this from multiple resources on database development process.

experts1 SQL SERVER   Practical Tips to Reduce SQL Server Database Table Size   Experts Opinion

However, the topic seems to become a front-burner issue when there are continuous data growths — tables become too large that leads to the performance loss.

It happens due to an ill-designed database schema that was not originally designed for handling large volumes of data.

To avoid the performance loss in the context of continuous data growth, you should stick to certain rules when designing a database schema.

Rule # 1 — Minimum Redundancy of Data Types

The fundamental unit of SQL Server data storage is the page. The disk space intended for a data file in a database is logically divided into pages numbered contiguously from 0 to n. In SQL Server, the page size is 8 KB. This means SQL Server databases have 128 pages per megabyte.

Disk I/O operations are performed at the page level. That is, SQL Server reads or writes whole data pages. The more compact data types is used, the less pages for storing that data are required, and as a result, less I/O operations needed.

Introduced in SQL Server, buffer pool significantly improves I/O throughput. The primary purpose of the SQL buffer pool is to reduce database file I/O and improve the response time for data retrieval.

Thus, when compact data types are used, the buffer pool stores larger amount of data on the same amount of pages. As a result, you will not waste the memory and reduce a number of logical operations.

Consider the following example — a table that stores working days of employees.

CREATE TABLE dbo.WorkOut1 (
, EmployeeID BIGINT
, WorkShiftCD NVARCHAR(10)
WorkHours DECIMAL(24,2)

Are the selected data types correct? The most probable answer is no. It is unlikely that an enterprise has (2^63-1) employees. Therefore, the BIGINT is an unsuitable data type in this case.

We can remove this redundancy and estimate the query execution time.

CREATE TABLE dbo.WorkOut2 (
, EmployeeID INT
, WorkShiftCD VARCHAR(10)
WorkHours DECIMAL(8,2)

The following execution plan demonstrates the cost difference which depends on a row size and expected number of rows.

experts2 SQL SERVER   Practical Tips to Reduce SQL Server Database Table Size   Experts Opinion

The less data you need to retrieve, the faster query will run.

(3492294 row(s) affected)
SQL Server Execution Times:
CPU time = 1919 ms, elapsed time = 33606 ms.
(3492294 row(s) affected)
SQL Server Execution Times:
CPU time = 1420 ms, elapsed time = 29694 ms.

As you can see, the usage of non-redundant data types is a keystone for the best query performance. It also allows reducing the size of problem tables. By the way, you can execute the following query for measuring a table size:

table_name = SCHEMA_NAME(o.[schema_id]) + '.' + o.name
, data_size_mb = CAST(do.pages * 8. / 1024 AS DECIMAL(8,4))
FROM sys.objects o
, total_rows = SUM(p.[rows])
total_pages = SUM(a.total_pages)
usedpages = SUM(a.used_pages)
pages = SUM(
WHEN it.internal_type IN (202, 204, 207, 211, 212, 213, 214, 215, 216, 221, 222) THEN 0
WHEN a.[type]! = 1 AND p.index_id < 2 THEN a.used_pages
WHEN p.index_id < 2 THEN a.data_pages ELSE 0
FROM sys.partitions p
JOIN sys.allocation_units a ON p.[partition_id] = a.container_id
LEFT JOIN sys.internal_tables it ON p.[object_id] = it.[object_id]
GROUP BY p.[object_id]
) do ON o.[object_id] = do.[object_id]
WHERE o.[type] = 'U'

For the above-considered tables, the query returns the following results:

table_name           data_size_mb
——————– ——————————-
dbo.WorkOut1         167.2578
dbo.WorkOut2         97.1250

Rule # 2 — Use Database Normalization and Avoid Data Duplication

Recently, I have analyzed a database of a free web service that allows formatting T-SQL code. The server part is quite simple over there and consists of a single table:

CREATE TABLE dbo.format_history (
session_id BIGINT
, format_date DATETIME
, format_options XML

Every time when formatting SQL code, the following parameters were saved to the database: current session ID, server time, and the settings that were applied while formatting user’s SQL code.

This data subsequently were used for determining of most popular formatting styles. There were plans to add these styles to SQL Complete default formatting styles.

However, the service popularity rise led to a significant table rows increase, and profiles processing became slow. The settings had the following XML structure:



<PropertyValue Name="Select_SelectList_IndentColumnList">true</PropertyValue>

<PropertyValue Name="Select_SelectList_SingleLineColumns">false</PropertyValue>

<PropertyValue Name="Select_SelectList_StackColumns">true</PropertyValue>

<PropertyValue Name="Select_SelectList_StackColumnsMode">1</PropertyValue>

<PropertyValue Name="Select_Into_LineBreakBeforeInto">true</PropertyValue>


<PropertyValue Name="UnionExceptIntersect_LineBreakBeforeUnion">true</PropertyValue>

<PropertyValue Name="UnionExceptIntersect_LineBreakAfterUnion">true</PropertyValue>

<PropertyValue Name="UnionExceptIntersect_IndentKeyword">true</PropertyValue>

<PropertyValue Name="UnionExceptIntersect_IndentSubquery">false</PropertyValue>




450 formatting options in total. Each row takes 33 KB in the table. The daily data growth exceeds 100 MB. As an obvius outcome, the database has been expanding day by day, thus making data analysis yet more complicated .

Surprisingly, the salvation turned out to be quite easy: all unique profiles were placed into a separate table, where a hash was defined for every set of options. As of SQL Server 2008, you can use the sys.fn_repl_hash_binary function for this.

So the DB schema has been normalized:

CREATE TABLE dbo.format_profile (
format_hash BINARY(16) PRIMARY KEY
, format_profile XML NOT NULL
CREATE TABLE dbo.format_history (
session_id BIGINT
, format_date SMALLDATETIME
, format_hash BINARY(16) NOT NULL
CONSTRAINT PK_format_history PRIMARY KEY CLUSTERED (session_id, format_date)

And if the previous query looked like this:

SELECT fh.session_id, fh.format_date, fh.format_options
FROM SQLF.dbo.format_history fh

The new schema required the JOIN usage to retrieve the same data:

SELECT fh.session_id, fh.format_date, fp.format_profile
FROM SQLF_v2.dbo.format_history fh
JOIN SQLF_v2.dbo.format_profile fp ON fh.format_hash = fp.format_hash

If we compare the execution time for two queries, we can hardly see the advantages of the schema changes.

(3090 row(s) affected)
SQL Server Execution Times:
CPU time = 203 ms, elapsed time = 4698 ms.
(3090 row(s) affected)
SQL Server Execution Times:
CPU time = 125 ms, elapsed time = 4479 ms.

But in this case, the goal was to decrease time for analysis. Before we had to write an intricate query for getting the list of popular formatting profiles:

;WITH cte AS (
, hsh = sys.fn_repl_hash_binary(CAST(fh.format_options AS VARBINARY(MAX)))
FROM SQLF.dbo.format_history fh
SELECT c2.format_options, c1.cnt
SELECT TOP (10) hsh, rn = MIN(rn), cnt = COUNT(1)
FROM cte
) c1
JOIN cte c2 ON c1.rn = c2.rn

Now due to the data normalization, we managed to simplify the query:

, t.cnt
, cnt = COUNT(1)
FROM SQLF_v2.dbo.format_history fh
GROUP BY fh.format_hash
) t
JOIN SQLF_v2.dbo.format_profile fp ON t.format_hash = fp.format_hash

As well as to decrease the query execution time:

(10 row(s) affected)
SQL Server Execution Times:
CPU time = 2684 ms, elapsed time = 2774 ms.
(10 row(s) affected)
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 379 ms.

In addition, the database size has decreased:

database_name    row_size_mb
—————- —————
SQLF             123.50
SQLF_v2          7.88

To retrieve a file size, the following query can be used:

database_name = DB_NAME(database_id)
row_size_mb = CAST(SUM(CASE WHEN type_desc = 'ROWS' THEN size END) * 8. / 1024 AS DECIMAL(8,2))
FROM sys.master_files
WHERE database_id IN (DB_ID('SQLF'), DB_ID('SQLF_v2'))
GROUP BY database_id

Hope I managed to demonstrate the importance of data normalization.

Rule # 3 — Be careful while selecting indexed columns.

An index is an on-disk structure associated with a table or view that speeds retrieval of rows from a table or a view. Indexes are stored on pages, thus, the less pages is required to store indexes, the faster search process is. It is extremely important to be careful while selecting clustered indexed columns, because all the clustered index columns are included in every non-clustered index. Due to this fact, a database size can increase dramatically.

Rule # 4 — Use Consolidated Tables.

You do not need to execute a complex query on a large table. Instead, you can execute a simple query on a small table.

For instance, we have the following consolidation query

FROM dbo.WorkOutFactor
WHERE Value > 0

If there is no need to often change the table data, we can create a separate table

FROM dbo.WorkOutFactorCache

The data retrieval from such consolidated table will be much faster:

(185916 row(s) affected)
SQL Server Execution Times:
CPU time = 3448 ms, elapsed time = 3116 ms.
(185916 row(s) affected)
SQL Server Execution Times:
CPU time = 1410 ms, elapsed time = 1202 ms.

Rule # 5 — Every rule has an exception

I’ve shown a couple of examples that demonstrated how to eliminate redundant data types and shorten queries execution time. But it does not always happen.

For instance, the BIT data type has a peculiarity —  SQL Server optimizes the storage of such columns group on a disk. If a table contains 8 (or less) columns of the BIT type, they are stored in the page as 1 byte. And if the table contains 16 columns of the BIT type, they are stored in the page as 2 bytes etc. The good news is that the table will take up little space and reduce disc I/O.

The bad news is that an implicit decoding will take place while retrieving data,  and the process is very demanding in terms of CPU resources.

Here is the example. Assume we have 3 identical tables containing information about employees work schedule (31 + 2 PK columns).  The only difference between tables is the data type for consolidated values (1– presence, 2 – absence)


When using less redundant data types, the table size decreases considerably (especially the last table)

table_name           data_size_mb
——————– ————–
dbo.E31_INT          150.2578
dbo.E32_TINYINT      50.4141
dbo.E33_BIT          24.1953

However, there is no significant speed gain from using the BIT type

(1000000 row(s) affected)
Table ‘E31_INT’. Scan count 1, logical reads 19296, physical reads 1, read-ahead reads 19260, …
SQL Server Execution Times:
CPU time = 1607 ms,  elapsed time = 19962 ms.
(1000000 row(s) affected)
Table ‘E32_TINYINT’. Scan count 1, logical reads 6471, physical reads 1, read-ahead reads 6477, …
SQL Server Execution Times:
CPU time = 1029 ms,  elapsed time = 16533 ms.
(1000000 row(s) affected)
Table ‘E33_BIT’. Scan count 1, logical reads 3109, physical reads 1, read-ahead reads 3096, …
SQL Server Execution Times:
CPU time = 1820 ms,  elapsed time = 17121 ms.

But the execution plan will show the opposite.

experts3 SQL SERVER   Practical Tips to Reduce SQL Server Database Table Size   Experts Opinion

So the negative effect from the decoding will not appear if a table contains less than 8 BIT columns. One must note that the BIT data type is hardly used in SQL Server metadata. More often the BINARY data type is used, however it requires manual manipulations for obtaining  specific values.

Rule # 6 — Delete data that no longer required.

SQL Server supports a performance optimization mechanism called read-ahead. This mechanizm anticipates the data and index pages needed to fulfill a query execution plan and brings pages into the buffer cache before they are actually used by the query.

So if the table contains a lot of needless data, it may lead to unnecessary disk I/O. Besides, getting rid of needless data allows you to reduce the number of logical operations while reading from the Buffer Pool.

In conclusion, my advice is to be extremely careful while selecting data types for columns and try predicting future data loads.

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #047 – Logic To Find Weekdays Between Two Dates

I often enjoy seeing interview questions where the contestant has to write SQL Scripts. Here is another such example.

Question: How to find weekdays between two dates?

Answer: Here is a function which can be used to find weekdays between two dates.

CREATE FUNCTION dbo.getDayCount(@startdate date, @enddate date)
@count INT = 0
WHILE @startdate<=@enddate
DATEPART(dw,@startdate) > 1 AND DATEPART(dw,@startdate) < 7
@count = @count + 1
@startdate = DATEADD(DAY,1,@startdate)
-- Execute Function
SELECT dbo.getDayCount ('10/10/2015', '11/10/2015')

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER 2016 – Comparing Execution Plans

New tools bring new capabilities and sometimes these are ingrained deep into the product that we can easily miss them out. And SQL Server 2016 is no different. After installing the CTP, I was pleasantly surprised to see this addition to help our DBA’s who are doing performance tuning and query tuning activities.

So what am I talking about? Think of a scenario wherein you are doing some sort of testing in your QA environment. You seem to be troubleshooting a production scenario and found that very scenario to be working very well in your test environment. Though you have taken the data from the production database to the test QA server less than a month back, you are still baffled with the question – why this difference?

The next requirement everyone looks at is to start comparing the execution plans between your production and test environment for the same query. Some of the execution plans on a production server can run to pages sometimes and are difficult to find difference visually.

SQL Server 2016 – Plan comparison

If you have the execution plan from your production environment, then go ahead and open it in SQL Server 2016 Management Studio and right click to get the following option:

compare plan 00 SQL SERVER 2016   Comparing Execution Plans

You can see that an interesting addition. The “Compare Showplan” needs to be selected. Now, select the plan that was generated from your test environment. This will make bring both the execution plans on a side-by-side view as shown below. Look at the Purple color on the nodes.

compare plan 01 SQL SERVER 2016   Comparing Execution Plans

This represents the fact that both the nodes are same but have some differences in some properties. Now it is upto us to right click and bring the properties tab.

compare plan 02 SQL SERVER 2016   Comparing Execution Plans

Here we can go ahead and compare each of the node properties to find where the discrimination between the plans are. This is awesome capabilities added in the new version.

I am sure you will find this feature of great use in query tuning exercises in your environments. Having said that, one thing I have seen interesting is, if you take a live “Actual Execution Plan” and then try to compare it with a saved .sqlplan file – it will come up with an error as shown below:compare plan 03 SQL SERVER 2016   Comparing Execution Plans

Do let me know if you would use this feature and if you have used this feature – what is some of the enhancements you wished this comparison operator had? I would like to learn from you for sure.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Installing and Getting Started With Semantic Search

After I wrote about the basics in the blog SQL SERVER – What is Semantics Search Inside SQL Server?, I had multiple requests to understand and get started with Sematic Search. So I thought to set the building blocks of working with the same. Where do we start? Well, here is something for you:

Similar to earlier versions of SQL Server, to use semantic search, you must complete installing “Full-Text” and “Semantic Extractions for Search” on the “Features to Install” page during setup of SQL Server. They are installed together as one feature.

You can run the following command to determine if full-text and semantic search are installed. A return value of 1 indicates that these components are installed; a return value of 0 indicates that they are not installed.


Language Statistical Database

In addition to installing the full-text and semantic search feature, statistical semantic search requires a language statistical database. This installs separately – I will outline the steps below.

This database contains the statistical language models required by semantic search. A single semantic language statistics database contains the language models for all the languages that are supported for semantic indexing.

The language used for the full-text index on a column determines the statistical language model used for semantic indexing. When you create a semantic index on a column, the index creation fails if you specify a language for which the language model is not available.

Note: Fewer languages are supported for semantic indexing than for full-text indexing. As a result, there may be columns that support full-text indexing, but that do not support semantic indexing.

Installing the Language Statistical Database

  • Locate the Windows installer package named SemanticLanguageDatabase.msi on the SQL Server installation media.
  • Run the SemanticLanguageDatabase.msi Windows installer package to extract the database and log file.
  • Move the extracted database file and log file to a suitable location in the file system. If you leave the files in their default location, it will not be possible to extract another copy of the database for another instance of SQL Server.
  • Note: When the semantic language statistics database is extracted, restricted permissions are assigned to the database file and log file in the default location in the file system. As a result, you may not have permission to attach the database if you leave it in the default location. If an error is raised when you try to attach the database, move the files, or check and fix file system permissions as appropriate.
  • Attach the database using the following command:

ON (FILENAME = 'C:\Microsoft Semantic Language Database\semanticsdb.mdf')

Registering the Language Statistical Database

After attaching the Language Statistical Database, you must register it so SQL Server knows to use it as the Language Statistical Database:

EXEC sp_fulltext_semantic_register_language_statistics_db @dbname = N'semanticsdb';

Verify the Language Statistical Database is installed and Registered

The following query returns a single row of information if the semantic language statistics database is installed and registered for the instance.

SELECT DB_NAME(database_id) DBName, * FROM sys.fulltext_semantic_language_statistics_database;

 SQL SERVER   Installing and Getting Started With Semantic Search

In subsequent blogs, I will try to take couple of examples to illustrate how these enabled features can be used in our applications.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – Availability Group and the Six Critical Steps for DBAs to Follow – Notes from the Field #104

Mike Lawell SQL SERVER   Availability Group and the Six Critical Steps for DBAs to Follow   Notes from the Field #104

Mike’s, “now what am I supposed to do?”, face.

[Note from Pinal]: In this episode of the Notes from the Field series database expert Mike Lawell explains about Availability Group and the Six Critical Steps for DBAs to Follow.  A few weeks ago, I asked questions in conferance, how many people know about availability group. Only a handful of people raised their hands. I was expecting this as many have heard about AlwaysOn and Availability Groups but not everyone has implemented the same. Most of the time, there are no clear guidelines for the same. In this blog post Mike tells an amazing story about Six Critical Steps every DBA must know about Availability group. He makes this simple concept very easy.  Read the experience of  Mike in her own words.

Your boss has asked you to implement SQL Server AlwaysOn Availability Groups in your environment. Exciting as it may, you’re an accidental DBA with little to no experience implementing any high availability solution, let alone, a fairly new availability group.

The first thing you say is, fantastic! Then, hmmm, fantastic. You realize this is above your head, but it is a great opportunity to learn.

This happened to one of my clients, where the DBA barely had SQL Server Database Mirroring experience. The product had a required up time of 99.999% (yeah, right), that they were having problems maintaining. They wanted to improve their availability and business continuity.

The DBA had a lot of anxiety with implementing AGs in this environment without experiencing considerable downtime.

Fortunately they were able to engage our services to lead them through the process. We migrated from mirroring to AGs with approximately a minute of downtime into a 4 server multi-subnet environment to a new set of servers.

1.      Research

There are many environments where AGs will simply not work (or have a hidden disaster waiting). For instance, you should not use availability groups in SQL Server 2012 or 2014, if you’re doing cross database transactions or distributed transactions. If you have a failover during a distributed transaction, it is possible that your database will enter a suspect mode (btw, this changes in SQL Server 2016).

How many servers and replicas will you need? Which databases do you want in which AG? What might the licensing costs be? What should/can your quorum mode be? Do your applications even support AGs?

2.      Plan

Develop the high level plan to the best of your knowledge. Don’t worry about being wrong, you will miss something. Take the time to think of every application or service, etc. that will connect to the AG. Document all of the jobs, alerts, mail configuration, operators, everything you can think of (find a script to do it).

3.      Test the AG Build

Now that you have some idea of what is possible and you think you know what needs to be done, build an AG as a proof of concept in an isolated environment to test your theory of how it should work. Make sure you use the same drive letters as production. You will learn that you will miss important configurations, steps, etc. Tear the AG down then rebuild it, until you are familiar with the steps, write the steps you take down (every step). Do this until you’re confident you have all of the steps assembled and you haven’t missed anything. You will miss something, because most database environments are complex, and many DBAs don’t know all of the services connecting to their databases.

4.      Script the Implementation

104 doc SQL SERVER   Availability Group and the Six Critical Steps for DBAs to Follow   Notes from the Field #104This part really helps in minimizing downtime. You don’t want to be clicking through SSMS trying to get all of the configurations correct in production and risk additional downtime at failover or worse.

I built the AGs ahead of time in the production environment and added a test database so I could test the failover between machines before go live.

The scripts were PowerShell scripts pulling the configuration from a csv file. I learned how to script PowerShell from the internet. Make sure your configuration file is tested. I create a spreadsheet with the important data (created from a script run against production) about database name, data file location, and more. Don’t type the configurations if you can avoid it. This is where you get into trouble especially in complex environments. I have some examples of the PowerShell AG and SQLCMD Log Shipping scripts I used on my blog at SQLServerAssociates.com

5.      Test the Scripts

This is the most important step as even the best plan must be tested. It must work repeatedly in your development environment without errors. Don’t compromise with this step. If you want to avoid extended downtime because you received an error that you’ll have to research to fix, test your scripts.

6.      Know your Environment

You must be able to talk confidently about AGs and defend your configuration to management. You will need to document your environment for management, operations, and other DBAs (including yourself). Use Visio diagrams for visuals.


Migrating your environment to AGs can be a great learning experience, but don’t be afraid to engage a consultant to help you do it right. The justification is usually very easy when talking to management when you equate downtime to revenue lost.

If you want to get started with SQL Server with the help of experts, read more over at Fix Your SQL Server.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – What Resource Wait Are We Seeing?

I have been a big supporter of the Activity Monitor capability inside SQL Server Management Studio. And I have written a number of articles before in this blog. Please take a moment to look at:

SQL SERVER – Activity Monitor to Identify Blocking – Find Expensive Queries

One of the readers wrote to me on this blog and asked me something simple and I was surprised I had not touched on it before. His mail was:

Hi Pinal,

I have seen your blog around Activity Monitor and as a budding DBA, I am still learning the tricks of the trade. I use many scripts from this site and they have helped me a number of times in my career.

In Activity monitor, I notice a large number of PAGELATCH waits. If I look at processes, I sometimes see a large number of processes waiting for PAGELATCH_SH and PAGELATCH_UP on resources 2:1:3 and sometimes on 2:1:1. This shows contention is placed on SGAM and PFS pages.

Do you have a script written somewhere to find what type of disk resource is under contention? After reading a number of web search I can see this is tempdb contention because of database id of 2.

I am sure you can help me somehow.


Blog reader

I personally didn’t see much of a problem in his request. And I was not aware if this script existed in this blog. So I went ahead to write the same. This is a query based on the sys.dm_exec_requests, sys.dm_exec_sessions and sys.dm_exec_qury_plan.

SELECT es.session_id, DB_NAME(er.database_id) AS [database_name],
OBJECT_NAME(qp.objectid, qp.dbid) AS [object_name], -- NULL if Ad-Hoc or Prepared statements
WHEN pageid = 1 OR pageid % 8088 = 0 THEN 'Is_PFS_Page'
WHEN pageid = 2 OR pageid % 511232 = 0 THEN 'Is_GAM_Page'
WHEN pageid = 3 OR (pageid - 1) % 511232 = 0 THEN 'Is_SGAM_Page'
ELSE 'Is Not PFS, GAM or SGAM page' END
(SELECT CASE WHEN er.[wait_type] LIKE 'PAGE%LATCH%' AND er.[wait_resource] LIKE '%:%'
THEN CAST(RIGHT(er.[wait_resource], LEN(er.[wait_resource]) - CHARINDEX(':', er.[wait_resource], LEN(er.[wait_resource])-CHARINDEX(':', REVERSE(er.[wait_resource])))) AS INT)
ELSE NULL END AS pageid) AS latch_pageid
) AS wait_resource_type,
er.wait_time AS wait_time_ms,
SELECT qt.TEXT AS [text()]
FROM sys.dm_exec_sql_text(er.sql_handle) AS qt
FOR XML PATH(''), TYPE) AS [running_batch],
CASE WHEN er.statement_start_offset = 0 THEN 0 ELSE er.statement_start_offset/2 END),
CASE WHEN er.statement_end_offset = -1 THEN DATALENGTH(qt2.TEXT) ELSE er.statement_end_offset/2 END - (CASE WHEN er.statement_start_offset = 0 THEN 0 ELSE er.statement_start_offset/2 END))) AS [text()]
FROM sys.dm_exec_sql_text(er.sql_handle) AS qt2
FOR XML PATH(''), TYPE) AS [running_statement],
FROM sys.dm_exec_requests er
LEFT OUTER JOIN sys.dm_exec_sessions es ON er.session_id = es.session_id
CROSS APPLY sys.dm_exec_query_plan (er.plan_handle) qp
WHERE er.session_id <> @@SPID AND es.is_user_process = 1
ORDER BY er.total_elapsed_time DESC, er.logical_reads DESC, [database_name], session_id

The output will help you in analyzing the typical wait resource type and if you are having tempdb contention this should show up here. For a matter of fact, if you have other contentions on disk, that would show up too.

Do let me know if this script was helpful and I will be more than happy to extend the same. Or if you extended the script, please make sure to share the same via comments.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – How to Create Linked Server to PostgreSQL?

The databases world is expanding and I have been fortunate enough to learn and share my experiences around databases like MySQL, PostgreSQL apart from working with SQL Server. I always try to expand my horizon and try various database products. In case you don’t know, I have course on Pluralsight about PostgreSQL


It is always interesting to integrate various products and make them work seamlessly. Due to my knowledge of both database products, I have been asked one question very frequently.

How can I create linked servers in SQL connecting with Postgres?

Though this question looks simple and easy – I thought of writing a note to show you the actual steps to achieve the same. This blog shows simple example about creating linked server. There is a provider called PGOLEDB which can be used for this purpose.

Create a Database and Table in PostgreSQL

  1. Create database:
  1. Once database is created, change the connection, create table and insert some data.
ID integer NOT NULL, 
Name varchar(128) NOT NULL
insert into MyTable values (1, 'Pinal Dave');
  1. Verify that we have data in table
    Select * from MyTable

    PG Linked 01 SQL SERVER   How to Create Linked Server to PostgreSQL?

    Create a Linked Server in SQL Server

    1. Go to http://www.pgoledb.com and choose “Download” from menu bar.
      PG Linked 02 SQL SERVER   How to Create Linked Server to PostgreSQL?
    2. Go to the page and choose “PGNP OLEDB Providers for Postgres, Greenplum and Redshift” as shown below.
      PG Linked 03 SQL SERVER   How to Create Linked Server to PostgreSQL?
    3. Install it and then we should see provider it in SSMS
      PG Linked 04 SQL SERVER   How to Create Linked Server to PostgreSQL?
    1. Modify and run below script in SQL Server. You need to change Server Name, port etc.
      -- Change parameter for provider Allow In Procees = true / DynamicParameters = true
      EXEC MASTER.dbo.sp_MSset_oledb_prop N'PGNP'  ,N'AllowInProcess',1
      EXEC MASTER.dbo.sp_MSset_oledb_prop N'PGNP'  ,N'DynamicParameters',1
      DECLARE @name NVARCHAR(4000);
      DECLARE @provider NVARCHAR(4000);
      DECLARE @servername NVARCHAR(4000);
      DECLARE @port NVARCHAR(4000);
      DECLARE @db_name NVARCHAR(4000)
      -- destination postgres database
      SET @name = N'SQLAuth_PG';
      SET @provider = N'PGNP';
      SET @servername = N'localhost';
      SET @port = 'PORT=5432;'
      SET @db_name = N'sqlauthority';
      -- create linked server
      EXEC MASTER.dbo.sp_addlinkedserver @server = @name
      ,@srvproduct = N'PGNP'
      ,@provider = N'PGNP'
      ,@datasrc = @servername
      ,@provstr = @port
      ,@catalog = @db_name
      -- username and password for postgres
      EXEC MASTER.dbo.sp_addlinkedsrvlogin @rmtsrvname = @name
      ,@useself = N'False'
      ,@locallogin = NULL
      @rmtuser = N'postgres'
      ,@rmtpassword = 'sa'
      -- set up Extended properties of the Linked Server
      EXEC MASTER.dbo.sp_serveroption @server = @name
      ,@optname = 'data access'
      ,@optvalue = 'true'
      EXEC MASTER.dbo.sp_serveroption @server = @name
      ,@optname = 'use remote collation'
      ,@optvalue = 'true'
      EXEC MASTER.dbo.sp_serveroption @server = @name
      ,@optname = 'rpc'
      ,@optvalue = 'true'
      EXEC MASTER.dbo.sp_serveroption @server = @name
      ,@optname = 'rpc out'
      ,@optvalue = 'true'
    2. Once completed, we should be able to see linked server created as shown below.
      PG Linked 05 SQL SERVER   How to Create Linked Server to PostgreSQL?

    Test the Linked Server

    Here are the test queries which I have used based on earlier script.

    SELECT [id],[name]
    FROM   [SQLAuth_PG].[sqlauthority].[public].[mytable]
    SELECT *
    FROM OPENQUERY(SQLAuth_PG, 'select id, name from mytable');

    PG Linked 06 SQL SERVER   How to Create Linked Server to PostgreSQL?

    Have you ever created any other 3rd party linked server in your environments? What methods have you used? Will you be kind enough to share the same via comments?

    Reference : Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – New features in SQL Server 2016 Setup Wizard

In this world of change, the SQL Server product is also changing. One thing which is very clear is that Microsoft is listening to feedbacks from customers and enhancing the setup experience also. Some of the classic feedback implementation can be seen in the setup of the next version of SQL Server 2016. In this blog post, I will try to bring out some of the enhancements that are being incorporated in the Setup UI. Some are critical and worth making a note.

Here are the few things which I noticed in SQL Server user interface.

SQL Setup allows user to add Database Engine Service SID into “Perform Volume Maintenance Task” privilege under security policy. This helps in instant file initialization

sql2016 setup 01 SQL SERVER   New features in SQL Server 2016 Setup Wizard

Here is the new parameter added in command line for SQL Setup.


This may lead to information disclosure as it could allow deleted content to be accessed by an unauthorized principal.

TempDB Configuration

In SQL Server 2016, we have the option to specify TempDB file specifications during setup itself. Most of the companies have “post deployment script” and one of the steps is to optimize TempDB files. It’s a tab called “TempDB” under “Database Engine Configuration”

sql2016 setup 02 SQL SERVER   New features in SQL Server 2016 Setup Wizard

Since UI has it, there is also a parameter in command line as below:

Parameter Meaning
SQLTEMPDBFILECOUNT The number of Database Engine TempDB files
SQLTEMPDBFILESIZE Initial size of a Database Engine TempDB data file in MB.
SQLTEMPDBFILEGROWTH automatic growth increment of each Database Engine TempDB data file in MB
SQLTEMPDBLOGFILESIZE initial size of the Database Engine TempDB log file in MB.
SQLTEMPDBLOGFILEGROWTH automatic growth increment of the Database Engine TempDB log file in MB.
SQLTEMPDBDIR Directories for Database Engine TempDB files.
SQLTEMPDBLOGDIR Directory for the Database Engine TempDB log files

Hopefully, we would see more feedback embedded in the product by default.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

Interview Question of the Week #046 – How @@DATEFIRST and SET DATEFIRST Are Related?

Question: How @@DATEFIRST and SET DATEFIRST are related?

Answer: The master database’s syslanguages table has a DateFirst column that defines the first day of the week for a particular language. SQL Server with US English as default language, SQL Server sets DATEFIRST to 7 (Sunday) by default. We can reset any day as first day of the week using


This will set Friday as the first day of the week.
@@DATEFIRST returns the current value, for the session, of SET DATEFIRST.

----This will return result as 1(Monday)
SET LANGUAGE us_english
----This will return result as 7(Sunday)

In this way @@DATEFIRST and SET DATEFIRST are related. When I learned about this feature I was very glad as our company has started to server global clients and simple feature like this helps a lot to avoid confusion.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – Inserting into ColumnSet with SPARSE Columns

There are a number of blogs around using SPARSE columns here. Here are few of them for reference and can be used as a great learning experience.

SQL SERVER – 2008 – Introduction to SPARSE Columns

SQL SERVER – 2008 – Introduction to SPARSE Columns – Part 2

SQL SERVER – Errors and Limitations of Working with SPARSE Columns in SQL Server

Though such capabilities exist inside the product, I hardly see people using this feature effectively. Recently, one friend from a product based company called me to check if I had written on this topic. I explained that there are tons of documentation and it is great to see them use it. He was saying, they had a table which had close to 1000’s of column as they are using it like a generic table and wanted to know if there is an easy way to update or insert into this SPARSE table directly from ColumnSet?

The short answer is, YES. It is quite possible to insert using the ColumnSet column directly. I sent him an example and thought it would be of great use for folks here too.

Let us create our table for the experiment:

USE tempdb;
IF EXISTS (SELECT * FROM sys.objects WHERE name = 'DocStore_With_ColumnSet')
-- note the use of the SPARSE keyword and the XML column_set
CREATE TABLE DocStore_With_ColumnSet
ProductionSpecification VARCHAR(20) SPARSE NULL,
ProductionLocation smallint SPARSE NULL,
MarketingSurveyGroup VARCHAR(20) SPARSE NULL,
MarketingProgramID INT SPARSE NULL,

We can insert into different values columns as shown below:

INSERT DocStore_With_ColumnSet (DocID, Title, ProductionSpecification, ProductionLocation)
VALUES (1, 'Tire Specs 1', 'ABC321', 27)
INSERT DocStore_With_ColumnSet (DocID, Title, MarketingSurveyGroup)
VALUES (2, 'Survey 1234', 'Female 25 - 35')
-- Note that the Special_XML_Column column is correctly populated based on the columns that are populated
SELECT * FROM DocStore_With_ColumnSet

Can we add to the ColumnSet directly? Here is what we can do:

-- you can add data directly by specifying the xml
INSERT DocStore_With_ColumnSet (DocID, Title, Special_XML_Column)
VALUES (3, 'Specification 2', '<ProductionSpecification>AXZ7R242</ProductionSpecification><ProductionLocation>18</ProductionLocation>')

We can see the success of the insert statement. Let us next try to check if the values are inserted into our ColumnSet:

SELECT * FROM DocStore_With_ColumnSet

If this was not enough, we can also use this technique to update the columnset values too. A typical example for this is shown below:

-- now let us update some of the data
UPDATE DocStore_With_ColumnSet
SET Special_XML_Column = '<ProductionSpecification>ZZ456Z</ProductionSpecification><ProductionLocation>18</ProductionLocation>'
WHERE DocID = 3;

You can see again by querying that the values have been inserted successfully. This was a simple example to how we can use columnsets to insert data into a sparse table rather can doing a qualification of each columns. Let us next try to clean this table:

-- Cleanup
DROP TABLE DocStore_With_ColumnSet

As I conclude, would like to know how many of you have used SPARSE columns in your application and what were some of the scenarios you were able to lighten up using this feature?

Reference: Pinal Dave (http://blog.sqlauthority.com)