SQL SERVER – Guest Post by Sandip Pani – SQL Server Statistics Name and Index Creation

Sometimes something very small or a common error which we observe in daily life teaches us new things. SQL Server Expert Sandip Pani (winner of Joes 2 Pros Contests) has come across similar experience. Sandip has written a guest post on an error he faced in his daily work. Sandip is working for QSI Healthcare as an Associate Technical Specialist and have more than 5 years of total experience. He blogs at SQLcommitted.com and contribute in various forums. His social media hands are LinkedIn, Facebook and Twitter.


Once I faced following error when I was working on performance tuning project and attempt to create an Index.

Mug 1913, Level 16, State 1, Line 1
The operation failed because an index or statistics with name ‘Ix_Table1_1′ already exists on table ‘Table1′.

The immediate reaction to the error was that I might have created that index earlier and when I researched it further I found the same as the index was indeed created two times. This totally makes sense. This can happen due to many reasons for example if the user is careless and executes the same code two times as well, when he attempts to create index without checking if there was index already on the object. However when I paid attention to the details of the error, I realize that error message also talks about statistics along with the index. I got curious if the same would happen if I attempt to create indexes with the same name as statistics already created. There are a few other questions also prompted in my mind. I decided to do a small demonstration of the subject and build following demonstration script.

The goal of my experiment is to find out the relation between statistics and the index. Statistics is one of the important input parameter for the optimizer during query optimization process. If the query is nontrivial then only optimizer uses statistics to perform a cost based optimization to select a plan. For accuracy and further learning I suggest to read MSDN.

Now let’s find out the relationship between index and statistics. We will do the experiment in two parts. i) Creating Index ii) Creating Statistics

We will be using the following T-SQL script for our example.

IF (OBJECT_ID('Table1') IS NOT NULL)
DROP TABLE Table1
GO
CREATE TABLE Table1
(Col1 INT NOT NULL,
Col2 VARCHAR(20) NOT NULL)
GO

We will be using following two queries to check if there are any index or statistics on our sample table Table1.

-- Details of Index
SELECT OBJECT_NAME(OBJECT_ID) AS TableName, Name AS IndexName, type_desc
FROM sys.indexes
WHERE OBJECT_NAME(OBJECT_ID) = 'table1'
GO
-- Details of Statistics
SELECT OBJECT_NAME(OBJECT_ID) TableName, Name AS StatisticsName
FROM sys.stats
WHERE OBJECT_NAME(OBJECT_ID) = 'table1'
GO

When I ran above two scripts on the table right after it was created it did not give us any result which was expected.

Now let us begin our test.

1) Create an index on the table

Create following index on the table.

CREATE NONCLUSTERED INDEX Ix_Table1_1 ON Table1(Col1)
GO

Now let us use above two scripts and see their results.

We can see that when we created index at the same time it created statistics also with the same name.

Before continuing to next set of demo – drop the table using following script and re-create the table using a script provided at the beginning of the table.

DROP TABLE table1
GO

2) Create a statistic on the table

Create following statistics on the table.

CREATE STATISTICS Ix_table1_1 ON Table1 (Col1)
GO

Now let us use above two scripts and see their results.

We can see that when we created statistics Index is not created. The behavior of this experiment is different from the earlier experiment.

Clean up the table setup using the following script:

DROP TABLE table1
GO

Above two experiments teach us very valuable lesson that when we create indexes, SQL Server generates the index and statistics (with the same name as the index name) together. Now due to the reason if we have already had statistics with the same name but not the index, it is quite possible that we will face the error to create the index even though there is no index with the same name.

A Quick Check

To validate that if we create statistics first and then index after that with the same name, it will throw an error let us run following script in SSMS. Make sure to drop the table and clean up our sample table at the end of the experiment.

-- Create sample table
CREATE TABLE TestTable
(Col1 INT NOT NULL,
Col2 VARCHAR(20) NOT NULL)
GO
-- Create Statistics
CREATE STATISTICS IX_TestTable_1 ON TestTable (Col1)
GO
-- Create Index
CREATE NONCLUSTERED INDEX IX_TestTable_1 ON TestTable(Col1)
GO
-- Check error
/*Msg 1913, Level 16, State 1, Line 2
The operation failed because an index or statistics with name 'IX_TestTable_1' already exists on table 'TestTable'.
*/
-- Clean up
DROP TABLE TestTable
GO

While creating index it will throw the following error as statistics with the same name is already created.

In simple words – when we create index the name of the index should be different from any of the existing indexes and statistics.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

About these ads

SQL SERVER – Observation of Top with Index and Order of Resultset

SQL Server has lots of things to learn and share. It is amazing to see how people evaluate and understand different techniques and styles differently when implementing. There are three different instances where I have come across a situation where I felt that proper understanding is important and something which looks evil may not be evil as suggested. The real reason may be absolutely different but we may blame something totally different for the incorrect results.

Scenario 1: Database Tuning Advisor and Incorrect Results

One of my friends called me he was stressed out. He just ran a few of Database Tuning Advisor on his system and followed up a few of the suggestions by DTA and suddenly his applications and reporting server started to show incorrect results. He got stressed out and reverted every single change he had done so far. Though he was getting better performance his results were not as per expectation. Finally after quick research we figured out that it was not DTA’s advice but something else which was creating this scenario.

Scenario 2: No Index for Correct Results

One of the place where I went for consultation for performance tuning. I realize that there was absolutely no indexes created. When I discussed this with them, they told me creating indexes is giving incorrect results to them. This was very funny to me. I have heard everything before but this was kind of first time for me. However, with carefully investigating I got the idea what they were suggesting and later on solution was reached.

Scenario 3: Top and Changing Results

This one is my own experience when I was beginning my career. Just like every beginner I was not sure about every single command of SQL Server as well I had no idea how the engine behaves internally. I was under the misconception that TOP changes how the results are displayed. I was very much scared of to use TOP in my queries but after a while I understood that it was not TOP which was misbehaving but my knowledge is incomplete. I eventually figured out how to use TOP clause and get expected results.

The Real Issue

The real issue in the above three cases is not using ORDER BY clause for the query. While writing query I have faced situations where one result can be achieved by multiple ways. It is our responsibility that we select the result which is deterministic and consistent. We should opt for the way which maintains database integrity as well, gives optimal performance.

In all the above three scenario the issue was developer was not writing script which was consistent with his understanding. The SQL Server Engine was behaving how it should behave but lack of understanding was building different expectation from the query.

Reproduction of Scenarios

Following T-SQL script is doing the same. First let us create a table with several records.

USE TempDB
GO
CREATE TABLE TestTable (ID INT, Name VARCHAR(10), Title VARCHAR(20))
GO
INSERT INTO TestTable (ID, Name, Title)
SELECT 1, 'Mark', 'Manager'
UNION ALL
SELECT 2, 'John', 'Sales Manager'
UNION ALL
SELECT 3, 'Matt', 'CEO'
UNION ALL
SELECT 4, 'Roger', 'COO'
UNION ALL
SELECT 5, 'Mike', 'Finance Manager'
UNION ALL
SELECT 6, 'Jeff', 'Manager'
UNION ALL
SELECT 7, 'Mason', 'HR Executive'
GO

Let us select records and observe the resultset.

SELECT TOP 3 Name
FROM TestTable
GO

Now let us create index over one of the columns and retrieve records. You will find the result is inconsistent than before or different than what we have received. (Note that I did not use word INCORRECT as it is not incorrect result, but different for sure)

CREATE NONCLUSTERED INDEX [IX_Name]
ON TestTable
(Name ASC)
GO
SELECT TOP 3 Name
FROM TestTable
GO

Now let us use ORDER BY clause and you will see that result is as per expected.

SELECT TOP 3 Name
FROM TestTable
ORDER BY ID
GO

Let us Clean up.

DROP TABLE TestTable
GO

In our sample reproduction I have selected one single column and single column index. In above scenario we had much difficult scenarios and complex cases. The major point of this blog post is to make sure that developers understand the importance of the ORDER BY clause.

Summary

When we are beginning our career it is quite easy get confused and not understanding the impact of code which we are writing. Quite often we build prejudice towards part of technology which should not be the case as we progress in our career. In this example we are trying to drive home the point that when you use TOP and if you are expecting your results in certain order all the time uses ORDER BY along with it.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – ColumnStore Index – Batch Mode vs Row Mode

What do you do when you are in a hurry and hear someone say things which you do not agree or is wrong? Well, let me tell you what I do or what I recently did.

I was walking by and heard someone mentioning “Columnstore Index are really great as they are using Batch Mode which makes them seriously fast.” While I was passing by and I heard this statement my first reaction was I thought Columnstore Index can use both – Batch Mode and Row Mode. I stopped by even though I was in a hurry and asked the person if he meant that Columnstore indexes are seriously fast because they use Batch Mode all the time or Batch Mode is one of the reasons for Columnstore Index to be faster. He responded that Columnstore Indexes can run only in Batch Mode.

However, I do not like to confront anybody without hearing their complete story. Honestly, I like to do information sharing and avoid confronting as much as possible. There are always ways to communicate the same positively. Well, this is what I did, I quickly pull up my earlier article on Columnstore Index and copied the script to SQL Server Management Studio. I created two versions of the script. 1) Very Large Table 2) Reasonably Small Table. I a query which uses columnstore index on both of the versions. I found very interesting result of the my tests.

I saved my tests and sent it to the person who mentioned about that Columnstore Indexes are using Batch Mode only. He immediately acknowledged that indeed he was incorrect in saying that Columnstore Index uses only Batch Mode. What really caught my attention is that he also thanked me for sending him detail email instead of just having argument where he and I both were standing in the corridor and neither have no way to prove any theory.

Here is the screenshots of the both the scenarios.

1) Columnstore Index using Batch Mode

2) Columnstore Index using Row Mode

Here is the logic behind when Columnstore Index uses Batch Mode and when it uses Row Mode. A batch typically represents about 1000 rows of data. Batch mode processing also uses algorithms that are optimized for the multicore CPUs and increased memory throughput.  Batch mode processing spreads metadata access costs and overhead over all the rows in a batch.  Batch mode processing operates on compressed data when possible leading superior performance.

Here is one last point – Columnstore Index can use Batch Mode or Row Mode but Batch Mode processing is only available in Columnstore Index. I hope this statement truly sums up the whole concept.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Finding Size of a Columnstore Index Using DMVs

Columnstore Index is one of my favorite enhancement in SQL Server 2012. A columnstore index stores each column in a separate set of disk pages, rather than storing multiple rows per page as data traditionally has been stored. In case of the row store indexes multiple pages will contain multiple rows of the columns spanning across multiple pages. Whereas in case of column store indexes multiple pages will contain (multiple) single columns.  Columnstore Indexes are compressed by default and occupies much lesser space than regular row store index by default.

One of the very common question I often see is need of the list of columnstore index along with their size and corresponding table name. I quickly re-wrote a script using DMVs sys.indexes and sys.dm_db_partition_stats. This script gives the size of the columnstore index on disk only. I am sure there will be advanced script to retrieve details related to components associated with the columnstore index. However, I believe following script is sufficient to start getting an idea of columnstore index size. 

SELECT OBJECT_SCHEMA_NAME(i.OBJECT_ID) SchemaName,
OBJECT_NAME(i.OBJECT_ID ) TableName,
i.name IndexName,
SUM(s.used_page_count) / 128.0 IndexSizeinMB
FROM sys.indexes AS i
INNER JOIN sys.dm_db_partition_stats AS S
ON i.OBJECT_ID = S.OBJECT_ID AND I.index_id = S.index_id
WHERE  i.type_desc = 'NONCLUSTERED COLUMNSTORE'
GROUP BY i.OBJECT_ID, i.name

Here is my introductory article written on SQL Server Fundamentals of Columnstore Index. Create a sample columnstore index based on the script described in the earlier article. It will give the following results.

Please feel free to suggest improvement to script so I can further modify it to accommodate enhancements.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Renaming Index – Index Naming Conventions

If you are regular reader of this blog, you must be aware of that there are two kinds of blog posts 1) I share what I learn recently 2) I share what I learn and request your participation. Today’s blog post is where I need your opinion to make this blog post a good reference for future.

Background Story

Recently I came across system where users have changed the name of the few of the table to match their new standard naming convention. The name of the table should be self explanatory and they should have explain their purpose without either opening it or reading documentations. Well, not every time this is possible but again this should be the goal of any database modeler. Well, I no way encourage the name of the tables to be too long like ‘ContainsDetailsofNewInvoices’. May be the name of the table should be ‘Invoices’ and table should contain a column with New/Processed bit filed to indicate if the invoice is processed or not (if necessary). Coming back to original story, the database had several tables of which the name were changed.

Story Continues…

To continue the story let me take simple example. There was a table with the name  ‘ReceivedInvoices’, it was changed to new name as ‘TblInvoices’. As per their new naming standard they had to prefix every talbe with the words ‘Tbl’ and prefix every view with the letters ‘Vw’. Personally I do not see any need of the prefix but again, that issue is not here to discuss.  Now after changing the name of the table they faced very interesting situation. They had few indexes on the table which had name of the table. Let us take an example.

Old Name of Table: ReceivedInvoice
Old Name of Index: Index_ReceivedInvoice1

Here is the new names

New Name of Table: TblInvoices
New Name of Index: ???

Well, their dilemma was what should be the new naming convention of the Indexes. Here is a quick proposal of the Index naming convention. Do let me know your opinion.

If Index is Primary Clustered Index: PK_TableName
If Index is  Non-clustered Index: IX_TableName_ColumnName1_ColumnName2…
If Index is Unique Non-clustered Index: UX_TableName_ColumnName1_ColumnName2…
If Index is Columnstore Non-clustered Index: CL_TableName

Here ColumnName is the column on which index is created. As there can be only one Primary Key Index and Columnstore Index per table, they do not require ColumnName in the name of the index. The purpose of this new naming convention is to increase readability. When any user come across this index, without opening their properties or definition, user can will know the details of the index.

T-SQL script to Rename Indexes

Here is quick T-SQL script to rename Indexes

EXEC sp_rename N'SchemaName.TableName.IndexName', N'New_IndexName', N'INDEX';
GO

Your Contribute Please

Well, the organization has already defined above four guidelines, personally I follow very similar guidelines too. I have seen many variations like adding prefixes CL for Clustered Index and NCL for Non-clustered Index. I have often seen many not using UX prefix for Unique Index but rather use generic IX prefix only.

Now do you think if they have missed anything in the coding standard. Is NCI and CI prefixed required to additionally describe the index names. I have once received suggestion to even add fill factor in the index name – which I do not recommend at all.

What do you think should be ideal name of the index, so it explains all the most important properties? Additionally, you are welcome to vote if you believe changing the name of index is just waste of time and energy. 

Note: The purpose of the blog post is to encourage all to participate with their ideas. I will write follow up blog posts in future compiling all the suggestions.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Columnstore Index and sys.dm_db_index_usage_stats

As you know I have been writing on Columnstore Index for quite a while. Recently my friend Vinod Kumar wrote about  SQL Server 2012: ColumnStore Characteristics. A fantastic read on the subject if you have yet not caught up on that subject. After the blog post I called him and asked what should I write next on this subject. He suggested that I should write on DMV script which I have prepared related to Columnstore when I was writing our SQL Server Questions and Answers book. When we were writing this book SQL Server 2012 CTP versions were available. I had written few scripts related to SQL Server columnstore Index. I like Vinod’s idea and I decided to write about DMV, which we did not cover in the book as SQL Server 2012 was not released yet. We did not want to talk about the product which was not yet released.

The first script which I had written was with DMV - sys.column_store_index_stats. This DMV was displaying the statistics of the columnstore indexes. When I attempted to run it on SQL Server 2012 RTM it gave me error suggesting that this DMV does not exists.

Here is the script which I ran:

SELECT *
FROM sys.column_store_index_stats;

It generated following error:

Msg 208, Level 16, State 1, Line 1
Invalid object name ‘column_store_index_stats’.

I was pretty confident that this DMV was available when I had written the scripts. The next reaction was to type ‘sys.’ only in SSMS and wait for intelisense to popup DMV list. I scrolled down and noticed that above said DMV did not exists there as well.

Now this is not bug or missing feature. This was indeed something can happen because the version which I was practicing was early CTP version. If you go to the page of the DMV here, it clearly stats notice on the top of the page.

This documentation is for preview only, and is subject to change in later releases.

Now this was not alarming but my next thought was if this DMV is not there where can I find the information which this DMV was providing. Well, while I was thinking about this, I noticed that my another friend Balmukund Lakhani was online on personal messenger. Well, Balmukund is “Know All” kid. I have yet to find situation where I have not got my answers from him. I immediately pinged him and asked the question regarding where can I find information of ‘column_store_index_stats’. His answer was very abrupt but enlightening for sure. Here is our conversation:

Pinal: Where can I find information of column_store_index_stats?
Balmukund: Assume you have never worked with CTP before and now try to find the information which you are trying to find.

Honestly  it was fantastic response from him. I was confused as I have played extensively with CTP versions of SQL Server 2012. Now his response give me big hint. I should have not looked for DMV but rather should have focused on what I wanted to do. I wanted to retrieve the statistics related to the index. In SQL Server 2008/R2, I was able to retrieve the statistics of the index from the DMV - sys.dm_db_index_usage_stats. I used the same DMV on SQL Server 2012 and it did retrieved the necessary information for me.

Here is the updated script which gave me all the necessary information I was looking for. Matter of the fact, if I have used my earlier SQL Server 2008 R2 script this would have just worked fine.

SELECT DB_NAME(Database_ID) DBName,
SCHEMA_NAME(schema_id) AS SchemaName,
OBJECT_NAME(ius.OBJECT_ID) ObjName,
i.type_desc, i.name,
user_seeks, user_scans,
user_lookups, user_updates,*
FROM sys.dm_db_index_usage_stats ius
INNER JOIN sys.indexes i
ON i.index_id = ius.index_id
AND ius.OBJECT_ID = i.OBJECT_ID
INNER JOIN sys.tables t ON t.OBJECT_ID = i.OBJECT_ID
GO

Let us see the resultset of above query.

You will notice that column Type_desc describes the type of the index. You can additionally write WHERE condition on the column and only retrieve only selected type of Index.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Identify Columnstore Index Usage from Execution Plan

I think there was a time when lots of questions were coming via either email or blog comments. Nowadays, the trend seems to change. Most of the question I receive is through social media. Here is the latest question I received through Twitter. The best or worst part of Twitter is that it allows only 140 characters, so I’ve noticed that a question is easy to ask on Twitter, but an answer is difficult to provide using this social network.

The question I received at http://twitter.com/pinaldave is as follows:

How do I know if columnstore index is used by query through execution plan?

Well, my reaction was simple – “From Operators“. See, I managed to answer this in two words only. However, I was not satisfied with my own answer. The questioner did not follow up after this question but I personally did not like my answer. I felt that if he was talking about execution plan, he is indeed aware of operators. If he was aware of operators, my answer was no use to him. Suddenly, I thought I was arrogant (or at least not clear). I sent him a direct message that I would write a blog post tomorrow to explain how to figure out if columnstore index is used or not via execution plan.

If you are interested in Columnstore Index, read following related posts on my blog  for additional details:

To demonstrate this scenario, I have created the following script. I am using AdventureWorks (note: AdventureWorks Installation 60 Seconds Tutorial) for this sample database. Here are the steps which are to be followed:

  • Create a sample table
  • Insert some data
  • Create clustered index on it
  • Create nonclustered Columnstore Index on it
  • Enable execution plan in SSMS
  • Run two SELECT statement together with using clustered indexand columnstore index (use hint if needed)

Let us create environment and populate tables.

CREATE DATABASE CLAdventureWorks
GO
USE CLAdventureWorks
GO
-- Create New Table
CREATE TABLE [dbo].[MySalesOrderDetail](
[SalesOrderID] [int] NOT NULL,
[SalesOrderDetailID] [int] NOT NULL,
[CarrierTrackingNumber] [nvarchar](25) NULL,
[OrderQty] [smallint] NOT NULL,
[ProductID] [int] NOT NULL,
[SpecialOfferID] [int] NOT NULL,
[UnitPrice] [money] NOT NULL,
[UnitPriceDiscount] [money] NOT NULL,
[LineTotal] [numeric](38, 6) NOT NULL,
[rowguid] [uniqueidentifier] NOT NULL,
[ModifiedDate] [datetime] NOT NULL
)
ON [PRIMARY]
GO
-- Create clustered index
CREATE CLUSTERED INDEX [CL_MySalesOrderDetail] ON [dbo].[MySalesOrderDetail]
( [SalesOrderDetailID])
GO
-- Create Sample Data Table
-- WARNING: This Query may run upto 2-10 minutes based on your systems resources
INSERT INTO [dbo].[MySalesOrderDetail]
SELECT S1.*
FROM AdventureWorks.Sales.SalesOrderDetail S1
GO 10
-- Create ColumnStore Index
CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_MySalesOrderDetail_ColumnStore]
ON [MySalesOrderDetail]
(UnitPrice, OrderQty, ProductID)
GO

Now enable the actual execution plan in SSMS.

Run following two SELECT statement together.

-- Select Table with Clustered Index (Not Columnstore)
SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice,
SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty
FROM [dbo].[MySalesOrderDetail] WITH(INDEX([CL_MySalesOrderDetail]))
GROUP BY ProductID
ORDER BY ProductID
GO
-- Select Table with Columnstore Index
SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice,
SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty
FROM [dbo].[MySalesOrderDetail]
GROUP BY ProductID
ORDER BY ProductID
GO

Both will return the same result set. Click on Execution Plan Window.

You will see that there is different execution plan operator for regular clustered index and non-clustered columnstore index.

When you hover your mouse on both operators, they open operator tip which also clearly indicates if it is a regular clustered index and non-clustered columnstore index.

In case of the columnstore index, you will notice that the execution mode is also Batch instead of Row execution mode.

I guess now this completes the answer for the question asked to me on Twitter. Let us quickly clean up.

-- Cleanup
DROP INDEX [IX_MySalesOrderDetail_ColumnStore] ON [dbo].[MySalesOrderDetail]
GO
TRUNCATE TABLE dbo.MySalesOrderDetail
GO
DROP TABLE dbo.MySalesOrderDetail
GO

Feel free to ask me any question on social media – twitter or facebook.

Reference: Pinal Dave (http://blog.sqlauthority.com)