SQL SERVER – Cardinality Estimation and Performance – SQL in Sixty Seconds #072

Yesterday I wrote blog post based on my latest Pluralsight course on learning SQL Server 2014. I discussed newly introduced cardinality estimation in SQL Server 2014 and how it improves the performance of the query. The cardinality estimation logic is responsible for quality of query plans and majorly responsible for improving performance for any query. This logic was not updated for quite a while, but in the latest version of SQL Server 2104 this logic is re-designed. The new logic now incorporates various assumptions and algorithms of OLTP and warehousing workload.

I hope my earlier blog post clearly explained how new cardinality estimation logic improves performance. If not, I suggest you watch following quick video where I explain this concept in extremely simple words.

You can download the code used in this course from Simple Demo of New Cardinality Estimation Features of SQL Server 2014.

Action Item

Here are the blog posts I have previously written. You can read it over here:

You can subscribe to my YouTube Channel for frequent updates.

Reference: Pinal Dave (http://blog.sqlauthority.com)

About these ads

SQL SERVER – Introduction to SQL Server 2014 In-Memory OLTP

In SQL Server 2014 Microsoft has introduced a new database engine component called In-Memory OLTP aka project “Hekaton” which is fully integrated into the SQL Server Database Engine. It is optimized for OLTP workloads accessing memory resident data. In-memory OLTP helps us create memory optimized tables which in turn offer significant performance improvement for our typical OLTP workload.

The main objective of memory optimized table is to ensure that highly transactional tables could live in memory and remain in memory forever without even losing out a single record. The most significant part is that it still supports majority of our Transact-SQL statement. Transact-SQL stored procedures can be compiled to machine code for further performance improvements on memory-optimized tables. This engine is designed to ensure higher concurrency and minimal blocking. In-Memory OLTP alleviates the issue of locking, using a new type of multi-version optimistic concurrency control. It also substantially reduces waiting for log writes by generating far less log data and needing fewer log writes.

Points to remember

  • Memory-optimized tables refer to tables using the new data structures and key words added as part of In-Memory OLTP.
  • Disk-based tables refer to your normal tables which we used to create in SQL Server since its inception. These tables use a fixed size 8 KB pages that need to be read from and written to disk as a unit.
  • Natively compiled stored procedures refer to an object Type which is new and is supported by in-memory OLTP engine which convert it into machine code, which can further improve the data access performance for memory –optimized tables. Natively compiled stored procedures can only reference memory-optimized tables, they can’t be used to reference any disk –based table.
  • Interpreted Transact-SQL stored procedures, which is what SQL Server has always used.
  • Cross-container transactions refer to transactions that reference both memory-optimized tables and disk-based tables.
  • Interop refers to interpreted Transact-SQL that references memory-optimized tables.

Using In-Memory OLTP

In-Memory OLTP engine has been available as part of SQL Server 2014 since June 2013 CTPs. Installation of In-Memory OLTP is part of the SQL Server setup application. The In-Memory OLTP components can only be installed with a 64-bit edition of SQL Server 2014 hence they are not available with 32-bit editions.

Creating Databases

Any database that will store memory-optimized tables must have a MEMORY_OPTIMIZED_DATA filegroup. This filegroup is specifically designed to store the checkpoint files needed by SQL Server to recover the memory-optimized tables, and although the syntax for creating the filegroup is almost the same as for creating a regular filestream filegroup, it must also specify the option CONTAINS MEMORY_OPTIMIZED_DATA. Here is an example of a CREATE DATABASE statement for a database that can support memory-optimized tables:

CREATE DATABASE InMemoryDB
ON
PRIMARY
(NAME = [InMemoryDB_data],
FILENAME = 'D:\data\InMemoryDB_data.mdf', size=500MB),
FILEGROUP [SampleDB_mod_fg] CONTAINS MEMORY_OPTIMIZED_DATA
(NAME = [InMemoryDB_mod_dir],
FILENAME = 'S:\data\InMemoryDB_mod_dir'),
(
NAME = [InMemoryDB_mod_dir],
FILENAME = 'R:\data\InMemoryDB_mod_dir')
LOG ON (name = [SampleDB_log], Filename='L:\log\InMemoryDB_log.ldf', size=500MB)
COLLATE Latin1_General_100_BIN2;

Above example code creates files on three different drives (D:  S: and R:) for the data files and in memory storage so if you would like to run this code kindly change the drive and folder locations as per your convenience. Also notice that binary collation was specified as Windows (non-SQL). BIN2 collation is the only collation support at this point for any indexes on memory optimized tables.

It is also possible to add a MEMORY_OPTIMIZED_DATA file group to an existing database, use the below command to achieve the same.

ALTER DATABASE AdventureWorks2012 ADD FILEGROUP hekaton_mod CONTAINS MEMORY_OPTIMIZED_DATA;
GO
ALTER DATABASE AdventureWorks2012 ADD FILE (NAME='hekaton_mod', FILENAME='S:\data\hekaton_mod') TO FILEGROUP hekaton_mod;
GO

Creating Tables

There is no major syntactical difference between creating a disk based table or a memory –optimized table but yes there are a few restrictions and a few new essential extensions. Essentially any memory-optimized table should use the MEMORY_OPTIMIZED = ON clause as shown in the Create Table query example.

DURABILITY clause (SCHEMA_AND_DATA or SCHEMA_ONLY)

Memory-optimized table should always be defined with a DURABILITY value which can be either SCHEMA_AND_DATA or  SCHEMA_ONLY the former being the default. A memory-optimized table defined with DURABILITY=SCHEMA_ONLY will not persist the data to disk which means the data durability is compromised whereas DURABILITY= SCHEMA_AND_DATA ensures that data is also persisted along with the schema.

Indexing Memory Optimized Table

A memory-optimized table must always have an index for all tables created with DURABILITY= SCHEMA_AND_DATA and this can be achieved by declaring a PRIMARY KEY Constraint at the time of creating a table. The following example shows a PRIMARY KEY index created as a HASH index, for which a bucket count must also be specified.

CREATE TABLE Mem_Table
(
[Name] VARCHAR(32) NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000),
[City] VARCHAR(32) NULL,
[State_Province] VARCHAR(32) NULL,
[LastModified] DATETIME NOT NULL,
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);

Now as you can see in the above query example we have used the clause MEMORY_OPTIMIZED = ON to make sure that it is considered as a memory optimized table and not just a normal table and also used the DURABILITY Clause= SCHEMA_AND_DATA which means it will persist data along with metadata and also you can notice this table has a PRIMARY KEY mentioned upfront which is also a mandatory clause for memory-optimized tables.

We will talk more about HASH Indexes and BUCKET_COUNT in later articles on this topic which will be focusing more on Row and Index storage on Memory-Optimized tables. So stay tuned for that as well.

Now as we covered the basics of Memory Optimized tables and understood the key things to remember while using memory optimized tables, let’s explore more using examples to understand the Performance gains using memory-optimized tables.

I will be using the database which i created earlier in this article i.e. InMemoryDB in the below Demo Exercise.

USE InMemoryDB
GO
-- Creating a disk based table
CREATE TABLE dbo.Disktable
(
Id INT IDENTITY,
Name CHAR(40)
)
GO
CREATE NONCLUSTERED INDEX IX_ID ON dbo.Disktable (Id)
GO
-- Creating a memory optimized table with similar structure and DURABILITY = SCHEMA_AND_DATA
CREATE TABLE dbo.Memorytable_durable
(
Id INT NOT NULL PRIMARY KEY NONCLUSTERED Hash WITH (bucket_count =1000000),
Name CHAR(40)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO
-- Creating an another memory optimized table with similar structure but DURABILITY = SCHEMA_Only
CREATE TABLE dbo.Memorytable_nondurable
(
Id INT NOT NULL PRIMARY KEY NONCLUSTERED Hash WITH (bucket_count =1000000),
Name CHAR(40)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_only)
GO
-- Now insert 100000 records in dbo.Disktable and observe the Time Taken
DECLARE @i_t bigint
SET @i_t =1
WHILE @i_t<= 100000
BEGIN
INSERT INTO
dbo.Disktable(Name) VALUES('sachin' + CONVERT(VARCHAR,@i_t))
SET @i_t+=1
END
-- Do the same inserts for Memory table dbo.Memorytable_durable and observe the Time Taken
DECLARE @i_t bigint
SET @i_t =1
WHILE @i_t<= 100000
BEGIN
INSERT INTO
dbo.Memorytable_durable VALUES(@i_t, 'sachin' + CONVERT(VARCHAR,@i_t))
SET @i_t+=1
END
-- Now finally do the same inserts for Memory table dbo.Memorytable_nondurable and observe the Time Taken
DECLARE @i_t bigint
SET @i_t =1
WHILE @i_t<= 100000
BEGIN
INSERT INTO
dbo.Memorytable_nondurable VALUES(@i_t, 'sachin' + CONVERT(VARCHAR,@i_t))
SET @i_t+=1
END

The above 3 Inserts took 1.20 minutes, 54 secs, and 2 secs respectively to insert 100000 records on my machine with 8 Gb RAM.

This proves the point that memory-optimized tables can definitely help businesses achieve better performance for their highly transactional business table and memory- optimized tables with Durability SCHEMA_ONLY is even faster as it does not bother persisting its data to disk which makes it supremely fast. Koenig Solutions is one of the few organizations which offer IT training on SQL Server 2014 and all its updates.

Now, I leave the decision on using memory_Optimized tables on you, I hope you like this article and it helped you understand  the fundamentals of IN-Memory OLTP .

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Activity Monitor to Identify Blocking – Find Expensive Queries

Recently when I was busy working on my next course for Pluralsight, one of my friend pinged me for some help over messenger. And the conversation went on a different spin, here is a gist of what we talked and I felt this was worth a share

 Me: Hi buddy, what’s up?

Jim: Dave, I have an unique problem and thought you can help me.

Me: Sure, let me know how I can be of help.

Jim: I am going on a quick consulting assignment tomorrow and I am wanting some helping hand.

Me: Hmmm Sure, let me know what the problem is?

Jim: This customer has a lot of deadlock and blocking problem. And I wanted to know if you know any easy steps to find blocking queries in a real time system?

Me: Hold on let me check.

Jim: Sure.

Me: Ahha Did you check my script available over my blog?

Jim: Awesome, Pinal I think you saved me a day. And apologize for not doing a quick search over your blog. But Pinal, I have one doubt.

Me: On the script?

Jim: No Pinal, the script was written 4 years back and is still a gold mine to help me tomorrow. I am just curious, have you ever explored any alternatives? I am sure your exploration wouldn’t have stopped.

Me: Well Jim, there are a number of other scripts but the one of the hidden gem I don’t think I have written over this blog is around Activity Monitor.

Jim: Oh is it? Can you show me what you are saying?

We get over a screen sharing and I show couple of neat tricks available with Activity Monitor. Let me take a moment to pen down what I showed Jim yesterday.

The quickest way to initialize the Activity monitor is to use the shortcut (CTRL+ALT+A) inside Management studio or feel free to read this blog on where to find the Activity Monitor.

Activity monitor to identify Blocking

The activity monitor has a number of tabs and the tab under question is the “Processes” tab. If you have a blocking scenario then activity monitor can be really a quick solution.

In the image below, we can see SPID of 51 is head of the Blocker and SPID of 55 is waiting for locks to be released by 51. We can also find the wait time, resource wait type and more from this screen.

As a DBA if you want to kill SPID 51, we can quickly do that from Activity monitor using the right click menu as:

When I showed this to Jim, he was super excited to see and was all awe of what we can do just with SQL Server Management Studio. So Jim quizzed me to show him one more tip using Activity monitor as a challenge. Since he was going for a performance troubleshooting exercise, I thought of sharing one more cooler tip using the Activity monitor.

Activity monitor to find execution plans of expensive queries

I asked Jim how he would find out if there are performance problems for a given query. The obvious answer as an expert was to use an execution plan. The problem of such an approach is that when the query executed with parameters, then when we pull them out from Profiler or other mechanisms based on Reads, Write or CPU usage – the parameters need to be replaced, executed and then we need to view the execution plan.

Interestingly, Activity monitor has an interesting tab called as “Recent Expensive Queries”. These are queries based on Reads, CPU, writes etc. Once you get into this screen, right click the query under question and select “Show Execution Plan”.

End of the screen sharing exercise and about 10 mins of quick talk with me, Jim was super excited and said he will ping me once the assignment is over and how he used some of the tips I shared with him.

Activity Monitor is an awesome hidden gem inside SQL Server Management studio that many fail to use for quick monitoring. Hope this blog post inspired some of you to use it in your environments in the future. Do let me know if you ever used Activity monitor in these ways.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – In-Memory Databases and Caching Wars

The In-Memory Database and In-Memory Caching market is heating up and we (the users) are the real winners. Exciting!!!

  • SAP, the giant application vendor, but a database newbie, is pushing to the market its SAP HANA, a standalone in-memory database (in an appliance box).
  • Last year IBM released a set of acceleration technologies, code-named BLU (a development code name that stood for “big data, lightning fast, ultra-easy”), adding in-memory capabilities to DB2 and better integration of solidDB caching engine with DB2.
  • This April (2014) Microsoft released its SQL Server 2014 with the new In-Memory OLTP engine embedded inside. Adding an In-Memory data engine right inside the most used database engine, the SQL Server, opens up many great possibilities for us, the engineers, building new applications and software solutions.
  • This month Oracle CEO Larry unveiled Oracle In-Memory database engine to be available next month. This comes as a direct fight to IBM, Microsoft and SAP in the burgeoning in-memory database market with technologies that can deliver dramatic performance boosts, high-scale online transactions or complex data processing.

The new in-memory database engines enable users to build new apps with a new scale of speed and processing capabilities. Hurrah!!

But the new in-memory database engines come with a price: Long list of data limitations, syntax limitations, use-case limitations, migration complications etc.

Another fast growing approach is adding a Smart Data Caching layer, where SafePeak (www.SafePeak.com) appears to be the visionary leader. SafePeak’s automated caching solution for SQL Server based applications enables easy deployment of a smart caching layer between the application and the database. SafePeak performs automated dynamic caching SQL queries and stored procedures results in-memory, covering immediately all queries of the application. Its key features that make it an interesting solution to explore when performance is discussed are:

1) Automation and ease of use;
2) No code changes;
3) Cache awareness for any transactions and data changes;
4) Wide range of caching options and cache eviction strategies.

The SQL Analytics engine embedded in SafePeak is an excellent bonus, working all the time and collecting information about every query. It’s a bit like the well-known SQL Profiler, but without the performance drop and with conversion of the raw not-really-readable data into advanced analytics.

When it comes to applicability to existing production applications, SafePeak’s strategy is to be easy to use for already-in-production operational read-write (but read-intensive) applications, and to be optimized for virtual and cloud environments. For in-production systems, this gives SafePeak an interesting advantage over the in-memory technologies of database giants.

 

Any way to look at – these in-memory developments are exciting for us
Comparing main in-memory options for the SQL Server applications, a question I think worth exploring is this:

When would SQL Server 2014 In-Memory OLTP fit your needs better than SafePeak’s Caching, and vice versa?

The answer depends of course on the user needs and the application-database state (Greenfield, Production, or Commercial off the shelf application).

SafePeak caching focuses on “READs”, especially REPEATITIVE READS, therefore it fits well for classic multi-user applications.

If your need is mostly about speeding up the WRITES or COMPLEX PROCESSING long-running queries, then having  a full table (or tables) In-Memory (as in SQL Server 2014 IN-Memory OLTP) sounds a better option.

However, the answer of applicability depending on the state of the application/database is very interesting too. I think it splits between 3 use cases:

1) In “greenfield” applications in the development cycle / process
The SQL Server 2014 In-Memory OLTP is a great option to explore. Having the knowledge how the 2014 In-Memory Tables and Stored Procedures works and their various limitations – gives the development managers tools to plan and developer the new application based on available features. In such cases several few important tables (and procs accessing them) can become in-memory. The rest will be usual.

2) In live production homegrown applications needing to speed-up performance without major redesign
SQL Server 2014 In-Memory OTLP may be a complex option to apply, due to its many current limitations of the solution and the requirement to redesign and reprogram parts of your application. And then there is the migration process… For this reason, SafePeak Cache may be a better choice.

3) In commercial 3rd-party / off-the-shelf applications
Unfortunately, SQL Server 2014 In-Memory OTLP requires new code and design changes, so in most cases, the user cannot apply the new in-memory feature. SafePeak works without any need to touch the application, so SafePeak wins here.

Here’s a quick comparison of advantages and disadvantages of Microsoft’s In-Memory OLTP option with SafePeak’s caching:

1) SQL 2014 works very well for in write intensive processes and large data processing. SafePeak main gain is on repetitive-reads (queries and procs).

2) SQL 2014 has your full table in memory. Every query will be fast, even the first time it is called.
SafePeak has a module called Cache Warm-up, enabling to “pre-fetch data into cache”.

3) SQL 2014 In-Memory OLTP is fully embedded inside the SQL Server, so no extra layer is needed.

SafePeak’s advantages:

1) SafePeak #1 advantage is time to deployment. No code changes, takes only a few hours to deploy and configure;

2) SafePeak immediately covers 100% of the application. Not only tables designed to be in-memory;

3) No SQL syntax limitations. All tables, all existing code, any SQL syntax, any stored procedure can be cached. Most automatically, rest with easy configuration;

Bottom-line, only a customer can say what technology will best fit his/her situation. But definitely the In-Memory Competition will continue to benefit the industry and mostly us – the DBA’s, programmers and application managers!

Interested to trial SQL Server 2014 In-Memory OTLP?

Download SQL Server 2014 from the Microsoft site.
And see my post “Introduction and Quick-start guide to SQL Server 2014 In-Memory OLTP

Interested to trial SafePeak?

SafePeak is available for free trial: www.SafePeak.com/Download
The guys there love helping users with free training and setup assistance, allowing users to get results quicker and better – and allowing the company to learn real customers’ needs and challenges (if you ask me, the right way to build a product and a company).

What In-Memory category fits you? Let us know. 

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Keeping MSDB System Database Lean and Fit – Notes from the Field #033

[Notes from Pinal]: I am working with SQL Server for almost 10 years. I have seen the most ignored databases on any database server is system database. We hardly pay attention to them. Just for the sake of it – try to run queries in your master database and see how many user databases already exists there. I am sure unless and until you have strict policies in the place, there will be a handful of the tables available there. Similarly try to ask DBA question about what is the exact importance of the MSDB database and you will find it that there are so many of the DBA without its knowledge. I often talk about this with my friend Tim. I asked him if he can help us out  with the tricks to keep MSDB system database at its optimal settings.

Linchpin People are database coaches and wellness experts for a data driven world. In this 33rd episode of the Notes from the Fields series database expert Tim Radney (partner at Linchpin People) explains in a very simple word how to keep MSDB system database at optimal settings. 


When doing a review of a client’s database server, one of the checks I perform is checking to see how much backup history is being kept within MSDB. SQL Server does a fantastic job logging all the backups and restores we perform and depending how your routine is configured you are probably only keeping enough backup files to meet your SLA’s. What most DBA’s or Admins don’t think about is all that backup and restore history being retained with MSDB.

The tables within msdb that hold this history include:

  • restorefile
  • restorefilegroup
  • restorehistory
  • backupfile
  • backupfilegroup
  • backupmediafamily
  • backupmediaset
  • backupset

Microsoft SQL Server has a system stored procedure that helps us maintain the history however it is not scheduled to run by default. We have to create a process to run the stored procedure on a routine basis. You will need to know how much history is enough to keep for your organization. I have seen this range from 90 days to 14 days.

The stored procedure is sp_delete_backupshistory within the MSDB database.

This stored procedure takes a single parameter of date. This date is the cutoff date of history to keep. Any history past the date provided will be deleted.

Over the years I have found MSDB ranging from several megabytes to nearly 20 GB.  I recall one client who had a database server with hundreds of databases that were being log shipped to a secondary with 15 minute log backup and the system had been in place for many years. MSDB had grown to almost 20 GB. The client had complained that if they ever tried to restore a database within the GUI that it would lock up SSMS.

A simple script to execute to purge this history older than 30 days is below.

USE msdb
GO
DECLARE @CutOffDate DATETIME
SET
@CutOffDate = CONVERT(VARCHAR(10), DATEADD(dd, -30,GETDATE()), 101)
EXEC sp_delete_backuphistory @CutOffDate
GO

If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server.

Note: Tim has also written an excellent book on SQL Backup and Recovery, a must have for everyone.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – SQL Server High Availability Options – Notes from the Field #032

[Notes from Pinal]: When it is about High Availability or Disaster Recovery, I often see people getting confused. There are so many options available that when the user has to select what is the most optimal solution for their organization they are often confused. Most of the people even know the salient features of various options, but when they have to figure out one single option to use they are often not sure which option to use. I like to give ask my dear friend time all these kinds of complicated questions. He has a skill to make a complex subject very simple and easy to understand.

Linchpin People are database coaches and wellness experts for a data driven world. In this 26th episode of the Notes from the Fields series database expert Tim Radney (partner at Linchpin People) explains in a very simple words the best High Availability Option for your SQL Server. 


Working with SQL Server a common challenge we are faced with is providing the maximum uptime possible.  To meet these demands we have to design a solution to provide High Availability (HA). Microsoft SQL Server depending on your edition provides you with several options.  This could be database mirroring, log shipping, failover clusters, availability groups or replication.

Each possible solution comes with pro’s and con’s.  Not anyone one solution fits all scenarios so understanding which solution meets which need is important.  As with anything IT related, you need to fully understand your requirements before trying to solution the problem.  When it comes to building an HA solution, you need to understand the risk your organization needs to mitigate the most.

I have found that most are concerned about hardware failure and OS failures. Other common concerns are data corruption or storage issues.  For data corruption or storage issues you can mitigate those concerns by having a second copy of the databases. That can be accomplished with database mirroring, log shipping, replication or availability groups with a secondary replica.  Failover clustering and virtualization with shared storage do not provide redundancy of the data.

I recently created a chart outlining some pros and cons of each of the technologies that I posted on my blog.

I like to use this chart to help illustrate how each technology provides a certain number of benefits.  Each of these solutions carries with it some level of cost and complexity.  As a database professional we should all be familiar with these technologies so we can make the best possible choice for our organization.

If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server.

Note: Tim has also written an excellent book on SQL Backup and Recovery, a must have for everyone.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Iridium I/O – SQL Server Deduplication that Shrinks Databases and Improves Performance

Database performance is a common problem for SQL Server DBA’s.  It seems like we spend more time on performance than just about anything else.  In many cases, we use scripts or tools that point out performance bottlenecks but we don’t have any way to fix them.  For example, what do you do when you need to speed up a query that is already tuned as well as possible?  Or what do you do when you aren’t allowed to make changes for a database supporting a purchased application?

Iridium I/O for SQL Server was originally built at Confio software (makers of Ignite) because DBA’s kept asking for a way to actually fix performance instead of just pointing out performance problems. The technology is certified by Microsoft and was so promising that it was spun out into a separate company that is now run by the Confio Founder/CEO and technology management team.

Iridium uses deduplication technology to both shrink the databases as well as boost IO performance.  It is intriguing to see it work.  It will deduplicate a live database as it is running transactions.  You can watch the database get smaller while user queries are running.

Iridium is a simple tool to use. After installing the software, you click an “Analyze” button which will spend a minute or two on each database and estimate both your storage and performance savings.  Next, you click an “Activate” button to turn on Iridium I/O for your selected databases.  You don’t need to reboot the operating system or restart the database during any part of the process.

As part of my test, I also wanted to see if there would be an impact on my databases when Iridium was removed.  The ‘revert’ process (bringing the files back to their SQL Server native format) was executed by a simple click of a button, and completed while the databases were available for normal processing.

I was impressed and enjoyed playing with the software and encourage all of you to try it out.  Here is the link to the website to download Iridium for free. .

Reference: Pinal Dave (http://blog.sqlauthority.com)