SQL SERVER – How to Remove All Characters From a String Using T-SQL?

In the recent past I have seen the best interaction and learning happens when we all get into a puzzle and challenge mode. And that just gets the creative juice in our minds and some really innovative solutions are given. So this blog is written like a nice trivia solution and I would love to hear from you on potential alternate solutions to this as you read the simplified solution.

There are many situations where a developer needs to do a lot of string manipulation using T-SQL. There are many things which ideally should be done via SQLCLR.

One of my blog reader posted a comment recently

I want to get only integer part in string. for example, string contain ab123ce234fe means i want only integer part like 123234. how can i get?

I gave him below suggestion.

@loop INT
@str VARCHAR(8000)
SELECT @str = 'ab123ce234fe'
SET @loop = 0
WHILE @loop < 26
@str = REPLACE(@str, CHAR(65 + @loop), '')
SET @loop = @loop + 1

Do you think there is a better way to achieve that? Please write it in comments section. It is going to be a great learning opportunity for all.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Puzzle – Inside Working of Datatype smalldatetime

In the recent past, I have been bringing few puzzles around using date and time functions. All these have brought some awesome feedbacks and you all have been sharing some awesome answers. Today’s question comes from an unusual place wherein one of a developer said he was getting unexpected results with working with datetime. So here is how the conversation went:

Dev: Hi Pinal.

Pinal: Yes, buddy. How can I be of help?

Dev: I have a problem working with datetime datatype.

Pinal: Oh is it? Tell me more about it.

Dev: I am trying to CAST a string coming from a front end app to SQL Server.

Pinal: Ok, where is the problem?

Dev: Every now and then, it gives different results than what I try to send via the wire.

Pinal: I am not able to understand. Can you send me an example to understand this better?

Dev: Sure, I will send via email.

After I received the actual code and email, I resumed the conversation.

Pinal: Hey.

Dev: Yes, Sir. Did you find the reason?

Pinal: I found the reason and I am going to blog about it tomorrow. It is a simplified version of the query you sent me. So you are about to find the answers there. The hint is: you are doing a convert to smalldatetime

Dev: Well, thanks. I think I will do my homework and try to answer it in your blog too.

This concept was interesting and I planned to share this experience with you folks like a small puzzle. So what are we talking?

Look at the below query and guess what would be the output:

CAST('2015-01-01 12:45:29.755' AS SMALLDATETIME),
CAST('2015-01-01 12:45:35.755' AS SMALLDATETIME)

This is a simple conversion but the two values are different. Why is that? Do you know the reason? Use the comments section to explain this. I may send surprise gift to 2 of the winners.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – DevOps for the DBA – Notes from the Field #091

[Note from Pinal]: This is a 91st episode of Notes from the Field series. Divas is a new buzz word with lots of substance behind it. Not everyone understands what it means and not everyone actually related it to their daily job. Lots of people think that it is alright to not explore and learn about this as after a few days or months it will be meaningless. However, the reality seems very different about DevOps as it is one of the concepts which is just rediscovered. I recently asked my friend Stuart about what it means by DevOps for DBA and he had a informative answer to this question.

In this episode of the Notes from the Field series database expert Stuart Ainsworth explains about DevOps for the DBA. Let us read the story in his own words.

 SQL SERVER   DevOps for the DBA   Notes from the Field #091

Not every problem that a DBA encounters is a technical one; sometimes, it’s a procedural one.  I’ve recently become enamored with the philosophy of DevOps, and have tried to find ways to transform those concepts and ideas into practical, actionable advice for my clients.  It’s still a work in progress, but I’ll try to summarize my thoughts today (and, they could change tomorrow).

A Brief Summary of DevOps

DevOps is a collaborative approach to software development and delivery that is influenced by Agile development methods and Lean IT principles; the idea is todevops SQL SERVER   DevOps for the DBA   Notes from the Field #091 decrease time-to-market by looking at (primarily enterprise) software development as a life cycle, and finding ways to automate and increase efficiency.  The general framework is often described as consisting of the Three Ways (a phrase fully explored in The Phoenix Project):

  1. The First Way: Systems Thinking. A complete software development lifecycle encompasses all the functionality involved in both development and operations (including QA and acceptance testing); the entire system is intertwined in order for the business to bring value to customers.  Development responsibility isn’t complete when software passes QA, and Operational responsibilities don’t simply begin during the install.  Both groups should be involved to some degree throughout the process.
  2. The Second Way: Amplify Feedback Loops. No system is complete without feedback, and DevOps approaches focus on increasing the immediacy of that feedback.  Bugs need to be simple to report and acknowledge; suggestions should be taken seriously.
  3. The Third Way: Culture of Continual Experimentation and Learning. Success is found when you take a risk, and DevOps is about creating an environment where risk is encouraged and rewarded.  This is often problematic for Operations, who are usually responsible for maintaining the stability of an environment, but that’s a key to this principle: Ops MUST be involved in finding a way for Dev to take risks while continuing to ensure the stability of the environment.

From Philosophy to Practice

All that’s well and good, but what does it mean for the average DBA? How do we go from thinking about processes, procedures, and philosophy to real meaningful change?  Here’s three specific practices that operational DBA’s should implement as soon as possible:

  1. Respect the role of development. Consider the following scenario: some performance issue has occurred, in operations, and the DBA has stepped in to make a fix (say an index change, or something more drastic like a stored proc rewrites).  The implemented fix is now different than the source of truth for development (their source control), and that’s bad.  It hampers the development effort, but sometimes emergency interventions are necessary to keep the business afloat; what do you do?First and foremost, DBA’s need to limit the changes that they make to strictly emergency fixes; emergency fixes need to be reported to develop as soon as possible.   Not all performance changes need to be emergency fixes; those should be reported to dev, but unless the business is grinding to a halt, the DBA’s should not change the code in operations.  This practice illustrates both the First and the Second Way above.
  1. Request simple methods to supply feedback. If developers are responsible for writing code, they have to be receptive to feedback from operations.  I’ve worked in shops where only the developers and QA had access to create bugs; this kills the feedback loop.  Operational DBA’s must find ways to report issues to development as soon as possible; any electronic tracking measure should be simple to identify who reported an issue, when they reported it, and what was decided to do about.  If the feedback is associated with an emergency change, the DBA should include the code that they changed, so that source control can be edited to include the current production version.  The key takeaway is that EVERY change to an operational system must be reported to the developers.
  2. Strive to provide operational support to development efforts. The best development environment is one that is identical to operations; that’s not always feasible or practical to implement.   DBA’s need to be part of the development process by finding ways to make the dev environment as similar to production as possible; virtualization technology can be a great tool for that, whether it’s virtualizing servers (e.g., VMware, Hyper-V, VirtualBox) or databases (e.g., Delphix).  Short of that, you can just make sure that your developers have a good representation of the current production statistics.


Software development is hard; systems administration is hard.  Working together can make it easier, but you have to focus on practical changes to your software culture to make things happen.

If you want to get started with SQL Server with the help of experts, read more over at Fix Your SQL Server.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Error: Msg 245 – Conversion failed when converting the varchar value ‘Inactive’ to data type int

Using CASE statements is something I see a lot of developers use. They use it in complex environments and I have been fortunate to troubleshoot and look at code blocks that run for pages using the standard CASE statements. This blog post is inspired by one such errors using CASE statements. This is one of the common error which is seen by many developers while using case/when combination. To understand the reason for such error, here is one of the simplified repro of the error:

INSERT INTO @TempTable VALUES (1,1,200)
INSERT INTO @TempTable VALUES (2,2,200)
INSERT INTO @TempTable VALUES (3,1,99)
INSERT INTO @TempTable VALUES (4,1,10)
THEN 'Inactive'
ELSE score
END 'Score'
FROM @TempTable

If we execute above in management studio, we would get below error

id          Score
----------- -----------
1           200
Msg 245, Level 16, State 1, Line 8
Conversion failed when converting the varchar value 'Inactive' to data type int.

Always remember the golden rule of case/when: “All branches – same type”. It is very important that all return values have the same data type. If we don’t do this, we would end up in conversion problems, like the one shown above. In our sample, score column is integer whereas value “Inactive’ is string. Due to datatype precedence rule, Data Type Precedence SQL Server is trying to implicitly convert Inactive to Integer data type. SQL Server will always return the same data type regardless of which branch of the case/when expression is actually returned. For CASE expressions all expressions should be of the same type or implicitly convertible types.

Here is another simple example which would fail randomly.

SELECT CASE WHEN RAND() < 0.5 THEN 200 ELSE 'SQlAuthority' END AS SomeColumnName;

Based on output of RAND() it would fail with error

Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the varchar value 'SQlAuthority' to data type int.

Here is the fix for the original problem.

INSERT INTO @TempTable VALUES (1,1,200)
INSERT INTO @TempTable VALUES (2,2,200)
INSERT INTO @TempTable VALUES (3,1,99)
INSERT INTO @TempTable VALUES (4,1,10)
THEN 'Inactive'
END 'Score'
FROM @TempTable

Here is the output:

case when 01 SQL SERVER   Error: Msg 245   Conversion failed when converting the varchar value Inactive to data type int

Have you been following this practice while coding? What are your workarounds for these sort of case statements? Let me know over comments.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – Free SQL Complete Add-in For SSMS

This article covers main features of ApexSQL Complete, free SQL complete add-in. You can download the most recent version, and check its features while reading the article.

ApexSQL Complete is a free Visual Studio and SQL Server Management Studio add-in that will auto-complete SQL statements and speed up coding with built-in snippet library, help managing opened and closed query windows, track all executed queries and more. In this article, we will explain how to use some of the core features.

Auto-complete using the hint list

Auto complete feature provides intelligent code completion through the hint list, allowing you to find an appropriate hint and to fill in SQL keywords, as well as database schema, object, parameter, and variable names. As an example, if you start typing “CRE”, the hint list appears offering the CREATE keyword:

addin1 SQL SERVER   Free SQL Complete Add in For SSMS

The hint list offers context sensitive suggestions based on the current cursor position in the query. This means that if you type the USE keyword, the hint list offers database list:

addin2 SQL SERVER   Free SQL Complete Add in For SSMS

In addition, you can drill down to the column level and choose the specific column directly from the hint list:

addin3 SQL SERVER   Free SQL Complete Add in For SSMS

If you need additional information about any object from the hint list, move the mouse pointer over the object and hold it for a second, the Object script window appears with complete script of the object including extended properties shown in the Description tab:

addin4 SQL SERVER   Free SQL Complete Add in For SSMS

To view the script for encrypted objects, select the Decrypt encrypted objects option in the add-in options:

addin5 SQL SERVER   Free SQL Complete Add in For SSMS

The Auto complete feature will not just help you to find the appropriate hint, but will also complete entire fragment of SQL code as INSERT statement, or JOIN statement:

addin6 SQL SERVER   Free SQL Complete Add in For SSMS

To auto-complete INSERT statement, check the Insert full INSERT statement option in the add-in options:

addin7 SQL SERVER   Free SQL Complete Add in For SSMS

Full INSERT STATEMENT will be auto-completed as follows:

addin8 SQL SERVER   Free SQL Complete Add in For SSMS

Snippets library and using snippets

The Snippets feature allows you to insert custom-made T-SQL code snippets, and often used SQL statements into SQL scripts. SQL coding with snippets can be achieved in two ways: using snippets from the built-in library (which are also editable) or creating new snippets.

New snippets can be created from the Snippet tab in the add-in options, or directly from the query window.

Snippet library

Built in T-SQL snippet library contains over 200 snippets:

addin9 SQL SERVER   Free SQL Complete Add in For SSMS

In order to use any of the predefined snippets, just pick it from the hint list, and it will be automatically deployed to the query.

Create a new SQL snippet

Clicking the Add command in the above image allows you to create a new snippet:

addin10 SQL SERVER   Free SQL Complete Add in For SSMS

Another way to create a snippet is to select the New snippet command from the query window context menu:

addin11 SQL SERVER   Free SQL Complete Add in For SSMS

This will open the same form as when you are creating a new snippet from the Snippets tab under the add-in option, but will automatically add SQL code from the query editor to the Code section:

addin12 SQL SERVER   Free SQL Complete Add in For SSMS

To create a new snippet from a part of SQL code in the query window, select the specific part of code and initiate the New snippet command from the right-click context menu.

Tab navigation

The Tab navigation feature allows you keep the track of all opened and recently closed tabs, from the moment of enabling the feature. It also can help in saving the current tab session, recovering tabs from the session prior to unexpected host application closing. To enable the Tab navigation feature, go to the add-in options, and enable it under the Tab navigation tab:

addin13 SQL SERVER   Free SQL Complete Add in For SSMS

In the same tab, you can enable restoring workspace on startup, or restoring session prior to unexpected host application closing. Once this is enabled, all opened and closed tabs will be kept so you can always navigate to any of them and continue working. The Tab navigation pane will be shown below the query window:

addin14 SQL SERVER   Free SQL Complete Add in For SSMS

For each of the logged query you will have details such as tab caption, creation date and time, and you can instantly preview the content of the query in the right side of the Tab navigation form. In the top of the form, you have a Search field, so you can easily find specific content of the query.

In the add-in options, you can set the timeframe for keeping the closed tabs in the Tab Navigation form. All queries older than a setting of the Delete tabs older than option, under the Tab navigation option, will be automatically removed.

Executed queries

The Executed queries feature keep all executed queries saved, so you can easily browse through the list of executed queries and get back to any of them, at any time.

To enable the Executed queries, check the Log executed queries option in the add-in options, under the Executed queries tab:

addin15 SQL SERVER   Free SQL Complete Add in For SSMS

All queries executed from the moment of enabling the feature will be logged, and you can access them from the Executed queries form, which you can initiate from the add-in menu:

addin16 SQL SERVER   Free SQL Complete Add in For SSMS

In the Executed queries form, you can find details about each query executed including date of execution, status (did it failed or run successfully), which user executed particular query, related database and duration of the execution.

You can search through the content of queries for a specific information, and filter the list so it shows only queries executed in the specified time range:

addin17 SQL SERVER   Free SQL Complete Add in For SSMS

Double-clicking any of the queries from the list will open it in the query editor, so you can continue working if there is a need.

Copy result as

The Copy result as feature allows you to copy query execution results, and save it in one of the following formats: CSV, HTML, or XML. As an example, a result of a simple SELECT statement can be copied by right clicking directly in the results grid:

addin18 SQL SERVER   Free SQL Complete Add in For SSMS

Reference : Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – Inserting into Sparse Column Sets and Errors Associated With It

I have written tons of blogs in the past around using sparse columns. The best part and lesser known part of sparse columns is that we can write to the column sets directly too. In this blog let me take a tour of how this can be achieved.

When sparse column values are inserted or updated by using the XML column set, the values that are inserted into the underlying sparse columns are implicitly converted from the xml data type. When inserting data explicitly into individual columns of a column set, you MUST use an explicit column list.

-- Create a table with sparse columns and a column set
c2 CHAR (20),
c3 DATETIME sparse,
--Try to insert a row without specifying a column list
INSERT INTO colset VALUES (3, 'abc', GETDATE())

As mentioned before, we will encounter the following error because the column names are not mentioned explicitly.

Msg 213, Level 16, State 1, Line 9

Column name or number of supplied values does not match table definition.

The correct version would be as shown below:

--Try to insert a row by specifying a column list - successful
INSERT INTO colset (c1,c2, c3) VALUES (3, 'abc', GETDATE())

Step 2, let us try to insert data into the column set itself, you have to specify XML syntax to reference the columns.

-- Insert two rows by specifying a column set- successful. Note we are specifying C2 and CS – the column set. use the table created above
INSERT colSet (c2, cs)
VALUES (3, '<c1>7</c1><c3>01/01/15</c3>');
INSERT colSet (c2, cs)
VALUES (4, '<c1>7</c1><c3>Feb 10 2015 7:05PM</c3>');

Be careful not to insert into a column set both directly via an explicit column name, and via a column set XML syntax.
--Create table
(c1 INT,
c2 CHAR (20) SPARSE,
c3 DATETIME sparse,
--Attempt to insert into table both via column name and column set
--This will fail because C2 is reference twice: explicitly on a column list and in the column set
INSERT colSet2 (c2, cset)
VALUES ('SQLAuthority', '<c2>SQLAuthority</c2><c3>01/01/08</c3>')

We will be encountering the following error in this case.

Msg 360, Level 16, State 1, Line 7

The target column list of an INSERT, UPDATE, or MERGE statement cannot contain both a sparse column and the column set that contains the sparse column. Rewrite the statement to include either the sparse column or the column set, but not both.

Let us go ahead and clean up the tables we used.

-- Clean up

Though I have written quite a few blogs on sparse columns. Do let me know if you are using any of these in your environments? What are those scenarios? Drop a comment and let us learn together.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

Interview Question of the Week #029 – Difference Between CHARINDEX vs PATINDEX

Question: What is the difference between CHARINDEX vs PATINDEX?

Answer: We can use either CHARINDEX or PATINDEX to search in a TEXT field in SQL SERVER. The CHARINDEX and PATINDEX functions return the starting position of a pattern you specify.

regular expressions regex Interview Question of the Week #029   Difference Between CHARINDEX vs PATINDEX

Both functions take two arguments. With PATINDEX, you must include percent signs before and after the pattern, unless you are looking for the pattern as the first (omit the first %) or last (omit the last %) characters in a column. For CHARINDEX, the pattern cannot include wildcard characters. The second argument is a character expression, usually a column name, in which Adaptive Server searches for the specified pattern.

Example of CHARINDEX:
USE AdventureWorks2014;
SELECT CHARINDEX('important', DocumentSummary),
FROM Production.Document
WHERE Title = 'Repair and Service Guidelines';

Examples of PATINDEX:
USE AdventureWorks2014;
SELECT PATINDEX('%imp_rtant%', DocumentSummary),
FROM Production.Document
WHERE Title = 'Repair and Service Guidelines';

PATINDEX is CHARINDEX + WildCard Search. Use either of them is depending your business need.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – How to Find If Queries are Run in Parallel?

Technology innovations over years have made personal computing and the infrastructure inside our datacenters even more powerful. Gone are the days when our laptops used to come with single processors and single cores. I wonder even if they ever sell such configurations in the market.

Talking about multi-cores on our desktops and servers, these days software like SQL Server just use them to the max. When working with SQL Server, there are a number of settings that influence using parallelism. Check blog “SQL SERVER – MAXDOP Settings to Limit Query to Run on Specific CPU”, “SQL SERVER – CXPACKET – Parallelism – Usual Solution – Wait Type” and many other posts on this topic.

Having said that, I have seen people struggle to identify parallel queries in their environments. So here is the first shot at this requirement.

FROM sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS p
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS q
WHERE cp.cacheobjtype = 'Compiled Plan' AND p.query_plan.value('declare namespace
p="http://schemas.microsoft.com/sqlserver/2004/07/showplan"; max(//p:RelOp/@Parallel)'
, 'float') > 0

Queries that run in Parallel can be found with the above query. Remember, if a query runs in parallel it is a query that SQL Server thinks is expensive enough to run in parallel. MAX_DOP and the cost_threshold_for_parallelism drive the behavior. MAX_DOP should be configured to match the number of physical processors in the server if required.

The next step is to understand what to do when you find them? When you find them look for ways to make them run more efficiently if they are run often and their performance during business hours is critical. Check indexing in DTA for recommendations, simplify the query, remove ORDER BYs, GROUP BYs, if they aren’t necessary – these are some steps to help you guided.

Another way to find parallelism is to get queries where the amount of time spent by the workers are more than the query execution time. You can also use the below method to get the same too:

sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.plan_handle) AS q
WHERE qs.total_worker_time > qs.total_elapsed_time

I hope these two scripts will be of use and you have something similar in your environments. Please share me scenario’s where you saw parallelism perform slower and how did you find them? Do let me know via comments.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – FIX – Property IsLocked is Not Available for Login

I play and experiment with SQL Server to the max. Sometimes I am not sure what I did and I land into something undesired. Then the reason to troubleshoot starts and I get to learn something new. Sometimes these learnings are based on error messages that you folks send me. Recently I faced below error while going to properties of an SA account in SQL Server Management Studio.

locked 01 SQL SERVER   FIX – Property IsLocked is Not Available for Login

Here is the text of the error message.

Property IsLocked is not available for Login ‘[sa]’. This property may not exist for this object, or may not be retrievable due to insufficient access rights. (Microsoft.SqlServer.Smo)

It was clear that there is something wrong with SQL Login but interestingly it was NOT happening with all other SQL Login accounts. I have put profiler to find out what is happening in the background causing this error. I nailed it down to the problem that below query is returning NULL instead of 0 or 1.


locked 02 SQL SERVER   FIX – Property IsLocked is Not Available for Login

Now, it was interesting to find why its returning NULL. Here are some possible reasons of such behavior. Later I realized that it’s happening with only those SQL Logins which are having “Enforced Password policy” enabled. Here is the query to check such accounts.

SELECT      name
FROM  sys.sql_logins
WHERE is_policy_checked = 1

Quick solution for this would be to disable the password policy check for such account using T-SQL below.

USE [master]

A few things to check if you run into such issue on a regular basis and don’t want to disable password policy.

  1. Check if SQL Service account (the account running SQL Service in services.msc) is locked out. To verify that try logging in to some machine using service account.
  2. Check if Password policy validation is failing due to some issue with domain controller connectivity.

If it’s the first issue, then you need to contact the domain administrator to unlock service account. For the second issue, you can check security ring buffers.

SELECT  DATEADD(ms, dorf.[timestamp] - dosi.ms_ticks, GETDATE()) AS Date_Time
,CAST(record AS XML).value('(//SPID)[1]', 'bigint') AS SPID
,CAST(record AS XML).value('(//ErrorCode)[1]', 'varchar(255)') AS Error_Code
,CAST(record AS XML).value('(//CallingAPIName)[1]', 'varchar(255)') AS CallingAPIName
,CAST(record AS XML).value('(//APIName)[1]', 'varchar(255)') AS WindowsAPIName
FROM sys.dm_os_ring_buffers dorf
CROSS JOIN sys.dm_os_sys_info dosi
WHERE dorf.ring_buffer_type = 'RING_BUFFER_SECURITY_ERROR'

In my case it was a first issue, so unlocking account fixed the issue for me. Have you ever faced such an issue? What was the solution which worked for you? Let me know via your comments.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – What are T-SQL Median? – Notes from the Field #090

Kathi SQL SERVER   What are T SQL Median?   Notes from the Field #090[Note from Pinal]: In this episode of the Notes from the Field series database expert Kathi Kellenberger explains about T-SQL Median. Kathi is an amazing instructor, she was the SQL author I have read in my early career. The reason, I love SQL Server because her writing has instigated love for this technology in me. Today she brings a relatively interesting topic for database experts. Read the experience of  Kathi in her own words.

SQL Server has a wealth of useful T-SQL functions that let you do just about anything you need to do. One function that is sorely missing is a MEDIAN function. A median is the middle value in a set of values. For example, 3 is the median of the set 1, 3, 5. If the set is even, the median is calculated or “interpolated” by averaging the two middle values. The median of this set 1, 3, 5, 7 is 4.

You will see median values used frequently even if you do not work with statistics yourself. For example, you could see a news story listing the median household income of a city by neighborhood, the median salary for developers by language, or the median age of a population.

As I mentioned, there is no MEDIAN function in T-SQL. If, however, you are fortunate enough to be running SQL Server 2012 or 2014, there is an easy workaround. In 2012, Microsoft introduced a new function called PERCENTILE_CONT, one of the T-SQL window functions. Given a percent rank and a set of values, PERCENTILE_CONT will return the value ranked at that percent. If there is not an exact value found at the rank, PERCENTILE_CONT will interpolate the answer instead. If you supply 0.5, meaning 50%, PERCENTILE_CONT will return the median value. Here is an example that calculates the median sales by month for each year:

WITH Totals AS (
SELECT YEAR(OrderDate) AS OrderYear,
MONTH(OrderDate) AS OrderMonth,
SUM(TotalDue) AS TotalSales
FROM Sales.SalesOrderHeader
SELECT OrderYear, OrderMonth, TotalSales,
OVER(PARTITION BY Totals.OrderYear) AS MedianSales
FROM Totals
ORDER BY OrderYear, TotalSales;

Before we take a look at the results, let’s review the query. The query has a common table expression (CTE) that calculates the sales for each month and year. In the outer query, the PERCENTILE_CONT function is used with an argument of 0.5. PERCENTILE_CONT is a T-SQL window function, but the syntax for it and a similar function, PERCENTILE_DISC, is different than the other window functions. The function name is followed by the WITHIN GROUP clause. This is where the ORDER BY expression goes, not in the OVER clause. The ORDER BY expression is used by the function to line up the values so that the ranks can be compared to the percentile rank value provided as an argument. The OVER clause supports PARTITION BY, and in this case we would like to get the median within each year, so the PARTITION BY expression is OrderYear.

In 2005, there were orders placed between July and December, six months. Six is even, so the function calculates the average of two middle months as shown in Figure 1.

notes 90 1 SQL SERVER   What are T SQL Median?   Notes from the Field #090

Figure 1: The Median Sales for 2005

Every year in the data has an even number of months except for 2008. In 2008, there are seven months with sales, so the function can find an exact median value. Figure 2 shows the 2008 sales:

notes 90 2 SQL SERVER   What are T SQL Median?   Notes from the Field #090

Figure 2: The Median Sales for 2008

You may be wondering what the difference is between PERCENTILE_CONT and PERCENTILE_DISC. PERCENTILE_DISC always returns an exact value found in the set, never a calculation. For calculating the median, PERCENTILE_CONT is the function to use.

notes 82 3 SQL SERVER   What are T SQL Median?   Notes from the Field #090If you would like to learn more about T-SQL window functions, be sure to check out my latest book Expert T-SQL Window Functions in SQL Server or my Pluralsight course T-SQL Window Functions.

If you want to get started with SQL Server with the help of experts, read more over at Fix Your SQL Server.

Reference: Pinal Dave (http://blog.sqlauthority.com)