SQL SERVER – DISTINCT Keyword Usage and Common Discussion

December 20, 2007

Jr. DBA asked me a day ago, how to apply DISTINCT keyword to only first column of SELECT. When asked for additional information about question, he showed me following query.

SELECT Roles, FirstName, LastName FROM UserNames
He wanted to apply DISTINCT to only Roles and not across FirstName and LastName. When he finished I realize that it is not possible and there is logical error in thinking query like that. I helped him with what he needed however, after he left I realize that answer to his original question was “NO”.

Distinct can not be applied to only few columns it is always applied to whole column. In following exam DISTINCT is applied across Roles, FirstName and LastName.
SELECT DISTINCT Roles, FirstName, LastName FROM UserNames

Same result as above query can be achieved by using GROUP BY as well.
SELECT Roles, FirstName, LastName FROM UserNames GROUP BY Roles, FirstName, LastName

Before I end discussion I would like to add that DISTINCT can be used with aggregated functions and can be applied on only one field in that case. Following example demonstrates that scenario.
SELECT MAX(DISTNCT ID), Roles, FirstName, LastName FROM UserNames, LastName

This subject is very interesting and I decided to search online if any other writers has explained this in depth. I found following good article written by Jeff Smith. He has explained this example in depth so I have skipped explaining similar example in depth.

Reference : Pinal Dave (https://blog.sqlauthority.com)

SQL Scripts

SQL SERVER – Cumulative Update Package 5 for SQL Server 2005 Service Pack 2

SQL SERVER – Enabling Clustered and Non-Clustered Indexes – Interesting Fact

29 Comments. Leave new

Roman
December 22, 2007 7:06 am
In the following sentence you made a mistake: “Distinct can not be applied to only few columns it is always applied to whole column.” You must have meant the “whole row”, not column. And while at it, you may want to add a punctuation sign in front of “it is…”. :)
Reply
Dash
March 26, 2008 11:40 am
my concept is not clear after reading your discussion
Reply
Steven
September 3, 2008 2:37 pm
It seems to me that M$ SQL Server cannot “GROUP BY” one column of a result set, nor can it select only “DISTINCT” values of one column of the result set.
MySQL can easily do this with the following code:
“SELECT DISTINCT Roles, FirstName, LastName FROM UserNames”
Reply
- rlj
  December 14, 2013 12:45 am
  Which FirstName, LastName does it pick for the result, if there are mulitiple rows with same Roles but different names?
  Reply
parthiban
October 6, 2008 12:38 pm
h i
using stored produce i want to remove the duplicate value in the single colums
Reply
Naeem
April 8, 2009 5:04 am
Hi,
I have a similar problem but cant fix it can you please help me, problem is:
I have a table with following columns: basketid, lookup_value, date_time.
I only want to return distinct basketid and lookup value rows and also want to include date_time row as well. but it is not distinct. any solutions please. Group by and distinct does not solve my problem.
Thank you
Reply
Imran Mohammed
April 8, 2009 10:02 am
@Naeem
One way to do that is use GROUP BY, But there is a serious problem with GROUP BY, which I hate saying but true, You have to include all the columns in GROUP BY clause what you have mentioned in SELECT clause.
Anyways your best choice would be to use OVER PARTITION BY in your select query, this is a new feature in sql server 2005.
OVER PARTITION BY will serves your purpose. This is just like GROUP BY but here you have freedom to exclude columns that you do not want to include in GROUP BY clause.
GROUP BY is nothing but grouping or distinct (for that match of columns) in a way.
Find simple example of OVER PARTITION BY here,
~ IM.
Reply
Kinara
May 19, 2009 3:41 am
I’m surprised at your statement about ” logical error in thinking query like that… ”
I have a situation where I’m asked for a sample of 100 records from a 10,000 record table. The table has the following columns ( ID, PayerID, Category, Status, CreateDate )
In my Sample file I need to have a 100 records with all the columns but each record should have a DISTINCT PayerID.
Explain to me please on how would you solve this?
Thanx
Kinara
Reply
Imran Mohammed
May 19, 2009 9:34 am
@Kinara,
Need more details.
Do you have duplicate records for the same PayerID with different combinations of ID< Category, Status and CreateDate. If there exusts multiple records for same PayerID, I dont think, you can achieve your tasks by a simple select statement.
Please provide some sample data.
Input and what kind of output you are expecting.
~ IM.
Reply
shaf
April 21, 2010 12:29 pm
How to select only one category of dupliacte records and delete that excluding one …
for example consider in my DB table i have 200 records
in that i have more that one person duplicate records, now
i want to delete one user duplicate records.
that is if person one has entered his data 5 times with same details, i have to check that and have to delete 4 record of that person and keep one record… i may have more than one person records in that way in same table
Reply
shaf
April 21, 2010 1:35 pm
and one more thing here is in this table their is no autonumber or primary key,
i have to select only one person record and delet his duplicate entries … that person records at a time
Reply
Tejas Shah
April 21, 2010 4:07 pm
Hi Shaf,
If you are using SQL 2005 and above, you can use following:
;with cte as(
SELECT Row_NUMBER() OVER(PARTITION BY Person ORDER BY Person) AS RowID,
*
from table
)
select *
from cte
where RowID > 1
It will list out all records that you need to delete. If you are sure that you can delete this. use this query:
;with cte as(
SELECT Row_NUMBER() OVER(PARTITION BY Person ORDER BY Person) AS RowID,
*
from table
)
DELETE
from cte
where RowID > 1
For more information:
Thanks,
Tejas
Reply
Edolmen Sviluppo Siti Web
June 1, 2010 2:48 pm
I’m actually working on a sample of 100 records from a 10,000 record table. The table has the following columns ( ID, PayerID, Category, Status, CreateDate )
In my file I need to have a 100 records with all the columns but each record should have a DISTINCT PayerID.
How would I do that ?
Reply
Prithviraj MK
August 10, 2010 6:12 pm
Hi,
I would like to know wether the distinct can be applied to each column in a table or not… if so can you please
explain..
e.g.
I have a table containing a firstname, lastname, salary, designation, dob, doj,
—-
I would like to apply distinct keyword for all the column in a table.
Thanks in Advance,
Prithviraj MK
Reply
AndresP
January 18, 2011 11:49 pm
I have
select distinct a,b,c,database from
(
select a,b,c,’A’ as database from tableA
union
select a,b,c,’B’ as database from tableB
) as Everything
with tableA and tableB having lots of data in common, and I expect this query to give all the rows with different value of a.
Where’s the logical error? Is there any way to have this simple query done in SQLServer?
Reply
Shirish Kulkarni
August 5, 2011 2:11 pm
As distinct clause affects performance of the application. What is the better option to DISTINCT clause. In T- SQL only.
Reply
- madhivanan
  August 5, 2011 3:35 pm
  Can you post the code you have used? If you used group by clause, you dont need a distinct clause
  Reply
Christof Coetzee
November 18, 2011 4:41 pm
Hawing worked extensively with MySQL and only about 2 years on Sql Server I have encountered a lot of headaches with Sql Server’s group-by, distinct and in general to get UNIQUE ROWS on joins.
The fact that you have to include all columns in a group-by is just silly and I think Sql Server is perhaps not as intelligent as MySql in its query plans.
In Sql Server the way to get recordsets for pagination is also unnecessarily complex TOP X and order-by, OVER PARTITION etc is ridiculous, in MySQL its as simple as LIMIT offset count.
Reply
Veeresh
November 28, 2011 4:30 pm
shal i use..Distinct(Columname) like this
Reply
Chinmoy
May 18, 2012 4:21 pm
I need result by using DISTINCT for one column only, not whole row then how to proceed.
Reply
- madhivanan
  May 23, 2012 3:49 pm
  You need to either exclude other columns, or use aggregate functions like min or mx on other columns
  Reply
prasanna
May 21, 2012 12:08 am
If GROUP BY can do same work as DISTINCT then why DISTINCT is needed.
Reply
- madhivanan
  May 23, 2012 3:34 pm
  In some cases they do the same thing and not all the cases. They exists for different purposes
  Reply
Abhishek Majumder
September 29, 2012 8:47 pm
I have 4 result looks like
ram 10 1001
ram 10 1002
shyam 30 5431
shyam 30 5567
now i have to select only 2nd and 4th row
how to do it?
Reply
- madhivanan
  November 15, 2012 4:56 pm
  select col1, col2, max(col3) as col3 from table
  group by col1,col2
  Reply