Jr. DBA asked me a day ago, how to apply DISTINCT keyword to only first column of SELECT. When asked for additional information about question, he showed me following query.
SELECT Roles, FirstName, LastName
FROM UserNames
He wanted to apply DISTINCT to only Roles and not across FirstName and LastName. When he finished I realize that it is not possible and there is logical error in thinking query like that. I helped him with what he needed however, after he left I realize that answer to his original question was “NO”.
Distinct can not be applied to only few columns it is always applied to whole column. In following exam DISTINCT is applied across Roles, FirstName and LastName.
SELECT DISTINCT Roles, FirstName, LastName
FROM UserNames
Same result as above query can be achieved by using GROUP BY as well.
SELECT Roles, FirstName, LastName
FROM UserNames
GROUP BY Roles, FirstName, LastName
Before I end discussion I would like to add that DISTINCT can be used with aggregated functions and can be applied on only one field in that case. Following example demonstrates that scenario.
SELECT MAX(DISTNCT ID), Roles, FirstName, LastName
FROM UserNames, LastName
This subject is very interesting and I decided to search online if any other writers has explained this in depth. I found following good article written by Jeff Smith. He has explained this example in depth so I have skipped explaining similar example in depth.
Reference : Pinal Dave (https://blog.sqlauthority.com)
29 Comments. Leave new
In the following sentence you made a mistake: “Distinct can not be applied to only few columns it is always applied to whole column.” You must have meant the “whole row”, not column. And while at it, you may want to add a punctuation sign in front of “it is…”. :)
my concept is not clear after reading your discussion
It seems to me that M$ SQL Server cannot “GROUP BY” one column of a result set, nor can it select only “DISTINCT” values of one column of the result set.
MySQL can easily do this with the following code:
“SELECT DISTINCT Roles, FirstName, LastName FROM UserNames”
Which FirstName, LastName does it pick for the result, if there are mulitiple rows with same Roles but different names?
h i
using stored produce i want to remove the duplicate value in the single colums
Hi,
I have a similar problem but cant fix it can you please help me, problem is:
I have a table with following columns: basketid, lookup_value, date_time.
I only want to return distinct basketid and lookup value rows and also want to include date_time row as well. but it is not distinct. any solutions please. Group by and distinct does not solve my problem.
Thank you
@Naeem
One way to do that is use GROUP BY, But there is a serious problem with GROUP BY, which I hate saying but true, You have to include all the columns in GROUP BY clause what you have mentioned in SELECT clause.
Anyways your best choice would be to use OVER PARTITION BY in your select query, this is a new feature in sql server 2005.
OVER PARTITION BY will serves your purpose. This is just like GROUP BY but here you have freedom to exclude columns that you do not want to include in GROUP BY clause.
GROUP BY is nothing but grouping or distinct (for that match of columns) in a way.
Find simple example of OVER PARTITION BY here,
~ IM.
I’m surprised at your statement about ” logical error in thinking query like that… ”
I have a situation where I’m asked for a sample of 100 records from a 10,000 record table. The table has the following columns ( ID, PayerID, Category, Status, CreateDate )
In my Sample file I need to have a 100 records with all the columns but each record should have a DISTINCT PayerID.
Explain to me please on how would you solve this?
Thanx
Kinara
@Kinara,
Need more details.
Do you have duplicate records for the same PayerID with different combinations of ID< Category, Status and CreateDate. If there exusts multiple records for same PayerID, I dont think, you can achieve your tasks by a simple select statement.
Please provide some sample data.
Input and what kind of output you are expecting.
~ IM.
How to select only one category of dupliacte records and delete that excluding one …
for example consider in my DB table i have 200 records
in that i have more that one person duplicate records, now
i want to delete one user duplicate records.
that is if person one has entered his data 5 times with same details, i have to check that and have to delete 4 record of that person and keep one record… i may have more than one person records in that way in same table
and one more thing here is in this table their is no autonumber or primary key,
i have to select only one person record and delet his duplicate entries … that person records at a time
Hi Shaf,
If you are using SQL 2005 and above, you can use following:
;with cte as(
SELECT Row_NUMBER() OVER(PARTITION BY Person ORDER BY Person) AS RowID,
*
from table
)
select *
from cte
where RowID > 1
It will list out all records that you need to delete. If you are sure that you can delete this. use this query:
;with cte as(
SELECT Row_NUMBER() OVER(PARTITION BY Person ORDER BY Person) AS RowID,
*
from table
)
DELETE
from cte
where RowID > 1
For more information:
Thanks,
Tejas
I’m actually working on a sample of 100 records from a 10,000 record table. The table has the following columns ( ID, PayerID, Category, Status, CreateDate )
In my file I need to have a 100 records with all the columns but each record should have a DISTINCT PayerID.
How would I do that ?
Hi,
I would like to know wether the distinct can be applied to each column in a table or not… if so can you please
explain..
e.g.
I have a table containing a firstname, lastname, salary, designation, dob, doj,
—-
I would like to apply distinct keyword for all the column in a table.
Thanks in Advance,
Prithviraj MK
I have
select distinct a,b,c,database from
(
select a,b,c,’A’ as database from tableA
union
select a,b,c,’B’ as database from tableB
) as Everything
with tableA and tableB having lots of data in common, and I expect this query to give all the rows with different value of a.
Where’s the logical error? Is there any way to have this simple query done in SQLServer?
As distinct clause affects performance of the application. What is the better option to DISTINCT clause. In T- SQL only.
Can you post the code you have used? If you used group by clause, you dont need a distinct clause
Hawing worked extensively with MySQL and only about 2 years on Sql Server I have encountered a lot of headaches with Sql Server’s group-by, distinct and in general to get UNIQUE ROWS on joins.
The fact that you have to include all columns in a group-by is just silly and I think Sql Server is perhaps not as intelligent as MySql in its query plans.
In Sql Server the way to get recordsets for pagination is also unnecessarily complex TOP X and order-by, OVER PARTITION etc is ridiculous, in MySQL its as simple as LIMIT offset count.
shal i use..Distinct(Columname) like this
I need result by using DISTINCT for one column only, not whole row then how to proceed.
You need to either exclude other columns, or use aggregate functions like min or mx on other columns
If GROUP BY can do same work as DISTINCT then why DISTINCT is needed.
In some cases they do the same thing and not all the cases. They exists for different purposes
I have 4 result looks like
ram 10 1001
ram 10 1002
shyam 30 5431
shyam 30 5567
now i have to select only 2nd and 4th row
how to do it?
select col1, col2, max(col3) as col3 from table
group by col1,col2