This question is asked many times to me. What is difference between DISTINCT and GROUP BY?
A DISTINCT and GROUP BY usually generate the same query plan, so performance should be the same across both query constructs. GROUP BY should be used to apply aggregate operators to each group. If all you need is to remove duplicates then use DISTINCT. If you are using sub-queries execution plan for that query varies so in that case you need to check the execution plan before making decision of which is faster.
Example of DISTINCT:
SELECT DISTINCT Employee, Rank
FROM Employees
Example of GROUP BY:
SELECT Employee, Rank
FROM Employees
GROUP BY Employee, Rank
Example of GROUP BY with aggregate function:
SELECT Employee, Rank, COUNT(*) EmployeeCount
FROM Employees
GROUP BY Employee, Rank
Reference : Pinal Dave (http://blog.SQLAuthority.com)




Very simple and clear explanation, very easy to understand
hay can u tell that how can i found average of colume valu in crystal reprot……not use cross tab only by queryes…. of many
hey…
i have a performance issue with me.
in one query and corresponding sub queries they have used so many distinct clauses . so suggest me the best replacement
to the distinct to improve the performance.
I have view reteriving records from multiple tables using join.. I want DISTINCT records but for speific field in Jet SQL you can do it by DISTINCT ( Employee.EmployeeId), …How in SQL server… thanks
Hi Pinal
I m visit this site for long period really i felt this site is very nice and explanation is very good.
SELECT DISTINCT Purchase.SessionId, UserInfo.UserId, UserInfo.UserName, Purchase.grandTotal, Purchase.PurchaseDate, Purchase.Status
FROM UserInfo INNER JOIN Purchase ON UserInfo.UserId=Purchase.Uid;
but this doesn’t generate correct result ……………… plz reply whats problem with that
Hello Pinal:
There is a table t_test with columns
[id, name, code, description].
I can write a query using MySQL as
SELECT
*
FROM t_test
GROUP BY Code
What would be equivalent sql server query for this.
Thanks.
hh, I can’t really say what the query should be without knowing what your expected result is.
Having a sample of a few typical data rows from the UserInfo and Purchase tables might also be neccessary.
However, “SELECT DISTINCT” performs a hash match on the selected fields, and omits any records producing a duplicate hash. In other words, the query you used will return any record where the returned column values, taken together, are different from any other record in the result set.
In general, when using “distinct”, more columns selected = more rows returned. If you need to select all of these columns into the result set, but are only concerned that a few of them be unique, then you need a different construct. Two possibilities are:
1. Use “GROUP BY” to eliminate duplicates on the desired columns, and then apply aggregate functions (MIN, MAX, AVG, etc) to the remaining columns.
2. Use subqueries or self-joins to limit the records if you need selected columns to be unique and don’t want to aggregate the other columns.
I have an issue with both DISTINCT and GROUP BY.
The issue is that using either one, the results comes back ordered as if using order by.
I need the unique results in the order they are in the database. How can I do this? Thanks
i have a question regarding search criteria
Select lastname,firstname from TblResourceProfile
where @crit like in(lastname,firstname)
dont know where to put the wild card ‘%’
i
thanks by the way….;)
Select lastname,firstname from TblResourceProfile
where @crit like in(lastname,firstname)
dont know where to put the wild card ‘%’
can this be possible?
There is a third approach if you have a master and detail table – WHERE EXISTS. For example, if you have an OrderHeader table and an OrderItem table that have a one to many relationship on OrderHeaderId, you can decide unique orders in OrderItem one of 2 ways
SELECT DISTINCT orderheaderid from orderitem
SELECT orderheaderid FROM orderheader oh WHERE EXISTS (SELECT 1 FROM OrderItem WHERE OrderHeaderId=oh.OrderHeaderId)
If these queries are run together, the “DISTINCT” query takes about 2/3 total query time, with WHERE EXISTS taken up a 1/3.
(Ahh but why dont you just select from OrderHeader in this case. Well that would not omit orders that have no item records – if thats how you represent cancelled orders)
In production code, always try to write SQL queries that produce unique results WITHOUT DISTINCT clauses. If performance is what you are after and you have one to many table relationships, use WHERE EXISTS.
We have published an article of SQL Performance Do’s and Dont’s if thats of help to anyone.
Steve Hatchard
Director
Mattched IT Ltd
Is there a way to execute a DISTINCT query on a single field while the SELECT statement contains more fields. For example this statement:
SELECT DISTINCT EmployeeID, EmployeeName, ContractDate, CurrentTitlle FROM EmployeeTable
I am only interested to fetch DISTINCT reocrds based on the EmployeeID. Please note that more than one occurence of each employee’s record exists in this table. Therefore getting DISTINCT on all fields is not possible.
If DISTINCT cannot solve this problem, what is other best approach in achieve this?
Tariq Changgez
Your best answer is to have two tables – one for unique employee data EmployeeId, EmployeeName, CurrentTitle etc.. and one for multiple row data – EmployeeId, ContractDate. It is usually best practice to “normalise” data like this.
This off topic and not really an issue for DISTINCT.
Steve Hatchard
Director
Mattched IT Ltd
When request is very simple, Distinct can be used else always use Group By. really faster.
hi pinal,
i’m struck with an issue related to distinct clause.
i’ve fired a query in which i’ve used distict clause. As per that i’m getting distinct rows but they are repeatative in nature.
e.g. if i’m getting 3 distinct rows from my query; i’m getting 1st row many hundreds of time, followed by 2nd and 3rd row in the same manner. How to suppress these repeated rows and how to only three distinct rows as a output?
Thanx in advance.
i like pinal dave blogs
[...] SQL SERVER – Difference between DISTINCT and GROUP BY – Distinct vs Group By [...]
Hi Pinal,
I have a situation, I want to count the amount of time a similar name occurs in a table.
E.g.
Table Buses,
Starmart
Starmart
Starmart
Starmart
Starmart
Damai
Damai
Damai
Damai
Damai
Damai
Damai
Damai
Senibudaya
Senibudaya
Senibudaya
Senibudaya
Senibudaya
Senibudaya
Senibudaya
Senibudaya
How do i actually type out a sql script to group and return the count for each name.
This is what i used so far
select CurrentOperator, COUNT(*) CurrentOperator from EZBusDB.dbo.Buses
GROUP BY CurrentOperator,
I am getting this error:
Msg 102, Level 15, State 1, Line 2
Incorrect syntax near ‘,’.
How i go about it?
Thank you!
“select CurrentOperator, COUNT(*) CurrentOperator from EZBusDB.dbo.Buses
GROUP BY CurrentOperator”
Delete Comma(,) At the End of Your Command
@shwaria
Remove comma (,) at the end of second line, to reduce confusion, you can use different column name.
select CurrentOperator, COUNT(*) Count_CurrentOperator from EZBusDB.dbo.Buses
GROUP BY CurrentOperator
~IM
I read your posts for quite a long time and should tell that your posts always prove to be of a high value and quality for readers.
Thank you Ted
hi thanks so much for between dif but when and why used this arti……..
Hye!
I’m trying a simple query with DISTINCT keyword as:
SELECT DISTINCT SecCode FROM Sections ORDER BY SecCode
in SQL Server 2005, but it is not giving me any result.
Can anyone help me?
Thanks!
Dear
Could u plz suggest me a technique where i can implement transactional replication technique with ONE – MANY Tables relationships……….. i want to replicate the different remote databases with same name n having same articles and structures in centralized single database with same single articles………………..
select DENSE_RANK() Over (Order by col1, col2 ) as Ranky, col1, col2 from table where…
group By Col1, col2 — gives distinct
Thanks Pinal,
These posts are very useful.
I found them after attending your presentation in teched delhi.
i try to get distinct values for name and get all other fields with out where condition
ex. table
id name level department companyid
1 ram 1 1 1
2 ram 2 2 2
3 kumar 3 3 3
4 Raja 4 4 4
i expect value is
1 ram 1 1 1
3 kumar 3 3 3
4 Raja 4 4 4
(or)
1 ram 1 1 1
2 ram 2 2 2
4 Raja 4 4 4
Thanks Pinal.
Very useful posts.
Hello I am trying to make a selection in 3 different tables using distinct, but I keep getting duplicate values, my tables are Alumni, History, and Groups, I get the id and name of students enrolled in the group 230, my query is
SELECT DISTINCT H. Go, A. LastName, A. Name FROM History H, Alumni A, Groups I WHERE Group = 230 AND H. H.id = I.id AND ORDER BY I.key = A.key H.id ” )
appear two students with the same value:
9876 M John
9876 M John
I hope you can help me as no way out, and I’m starting to use SQL.
@Lizbeth Morales
The query didn’t seem to come through correctly.
Please repost it, along with the COLUMNs in each TABLE used. It is hard to determine what is going on without them.