SQL SERVER – Difference between DISTINCT and GROUP BY – Distinct vs Group By

March 29, 2007

SQL, SQL Performance, SQL Server, SQL Tips and Tricks

This question is asked many times to me. What is difference between DISTINCT and GROUP BY?

A DISTINCT and GROUP BY usually generate the same query plan, so performance should be the same across both query constructs. GROUP BY should be used to apply aggregate operators to each group. If all you need is to remove duplicates then use DISTINCT. If you are using sub-queries execution plan for that query varies so in that case you need to check the execution plan before making decision of which is faster.

Example of DISTINCT:
SELECT DISTINCT Employee, Rank FROM Employees

Example of GROUP BY:
SELECT Employee, Rank FROM Employees GROUP BY Employee, Rank

Example of GROUP BY with aggregate function:
SELECT Employee, Rank, COUNT(*) EmployeeCount FROM Employees GROUP BY Employee, Rank

Reference : Pinal Dave (https://blog.sqlauthority.com)

SQL Function, SQL Joins, SQL Scripts

SQL SERVER – Fix : Error 8101 An explicit value for the identity column in table can only be specified when a column list is used and IDENTITY_INSERT is ON

SQL SERVER – Index Seek Vs. Index Scan (Table Scan)

97 Comments. Leave new

Jignesh Shah
May 5, 2010 4:11 pm
Hi,
I have the following data in the table.
product price qty
====================
A 10 5
A 20 2
A 30 1
B 20 3
B 10 7
The result that i expect from SQL query
basically the total = sum (price x qty) but quantity should be the lowest among same product
for product A is
LINETOTAL
50
for product B is
SUMTOTAL
70
Could you please help me how to write the SQL query.
Thanks,
Jignesh
Reply
- Brian Tkatch
  May 6, 2010 5:20 pm
  @Jignesh
  The lowest or the highest?
  WITH
  Data(product, price, qty)
  AS
  (
  SELECT ‘A’, 10, 5 UNION ALL
  SELECT ‘A’, 20, 2 UNION ALL
  SELECT ‘A’, 30, 1 UNION ALL
  SELECT ‘B’, 20, 3 UNION ALL
  SELECT ‘B’, 10, 7
  ),
  Data_RN
  AS
  (
  SELECT
  product,
  price,
  qty,
  ROW_NUMBER() OVER(PARTITION BY Product ORDER BY qty DESC) RN
  FROM
  Data
  )
  SELECT
  product,
  price * qty
  FROM
  Data_RN
  WHERE
  RN = 1;
  Reply
Jignesh Shah
May 7, 2010 10:33 am
Thanks brian thanks a lot ! :-)
Sorry it was for lowest quantity…
Reply
nahid
May 9, 2010 1:09 pm
Hi
I have a problem with join and group by
when I use group by in sub query the result will be done with 7 rows
SELECT classNameId , max( viewCount) as viewcount
FROM tagsasset
group by classNameId
classNameId viewcount
10011 0
10020 0
10040 0
10056 31
10066 10
10067 5409
10111 2
but when join it the result has 63 rows
because I put 2 condition
when it was do 2 conditon for equal result show all of the row have this result
SELECT TA.title,TA.viewcount,TA.classnameid,c.value
from tagsasset TA inner join classname_ c
on c.classnameid=TA.classnameid
inner join
(SELECT classNameId , max( viewCount) as viewcount
FROM tagsasset
group by classNameId )groupet on
TA.viewcount=groupet.viewcount
and
groupet.classnameid=TA.classnameid
view classnameid
0 10040
0 10040
0 10040
0 10040
0 10040
0 10040
0 10040
0 10040
0 10040
0 10040
0 10040
0 10040
0 10040
0 10040
0 10040
31 10056
10 10066
5409 10067
2 10111
2 10111
I have duplicate result
plz help me
Reply
- Brian Tkatch
  May 31, 2010 6:38 pm
  @nahid
  What should the result be?
  Reply
lokesh
May 29, 2010 6:44 pm
i have results like the following
Marea,Garea,SGarea
bck1,gitem1,sgitem1
bck1,gitem1,sgitem2
bck1,gitem1,sgitem3
bck2,gitem2,sgitem11
bck2,gitem2,sgitem12
bck2,gitem2,sgitem13
i require quey which should give result like below
(note:i wanted to show in a gridview like below)
bck1 gitem1 sgitem1
sgitem2
sgitem3
bck2 gitem2 sgitem11
sgitem12
sgitem13
can any one give me some idea.how i can make it possible.
from
lokesh
Reply
Ram
July 8, 2010 7:18 am
DECLARE @tmp TABLE(SrNo INT)
INSERT INTO @tmp VALUES(3)
INSERT INTO @tmp VALUES(3)
INSERT INTO @tmp VALUES(4)
INSERT INTO @tmp VALUES(5)
INSERT INTO @tmp VALUES(5)
INSERT INTO @tmp VALUES(5)
INSERT INTO @tmp VALUES(9)
INSERT INTO @tmp VALUES(8)
INSERT INTO @tmp VALUES(9)
INSERT INTO @tmp VALUES(9)
INSERT INTO @tmp VALUES(9)
INSERT INTO @tmp VALUES(9)
SELECT SrNo,COUNT(*) AS Total
FROM @tmp group by SrNo
Reply
Ram
July 8, 2010 7:19 am
Hey Sally,
U can use above query which gives the result that u expected
Thanks
Ram
Reply
Santhu
August 13, 2010 8:52 pm
Hi All,
I have a db with fields Part number,supplier number and country name
I would like to get a table with one unique record for partnumber and the country1 name country 2 name,–etc..
example this is what I have
partno suno country
001 123 US
001 234 IN
001 567 AU
001 897 CA
002 235 IN
002 566 SR
etc…
I need a table like this
part num country1 country 2 country 3 country 4
001 US IN AU CA
002 IN SR
etc…
Maximum suppliers will be usually from 1-10 ( Assumption)
Could anyone please help me out with the sql query to get this result.
.we may have use do while loop,but no clear idea
please hep me out..
Thanks,
Santhu
Reply
- Kirtan
  February 8, 2017 11:42 am
  I think group_concat() function is the best solution to resolve your confusion,
  E.g.
  select
  First_column ,
  dbo.group_concat(Second_column) as “List of all countries”
  from Table_name
  group by First_column;
  Reply
  - Pinal Dave
    February 14, 2017 5:00 am
    I agree with Kirtan.
jazzy
August 24, 2010 7:46 am
I guess something is missing in Article. Distinct creates a temp table and remove duplicates. Group by essentially does the same things, but it has additional overhead of sorting the records. By default group by is ASC sort. so Group by is slower in cases where the key grouping column is not indexed. If the query columns are already in clustered indexed, it may be same operation.
I would go with query plan and based on instincts will prefer Distinct over Group by
thanks
Reply
KUMAR
August 24, 2010 12:48 pm
Hi
why we should not use distinct and group by within the same sql statement or query??? any performance issue can u explain pls????????
Reply
Manikandan
November 17, 2010 12:30 pm
Very nice…
Reply
raj
December 13, 2010 5:37 pm
Hi,
i want to find distinct top.
top 3 for all UserId’s.
for ex:
the table is as below:
Id UserId Date
1 10 10/10/2010
1 10 10/10/2010
1 5 10/10/2010
1 10 10/10/2010
1 5 10/10/2010
1 7 10/10/2010
1 10 10/10/2010
1 5 10/10/2010
1 10 10/10/2010
1 5 10/10/2010
In the above table userid 10, 5 & 7 has 5, 4 & 1 rows respectively.
Now i expect 3 or below 3 from each user.
For Ex:
Id UserId Date
1 10 10/10/2010
1 10 10/10/2010
1 10 10/10/2010
1 5 10/10/2010
1 5 10/10/2010
1 5 10/10/2010
1 7 10/10/2010
how to solve this.
regards
rajendran
Reply
- madhivanan
  December 15, 2010 5:25 pm
  Make sure to read this
  Reply
Vicky Raj Sharma
March 29, 2011 1:11 pm
SELECT ROW_NUMBER() OVER (ORDER BY inserted_date DESC)
AS Row, distinct product_type, parent_id,product_id,brand_id,device_id,p.subcategory_id,upc_code,sku,product_name,p.description,price,product_image,
isDevice,dimension_length,dimension_length_unit,dimension_width,dimension_width_unit,
dimension_height,dimension_height_unit,dimension_weight,dimension_weight_unit from tblProductDetails p join tblSubCategory s
on p.subcategory_id=s.subcategory_id
where parent_id=@cat_id and p.subcategory_id=@sub_id
My distinct keyword is having error.
Reply
- madhivanan
  April 1, 2011 11:32 am
  You cannot use distinct that way. What do you want to get? Post some sample data with expected result
  Reply
pradeep Tiwari
April 26, 2011 1:08 pm
I guess something is missing in Article. Distinct creates a temp table and remove duplicates. Group by essentially does the same things, but it has additional overhead of sorting the records. By default group by is ASC sort. so Group by is slower in cases where the key grouping column is not indexed. If the query columns are already in clustered indexed, it may be same operation.
I would go with query plan and based on instincts will prefer Distinct over Group by
Reply
harsh shah
May 2, 2011 6:50 pm
this is my query and
select company_name ,department_name from Department_Master where Status = ‘N’ order by company_name, group by company_name
It shows me this kind of the error
Msg 156, Level 15, State 1, Line 1
Incorrect syntax near the keyword ‘group’.
What should i do to use the group by clouse in sql server 2008
Thanks in advance
Reply
- madhivanan
  May 4, 2011 8:25 pm
  It should be
  select company_name ,department_name from Department_Master where Status = ‘N’ group by company_name, order by company_name
  Reply
Blueprint RSA
May 31, 2011 9:33 am
Thanks for posting this very informational blog. Now I know the difference between DISTINCT and GROUP BY. I hope to achieve something from what I learn today.
Reply
Varun
June 15, 2011 6:15 pm
Hi Sally,
You can do the following to get the results inon estatement.
SELECT Col1,COUNT(*) AS Total FROM TMP
GROUP BY Col1
ORDER BY Col1
Reply
Deepak Gakhar
September 1, 2011 6:13 pm
I need your help ragarding sql query.
Val_1, Val_2, Val_3,Val_4
a b c 1
a b c 2
a b c 4
m n o 1
m n o 2
and output should be
val_1, Val_2,val_3, val_4
a b c 1,2,4
m n o 1,2
Pls help if anyone know…
Reply
Muhammad Umer
December 1, 2011 10:21 pm
Dear all i need a help from you people. I am using vb6 & ms sql server2000. I have two tables.
1- Order (master table) having data
ORDERID DATE EMPNAME
1 1-2-1999 JOHN
2 1-1-2010 JOHN
3 1-12-2011 FOAD
2-Orderdetail (child table) having data
PRID PRNAME SP QTY TOTAL ORDERID
11 CD 5 10 50 1
11 CD 5 5 25 1
12 DVD 10 5 50 2
i wan to get data like this
Empname Prname Qty SP total
JOHN CD 15 5 75
DVD 5 10 50
Regards
Reply
dolly
April 2, 2012 7:17 pm
hi pinal,
i find your blog very useful.
i have been working for a airticketing service i have a table with 2 cols id and wsdate(datetime datatype) where wsdate contains date and time which got some thousands of records on different dates,i want to display distinct date col and other one is count of rows from the table on that particular date.please try to help me out
Reply
JideshDavid
June 3, 2012 12:18 am
Thank u sir, I forgot Distinct Clause, to remove duplicate values…. At the right time your answer helps me…alot..
Reply