SQL SERVER – Difference between DISTINCT and GROUP BY – Distinct vs Group By

This question is asked many times to me. What is difference between DISTINCT and GROUP BY?

A DISTINCT and GROUP BY usually generate the same query plan, so performance should be the same across both query constructs. GROUP BY should be used to apply aggregate operators to each group. If all you need is to remove duplicates then use DISTINCT. If you are using sub-queries execution plan for that query varies so in that case you need to check the execution plan before making decision of which is faster.

Example of DISTINCT:
SELECT DISTINCT Employee, Rank
FROM Employees

Example of GROUP BY:
SELECT Employee, Rank
FROM Employees
GROUP BY Employee, Rank

Example of GROUP BY with aggregate function:
SELECT Employee, Rank, COUNT(*) EmployeeCount
FROM Employees
GROUP BY Employee, Rank

Reference : Pinal Dave (https://blog.sqlauthority.com)

SQL Function, SQL Joins, SQL Scripts
Previous Post
SQL SERVER – Fix : Error 8101 An explicit value for the identity column in table can only be specified when a column list is used and IDENTITY_INSERT is ON
Next Post
SQL SERVER – Index Seek Vs. Index Scan (Table Scan)

Related Posts

97 Comments. Leave new

  • Jignesh Shah
    May 5, 2010 4:11 pm

    Hi,

    I have the following data in the table.

    product price qty
    ====================
    A 10 5
    A 20 2
    A 30 1
    B 20 3
    B 10 7

    The result that i expect from SQL query
    basically the total = sum (price x qty) but quantity should be the lowest among same product

    for product A is
    LINETOTAL
    50

    for product B is
    SUMTOTAL
    70

    Could you please help me how to write the SQL query.

    Thanks,
    Jignesh

    Reply
    • Brian Tkatch
      May 6, 2010 5:20 pm

      @Jignesh

      The lowest or the highest?

      WITH
      Data(product, price, qty)
      AS
      (
      SELECT ‘A’, 10, 5 UNION ALL
      SELECT ‘A’, 20, 2 UNION ALL
      SELECT ‘A’, 30, 1 UNION ALL
      SELECT ‘B’, 20, 3 UNION ALL
      SELECT ‘B’, 10, 7
      ),
      Data_RN
      AS
      (
      SELECT
      product,
      price,
      qty,
      ROW_NUMBER() OVER(PARTITION BY Product ORDER BY qty DESC) RN
      FROM
      Data
      )
      SELECT
      product,
      price * qty
      FROM
      Data_RN
      WHERE
      RN = 1;

      Reply
  • Jignesh Shah
    May 7, 2010 10:33 am

    Thanks brian thanks a lot ! :-)

    Sorry it was for lowest quantity…

    Reply
  • Hi
    I have a problem with join and group by
    when I use group by in sub query the result will be done with 7 rows

    SELECT classNameId , max( viewCount) as viewcount
    FROM tagsasset
    group by classNameId

    classNameId viewcount
    10011 0
    10020 0
    10040 0
    10056 31
    10066 10
    10067 5409
    10111 2

    but when join it the result has 63 rows
    because I put 2 condition
    when it was do 2 conditon for equal result show all of the row have this result

    SELECT TA.title,TA.viewcount,TA.classnameid,c.value
    from tagsasset TA inner join classname_ c
    on c.classnameid=TA.classnameid
    inner join
    (SELECT classNameId , max( viewCount) as viewcount
    FROM tagsasset
    group by classNameId )groupet on
    TA.viewcount=groupet.viewcount
    and
    groupet.classnameid=TA.classnameid
    view classnameid
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    31 10056
    10 10066
    5409 10067
    2 10111
    2 10111

    I have duplicate result
    plz help me

    Reply
  • i have results like the following

    Marea,Garea,SGarea
    bck1,gitem1,sgitem1
    bck1,gitem1,sgitem2
    bck1,gitem1,sgitem3
    bck2,gitem2,sgitem11
    bck2,gitem2,sgitem12
    bck2,gitem2,sgitem13

    i require quey which should give result like below
    (note:i wanted to show in a gridview like below)

    bck1 gitem1 sgitem1
    sgitem2
    sgitem3
    bck2 gitem2 sgitem11
    sgitem12
    sgitem13

    can any one give me some idea.how i can make it possible.

    from
    lokesh

    Reply
  • DECLARE @tmp TABLE(SrNo INT)

    INSERT INTO @tmp VALUES(3)
    INSERT INTO @tmp VALUES(3)
    INSERT INTO @tmp VALUES(4)
    INSERT INTO @tmp VALUES(5)
    INSERT INTO @tmp VALUES(5)
    INSERT INTO @tmp VALUES(5)
    INSERT INTO @tmp VALUES(9)
    INSERT INTO @tmp VALUES(8)
    INSERT INTO @tmp VALUES(9)
    INSERT INTO @tmp VALUES(9)
    INSERT INTO @tmp VALUES(9)
    INSERT INTO @tmp VALUES(9)

    SELECT SrNo,COUNT(*) AS Total
    FROM @tmp group by SrNo

    Reply
  • Hey Sally,

    U can use above query which gives the result that u expected

    Thanks
    Ram

    Reply
  • Hi All,
    I have a db with fields Part number,supplier number and country name

    I would like to get a table with one unique record for partnumber and the country1 name country 2 name,–etc..
    example this is what I have

    partno suno country
    001 123 US
    001 234 IN
    001 567 AU
    001 897 CA
    002 235 IN
    002 566 SR

    etc…

    I need a table like this

    part num country1 country 2 country 3 country 4
    001 US IN AU CA
    002 IN SR

    etc…

    Maximum suppliers will be usually from 1-10 ( Assumption)

    Could anyone please help me out with the sql query to get this result.

    .we may have use do while loop,but no clear idea

    please hep me out..

    Thanks,
    Santhu

    Reply
    • I think group_concat() function is the best solution to resolve your confusion,

      E.g.

      select
      First_column ,
      dbo.group_concat(Second_column) as “List of all countries”
      from Table_name
      group by First_column;

      Reply
  • I guess something is missing in Article. Distinct creates a temp table and remove duplicates. Group by essentially does the same things, but it has additional overhead of sorting the records. By default group by is ASC sort. so Group by is slower in cases where the key grouping column is not indexed. If the query columns are already in clustered indexed, it may be same operation.
    I would go with query plan and based on instincts will prefer Distinct over Group by

    thanks

    Reply
  • Hi

    why we should not use distinct and group by within the same sql statement or query??? any performance issue can u explain pls????????

    Reply
  • Very nice…

    Reply
  • Hi,

    i want to find distinct top.

    top 3 for all UserId’s.

    for ex:

    the table is as below:

    Id UserId Date
    1 10 10/10/2010
    1 10 10/10/2010
    1 5 10/10/2010
    1 10 10/10/2010
    1 5 10/10/2010
    1 7 10/10/2010
    1 10 10/10/2010
    1 5 10/10/2010
    1 10 10/10/2010
    1 5 10/10/2010

    In the above table userid 10, 5 & 7 has 5, 4 & 1 rows respectively.

    Now i expect 3 or below 3 from each user.

    For Ex:

    Id UserId Date
    1 10 10/10/2010
    1 10 10/10/2010
    1 10 10/10/2010
    1 5 10/10/2010
    1 5 10/10/2010
    1 5 10/10/2010
    1 7 10/10/2010

    how to solve this.

    regards
    rajendran

    Reply
  • Vicky Raj Sharma
    March 29, 2011 1:11 pm

    SELECT ROW_NUMBER() OVER (ORDER BY inserted_date DESC)
    AS Row, distinct product_type, parent_id,product_id,brand_id,device_id,p.subcategory_id,upc_code,sku,product_name,p.description,price,product_image,
    isDevice,dimension_length,dimension_length_unit,dimension_width,dimension_width_unit,
    dimension_height,dimension_height_unit,dimension_weight,dimension_weight_unit from tblProductDetails p join tblSubCategory s
    on p.subcategory_id=s.subcategory_id
    where parent_id=@cat_id and p.subcategory_id=@sub_id

    My distinct keyword is having error.

    Reply
    • You cannot use distinct that way. What do you want to get? Post some sample data with expected result

      Reply
  • pradeep Tiwari
    April 26, 2011 1:08 pm

    I guess something is missing in Article. Distinct creates a temp table and remove duplicates. Group by essentially does the same things, but it has additional overhead of sorting the records. By default group by is ASC sort. so Group by is slower in cases where the key grouping column is not indexed. If the query columns are already in clustered indexed, it may be same operation.
    I would go with query plan and based on instincts will prefer Distinct over Group by

    Reply
  • this is my query and

    select company_name ,department_name from Department_Master where Status = ‘N’ order by company_name, group by company_name

    It shows me this kind of the error

    Msg 156, Level 15, State 1, Line 1
    Incorrect syntax near the keyword ‘group’.

    What should i do to use the group by clouse in sql server 2008

    Thanks in advance

    Reply
    • It should be

      select company_name ,department_name from Department_Master where Status = ‘N’ group by company_name, order by company_name

      Reply
  • Blueprint RSA
    May 31, 2011 9:33 am

    Thanks for posting this very informational blog. Now I know the difference between DISTINCT and GROUP BY. I hope to achieve something from what I learn today.

    Reply
  • Hi Sally,

    You can do the following to get the results inon estatement.

    SELECT Col1,COUNT(*) AS Total FROM TMP
    GROUP BY Col1
    ORDER BY Col1

    Reply
  • I need your help ragarding sql query.

    Val_1, Val_2, Val_3,Val_4
    a b c 1
    a b c 2
    a b c 4
    m n o 1
    m n o 2

    and output should be

    val_1, Val_2,val_3, val_4
    a b c 1,2,4
    m n o 1,2

    Pls help if anyone know…

    Reply
  • Dear all i need a help from you people. I am using vb6 & ms sql server2000. I have two tables.

    1- Order (master table) having data

    ORDERID DATE EMPNAME
    1 1-2-1999 JOHN
    2 1-1-2010 JOHN
    3 1-12-2011 FOAD

    2-Orderdetail (child table) having data

    PRID PRNAME SP QTY TOTAL ORDERID
    11 CD 5 10 50 1
    11 CD 5 5 25 1
    12 DVD 10 5 50 2

    i wan to get data like this

    Empname Prname Qty SP total
    JOHN CD 15 5 75
    DVD 5 10 50

    Regards

    Reply
  • hi pinal,
    i find your blog very useful.
    i have been working for a airticketing service i have a table with 2 cols id and wsdate(datetime datatype) where wsdate contains date and time which got some thousands of records on different dates,i want to display distinct date col and other one is count of rows from the table on that particular date.please try to help me out

    Reply
  • Thank u sir, I forgot Distinct Clause, to remove duplicate values…. At the right time your answer helps me…alot..

    Reply

Leave a Reply