SQL SERVER – Difference between DISTINCT and GROUP BY – Distinct vs Group By

This question is asked many times to me. What is difference between DISTINCT and GROUP BY?

A DISTINCT and GROUP BY usually generate the same query plan, so performance should be the same across both query constructs. GROUP BY should be used to apply aggregate operators to each group. If all you need is to remove duplicates then use DISTINCT. If you are using sub-queries execution plan for that query varies so in that case you need to check the execution plan before making decision of which is faster.

Example of DISTINCT:
SELECT DISTINCT Employee, Rank
FROM Employees

Example of GROUP BY:
SELECT Employee, Rank
FROM Employees
GROUP BY Employee, Rank

Example of GROUP BY with aggregate function:
SELECT Employee, Rank, COUNT(*) EmployeeCount
FROM Employees
GROUP BY Employee, Rank

Reference : Pinal Dave (https://blog.sqlauthority.com)

, ,
Previous Post
SQL SERVER – Fix : Error 8101 An explicit value for the identity column in table can only be specified when a column list is used and IDENTITY_INSERT is ON
Next Post
SQL SERVER – Index Seek Vs. Index Scan (Table Scan)

Related Posts

96 Comments. Leave new

  • Samba Shiva Reddy
    April 28, 2010 4:56 pm

    CREATE table tmp (SrNo INT)
    INSERT INTO tmp VALUES(3)
    INSERT INTO tmp VALUES(3)
    INSERT INTO tmp VALUES(4)
    INSERT INTO tmp VALUES(5)
    INSERT INTO tmp VALUES(5)
    INSERT INTO tmp VALUES(5)
    INSERT INTO tmp VALUES(9)
    INSERT INTO tmp VALUES(8)
    INSERT INTO tmp VALUES(9)
    INSERT INTO tmp VALUES(9)
    INSERT INTO tmp VALUES(9)
    INSERT INTO tmp VALUES(9)

    select top 1 srno
    from tmp
    group by srno
    order by count(srno)
    desc

    Reply
  • thanks Samba Shiva Reddy for your help

    Reply
  • does anyone help me
    ……..how can I Make OLAP in SQl server 2005

    Reply
  • how can i Create Local Cubes with Microsoft SQL Server 2005

    Reply
  • HI, i am new to OLAP in Sql Server 2005, bascially i have read through lots of article on this OLAP and basically i get to know how it works and function theorically, but i could not find an article to show me step by step on how to create the cube and how to query the data from the cube. Would very much appreciated if somebody out there can explain to me in details how to i create cube from sql server 2005 as i totally no idea at all. Though i can view the sample cude in sql server 2005, but i dunno how to create it from my relational database. I need it very urgent as needed by my job functionality. Hope to get some explanation from you guys asap. Just assume i am totally new and please guide me step by step to create a normal and simple cube and query from the cube. I just need to create simple prototype. Thanks alot yea guys :)

    Reply
  • I’ll use DISTINCT as it is the true meaning of what I want to achieve… I had just heard in the past that GROUP BY was a lot more performant, but like you say it should execute the same query plan – and I can always check that too!

    So thanks, I am now not as “anti-DISTINCT” as I was before I read your post. ;-)

    Cheers.

    Reply
  • Jignesh Shah
    May 5, 2010 4:11 pm

    Hi,

    I have the following data in the table.

    product price qty
    ====================
    A 10 5
    A 20 2
    A 30 1
    B 20 3
    B 10 7

    The result that i expect from SQL query
    basically the total = sum (price x qty) but quantity should be the lowest among same product

    for product A is
    LINETOTAL
    50

    for product B is
    SUMTOTAL
    70

    Could you please help me how to write the SQL query.

    Thanks,
    Jignesh

    Reply
    • Brian Tkatch
      May 6, 2010 5:20 pm

      @Jignesh

      The lowest or the highest?

      WITH
      Data(product, price, qty)
      AS
      (
      SELECT ‘A’, 10, 5 UNION ALL
      SELECT ‘A’, 20, 2 UNION ALL
      SELECT ‘A’, 30, 1 UNION ALL
      SELECT ‘B’, 20, 3 UNION ALL
      SELECT ‘B’, 10, 7
      ),
      Data_RN
      AS
      (
      SELECT
      product,
      price,
      qty,
      ROW_NUMBER() OVER(PARTITION BY Product ORDER BY qty DESC) RN
      FROM
      Data
      )
      SELECT
      product,
      price * qty
      FROM
      Data_RN
      WHERE
      RN = 1;

      Reply
  • Jignesh Shah
    May 7, 2010 10:33 am

    Thanks brian thanks a lot ! :-)

    Sorry it was for lowest quantity…

    Reply
  • Hi
    I have a problem with join and group by
    when I use group by in sub query the result will be done with 7 rows

    SELECT classNameId , max( viewCount) as viewcount
    FROM tagsasset
    group by classNameId

    classNameId viewcount
    10011 0
    10020 0
    10040 0
    10056 31
    10066 10
    10067 5409
    10111 2

    but when join it the result has 63 rows
    because I put 2 condition
    when it was do 2 conditon for equal result show all of the row have this result

    SELECT TA.title,TA.viewcount,TA.classnameid,c.value
    from tagsasset TA inner join classname_ c
    on c.classnameid=TA.classnameid
    inner join
    (SELECT classNameId , max( viewCount) as viewcount
    FROM tagsasset
    group by classNameId )groupet on
    TA.viewcount=groupet.viewcount
    and
    groupet.classnameid=TA.classnameid
    view classnameid
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    0 10040
    31 10056
    10 10066
    5409 10067
    2 10111
    2 10111

    I have duplicate result
    plz help me

    Reply
  • i have results like the following

    Marea,Garea,SGarea
    bck1,gitem1,sgitem1
    bck1,gitem1,sgitem2
    bck1,gitem1,sgitem3
    bck2,gitem2,sgitem11
    bck2,gitem2,sgitem12
    bck2,gitem2,sgitem13

    i require quey which should give result like below
    (note:i wanted to show in a gridview like below)

    bck1 gitem1 sgitem1
    sgitem2
    sgitem3
    bck2 gitem2 sgitem11
    sgitem12
    sgitem13

    can any one give me some idea.how i can make it possible.

    from
    lokesh

    Reply
  • DECLARE @tmp TABLE(SrNo INT)

    INSERT INTO @tmp VALUES(3)
    INSERT INTO @tmp VALUES(3)
    INSERT INTO @tmp VALUES(4)
    INSERT INTO @tmp VALUES(5)
    INSERT INTO @tmp VALUES(5)
    INSERT INTO @tmp VALUES(5)
    INSERT INTO @tmp VALUES(9)
    INSERT INTO @tmp VALUES(8)
    INSERT INTO @tmp VALUES(9)
    INSERT INTO @tmp VALUES(9)
    INSERT INTO @tmp VALUES(9)
    INSERT INTO @tmp VALUES(9)

    SELECT SrNo,COUNT(*) AS Total
    FROM @tmp group by SrNo

    Reply
  • Hey Sally,

    U can use above query which gives the result that u expected

    Thanks
    Ram

    Reply
  • Hi All,
    I have a db with fields Part number,supplier number and country name

    I would like to get a table with one unique record for partnumber and the country1 name country 2 name,–etc..
    example this is what I have

    partno suno country
    001 123 US
    001 234 IN
    001 567 AU
    001 897 CA
    002 235 IN
    002 566 SR

    etc…

    I need a table like this

    part num country1 country 2 country 3 country 4
    001 US IN AU CA
    002 IN SR

    etc…

    Maximum suppliers will be usually from 1-10 ( Assumption)

    Could anyone please help me out with the sql query to get this result.

    .we may have use do while loop,but no clear idea

    please hep me out..

    Thanks,
    Santhu

    Reply
    • I think group_concat() function is the best solution to resolve your confusion,

      E.g.

      select
      First_column ,
      dbo.group_concat(Second_column) as “List of all countries”
      from Table_name
      group by First_column;

      Reply
  • I guess something is missing in Article. Distinct creates a temp table and remove duplicates. Group by essentially does the same things, but it has additional overhead of sorting the records. By default group by is ASC sort. so Group by is slower in cases where the key grouping column is not indexed. If the query columns are already in clustered indexed, it may be same operation.
    I would go with query plan and based on instincts will prefer Distinct over Group by

    thanks

    Reply
  • Hi

    why we should not use distinct and group by within the same sql statement or query??? any performance issue can u explain pls????????

    Reply
  • Very nice…

    Reply
  • Hi,

    i want to find distinct top.

    top 3 for all UserId’s.

    for ex:

    the table is as below:

    Id UserId Date
    1 10 10/10/2010
    1 10 10/10/2010
    1 5 10/10/2010
    1 10 10/10/2010
    1 5 10/10/2010
    1 7 10/10/2010
    1 10 10/10/2010
    1 5 10/10/2010
    1 10 10/10/2010
    1 5 10/10/2010

    In the above table userid 10, 5 & 7 has 5, 4 & 1 rows respectively.

    Now i expect 3 or below 3 from each user.

    For Ex:

    Id UserId Date
    1 10 10/10/2010
    1 10 10/10/2010
    1 10 10/10/2010
    1 5 10/10/2010
    1 5 10/10/2010
    1 5 10/10/2010
    1 7 10/10/2010

    how to solve this.

    regards
    rajendran

    Reply
  • Vicky Raj Sharma
    March 29, 2011 1:11 pm

    SELECT ROW_NUMBER() OVER (ORDER BY inserted_date DESC)
    AS Row, distinct product_type, parent_id,product_id,brand_id,device_id,p.subcategory_id,upc_code,sku,product_name,p.description,price,product_image,
    isDevice,dimension_length,dimension_length_unit,dimension_width,dimension_width_unit,
    dimension_height,dimension_height_unit,dimension_weight,dimension_weight_unit from tblProductDetails p join tblSubCategory s
    on p.subcategory_id=s.subcategory_id
    where parent_id=@cat_id and p.subcategory_id=@sub_id

    My distinct keyword is having error.

    Reply
    • You cannot use distinct that way. What do you want to get? Post some sample data with expected result

      Reply
  • pradeep Tiwari
    April 26, 2011 1:08 pm

    I guess something is missing in Article. Distinct creates a temp table and remove duplicates. Group by essentially does the same things, but it has additional overhead of sorting the records. By default group by is ASC sort. so Group by is slower in cases where the key grouping column is not indexed. If the query columns are already in clustered indexed, it may be same operation.
    I would go with query plan and based on instincts will prefer Distinct over Group by

    Reply
  • this is my query and

    select company_name ,department_name from Department_Master where Status = ‘N’ order by company_name, group by company_name

    It shows me this kind of the error

    Msg 156, Level 15, State 1, Line 1
    Incorrect syntax near the keyword ‘group’.

    What should i do to use the group by clouse in sql server 2008

    Thanks in advance

    Reply
    • It should be

      select company_name ,department_name from Department_Master where Status = ‘N’ group by company_name, order by company_name

      Reply

Leave a Reply

Menu