Interview Question of the Week #034 – What is the Difference Between Distinct and Group By

Question: What is the difference between DISTINCT and GROUP BY?

Answer:

A DISTINCT and GROUP BY usually generate the same query plan, so performance should be the same across both query constructs. GROUP BY should be used to apply aggregate operators to each group. If all you need is to remove duplicates, then use DISTINCT. If you are using sub-queries execution plan for that query varies so in that case you need to check the execution plan before making decision of which is faster.

Example of DISTINCT:

SELECT DISTINCT Employee, Rank
FROM Employees

Example of GROUP BY:

SELECT Employee, Rank
FROM Employees
GROUP BY Employee, Rank

Example of GROUP BY with aggregate function:

SELECT Employee, Rank, COUNT(*) EmployeeCount
FROM Employees
GROUP BY Employee, Rank

Reference: Pinal Dave (https://blog.sqlauthority.com)

Previous Post
SQL SERVER – Remove All Characters From a String Using T-SQL
Next Post
SQL SERVER – SPID is KILLED/ROLLBACK state. What to Do Next?

Related Posts

No results found.

4 Comments. Leave new

  • no .group by is faster! better in the long run. if need to work with subqueries, select distinct kill the CPU. group by already filter out duplicated row.

    Reply
  • can’t see why I just add a comment and when I submit that.

    I am not going to type long again.

    select distinct is a row by row operation, just like RBAR operation. with subqueries, it run each subqueries, which is worse already, and after that , drop that as select distinct see some ROW is duplicated.

    we just under the same problem and we fix it by using group by.

    other worse thing we saw is the function, RAM eating.

    when compare with group by, group by do not have the additional SORT operator.

    Reply

Leave a Reply

Menu