Question: What is the difference between DISTINCT and GROUP BY?
Answer:
A DISTINCT and GROUP BY usually generate the same query plan, so performance should be the same across both query constructs. GROUP BY should be used to apply aggregate operators to each group. If all you need is to remove duplicates, then use DISTINCT. If you are using sub-queries execution plan for that query varies so in that case you need to check the execution plan before making decision of which is faster.
Example of DISTINCT:
SELECT DISTINCT Employee, Rank FROM Employees
Example of GROUP BY:
SELECT Employee, Rank FROM Employees GROUP BY Employee, Rank
Example of GROUP BY with aggregate function:
SELECT Employee, Rank, COUNT(*) EmployeeCount FROM Employees GROUP BY Employee, Rank
Reference: Pinal Dave (https://blog.sqlauthority.com)
4 Comments. Leave new
no .group by is faster! better in the long run. if need to work with subqueries, select distinct kill the CPU. group by already filter out duplicated row.
dba100 – please provide example.
can’t see why I just add a comment and when I submit that.
I am not going to type long again.
select distinct is a row by row operation, just like RBAR operation. with subqueries, it run each subqueries, which is worse already, and after that , drop that as select distinct see some ROW is duplicated.
we just under the same problem and we fix it by using group by.
other worse thing we saw is the function, RAM eating.
when compare with group by, group by do not have the additional SORT operator.
dba100 – Thanks for sharing your scenario.