Question: What is the Difference Between DISTINCT and GROUP BY?
Answer: A DISTINCT and GROUP BY usually generate the same query plan, so performance should be the same across both query constructs. GROUP BY should be used to apply aggregate operators to each group. If all you need is to remove duplicates, then use DISTINCT. If you are using sub-queries execution plan for that query varies so in that case you need to check the execution plan before making decision of which is faster.
USE AdventureWorks2014 GO -- Example of DISTINCT: SELECT DISTINCT JobTitle, Gender FROM [HumanResources].[Employee] GO
USE AdventureWorks2014 GO -- Example of GROUP BY: SELECT JobTitle, Gender FROM [HumanResources].[Employee] GROUP BY JobTitle, Gender GO
USE AdventureWorks2014 GO -- Example of GROUP BY with aggregate function: SELECT JobTitle, Gender, COUNT(*) EmployeeCount FROM [HumanResources].[Employee] GROUP BY JobTitle, Gender GO
Reference: Pinal Dave (https://blog.sqlauthority.com)
6 Comments. Leave new
Thanks Pinal, really simple to understand!
Great. Thanks for the comment Alexander.
Also we can apply HAVING clause with GROUP BY to filter the aggregate results
Correct Manish.
I always warn other developers not to use DISTINCT as a Band-Aid. If your query is returning seven copies of every record, don’t just slap a DISTINCT on there to make them go away. There’s a mistake in your query – find it and fix it. Most likely there’s a one-to-many relationship you didn’t take into consideration and it’s inflating your result set. It could be throwing off something else as well.
I always warn other developers not to use DISTINCT as a Band-Aid. If your query is returning seven copies of every record, don’t just slap a DISTINCT on there to make them go away. There’s a mistake in your query – find it and fix it. Most likely there’s a one-to-many relationship you didn’t take into consideration and it’s inflating your result set. It could be throwing off something else as well.