Interview Question of the Week #020 – What is the Difference Between DISTINCT and GROUP BY?

Question: What is the Difference Between DISTINCT and GROUP BY?

Answer: A DISTINCT and GROUP BY usually generate the same query plan, so performance should be the same across both query constructs. GROUP BY should be used to apply aggregate operators to each group. If all you need is to remove duplicates, then use DISTINCT. If you are using sub-queries execution plan for that query varies so in that case you need to check the execution plan before making decision of which is faster.

USE AdventureWorks2014
GO
-- Example of DISTINCT:
SELECT DISTINCT JobTitle, Gender
FROM [HumanResources].[Employee]
GO


Interview Question of the Week #020 - What is the Difference Between DISTINCT and GROUP BY? interview-20-1

USE AdventureWorks2014
GO
-- Example of GROUP BY:
SELECT JobTitle, Gender
FROM [HumanResources].[Employee]
GROUP BY JobTitle, Gender
GO


Interview Question of the Week #020 - What is the Difference Between DISTINCT and GROUP BY? interview-20-2

USE AdventureWorks2014
GO
-- Example of GROUP BY with aggregate function:
SELECT JobTitle, Gender, COUNT(*) EmployeeCount
FROM [HumanResources].[Employee]
GROUP BY JobTitle, Gender
GO

Interview Question of the Week #020 - What is the Difference Between DISTINCT and GROUP BY? interview-20-3

Reference: Pinal Dave (https://blog.sqlauthority.com)

SQL Scripts
Previous Post
SQL SERVER – Service Pack Error – The User Data directory in the registry is not valid. Verify DefaultData key under the instance hive points to a valid directory
Next Post
SQL Authority News – SQL Server 2014 Service Pack 1 Available for Download

Related Posts

6 Comments. Leave new

  • Alexander Nana Owusu
    May 17, 2015 12:22 pm

    Thanks Pinal, really simple to understand!

    Reply
  • Manish Sharma
    May 19, 2015 1:15 pm

    Also we can apply HAVING clause with GROUP BY to filter the aggregate results

    Reply
  • I always warn other developers not to use DISTINCT as a Band-Aid. If your query is returning seven copies of every record, don’t just slap a DISTINCT on there to make them go away. There’s a mistake in your query – find it and fix it. Most likely there’s a one-to-many relationship you didn’t take into consideration and it’s inflating your result set. It could be throwing off something else as well.

    Reply
  • I always warn other developers not to use DISTINCT as a Band-Aid. If your query is returning seven copies of every record, don’t just slap a DISTINCT on there to make them go away. There’s a mistake in your query – find it and fix it. Most likely there’s a one-to-many relationship you didn’t take into consideration and it’s inflating your result set. It could be throwing off something else as well.

    Reply

Leave a Reply