SQL SERVER – DISTINCT Keyword Usage and Common Discussion

Jr. DBA asked me a day ago, how to apply DISTINCT keyword to only first column of SELECT. When asked for additional information about question, he showed me following query.

SELECT Roles, FirstName, LastName
FROM UserNames

He wanted to apply DISTINCT to only Roles and not across FirstName and LastName. When he finished I realize that it is not possible and there is logical error in thinking query like that. I helped him with what he needed however, after he left I realize that answer to his original question was “NO”.

Distinct can not be applied to only few columns it is always applied to whole column. In following exam DISTINCT is applied across Roles, FirstName and LastName.

SELECT DISTINCT
Roles, FirstName, LastName
FROM UserNames

Same result as above query can be achieved by using GROUP BY as well.

SELECT
Roles, FirstName, LastName
FROM UserNames
GROUP BY Roles, FirstName, LastName

Before I end discussion I would like to add that DISTINCT can be used with aggregated functions and can be applied on only one field in that case. Following example demonstrates that scenario.

SELECT MAX
(DISTNCT ID), Roles, FirstName, LastName
FROM UserNames, LastName

This subject is very interesting and I decided to search online if any other writers has explained this in depth. I found following good article written by Jeff Smith. He has explained this example in depth so I have skipped explaining similar example in depth.

Reference : Pinal Dave (https://blog.sqlauthority.com)

SQL Scripts
Previous Post
SQL SERVER – Cumulative Update Package 5 for SQL Server 2005 Service Pack 2
Next Post
SQL SERVER – Enabling Clustered and Non-Clustered Indexes – Interesting Fact

Related Posts

29 Comments. Leave new

  • In the following sentence you made a mistake: “Distinct can not be applied to only few columns it is always applied to whole column.” You must have meant the “whole row”, not column. And while at it, you may want to add a punctuation sign in front of “it is…”. :)

    Reply
  • my concept is not clear after reading your discussion

    Reply
  • It seems to me that M$ SQL Server cannot “GROUP BY” one column of a result set, nor can it select only “DISTINCT” values of one column of the result set.

    MySQL can easily do this with the following code:
    “SELECT DISTINCT Roles, FirstName, LastName FROM UserNames”

    Reply
    • Which FirstName, LastName does it pick for the result, if there are mulitiple rows with same Roles but different names?

      Reply
  • h i
    using stored produce i want to remove the duplicate value in the single colums

    Reply
  • Hi,

    I have a similar problem but cant fix it can you please help me, problem is:

    I have a table with following columns: basketid, lookup_value, date_time.

    I only want to return distinct basketid and lookup value rows and also want to include date_time row as well. but it is not distinct. any solutions please. Group by and distinct does not solve my problem.

    Thank you

    Reply
  • Imran Mohammed
    April 8, 2009 10:02 am

    @Naeem

    One way to do that is use GROUP BY, But there is a serious problem with GROUP BY, which I hate saying but true, You have to include all the columns in GROUP BY clause what you have mentioned in SELECT clause.

    Anyways your best choice would be to use OVER PARTITION BY in your select query, this is a new feature in sql server 2005.

    OVER PARTITION BY will serves your purpose. This is just like GROUP BY but here you have freedom to exclude columns that you do not want to include in GROUP BY clause.

    GROUP BY is nothing but grouping or distinct (for that match of columns) in a way.

    Find simple example of OVER PARTITION BY here,

    ~ IM.

    Reply
  • I’m surprised at your statement about ” logical error in thinking query like that… ”

    I have a situation where I’m asked for a sample of 100 records from a 10,000 record table. The table has the following columns ( ID, PayerID, Category, Status, CreateDate )

    In my Sample file I need to have a 100 records with all the columns but each record should have a DISTINCT PayerID.

    Explain to me please on how would you solve this?

    Thanx

    Kinara

    Reply
  • Imran Mohammed
    May 19, 2009 9:34 am

    @Kinara,

    Need more details.

    Do you have duplicate records for the same PayerID with different combinations of ID< Category, Status and CreateDate. If there exusts multiple records for same PayerID, I dont think, you can achieve your tasks by a simple select statement.

    Please provide some sample data.

    Input and what kind of output you are expecting.

    ~ IM.

    Reply
  • How to select only one category of dupliacte records and delete that excluding one …
    for example consider in my DB table i have 200 records
    in that i have more that one person duplicate records, now
    i want to delete one user duplicate records.
    that is if person one has entered his data 5 times with same details, i have to check that and have to delete 4 record of that person and keep one record… i may have more than one person records in that way in same table

    Reply
  • and one more thing here is in this table their is no autonumber or primary key,

    i have to select only one person record and delet his duplicate entries … that person records at a time

    Reply
  • Hi Shaf,

    If you are using SQL 2005 and above, you can use following:

    ;with cte as(
    SELECT Row_NUMBER() OVER(PARTITION BY Person ORDER BY Person) AS RowID,
    *
    from table
    )
    select *
    from cte
    where RowID > 1

    It will list out all records that you need to delete. If you are sure that you can delete this. use this query:

    ;with cte as(
    SELECT Row_NUMBER() OVER(PARTITION BY Person ORDER BY Person) AS RowID,
    *
    from table
    )
    DELETE
    from cte
    where RowID > 1

    For more information:

    Thanks,
    Tejas

    Reply
  • Edolmen Sviluppo Siti Web
    June 1, 2010 2:48 pm

    I’m actually working on a sample of 100 records from a 10,000 record table. The table has the following columns ( ID, PayerID, Category, Status, CreateDate )

    In my file I need to have a 100 records with all the columns but each record should have a DISTINCT PayerID.
    How would I do that ?

    Reply
  • Hi,

    I would like to know wether the distinct can be applied to each column in a table or not… if so can you please
    explain..

    e.g.

    I have a table containing a firstname, lastname, salary, designation, dob, doj,

    —-
    I would like to apply distinct keyword for all the column in a table.

    Thanks in Advance,
    Prithviraj MK

    Reply
  • I have

    select distinct a,b,c,database from
    (
    select a,b,c,’A’ as database from tableA
    union
    select a,b,c,’B’ as database from tableB
    ) as Everything

    with tableA and tableB having lots of data in common, and I expect this query to give all the rows with different value of a.

    Where’s the logical error? Is there any way to have this simple query done in SQLServer?

    Reply
  • Shirish Kulkarni
    August 5, 2011 2:11 pm

    As distinct clause affects performance of the application. What is the better option to DISTINCT clause. In T- SQL only.

    Reply
    • Can you post the code you have used? If you used group by clause, you dont need a distinct clause

      Reply
  • Christof Coetzee
    November 18, 2011 4:41 pm

    Hawing worked extensively with MySQL and only about 2 years on Sql Server I have encountered a lot of headaches with Sql Server’s group-by, distinct and in general to get UNIQUE ROWS on joins.
    The fact that you have to include all columns in a group-by is just silly and I think Sql Server is perhaps not as intelligent as MySql in its query plans.

    In Sql Server the way to get recordsets for pagination is also unnecessarily complex TOP X and order-by, OVER PARTITION etc is ridiculous, in MySQL its as simple as LIMIT offset count.

    Reply
  • shal i use..Distinct(Columname) like this

    Reply
  • I need result by using DISTINCT for one column only, not whole row then how to proceed.

    Reply
    • You need to either exclude other columns, or use aggregate functions like min or mx on other columns

      Reply
  • If GROUP BY can do same work as DISTINCT then why DISTINCT is needed.

    Reply
    • In some cases they do the same thing and not all the cases. They exists for different purposes

      Reply
  • Abhishek Majumder
    September 29, 2012 8:47 pm

    I have 4 result looks like
    ram 10 1001
    ram 10 1002
    shyam 30 5431
    shyam 30 5567

    now i have to select only 2nd and 4th row
    how to do it?

    Reply

Leave a Reply