SQL SERVER – Query to Get Unique Distinct Data Based on Condition – Eliminate Duplicate Data from Resultset

Seems like in T-SQL world the issue with Duplicate Records never an old topic. Today lets  quickly go over another question which made it to my mailbox 3 times this week.

Question: How do I display only unique records from my table?

The natural reaction will be to suggest DISTINCT or GROUP BY. However, not all the questions can be solved by DISTINCT or GROUP BY. Let us see the following example, where a user wanted only latest records to be displayed. Let us see the example to understand further.

What the user wanted was not to display every duplicate records but wanted to display the record which had highest Price value. I like this kind of simple puzzle. This kind of puzzles has multiple solutions and I like to work on them. However, in this particular case, I was able to quickly come to resolve as I used ROW_NUMBER function in the solution.

Create a sample table.

USE TempDB
GO
CREATE TABLE Table1 (ID INT, Product VARCHAR(100), Price INT, Color VARCHAR(100))
GO
INSERT INTO Table1
SELECT 1, 'Toy', 100, 'Black'
UNION ALL
SELECT 2, 'Pen', 100, 'Black'
UNION ALL
SELECT 3, 'Pencil', 100, 'Blue'
UNION ALL
SELECT 4, 'Pencil', 100, 'Red'
UNION ALL
SELECT 5, 'Pencil', 200, 'Yellow'
UNION ALL
SELECT 6, 'Cup', 300, 'Orange'
UNION ALL
SELECT 7, 'Cup', 400, 'Brown'
GO
SELECT Product, Price, Color
FROM Table1
GO

Select the data without duplicate records as well select the data with maximum price.

SELECT Product, Price, Color
FROM Table1
GO
SELECT Product, Price, Color
FROM
(SELECT Table1.*,
ROW_NUMBER() OVER (PARTITION BY Product ORDER BY Price DESC) AS RN
FROM Table1) AS t
WHERE RN = 1
ORDER BY ID
GO

Clean up act.

DROP TABLE Table1
GO

Please feel free to share your solution as a comment. I will share the solutions on blog with due credit.

Reference: Pinal Dave (http://blog.sqlauthority.com)

About these ads

34 thoughts on “SQL SERVER – Query to Get Unique Distinct Data Based on Condition – Eliminate Duplicate Data from Resultset

  1. Couple of dirty solutions. Several assumptions were made based on sample data

    –1
    SELECT * FROM Table1 O
    WHERE NOT EXISTS(SELECT 1 FROM Table1 I WHERE I.Product = O.Product AND I.Price > O.Price)

    –2
    SELECT O.* FROM Table1 O
    INNER JOIN
    (SELECT I.Product, MAX(Price) As Price FROM Table1 I
    GROUP BY I.Product)I
    ON I.Product = O.Product AND I.Price = O.Price

    –3

    SELECT * FROM Table1 O
    WHERE Price = (SELECT MAX(Price) FROM Table1 I WHERE I.Product = O.Product)

    Regards
    Roji Thomas
    http://sqlindian.com

  2. SELECT DISTINCT Product,
    (SELECT TOP (1) Price FROM Table1
    WHERE (t1.Product = Product) ORDER BY Price DESC) AS Price,
    (SELECT TOP (1) Color FROM Table1
    WHERE (t1.Product = Product) ORDER BY Price DESC) AS Color
    FROM Table1 AS t1

  3. SELECT table2.Product, table2.price, MAX(Table1.Color) AS Color
    FROM
    (SELECT Product, MAX(Price) AS price
    FROM Table1 AS t
    GROUP BY Product) AS table2
    INNER JOIN
    Table1 ON Table1.Product = table2.Product AND Table1.Price = table2.price
    GROUP BY table2.Product, table2.price

  4. Minor Changes to Original Solution:

    This will bring highest priced latest record.

    SELECT Product, Price, Color
    FROM
    (SELECT Table1.*,
    RANK() OVER (PARTITION BY Product ORDER BY Price DESC,ID DESC) AS RN
    FROM Table1) AS t
    WHERE RN = 1
    ORDER BY ID

  5. WITH Test(Product,Price) AS
    ( SELECT Product,MAX(Price)
    FROM Table1
    GROUP BY Product
    )SELECT A.*,B.Color FROM Test A JOIN Table1 B ON A.Product =B.Product AND A.Price =B.Price

  6. For Better performance and readability

    with cte(RN,Product,Price,color)
    as
    (
    SELECT
    RN=row_number() over (PARTITION BY Product order by price desc ) ,
    Product, Price, Color
    FROM Table1
    )
    select * from cte where RN=1

    In case we are into older version of sql and to make compatible with other RDBMS sysstems

    select T.Product,T.Price,T.color
    from
    Table1 T inner join (select Product,max(Price) as Price from Table1 group by Product) VT
    on t.Product=VT.Product and t.Price=VT.Price

    Thanks
    Nelson
    [Web Addressed moved to URL field]

  7. SELECT p.Product, p.Price, p.Color
    FROM Table1 p
    INNER JOIN (
    SELECT product, MAX(price) as Price
    FROM Table1
    GROUP by Product ) q on p.product = q.Product and p.price = q.Price
    ORDER BY ID

    This is the same as other posts, but may not always yield the same results as the original solution. If you add another row to the original table, you would up with duplicate products if the max price of a product is available in more than one color.

    INSERT INTO Table1
    SELECT 8, ‘Pencil’, 200, ‘Brown’

    Results
    Product Price Color
    Toy 100 Black
    Pen 100 Black
    Pencil 200 Yellow
    Cup 400 Brown
    Pencil 200 Brown

    The original solution returns the same 4 rows as it did before.

  8. Select distinct B.product , B.Price, B.Color
    from table_1 B
    join
    (Select product, max(price) max_price from table_1 group by product) A
    on A.product = B.product and A.max_price = B.Price

  9. select Product,max(Price) from table1 group by Product

    when i run this query it also returns the same result how can i know where to use and what the difference between ur query and this query ………………….

    • Dear Venkatesh,

      We need color also… If our requirement is to fetch The Product and Maximum Price only mean Your Query is Enough . But our requirement is to fetch the color also, So only we need such type of Different Query .

      According to the requirement the query may vary .

  10. SELECT Product, Price, Color
    FROM Table_test b
    where exists(Select 1 from
    (
    Select product, MAX(price)price
    from Table_test
    group by product)a where a.Product = b.Product and a.price = b.Price)

  11. – Query Cost: 36%
    ;WITH CTE AS (
    SELECT Product, MAX(Price) Price
    FROM TABLE1
    GROUP BY Product
    )

    SELECT CTE.Product, CTE.Price, Table1.Color
    FROM CTE
    JOIN Table1 ON CTE.Product = Table1.Product AND CTE.Price = Table1.Price

    GO

    – Query Cost: 64%
    SELECT Product, Price, Color
    FROM
    (SELECT Table1.*,
    ROW_NUMBER() OVER (PARTITION BY Product ORDER BY Price DESC) AS RN
    FROM Table1) AS t
    WHERE RN = 1
    ORDER BY ID

  12. To return a de-duped value to an outer table

    Select outer.column, inner.column
    From dbo.outer
    cross apply (
    Select top 1 column
    From table
    Where outer.id = table.id
    Order by sortcolumn
    ) inner

  13. In MySQL, this is very strait-forward:

    select * from t
    (select * from Table1 order by product, price DESC) as t
    group by product

    it will preserve the order of the sub-query, and take the first record when grouping. Seems like MSSQL is aways more complicated than necessary.

  14. thanks for updating me with this. Now suppose if I want to display the Product and Color, where all colors corresponding to the Product are dispalyed together separated with comma.

    I am attempting to do something like this and I am not finding the prper solution. Pl help me with it…

  15. coolpal9 :

    Again, being a MySQL guy, my first response was group_concat. So, I took that and searched for group_concat equivalent in MS SQL, and found this:

    http://stackoverflow.com/questions/451415/simulating-group-concat-mysql-function-in-ms-sql-server-2005

    Hope that link helps you do what you want to do. Looks to be what you are asking for.

    (also, my post above is incorrect – it should be just SELECT * FROM ( … – without the ‘t’ after the from )

  16. hello, sir
    I am using row_number 2 times in my sp so, it is right or not ? I want to display records as per distinct url and i have already used row_number for paging in my stored procedure. so, how can i display the records with unique url. please help asap….
    thanks

  17. Pingback: SQL SERVER – Weekly Series – Memory Lane – #039 | Journey to SQL Authority with Pinal Dave

  18. WITH Dup_LatLong AS
    (
    SELECT row_number() OVER (PARTITION BY
    LATITUDE, LONGITUDE ORDER BY UniqueValue) AS RowNumber, UniqueValue, convert(Varchar(50),LATITUDE)+convert(Varchar(50),LONGITUDE) Check_Cond
    FROM YourTable)
    Select UniqueValue,LATITUDE, LONGITUDE, convert(Varchar(50),LATITUDE)+convert(Varchar(50),LONGITUDE)
    from YourTable
    where convert(Varchar(50),LATITUDE)+convert(Varchar(50),LONGITUDE) in (Select Check_Cond from Dup_LatLong where RowNumber > 1)
    order by 7

  19. in above example i want result like

    Product Price Color
    1 Toy 100 Black
    2 Pen 100 Black
    3 Pencil 100 Blue, Red,Yellow
    4 Cup 300 Orange, Brown

  20. Sir I have a problem,I have two tables,but I am not able to retrive the unique records from them.Its a hospital mgt application.I am using FULL OUTER JOIN grouping by datename(month,v.date)…..but output is wrong….kindly suggest me…plz.Thanks in Advance.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s