Seems like in T-SQL world the issue with Duplicate Records never an old topic. Today lets quickly go over another question which made it to my mailbox 3 times this week.
Question: How do I display only unique records from my table?
The natural reaction will be to suggest DISTINCT or GROUP BY. However, not all the questions can be solved by DISTINCT or GROUP BY. Let us see the following example, where a user wanted only latest records to be displayed. Let us see the example to understand further.
What the user wanted was not to display every duplicate records but wanted to display the record which had highest Price value. I like this kind of simple puzzle. This kind of puzzles has multiple solutions and I like to work on them. However, in this particular case, I was able to quickly come to resolve as I used ROW_NUMBER function in the solution.
Create a sample table.
USE TempDB
GO
CREATE TABLE Table1 (ID INT, Product VARCHAR(100), Price INT, Color VARCHAR(100))
GO
INSERT INTO Table1
SELECT 1, 'Toy', 100, 'Black'
UNION ALL
SELECT 2, 'Pen', 100, 'Black'
UNION ALL
SELECT 3, 'Pencil', 100, 'Blue'
UNION ALL
SELECT 4, 'Pencil', 100, 'Red'
UNION ALL
SELECT 5, 'Pencil', 200, 'Yellow'
UNION ALL
SELECT 6, 'Cup', 300, 'Orange'
UNION ALL
SELECT 7, 'Cup', 400, 'Brown'
GO
SELECT Product, Price, Color
FROM Table1
GO
Select the data without duplicate records as well select the data with maximum price.
SELECT Product, Price, Color
FROM Table1
GO
SELECT Product, Price, Color
FROM
(SELECT Table1.*,
ROW_NUMBER() OVER (PARTITION BY Product ORDER BY Price DESC) AS RN
FROM Table1) AS t
WHERE RN = 1
ORDER BY ID
GO
Clean up act.
DROP TABLE Table1
GO
Please feel free to share your solution as a comment. I will share the solutions on blog with due credit.
Reference: Pinal Dave (https://blog.sqlauthority.com)
37 Comments. Leave new
Couple of dirty solutions. Several assumptions were made based on sample data
–1
SELECT * FROM Table1 O
WHERE NOT EXISTS(SELECT 1 FROM Table1 I WHERE I.Product = O.Product AND I.Price > O.Price)
–2
SELECT O.* FROM Table1 O
INNER JOIN
(SELECT I.Product, MAX(Price) As Price FROM Table1 I
GROUP BY I.Product)I
ON I.Product = O.Product AND I.Price = O.Price
–3
SELECT * FROM Table1 O
WHERE Price = (SELECT MAX(Price) FROM Table1 I WHERE I.Product = O.Product)
Regards
Roji Thomas
Excellent
SELECT DISTINCT Product,
(SELECT TOP (1) Price FROM Table1
WHERE (t1.Product = Product) ORDER BY Price DESC) AS Price,
(SELECT TOP (1) Color FROM Table1
WHERE (t1.Product = Product) ORDER BY Price DESC) AS Color
FROM Table1 AS t1
SELECT table2.Product, table2.price, MAX(Table1.Color) AS Color
FROM
(SELECT Product, MAX(Price) AS price
FROM Table1 AS t
GROUP BY Product) AS table2
INNER JOIN
Table1 ON Table1.Product = table2.Product AND Table1.Price = table2.price
GROUP BY table2.Product, table2.price
Minor Changes to Original Solution:
This will bring highest priced latest record.
SELECT Product, Price, Color
FROM
(SELECT Table1.*,
RANK() OVER (PARTITION BY Product ORDER BY Price DESC,ID DESC) AS RN
FROM Table1) AS t
WHERE RN = 1
ORDER BY ID
Just a observation: Spelling mistake in the post title
Eleminate — It should be Eliminate :)
WITH Test(Product,Price) AS
( SELECT Product,MAX(Price)
FROM Table1
GROUP BY Product
)SELECT A.*,B.Color FROM Test A JOIN Table1 B ON A.Product =B.Product AND A.Price =B.Price
For Better performance and readability
with cte(RN,Product,Price,color)
as
(
SELECT
RN=row_number() over (PARTITION BY Product order by price desc ) ,
Product, Price, Color
FROM Table1
)
select * from cte where RN=1
In case we are into older version of sql and to make compatible with other RDBMS sysstems
select T.Product,T.Price,T.color
from
Table1 T inner join (select Product,max(Price) as Price from Table1 group by Product) VT
on t.Product=VT.Product and t.Price=VT.Price
Thanks
Nelson
[Web Addressed moved to URL field]
SELECT p.Product, p.Price, p.Color
FROM Table1 p
INNER JOIN (
SELECT product, MAX(price) as Price
FROM Table1
GROUP by Product ) q on p.product = q.Product and p.price = q.Price
ORDER BY ID
This is the same as other posts, but may not always yield the same results as the original solution. If you add another row to the original table, you would up with duplicate products if the max price of a product is available in more than one color.
INSERT INTO Table1
SELECT 8, ‘Pencil’, 200, ‘Brown’
Results
Product Price Color
Toy 100 Black
Pen 100 Black
Pencil 200 Yellow
Cup 400 Brown
Pencil 200 Brown
The original solution returns the same 4 rows as it did before.
Select distinct B.product , B.Price, B.Color
from table_1 B
join
(Select product, max(price) max_price from table_1 group by product) A
on A.product = B.product and A.max_price = B.Price
select Product,max(Price) from table1 group by Product
when i run this query it also returns the same result how can i know where to use and what the difference between ur query and this query ………………….
Dear Venkatesh,
We need color also… If our requirement is to fetch The Product and Maximum Price only mean Your Query is Enough . But our requirement is to fetch the color also, So only we need such type of Different Query .
According to the requirement the query may vary .
SELECT Product, Price, Color
FROM Table_test b
where exists(Select 1 from
(
Select product, MAX(price)price
from Table_test
group by product)a where a.Product = b.Product and a.price = b.Price)
— Query Cost: 36%
;WITH CTE AS (
SELECT Product, MAX(Price) Price
FROM TABLE1
GROUP BY Product
)
SELECT CTE.Product, CTE.Price, Table1.Color
FROM CTE
JOIN Table1 ON CTE.Product = Table1.Product AND CTE.Price = Table1.Price
GO
— Query Cost: 64%
SELECT Product, Price, Color
FROM
(SELECT Table1.*,
ROW_NUMBER() OVER (PARTITION BY Product ORDER BY Price DESC) AS RN
FROM Table1) AS t
WHERE RN = 1
ORDER BY ID
with NQ as
(SELECT Product,Price,Color ,ROW_NUMBER() Over(Partition by Product Order by Price desc ) as RowNum
FROM [TempProduct])
SELECT Product,Price,Color from NQ
where RowNum = 1
To return a de-duped value to an outer table
Select outer.column, inner.column
From dbo.outer
cross apply (
Select top 1 column
From table
Where outer.id = table.id
Order by sortcolumn
) inner
In MySQL, this is very strait-forward:
select * from t
(select * from Table1 order by product, price DESC) as t
group by product
it will preserve the order of the sub-query, and take the first record when grouping. Seems like MSSQL is aways more complicated than necessary.
thanks for updating me with this. Now suppose if I want to display the Product and Color, where all colors corresponding to the Product are dispalyed together separated with comma.
I am attempting to do something like this and I am not finding the prper solution. Pl help me with it…
…and so the question remains – which is fastest / best? :)
coolpal9 :
Again, being a MySQL guy, my first response was group_concat. So, I took that and searched for group_concat equivalent in MS SQL, and found this:
Hope that link helps you do what you want to do. Looks to be what you are asking for.
(also, my post above is incorrect – it should be just SELECT * FROM ( … – without the ‘t’ after the from )
It helped me.
Thank u so much……
you reduced my R & D work.
hello, sir
I am using row_number 2 times in my sp so, it is right or not ? I want to display records as per distinct url and i have already used row_number for paging in my stored procedure. so, how can i display the records with unique url. please help asap….
thanks