Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.
DELETE
FROM MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn2)
Reference : Pinal Dave (http://blog.SQLAuthority.com)




if there is no key in the table then what ?
I have on table named [Duplicate] {ID int,FNAME varchar(10),MNAME varchar(10)}
Here there is no key and here are duplicate rows. so hoca can i delete this duplicate rows.
Check Data
ID FNAME LNAME
1 AAA CCC
2 BBB DDD
1 AAA CCC
2 BBB DDD
1 AAA CCC
2 BBB DDD
3 BCB DGD
Remove duplicate rows and keep the data in to the table like this using single query.
ID FNAME LNAME
1 AAA CCC
2 BBB DDD
3 BCB DGD
PLease if possible help because i faced this question in many interviews.
Reply me
Good Question. I have been asking this question in interview many times to new candidates. Answer is:
1) Add Identity Col and perform the operation displayed in this blog and drop Identity Col.
2) User Cursor and save your values in temp table. If you receive the same row which you put in temp table delete it.
3) You can use while loop as well do the same as cursor.
Pinal Dave
(http://www.sqlauthority.com)
Wouldn’t it be a little more simple to use Group By or Distinct to get the unique records in that situation?
SELECT ID, FNAME, LNAME
FROM Duplicate
GROUP BY ID, FNAME, LNAME
would give you the unique rows of Duplicate
Rough Steps of one possibility to fix the table:
1) Save the results of the group by query to a temp table
2) Delete rows from the original table
3) Fill original table with rows of temp table
That you way you don’t have to use a cursor or change the design of table Duplicate. However, in the real world, you’d want to add a unique constraint to the ID column of table Duplicate to make sure it doesn’t get any more duplicates by ID.
David,
Thanks! That is good idea.
Regards,
Pinal Dave
(http://www.SQLAuthority.com)
Whoops, I almost forgot…
Thanks for the post about deleting duplicates! Your code was easy find with a Google search. It’s exactly what I was needing. It’s simple and elegant! I sometimes find it difficult to think of how to do things like that, so it was nice to find your post about it!
While searching for this solution I found many others which use cursors and complicated SQL statements but in the end they do the same thing as your simple & powerful solution. Thank you for posting it online!
My pleasure.
Kind Regards,
Pinal Dave
(http://www.SQLAuthority.com)
Seriously thanks,
Keep plopping my head why i didnt come up with simplicity.
– note
Plopping is a merely profound and uncatastrophic way of activating the tiny brain members that are usually on vacation.
– note 2
nevermind the note
Pinal,
Once again you are the man! I looked at other sites to answer this question and I found this SQL statement to be the best and easiest. Now I know to just come straight here.
If you are looking for deleting duplicates of more than 2 records, then the following code can be used…
set nocount on
drop table tbl1
create table tbl1
(
col1 int
)
insert into tbl1 values(1)
insert into tbl1 values(1)
insert into tbl1 values(1)
insert into tbl1 values(2)
insert into tbl1 values(2)
insert into tbl1 values(2)
insert into tbl1 values(2)
insert into tbl1 values(3)
insert into tbl1 values(3)
set rowcount 1
select ’start’
while @@rowcount > 0 delete a from tbl1 a where (select count(*) from tbl1 b where a.col1 = b.col1)>1
set rowcount 0
select * from tbl1
set nocount off
If you are looking for deleting multiple records (duplicate with more than 2 records like 3,4,5 duplicate records, etc) and without key column then the following method can be adopted.
Lets say your table is tab1.
SELECT DISTINCT * INTO tab2 FROM tab1 — distinct of all columns
DROP TABLE tab1
EXEC sp_rename ‘tab2′,’tab1′
Note:
Take a back up of your original table before using dropping for your reference. You may need to grant permission of tab2 as given for tab1
Simple is good! You Rock!
(I found so many answers elsewhere and it’s incredible how 6 lines of code and one push of a button can do with your code)
Ravi,
your solution seems to be excelent but what about triggers and indexes defined on that table tab1.
will they remain intact??
Please repsond ….
Amit
Amit,
Answer to your question is that will not remain intact. You may have to recreate all of them as well as all the constraints.
It is recommended to use the method explained in the original post to delete duplicate records.
Regards,
Pinal Dave(http://www.SQLAuthority.com)
Well to answer Rahul’s question and modify Ravi’s and Pinal Dave query…
Here is a single query which will answer all your questions but this still needs Primary key or identity column to be added to the table though.
DELETE FROM MyTable
WHERE EXISTS (
SELECT * FROM MyTable AS b
WHERE
b.[col1] = MyTable.[col1]
AND b.[col2] = MyTable.[col2]
AND b.[col3] = MyTable.[col3]
GROUP BY
b.[col1], b.[col2], b.[col3]
HAVING
MyTable.[ID] > MIN(b.[ID])
)
If you want most recent records to be present in your DB and delete multiple Old duplicate records, You can make use of
DELETE FROM MyTable
WHERE EXISTS (
SELECT * FROM MyTable AS b
WHERE
b.[col1] = MyTable.[col1]
AND b.[col2] = MyTable.[col2]
AND b.[col3] = MyTable.[col3]
GROUP BY
b.[col1], b.[col2], b.[col3]
HAVING
MyTable.[ID] MIN(b.[ID])
)
Hope this answers all your questions.
If you dont have Primary key or Identity Column then you can always create it and run this query and delete the Primary Key or identity column.
“there are always better things to do to make lives easier.”
enjoy your day
Vamshi
IN SQL SERVER 2005, This can be easily achieved without crating unique identifier by using CTE and ROW_NUMBER (), the modified query for sql server 2005 goes here
***********************************************
WITH T1 AS (SELECT ROW_NUMBER ( ) OVER ( PARTITION BY ID, FNAME, LNAME ORDER BY ID ) AS RNUM FROM DUPLICATE )
DELETE FROM T1 WHERE RNUM > 1
***********************************************
To get a detail grasp on ROW_NUMBER () OVER () … Refer MSDN http://msdn2.microsoft.com/en-us/library/ms186734.aspx for.
I have by mistake duplicated all my records.
Through export/import wizard I selected my database (e.g pension) and copied tables to the desination, later I came to know that source and destination where the same.
Therefore, in all the table all rows are duplicated.
Can someone solve the problem.
[...] 11th, 2007 by pinaldave In my previous article SQL SERVER – Delete Duplicate Records – Rows, we have seen how we can delete all the duplicate records in one simple query. In this article we [...]
Thanks Ravi, your information helped. I modified the query in the following manner and it worked fine:
Select distinct * into temp from tab1;
Delete tab1;
Insert tab1 Select * from temp;
drop table temp;
how can we delete duplicate rows if table doesn’t have any identity column and requirement is you don’t have to insert one column as identity and you don’t have to use cursor and you don’t have to use temp table. so query should be a single
Please reply
Dipak,
What in application makes you have this many requirement?
I would do what I have suggested in initial post or add identity column and drop it afterwards.
Regards,
Pinal Dave (SQLAuthority.com)
this question i have asked in one of my interview, and requirement is like delete record should be in single query its ok if you can write subquery, but without use of temp table, without use cursor, and without use of insert identity column. still i am waiting for the answer,
Please read the comments above there are solutions to your question. This blog readers have already answered that.
I’m trying to keep only the recent SaleDate in the following table. How can I do that?
ID SaleDate SaleAmount
40 6/23/2003 242
40 12/28/2001 212
40 6/13/1994 111
41 11/30/2001 233
41 10/25/1996 15
41 4/21/1994 132
42 6/17/2005 2765
42 5/14/1994 147
43 5/9/1994 145
Thanks much
Tayeb below query should work for you
DELETE FROM #tempTab WHERE
EXISTS
(SELECT ID FROM #tempTab AS b WHERE #tempTab.ID = b.ID
GROUP BY b.ID
HAVING #tempTab.SaleDate < MAX(b.SaleDate))
Depak, to answer your question here is a solution in SQL Server 2005
CREATE TABLE #Table1 (col1 int, col2 int)
INSERT INTO #Table1 VALUES (1, 1000)
INSERT INTO #Table1 VALUES (2, 2000)
INSERT INTO #Table1 VALUES (2, 2000)
INSERT INTO #Table1 VALUES (3, 3000)
INSERT INTO #Table1 VALUES (3, 3000)
INSERT INTO #Table1 VALUES (4, 4000)
SELECT * FROM #Table1;
WITH T1 AS (SELECT (ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY col1)) AS RNum FROM #Table1)
DELETE FROM T1 WHERE RNum IN (SELECT a.RNum FROM T1 AS a, T1 AS b WHERE a.RNum > b.RNum GROUP BY a.RNum)
SELECT * FROM #Table1
DROP TABLE #Table1
ROW_NUMBER() will solve your purpose.
Vamshi,
Excellent post. If you prefer, you can write post with little more explanation. I will be happy to post on my blog with your name.
Regards,
Pinal Dave (http://www.SQLAuthority.com)
Hi
I had problem with deleting dupilicate rows. i have used your code. work fine. thanks a lot.
Geetha
Awesome man. I have used you delete duplicate records query. And it works great.
Thanks
Unfortunately, this query doesn’t seem to work when attempting to delete a row that has duplicate id’s but different data.
Can’t believe how hard it is to find something simple that will work.
Why don’e we do a self union to the table.
select * from mytable union select * from mytable,
thsi query returns the exact records, and i guess we can use this data to build a new table or replacing the previous one.
Thoughts on this please
I am trying to design an query which functions like ICETOOL of mainframe it also requires this kind of duplicate deletions…
ICETOOL in my application (sorts, merges and separates) duplicates and uniques from 4 input file into two output files(one containing unique ids and other containing its duplicate ids with different data) and discards the duplicates which occurs more than twice…
i have the 4 files as tables now i need to get two output tables
help will be much appreciated………
Hi
Sir can u please let me how to pass aliase name in Stored
procedure when I am using table as Parameter in A SP
How would your solution work if I had a single table called Customers that looks like this:
ID LastName FirstName
1 Smith John Adam
2 Smith John A.
3 Jones Jane Mary
4 Jones Jane M.
No other fields are in this table. Just what you see. The ID field is unique for each record. In this example, both records for Smith are the same person. The same goes for Jones. I want to view in a table the resulting unique records (i.e. only one record for John Smith, and one record for Jane Jones).
I am a complete newbie and would appreciate guidance.
Thanks!
If you have Key already, you have seen many solutions in this post before, you need to group by on the repeating/columns that can be duplicated and delete them. The answered solution is if you dont have a unique key in the table.
well to add more to that, once you do self union and insert into temp table, you can delete every thing form main table and insert every thing from temp table in to the original/main table. I think this should help..
-Aditya
SELECT id,name,mNumber,sNumber,partOfSpeech,type FROM keyword where id IN (Select max(id) from keyword GROUP BY name)
The above is if you want to just select a column without duplicates in it.
create table T1 (C1 int, C2 varchar(3))
–Fill it with duplicate Records
declare @as Table (c1 int,c2 varchar(3))
insert into @as select Distinct * from T1
Truncate table T1
Alter Table T1 Disable Trigger All
insert into T1 select * from @As
Alter Table T1 Enable Trigger All
Thanks Mr. Dave for the codes.
I’ve searched inside out in the internet but still couldn’t find a solution for my problem.
Your code manage to solved it, your truly a master in SQL. Thanks again
Can you show how to make this a select Into, so that one can save the deleted records in another table
DELETE
FROM MyTable
WHERE ID NOT IN
(SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicatevalueColumn1, DuplicateValueColumn2, DuplicateValueColumn2
I thing this one alos workout
====================
set rowcount 1
delete test
from test a
where(select count(*) from test b where b.age = a.age and b.name = a.name)
> 1
while @@rowcount>0
delete test
from test a
where(select count(*) from test b where b.age = a.age and b.name = a.name)
>1
set rowcount 0
//Here n is maximum allowable number of a record
create or replace procedure t_dili1_2
(n in int,q in out sys_refcursor)
as
begin
open q for
select a,b
from dil1 where (a,b) in(
select a,b from (select a,b,count(*) from t group by a,b having count(*)<n));
end t_dili1_2;
/
variable z refcursor
exec t_dili1_2(4,:z)
print z
Plz Reffer to my (above) comment (PL/SQL Block)..
Here dil1 is name of TABLE.
a,b are columns with repeated records.
I did is Using REFCURSOR.
Q: Display all the records of a table t (where there are several duplicated & reduplicated records present) with restricted number (n) of distinct records. (Use REFCURSOR)
Solution:
create or replace procedure t_dili1_2
(n in int,q in out sys_refcursor)
as
begin
open q for
select a,b
from t where (a,b) in(
select a,b from (select a,b,count(*) from t group by a,b having count(*)<n));
end t_dilip1_2;
/
variable z refcursor
exec t_dili1_2(4,:z)
print z
Pinal,
You are truly a SQL guru and I love your website and your articles.
The statement above that deletes duplicate records worked perfectly for me.
Thanks.
Hello All,
I have a situation where I have two tables A & B. A has the actual data, table B maintains a sequence of data that is in table A. I have a process that polls table A every 5 seconds. Whenever a record is updated in table A with a TRANSID n, table B is updated with the same TRANSID. This said, what I would like to achieve is, when I poll table A, I should get all the records that were updated in the past 5 seconds. The catch is, same record can be updated more than once and end up updated with a greater TRANSID, table B is updated with the same TRANSID, but if same record is updated, one of their columns REPORT_ID, would be the same in table A(REPORT_ID not in table B). I should be able to get the most recent of the duplicate records and other records that were updated. Any help is greatly appreciated since writing SQL queries is not one of my fortes.
YOu rock man!
how to find second highest salary of the employee table
please explain
dear Pinaldave
u r genius
u have solved my problems on deleting duplicate records
dear Pinal Dave
U r genius , u have solved my
problems on deleting records
hi
how to find 3rd maximum salary, like that how delete duplicate values in a table
Thanks
Praveen
sir,
i am very much imperssed with u r ans. but if no of columes will be more than 100 then what is the procedure
please repaly to this one to my mail id if possible
Hi,
Among Joins and Subquery, which one is better approach?
Can u pls tell me this.
Thanks
Ramesh
very helpful for me to delete duplicate records
Hi,
I need to delete all the duplicate records with MAPID being the duplicated FIELD where ADDRESSCOUNT = 0 being the other defining criteria. My table looks like;
ID MAPID ADDRESSCOUNT
111 54560 4
132 54560 0
198 23429 1
240 29584 1
248 29584 0
Any help appreciated.
Using MS SQL Server 2000.
Regards,
Jim
how i find recent row updated i would like to use this row in trigger after update in table
hi Mr. Dave,
This question was asked to me in an interview and I was unable to answer.
Now I got the solution.
ThankU Very Much
Hi jim(54),
this should work for you.
DELETE
FROM duptest
WHERE MAPID IN
(select MAPID
from duptest
group by mapid
having count(mapid) > 1) and AddressCount = 0
cheers,
anand.
very nice logic…u r great
–For finding second highest salary
select max(salary) from emp where salary<max(salary)
How to recover deleted records from any perticular table
in case of no id column in table one can delete duplicate rows as below
create view abc as select *,row_number() OVER (PARTITION BY dupcol1,dupcol2,… ORDER BY dupcol1,dupcol2,…) as rnum from
table
delete from abc where rnum > 1
drop view abc
Hi All,
I would like to share one suggestion that is :
should’t We conclude each topic with one best answer (if we can).
Hi,
Thank you so much for sharing your knowledge.
Great Work.
hi,
realy this is super i have not found any where very nice.
cheers,
Satish
Hello Sir,
I recently joined your site, and found it really very helpful.
How about using ‘ROWID’ to delete the duplicate rows.
Please check this query.
DELETE
FROM MyTable
WHERE ROWID NOT IN
(SELECT MIN(ROWID)
FROM MyTable
GROUP BY DUPL_COL1,DUPL_COL2,DUPL_COL3)…
–All col names
Now, my question is that if I have more than two duplicate records I want to keep 2 of them and to remove rest.
How can I do it?
Please help out.
Thank you.
Hi
its a great thing to share knowledge
thanks for your help
ssatish kumar
Hi ATIN(59),
For finding only second highest salary – -
select * from
(select * from employee orderby salary desc
where rownum>=2)
minus
select * from
(select * from employee orderby salary desc
where rownum>=1) ;
And to get Only Top 2 Highest salary –
select * from
(select * from employee orderby salary desc
where rownum>=2);
Thanks & Regards
Angadi Doddappa
Thanks Pinal Dave….
U are doing a great Job…
All the Best all Of u…
Jai Hind…
Hi,
This page looks really cool, hope I will get answer for my question, I have a table with 35 columns and have duplicate rows based on 6 columns. So how do I remove duplicates and keep the original rows in the table, keep in mind table has around 500,000 rows.
hiii
i want to select the second highest salary from an Employee table!
how to do that!
Thanks!
Hi, please help me…. I couldn’t find answer for this anywhere…
I have a table with a column having data type varchar and it has data with numbers and alphabets in it. I want to replace the data in the column with ‘ ‘ where there are alphabets.
(REPLACE((column),’S',”)) — I know this would replace s with ‘ ‘, but how to replace all the alphabets with ‘ ‘.
Please answer, I urgently need the answer.
Thanks to all in advance….
Hellooo
I have a table for example EMployees, it has a PID which is auto number, and other 2 fields one EmployeeID and other is Name.
what i want is to delete duplicated data in Employee ID and Name for sure, what to do plz….
Thanks
DELETE
FROM EMployees
WHERE Employee ID IN
(select Employee ID
from EMployees
group by EMployees
having count(EMployees) > 1)
just try this……;
can any budy explain trigger concept…
this question i have asked in one of my interview, and requirement is like delete record should be in single query its ok if you can write subquery, but without use of temp table, without use cursor, and without use of insert identity column. still i am waiting for the answer,
Pls give solution as in sql server 2000
Hi
I am having a problem in generating reports from two SQL tables.
eg.
table1.
loc1 10
loc2 10
loc3 40
Table2
loc1 xx
loc1 yy
loc1 zz
loc3 pp
I want to print.
Table1 + table2
loc1 10 loc1 xx
null loc1 yy
null loc1 zz
loc2 10 null
loc3 40 loc3 pp
Any way to generate this using joins or any other.Pl. advise
pankaj
U are doing a great Job…
All the Best all Of u…
i like this site
thanks
Great solution, however….i have a table that has…………..
1] Duplicate Id’s
2] Unique id’s
Now on the basis of your query i would be able to delete the duplicate id’s(and records corresponding to that is)..but that would delete all subsequent unique id’s as well.
So basically i wanto find all duplicates, choose one, delete the rest and keep the unique records as well.
Could you or anyone please help me find a solution to this…
Thanks!
Marvellous men!!!!!!!!1
Hi ,
Pls help in this issue .
create table test1
( A int , B varchar(2) , C varchar(20) , D int , E varchar(20))
insert into test1 values (1,’A',’aaa’ , 111, ‘adf’)
insert into test1 values (2,’B',’bbb’ , 222, ‘adsf’)
insert into test1 values (3,’C',’ccc’ , 333, ‘cfe’)
insert into test1 values (1,’A',’aaa’ , 444, ‘vgd’)
insert into test1 values (4,’D',’ddd’ , 555, ‘qef’)
Now for the above table Columns A , B and C (combined) form the unique identifiers for a given record.
I would like a query to retrieve the unique records(identified by columns A , B and C in the table )
i.e
2,’B',’bbb’ , 222, ‘adsf’
3,’C',’ccc’ , 333, ‘cfe’
4,’D',’ddd’ , 555, ‘qef’
Constraint being that i cannot add any composite / identity for the three columns (A,B ,C)
Any help is highly appreciated……
its really nice
regds
faisal qureshi
My solution is:
select * from DUPLICATE
union
select * from DUPLICATE
This query will filter out all duplicate records from DUPLICATE table
Tarun
71.
Hi, please help me…. I couldn’t find answer for this anywhere…
I have a table with a column having data type varchar and it has data with numbers and alphabets in it. I want to replace the data in the column with ‘ ‘ where there are alphabets.
(REPLACE((column),’S’,”)) — I know this would replace s with ‘ ‘, but how to replace all the alphabets with ‘ ‘.
Please answer, I urgently need the answer.
The Answer is as follow :
You have to use Stuff function in place of Replace.
e.g.
select stuff(column,1,len(column),”) from Table_name
A.K.Rastogi
This soluton worked perfectly in my situation! I had read many other suggestions online but most of them were overly complicated and usually required creating additional tables. Thanks for posting this!
Hi Pradeep,
Today I saw your question. It’s late, but if you haven’t find any solution below may help…
____________________________________________________________________________________________
SELECT * FROM test1 t1, (SELECT A, B, C FROM test1 GROUP BY A,B,C HAVING COUNT(1)>1) t2
WHERE t1.A t2.A AND t1.B t2.B AND t1.C t2.C
____________________________________________________________________________________________
Regards
Saroop
____________________________________________________________________________________________
SELECT * FROM test1 t1, (SELECT A, B, C FROM test1 GROUP BY A,B,C HAVING COUNT(1)>1) t2
WHERE t1.A <> t2.A AND t1.B <> t2.B AND t1.C <> t2.C
____________________________________________________________________________________________
thanks satish
tahnks ravi
cool !!!!!!!
hi dave,
i m really imprssd with ur blog. n u knw d bst part is u rply to each possible post.
hoping to get best out dis blog
thanks
and all d best
deepti
Hi all,
i have doupt ..plz clarify that.
create table a(empid int)
create table b(empid1 int)
insert into a values(1)
insert into b values(1)
insert into a values(2)
insert into b values(3)
output:
empid
———–
1
2
empid1
———–
1
3
with out using temporary table to del the comman record from separate two table.
Wow this is nice and simple!
Hi All,
Those who have requested for
How to get 1st, 2nd, 3rd, 4th, nth topmost salary from an Employee table
The following solution is for getting 6th highest salary from Employee table ,
SELECT TOP 1 salary FROM (SELECT DISTINCT TOP 6 salary FROM employee
ORDER BY salary DESC) a ORDER BY salary
You can change and use it for getting nth highest salary from Employee table as follows
SELECT TOP 1 salary FROM (SELECT DISTINCT TOP n salary FROM employee
ORDER BY salary DESC) a ORDER BY salary
where n > 1 (n is always greater than one)
I need to be able to select only the record with the max(datetime) of the same guid. Is this possible? I only want the record with the date of 6:53:02 and not 5:07:32 and I want to keep the record with 5:07:19 (different guid)
b16569a1-e3dc-4f03-b2a0-bf3e2677c466 rptProviders_Provider2Location 4/13/2008 5:07:19 AM 4/11/2008 6:46:29 PM
36b6419a-a3a5-49ad-b84d-942c89c4754c rptProviders_Provider2Location 4/13/2008 5:07:32 AM 4/11/2008 6:46:29 PM
36b6419a-a3a5-49ad-b84d-942c89c4754c rptProviders_Provider2Location 4/13/2008 6:53:02 AM 4/11/2008 6:46:29 PM
Here is the query I have so far:
SELECT P2L.ProviderId, dbo.AUDIT.TableName, dbo.AUDIT.OccurredAt, P2L.xLastUpdated
FROM dbo.rptProviders_Provider2Location AS P2L WITH (NOLOCK) INNER JOIN
dbo.rptProviders_Provider2LocationCategory AS P2LC WITH (NOLOCK) ON P2L.xId = P2LC.Provider2LocationId INNER JOIN
dbo.AUDIT ON P2L.xId = dbo.AUDIT.RowId
I’ve got a slightly different problem. I’m creating a stored procedure which will retrieve several different columns from a hospital table (month, year, patient name, intervention hours, and date changed) Those aren’t the real column names; the idea is that records are kept when a patient’s behavior requires staff intervention. The records are entered monthly, but frequently–in a given time range–the reports show patients’ names more than once, because people update the intervention time numbers (stuff is loaded in wrong, then corrected, or loaded in as 0 to meet a deadline, then updated, etc). I need to get rid of the duplications, meaning that when names are shown more than once for a given month and year, I only want one of the duplicated intervention records to show up. I want to filter it by “date changed,” meaning the most recent entry for each patient.
So when I run an “interventions” report for specified time frame at a given hospital, I want to see the intervention hours broken down by patient name (but I only want to see the most recently entered record for each patient who had intervention activity during the time parameters of the report).
How can I modify my SELECT statement to do this?
how to solve this problem?
ID(identity column) name
——————————————
1 david
2 john
3 ram
4 paul
5 aish
6 raj
7 mani
8 madu
9 cotter
now i deleted the 4′th row
4 paul
ok
now how can i display the remaining names as given below
ID(identity column) name
——————————————
1 david
2 john
3 ram
4 aish
5 raj
6 mani
7 madu
8 cotter
any body can help me sir?
saranpselvam@gmail.com
Is it possible to delete a single column from a table in SQL 2000.
Thanks
This works in 2000. WARNING: According to MS, SET ROWCOUNT will not work with INSERT, DELETE, and UPDATE in later versions.
– Create table with a number of values between zero and nine
select a+b+c as val
into dbo.rmtemp
from (select 0 a union all select 1 union all select 2 union all select 3) a
, (select 0 b union all select 1 union all select 2 union all select 3) b
, (select 0 c union all select 1 union all select 2 union all select 3) c
– Show what you’ve got
select val,count(*) row_count from dbo.rmtemp group by val
– Limit processing to a single row
set rowcount 1
– While you’ve got duplicates, delete a row
while (select top 1 val from dbo.rmtemp group by val having count(*) > 1) is not null
delete from dbo.rmtemp where val in (select top 1 val from dbo.rmtemp group by val having count(*) > 1);
– Remove single row processing limit
set rowcount 0
– Confirm that only uniques remain
select val,count(*) row_count from dbo.rmtemp group by val
– Clean up
drop table dbo.rmtemp
– Comment
I knew there had to be a better way.
[...] pinaldave I am proud on readers of this blog. One of the reader asked asked question on article SQL SERVER – Delete Duplicate Records – Rows and another reader followed up with nice quick answer. Let us read them both [...]
hi pinal dave ,
i have a question for u , i want to retrive the rows from a table by avoiding duplicate rows with all coloumns .i.e; duplicate row sholud be retrieved only once . any help reg…
Its really a nice
Hi Pinal,
Thanks for your tips.
I have a question. We have a table that have many contacts that are duplicated more than once. And we need a query that excludes all duplicated people who has already registered is there a more efficient way of doing it than this?
SELECT CD.email
, CD.contactid
FROM contact CD
WHERE (CD.opt_out = 0)
AND (CD.email ‘NoEmail’)
AND (CD.email IS NOT NULL)
AND (CD.email ”)
AND (CD.registered = 0)
AND Lower (CD.email) + ‘||’ + Lower (cd.firstname) + ‘||’ + Lower (CD.lastname) NOT IN
(SELECT Lower (CD1.email) + ‘||’ + Lower (CD1.firstname) + ‘||’ + Lower (CD1.lastname)
FROM contact CD1
WHERE (CD.registered = 1)
)
GROUP BY CD.contactid
, CD.email
Thank you,
EstebanD
Dear Sir,
i a have a table with id as one column,name as another column , in name there r diffrent names is there is possible to display all the alphabets in the name column.it is very urgent.where i can see mu solution if else post it to my id.
I am having a query inside the view which is returning more than one value. I need all the values to be returned in a separate row while running the view. Can ypu provide me a solution?
Dear sir
I am faceing a problem in sql query.
I have two table.
1st table is maste table, In master table total 180 records are stored.
our requirement total 180 rows – 20 rows are stored in 2nd table, then display 160 mark as ‘N’ bcos stored in master table and other records are mark as ‘Y’ bcos its stored in 2nd table.
Table 1st
AppID FunCode SubFuncode FunName SubFunName Mark
Dis 1 30 Main 0 N
Depos 2 30 &Main 0 N
Lox 3 30 &Main 0 N
LCredit 4 30 &Main 0 N
In 2nd table same records but only 20 records are stored.
AppID FunCode SubFuncode FunName SubFunName Mark
Dis 1 30 &Main 0 Y
Depos 2 30 &Main 0 Y
In query we try to fetch all Y marked row and N marked with Union query.
SELECT TBLAPPLICATIONS.Name, TBLGROUPFUNCTIONS.AppID, TBLGROUPFUNCTIONS.FunctionCode, TBLALLFUNCTIONS.DefaultSubFunctionName,
TBLGROUPFUNCTIONS.SubFunctionCode, ‘Y’ AS AssignedFunction
FROM TBLGROUPFUNCTIONS INNER JOIN
TBLALLFUNCTIONS ON TBLGROUPFUNCTIONS.AppID = TBLALLFUNCTIONS.AppID AND
TBLGROUPFUNCTIONS.FunctionCode = TBLALLFUNCTIONS.FunctionCode AND
TBLGROUPFUNCTIONS.SubFunctionCode = TBLALLFUNCTIONS.SubFunctionCode INNER JOIN
TBLAPPLICATIONS ON TBLGROUPFUNCTIONS.AppID = TBLAPPLICATIONS.AppID
WHERE TBLGROUPFUNCTIONS.GroupID = 12
UNION ALL
SELECT TBLAPPLICATIONS.Name, TBLALLFUNCTIONS.AppID, TBLALLFUNCTIONS.FunctionCode,
TBLALLFUNCTIONS.DefaultSubFunctionName, TBLALLFUNCTIONS.SubFunctionCode, ‘N’ AS AssignedFunction
FROM TBLALLFUNCTIONS INNER JOIN TBLAPPLICATIONS ON TBLALLFUNCTIONS.AppID = TBLAPPLICATIONS.AppID
WHERE
(TBLALLFUNCTIONS.AppID NOT IN (SELECT AppID FROM TBLGROUPFUNCTIONS WHERE GroupID = 12) OR
TBLALLFUNCTIONS.FunctionCode NOT IN (SELECT FunctionCode FROM TBLGROUPFUNCTIONS WHERE GroupID = 12) OR
TBLALLFUNCTIONS.SubFunctionCode NOT IN (SELECT SubFunctionCode FROM TBLGROUPFUNCTIONS WHERE GroupID = 12))
Order BY aPPID,FunctionCode,sUBfUNCTIONcODE
From this query record is not show correctley.
Please please immediate reply me.
I am waiting your reply please suggest me.
Asit Sinha
Thanx Mhlove:
Your solution was succesfull for me, for delete duplicate data in secondary column but the same ID.
Regards.
Hi,
I have an issue with and sql query i’m trying to get right and I seem not to find the answer.
I have a table with 4 colums: id1,id2,value1,value2.
I need to select all rows with a unique pair of (id1,id2) and use a 3rd column in the select for something like : sum(value1*value2) for duplicate rows.
So basicaly, show 1 row for each pair and next get a sum of all the values for that pair (sum the duplicate rows on value1,value 2).
I only been able to select the unique pairs until now using a group by id1,id2 but i’m having troble with the sums of the duplicate rows for each pair.
I’d appreciate your help
Hi Bogdon,
It would have been easy to provide solution if you would have provided proper Input and desired output. Anyways see it the below works for you based on my understanding
CREATE TABLE #T1 (ID1 INT, ID2 INT, Value1 INT, Value2 INT)
INSERT INTO #T1 VALUES (1,1,5,10)
INSERT INTO #T1 VALUES (1,1,6,12)
INSERT INTO #T1 VALUES (1,1,50,100)
INSERT INTO #T1 VALUES (1,2,7,14)
INSERT INTO #T1 VALUES (1,2,8,16)
INSERT INTO #T1 VALUES (2,1,2,5)
INSERT INTO #T1 VALUES (2,1,3,6)
INSERT INTO #T1 VALUES (2,3,10,20)
INSERT INTO #T1 VALUES (2,3,25,50)
SELECT * FROM #T1;
SELECT ID1, ID2, SUM(Value1*Value2) FROM #T1 GROUP BY ID1, ID2;
WITH T1 AS (SELECT (ROW_NUMBER() OVER (PARTITION BY ID1 ORDER BY ID1)) AS RNum, * FROM #T1 )
DELETE FROM T1 WHERE Rnum IN (SELECT a.RNum AS S FROM T1 AS a, T1 AS b WHERE a.RNum > b.RNum GROUP BY a.RNum)
SELECT * FROM #T1
DROP TABLE #T1
Post the input and desired output if the above is not what you want.
Vamshi
Sorry WITH statement is not yours…
CREATE TABLE #T1 (ID1 INT, ID2 INT, Value1 INT, Value2 INT)
INSERT INTO #T1 VALUES (1,1,5,10)
INSERT INTO #T1 VALUES (1,1,6,12)
INSERT INTO #T1 VALUES (1,1,50,100)
INSERT INTO #T1 VALUES (1,2,7,14)
INSERT INTO #T1 VALUES (1,2,8,16)
INSERT INTO #T1 VALUES (2,1,2,5)
INSERT INTO #T1 VALUES (2,1,3,6)
INSERT INTO #T1 VALUES (2,3,10,20)
INSERT INTO #T1 VALUES (2,3,25,50)
SELECT * FROM #T1;
SELECT ID1, ID2, SUM(Value1*Value2) FROM #T1 GROUP BY ID1, ID2;
I dont know why you said GROUP BY didnt worked for you if the above is what you are expecting
Hi Haneef and Preveen
Hope this would help you in getting all those employees with Nth Highest salary
DECLARE @n INT
SELECT @n = 2 — Change N value to get Nth highest value
CREATE TABLE #T1 (ID1 INT, [Name] NVARCHAR(50), Salary INT)
INSERT INTO #T1 VALUES (1, ‘Vamshi’, 1000)
INSERT INTO #T1 VALUES (2, ‘xxxxx’, 2000)
INSERT INTO #T1 VALUES (3, ‘yyyyy’, 3000)
INSERT INTO #T1 VALUES (4, ‘zzzzz’, 4000)
INSERT INTO #T1 VALUES (5, ’sssss’, 5000)
INSERT INTO #T1 VALUES (6, ‘ccccc’, 6000)
INSERT INTO #T1 VALUES (7, ‘ppppp’, 2000)
INSERT INTO #T1 VALUES (8, ‘aaaaa’, 4000)
INSERT INTO #T1 VALUES (9, ‘bbbbb’, 5000)
SELECT a.ID1, a.[Name], a.Salary
FROM #T1 AS a
WHERE (@n-1) = (
SELECT COUNT(DISTINCT(b.Salary))
FROM #T1 AS b
WHERE b.Salary > a.Salary)
DROP TABLE #T1
Vamshi
This is amazing Pinal.
It resolved my problem.
Cheers.
Hi,
I need to Get a primary key in a particular table in MS-SQL SERVER 2000. How Can i get. Is there any query is available???
Thanks in advance
Mohan.V
Pinal, great information on your site!
I think Senthilnathan in comment #16 has the right solution that can be used for deleting duplicates and for finding the nth row in a group. The key is ROW_NUMBER() function with a Partition.
– EXAMPLE TO GET THE 5th highest salary of the “Manager” group.
– CREATE Common Table Expression CTE
With MyCTE AS (
SELECT
ROW_NUMBER() OVER (PARTITION EmployeeType
ORDER BY EmployeeType, Salary) AS OrderedGroupId,
EmployeeId,
EmployeeType,
Salary
FROM SalaryHistory)
SELECT EmployeeId, — Query results of CTE
Salary
FROM MyCTE
WHERE EmployeeType = “Manager”
AND OrderedGroupId = 5
This could also be a DELETE statement deleting duplicate items in a PARTITION (DELETE FROM MyCTE WHERE OrderedGroupId > 1)
@Mohan,
1. If you want to see the primary on a specific table then Execute this stored procedure,
Sp_help table_name — you have to give table name
It will display all the table properties, like columns, keys, indexes…
In constraints section it will show you all the keys details, on which column primary key is made.
2. If you want to see all the primary keys on all tables, then use this query,
SELECT A.CONSTRAINT_NAME ‘CONSTRAINT NAME’ , A.COLUMN_NAME ‘COLUMN NAME’, A.TABLE_NAME’TABLE NAME’, A.ORDINAL_POSITION ‘POSITION OF COLUMN’
FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS B, INFORMATION_SCHEMA.KEY_COLUMN_USAGE A
WHERE B.CONSTRAINT_NAME =A.CONSTRAINT_NAME AND CONSTRAINT_TYPE = ‘PRIMARY KEY’
ORDER BY A.TABLE_NAME, ORDINAL_POSITION
– Run this script as is don’t change anything
This will give you primary key name, on which column it is made and which table it is made and also position of the column.
Sometimes, we create composite primary keys, meaning primary key on more than one column on one table in that case when you run the above query you will see table name more than once, dont get confuse, see the “position of the column” ( in the output) and then you will come to know that this table has a composite primary key.
** A table can have only one Primary key.
Rahul ask how if there is no key in the table to delete the duplicate data,
Please use this sql to do it, @ID, @Lname and @fname is depend on our field in the table:
DECLARE @ID varchar(150), @lname varchar(150), @fname varchar(150), @MyCount int, @Counting Int
DECLARE authors_cursor CURSOR FOR
SELECT id, FNAME, LNAME, COUNT(*) AS myCount
FROM dbo.Table1
GROUP BY id, FNAME, LNAME HAVING (COUNT(*) > 1)
open authors_cursor
FETCH NEXT FROM authors_cursor
INTO @ID, @fname, @LName, @MYCount
WHILE @@FETCH_STATUS = 0 BEGIN
DECLARE aut2 CURSOR local scroll dynamic OPTIMISTIC FOR
SELECT *
FROM table1
WHERE id = @id AND lname = @lname AND fname = @fname
OPEN aut2
SET @counting = 1
FETCH next FROM aut2
begin
WHILE @Counting < @MyCount BEGIN
print @counting
DELETE FROM table1 WHERE CURRENT OF aut2
SET @Counting = @Counting + 1
FETCH next FROM aut2
END
–end
end
CLOSE aut2
DEALLOCATE aut2
FETCH NEXT FROM authors_cursor
INTO @ID, @Lname, @FName, @MYCount
END
CLOSE authors_cursor
DEALLOCATE authors_cursor
Answer for Rasmi, please study this one:
declare @MyStr varchar(150), @pjg int
DECLARE authors_cursor CURSOR FOR
SELECT LNAME
FROM dbo.Table1
open authors_cursor
FETCH NEXT FROM authors_cursor
INTO @mystr
WHILE @@FETCH_STATUS = 0 BEGIN
set @pjg=len(@mystr)-1
while @pjg>=0
begin
print substring(@mystr,len(@mystr)-@pjg,1)
if isnumeric(substring(@mystr,len(@mystr)-@pjg,1))=0
set @mystr=replace (@mystr,substring(@mystr,len(@mystr)-@pjg,1),”)
set @pjg=@pjg-1
end
print @myStr
update table1 set lname=@mystr where current of authors_cursor
FETCH NEXT FROM authors_cursor
INTO @mystr
end
print @myStr
CLOSE authors_cursor
DEALLOCATE authors_cursor
In most cases, two duplicate records in a table may not be the same. For example, a customer may enter his name or address, or phone number in different ways, but only from his email, we know this is the same person.
My question is, if a table contains duplicate records like this, and I only want to keep latest record of the dups. I mean, based on customer’s email, and the date of registration, can we have a way to delete the duplicate records? for example:
table_customers contains “email”, “name”, and “date”:
ab@c.com John Smith 24/8/2006
ab@c.com John S. 12/9/2007
…
I only wants to keep the latest record if two records have the same email, no matter if the names are the same or not.
Thank you for your help.
Mike
For Mike may be you can use this Query:
DELETE table_customers
FROM (SELECT Email, MAX(Date) AS Date
FROM table_customers
GROUP BY Email
HAVING Email = ‘ab@c.com’)) DERIVEDTBL
WHERE table_customers.Email = DERIVEDTBL.Email AND table_customers.Date DERIVEDTBL.Date
Hi
I have a question. Below is the Employee table
EmpID EmpName Salary
———– —————————– ———-
1 Aashish 15000
1 Aashish 15000
3 Gunjan 25000
3 Gunjan 25000
5 Atul 35000
6 Animesh 20000
Now I have to keep only one record either of the two and remove the duplicate record. Condition is There should not be any use of temp tables, no primary keys, no identity columns, no cursors. Everything has to be handeled in a query . And Database is MS SQL Server 2000
For Aashish Mangal, You Can Use Query Analyzer Like This:
ALTER TABLE [Employee] ADD [MySpecialIdx] [int] IDENTITY (1, 1) NOT NULL
Go
DELETE [Employee]
FROM (SELECT EmpID, EmpName, Salary, MAX(MySpecialIdx) AS mySpecialIdx
FROM [Employee]
GROUP BY EmpID, EmpName, Salary)) DERIVEDTBL
WHERE table_customers.EmpID = DERIVEDTBL.EmpID AND [Employee].EmpName = DERIVEDTBL.EmpName AND AND
[Employee].EmpName = DERIVEDTBL.EmpName AND Employee.myspecialIdx DerivedTbl.myspecialidx
Go
ALTER TABLE [Employee] DROP COLUMN MySpecialIdx
Go
Or You Can Use In View and Execute Step By Step
The Next for Aashish Mangal, we can use this query to delete the duplicate record:
WHILE (SELECT TOP 1 COUNT(id) AS Expr1
FROM dbo.Table1
GROUP BY id, FNAME, LNAME
HAVING (COUNT(id) > 1)) > 1
BEGIN
set rowcount 1
DELETE table1
FROM (SELECT TOP 1 id, FNAME, LNAME
FROM dbo.Table1
GROUP BY id, FNAME, LNAME
HAVING (COUNT(id) > 1)) DERIVEDTBL
WHERE table1.id = derivedtbl.id AND table1.fname = derivedtbl.fname AND table1.lname = derivedtbl.lname
END
==
If this query will be saved, the first time you must add any table to query than remove the query text and replaced with the above query text, then you can saved this query with your desired name.
Sorry, in the fact, that query only can be executed and can not be saved.
Hi Pinal,
I had the same query i tried using Rank() function.
It worked for me.
Attaching code here:
WITH DeleteDuplicates
AS
(
SELECT RANK() OVER (PARTITION BY DUPFIELD ORDER BY PKFIELD ) AS RANK, * FROM TABLENAME
)
DELETE FROM DeleteDuplicates WHERE RANK > 1
Wish add more value to the topic.
Thanks,
Mit_2807
hi sir i did like this to delete duplicate rows…
sir i accomplished this using rank functions… and CTE
1)create table emp (id int,name varchar(20))
2) i inserted 1,’rakesh ‘ 3 times
and 2,’sagar’ 2 times
deleting duplicate rows…
with cte
as
(
select id,name,rank() over(order by id) r,row_number() over(order by id) rn from emp
)
delete from cte
where r rn
will this query degrade the performance or not ……
hi sir
could you please give script for sending SMTP mail
USING SQLSERVER….
Dear All,
i have a table called Player_details in that i have a data’ like below without any key i want to delete one record and another in data how to do this.
100 chennaiRoyals 9841998470
101 DareDevils 9984725487
101 DareDevils 9984725487
Both are same as like.
can u pls help me
To Find the 2nd Highest Salary:
2nd highest salary
1)SELECT max(salary) FROM Employee
WHERE salary < (SELECT max(salary) FROM employee)
Purushot
your query works great !!! finally I found an elegant way to solve the problem without exploiting new sql commands which in SQL 2005 don’t exist yet.
Dear All,
i have a table called Player_details in that i have a data’ like below without any key i want to delete one record and another in data how to do this.
100 chennaiRoyals 9841998470
101 DareDevils 9984725487
101 DareDevils 9984725487
Both are same as like.
can u pls help me
Dear Purushot
Suppose that table Player_details contains fields idx, Name and ID, and the table contains records as follows:
100 chennaiRoyals 9841998470
101 DareDevils 9984725487
101 DareDevils 9984725487
103 DareDevilx 9984725487
103 DareDevilx 9984725487
104 DareDevily 9984725487
104 DareDevily 9984725487
104 DareDevily 9984725487
The we want delete the duplicate record, those are the records with idx=101 and idx=103 and idx=104, so that just 1 record will be kept, we can follow these step:
1. Please create this query and save as PlayerDetails_ForDeletingDuplicate
SELECT *
FROM dbo.Player_details a
WHERE ((SELECT COUNT(*)
FROM dbo.Player_details b
WHERE a.idx = b.idx AND a.name = b.name AND a.id = b.id) > 1)
2. Please run this query, if we found any record viewed, then go to next step
3. Edit this query, change the line:
select *
with
delete dbo.Player_details
4. In the top this query add this line
set rowcount 1
5. Execute this query several times until no more records affected
6. close this query without save
Dear All,
I have a table which can accept duplicate records, but it should not accept duplicate within 20 seconds.I want to delete duplicate row which has been created back to back in 20 seconds. Is there any way to delete these records? Please let me know if you need more info about this.
Thanks in advance.
Cheers,
Saravanan
Dear All,
I forgot to tell you one thing. I got a column which stores created date in that table.
Cheers,
Saravanan
i like 2 learn more about the sql queries ,so please send me any updated information about the sql.
Good coding you guys have displayed. I have just added a while loop to eliminate all duplicates.
–create fruit table and populate it with duplicate fruits
CREATE TABLE #tblFruit
(
ID int identity,
Fruit varchar(7)
)
INSERT INTO #tblFruit
VALUES(’Banana’)
INSERT INTO #tblFruit
VALUES(’Banana’)
INSERT INTO #tblFruit
VALUES(’Pear’)
INSERT INTO #tblFruit
VALUES(’Orange’)
INSERT INTO #tblFruit
VALUES(’Orange’)
INSERT INTO #tblFruit
VALUES(’Apple’)
INSERT INTO #tblFruit
VALUES(’Banana’)
INSERT INTO #tblFruit
VALUES(’Apple’)
–display all the fruits you have, and count how many duplicates you have to see if your code works
SELECT * FROM #tblFruit
SELECT COUNT(FRUIT) FROM #tblFruit
SELECT COUNT(DISTINCT(FRUIT)) FROM #tblFruit
–loop through your table and delete any duplicates, display your new table w/o duplicates, count if you still have duplicates or not, delete your temp table
WHILE EXISTS
(
SELECT MIN(ID)
FROM #tblFruit
GROUP BY FRUIT
HAVING COUNT(FRUIT) > 1
)
BEGIN
DELETE
FROM #tblFruit
WHERE ID IN ( SELECT MIN(ID) FROM #tblFruit
GROUP BY Fruit
HAVING COUNT(Fruit) > 1
)
SELECT * FROM #tblFruit
SELECT COUNT(FRUIT) FROM #tblFruit
SELECT COUNT(DISTINCT(FRUIT)) FROM #tblFruit
DROP TABLE #tblFruit
END
Thanks Vamshi…It helped me a lot
Pinal,
Thanks for the excellent web site. Just wanted to point out a small typo.
You have group by DuplicateValueColumn2 twice. It should be DuplicateValueColumn3. I know most people would catch it. Just thought you should update the web site solution with correct code. Thanks.
Ash
Hi ;
I have two queries which is below. I want to get the records which are excess in Region1.table don’t want display common records also
1). SELECT * FROM
Region1.table
WHERE POST_KNTNR = ‘6731′
AND SLSKB_INIT = ‘NL’
AND VAKD = ‘DKK’
AND year(MDTGL_TMSTM) = 2008
2). SELECT * FROM
Region2.table
WHERE POST_KNTNR = ‘6731′
AND SLSKB_INIT = ‘NL’
AND VAKD = ‘DKK’
AND year(MDTGL_TMSTM) = 2008
Hi Ravichandra,
Use the following code. Replace COLUMN_LIST with column names. Let me know if you have any issues.
SELECT COLUMN_LIST FROM (SELECT DISTINCT ‘U’ AS SETNAME,
* FROM REGION1.TABLE WHERE POST_KNTNR = ‘6731′
AND SLSKB_INIT = ‘NL’
AND VAKD = ‘DKK’
AND YEAR (MDTGL_TMSTM) = 2008
UNION ALL
SELECT DISTINCT NULL, * FROM
REGION2.TABLE
WHERE POST_KNTNR = ‘6731′
AND SLSKB_INIT = ‘NL’
AND VAKD = ‘DKK’
AND YEAR (MDTGL_TMSTM) = 2008) A
GROUP BY COLUMN_LIST
HAVING COUNT (*) = 1 AND MAX (SETNAME) = ‘U’
Hi,
A table contains PRIMARY KEY CLUSTERED and it contains 20 partitions. How to delete a FIRST (1st) Partition.
Thanks
GAG
hi sir
i m very impressed with your article ?
it is great sir
To delete the DuplicateRows when the ID column is UniqueIdentifier
DELETE
FROM MyTable where Id not in(
SELECT Top 1 b.Id
FROM MyTable as b
WHERE b.Col1= MyTable.Col1
AND b.Col2= MyTable.Col2
AND b.Col3= MyTable.Col3
)
Hi,
Do the following steps:
1: Create a temp table with the same column names as the main table
2: Copy the duplicate records by using count>1 in the where clause
3: Delete the duplicate records in the main table using the count>1
4: Insert the records from the temp table to the original table
5: Clear the temp table
If in the sql server table there is no primary key in the table and the data is huge and we want to delete the duplicate records from that data by using the query so tell me which is the feasible way for deleting duplicate records.
Please help with the following:
SELECT
date,
name,
desc,
‘Code’ as cdDesc
FROM activity
WHERE
date = ‘2009-01-05′ and
site = ‘80′ and
exists
(select code
from activity
where code in (’55544′,’33333′, ‘66666′)
GROUP BY code HAVING COUNT(*) > 2)
UNION ALL
SELECT
date,
name,
desc,
‘Code2′ as cdDesc
FROM activity
WHERE
date = ‘2009-01-05′ and
site = ‘80′ and
exists
(select code
from activity
where code in (’2222′,’3333′, ‘66666′, ‘88888′)
GROUP BY code HAVING COUNT (*) > 3)
The issue is that some code(s) exist in multiple rows and I only need the whole group to display. I used “AND’ but that gave me no results. ANY HELP would be appreciated.
CREATE TABLE #temp
(SAP_FL varchar(40),
EQUIPMENT_NO varchar(18),
SHORT_DESCR varchar(100),
COMPL_DATE datetime,
NOTIFICATION_NO varchar(12),
ACTIVITY_CODE VARCHAR(10),
ACTIVITY_NAME VARCHAR(100),
TOTAL_COUNT float,
CONST_TYPE varchar(100),
UNIT varchar(50),
PriorityID int,
)
————————————————————-
insert into #temp values ( ‘1023-A615400-443401-00901-00401′ ,50001564 , ‘Ultrasonic stack #30 ‘ , ‘11/14/2005′ , ‘500722482′ , ‘REPL’ , ‘Replaced’ , 70628 , 80159257 , ‘EA’ , 2 )
insert into #temp values ( ‘1023-A614400-443401-00901-00501′ ,50001564 , ‘Ultrasonic stack #30 ‘ , ‘4/15/2005′ , ‘500357639′ , ‘REPL’ , ‘Replaced’ , 70404 , 80159257 , ‘EA’ , 3 )
insert into #temp values ( ‘1023-A615400-443401-00901-00201′ ,50001564 , ‘Ultrasonic stack #30 ‘ , ‘3/21/2005′ , ‘500329094′ , ‘REPL’ , ‘Repaired’ , 70321 , 80159257 , ‘EA’ , 2 )
insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘10/4/2008′ ,’502737991′,’REPR’, ‘Repaired’, 218242 ,’80296971′ ,’EA’, 3)
insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘8/29/2008′ ,’502675451′,’REPR’, ‘Repaired’, 218242 ,’80296971′ ,’EA’, 3)
insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘8/29/2008′ ,’502620150′,’REPR’, ‘Repaired’, 218242 ,’80296971′ ,’EA’, 3)
insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘6/18/2008′ ,’502495333′,’REPR’, ‘Repaired’, 208032 ,’80296971′ ,’EA’, 3)
insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘9/7/2007′ ,’501944796′,’REPR’, ‘Repaired’, 197999 ,’80296971′ ,’EA’, 3)
insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘9/23/2005′ ,’500648350′,’REPR’, ‘Repaired’, 159277 ,’80296971′ ,’EA’, 3)
insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘12/17/2004′,’500182407′,’REPR’, ‘Repaired’, 118935 ,’80296971′ ,’EA’, 3)
—————————————————-
select
t1.SAP_FL as SAP_FL_ST ,
t2.SAP_FL as SAP_FL_ED,
t1.EQUIPMENT_NO as EQUIPMENT_NO_ST,
t2.EQUIPMENT_NO as EQUIPMENT_NO_ED,
t1.COMPL_DATE as COMPL_DATE_ST ,
t2.COMPL_DATE as COMPL_DATE_ED,
t1.SHORT_DESCR as SHORT_DESCR_ST,
t2.SHORT_DESCR as SHORT_DESCR_ED,
t1.NOTIFICATION_NO as NOTIFICATION_NO_ST,
t2.NOTIFICATION_NO as NOTIFICATION_NO_ED,
t1.ACTIVITY_CODE as ACTIVITY_CODE_ST,
t2.ACTIVITY_CODE as ACTIVITY_CODE_ED ,
t1.ACTIVITY_NAME as ACTIVITY_NAME_ST ,
t2.ACTIVITY_NAME as ACTIVITY_CODE_ED,
t1.TOTAL_COUNT as TOTAL_COUNT_ST ,
t2.TOTAL_COUNT as TOTAL_COUNT_ED ,
t1.TOTAL_COUNT -
t2.TOTAL_COUNT as TOTAL_COUNT_DIFFERENCE,
t1.CONST_TYPE as CONST_TYPE_ST ,
t2.CONST_TYPE as CONST_TYPE_ED ,
t1.UNIT
from (select SAP_FL as ‘SAP_FL’,
EQUIPMENT_NO as ‘EQUIPMENT_NO’,
SHORT_DESCR as ‘SHORT_DESCR’,
COMPL_DATE as ‘COMPL_DATE’,
NOTIFICATION_NO as ‘NOTIFICATION_NO’,
ACTIVITY_CODE as ‘ACTIVITY_CODE’,
ACTIVITY_NAME as ‘ACTIVITY_NAME’,
CONST_TYPE as ‘CONST_TYPE’,
UNIT as ‘UNIT’
,max(TOTAL_COUNT) as ‘TOTAL_COUNT’
from #temp
– where EQUIPMENT_NO = 50001721
group by SAP_FL ,
EQUIPMENT_NO ,
COMPL_DATE ,
NOTIFICATION_NO ,
ACTIVITY_CODE ,
ACTIVITY_NAME ,
SHORT_DESCR ,
CONST_TYPE ,
UNIT )t1
INNER JOIN #temp t2
ON t1.EQUIPMENT_NO=t2.EQUIPMENT_NO
AND t1.ACTIVITY_CODE=t2.ACTIVITY_CODE
AND t1.COMPL_DATE>t2.COMPL_DATE
=====================================
OUTPUT–if we considered equi no 50001721
==================================
SAP_FL_ST SAP_FL_ED EQUIPMENT_NO_ST EQUIPMENT_NO_ED COMPL_DATE_ST COMPL_DATE_ED NOTIFICATION_NO_ST NOTIFICATION_NO_ED ACTIVITY_CODE_ST ACTIVITY_CODE_ED ACTIVITY_NAME_ST ACTIVITY_NAME_ED TOTAL_COUNT_ED TOTAL_COUNT_ST TOTAL_COUNT_DIFFERENCE SHORT_DESCR_ST SHORT_DESCR_ED CONST_TYPE_ST CONST_TYPE_ED UNIT
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 10/4/2008 8/29/2008 502737991 502675451 REPR REPR Repaired Repaired 218242 218242 0 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 10/4/2008 8/29/2008 502737991 502620150 REPR REPR Repaired Repaired 218242 218242 0 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 8/29/2008 502675451 502620150 REPR REPR Repaired Repaired 218242 218242 0 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 10/4/2008 6/18/2008 502737991 502495333 REPR REPR Repaired Repaired 218242 208032 10210 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 6/18/2008 502620150 502495333 REPR REPR Repaired Repaired 218242 208032 10210 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 6/18/2008 502675451 502495333 REPR REPR Repaired Repaired 218242 208032 10210 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 10/4/2008 9/7/2007 502737991 501944796 REPR REPR Repaired Repaired 218242 197999 20243 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 6/18/2008 9/7/2007 502495333 501944796 REPR REPR Repaired Repaired 208032 197999 10033 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 9/7/2007 502620150 501944796 REPR REPR Repaired Repaired 218242 197999 20243 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 9/7/2007 502675451 501944796 REPR REPR Repaired Repaired 218242 197999 20243 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 10/4/2008 9/23/2005 502737991 500648350 REPR REPR Repaired Repaired 218242 159277 58965 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 6/18/2008 9/23/2005 502495333 500648350 REPR REPR Repaired Repaired 208032 159277 48755 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 9/23/2005 502620150 500648350 REPR REPR Repaired Repaired 218242 159277 58965 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 9/23/2005 502675451 500648350 REPR REPR Repaired Repaired 218242 159277 58965 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 9/7/2007 9/23/2005 501944796 500648350 REPR REPR Repaired Repaired 197999 159277 38722 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 10/4/2008 12/17/2004 502737991 500182407 REPR REPR Repaired Repaired 218242 118935 99307 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 6/18/2008 12/17/2004 502495333 500182407 REPR REPR Repaired Repaired 208032 118935 89097 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 12/17/2004 502620150 500182407 REPR REPR Repaired Repaired 218242 118935 99307 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 12/17/2004 502675451 500182407 REPR REPR Repaired Repaired 218242 118935 99307 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 9/23/2005 12/17/2004 500648350 500182407 REPR REPR Repaired Repaired 159277 118935 40342 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 9/7/2007 12/17/2004 501944796 500182407 REPR REPR Repaired Repaired 197999 118935 79064 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-A614400-443401-00901-00501 1023-A615400-443401-00901-00201 50001564 50001564 3/21/2005 4/15/2005 500357639 500329094 REPL REPL Replaced Repaired 70404 70321 83 Ultrasonic stack #30 Ultrasonic stack #30 80159257 80159257 EA
1023-A615400-443401-00901-00401 1023-A614400-443401-00901-00501 50001564 50001564 4/15/2005 11/14/2005 500722482 500357639 REPL REPL Replaced Replaced 70628 70404 224 Ultrasonic stack #30 Ultrasonic stack #30 80159257 80159257 EA
1023-A615400-443401-00901-00401 1023-A615400-443401-00901-00201 50001564 50001564 3/21/2005 11/14/2005 500722482 500329094 REPL REPL Replaced Repaired 70628 70321 307 Ultrasonic stack #30 Ultrasonic stack #30 80159257 80159257 EA
here in output it puts tries to take start date as max date and calcuates difference based on max(date)- all remaining dates
but i want in this sequence
1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 10/4/2008 502737991 REPR Repaired 218242 80296971 EA 3
1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 8/29/2008 502675451 REPR Repaired 218242 80296971 EA 3
1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 8/29/2008 502620150 REPR Repaired 218242 80296971 EA 3
1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 6/18/2008 502495333 REPR Repaired 208032 80296971 EA 3
1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 9/7/2007 501944796 REPR Repaired 197999 80296971 EA 3
1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 9/23/2005 500648350 REPR Repaired 159277 80296971 EA 3
1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 12/17/2004 500182407 REPR Repaired 118935 80296971 EA 3
1023-A614400-443401-00901-00501 1023-A615400-443401-00901-00201 50001564 50001564 3/21/2005 4/15/2005 500357639 500329094 REPL REPL Replaced Repaired 70404 70321 83 Ultrasonic stack #30 Ultrasonic stack #30 80159257 80159257 EA
1023-A615400-443401-00901-00401 1023-A614400-443401-00901-00501 50001564 50001564 4/15/2005 11/14/2005 500722482 500357639 REPL REPL Replaced Replaced 70628 70404 224 Ultrasonic stack #30 Ultrasonic stack #30 80159257 80159257 EA
—————————————————————————
can anybody has the solution for this?
@qute/cute
Dude, Your query is incomplete, I spent nearly an hour working on this, But I could not understand what you want in the select statement.
You say you are trying to calculate difference, where is the difference, you have 4 columns you call them Total_Count, you are not even using any function to calculate difference, query is not clear at all and also the sample data… I mean atleast give me the select query properly, data ( insert statements )you provided are not correct comparing to the out put what you have given in your post.
Please post your complete questions.
Regards,
IM.
@dyamond
When ever you use, Group BY, you need to have a aggregate function in your select list. The subquery you are trying to use has group by but no Aggregate function, also you are saying having Count(*) > 2 at one place and having Count(*) > 3 at another place, but where exactly is this count(*), are you writing this any place.
SELECT
[date],
[name],
[desc],
‘Code’ as cdDesc
FROM activity
WHERE date = ‘2009-01-05′
and site = ‘80′
and exists
(
select code , count(*) Counts
from activity
where code in (’55544′,’33333′, ‘66666′)
GROUP BY code HAVING COUNT(*) > 2
)
UNION ALL
SELECT
[date],
[name],
[desc],
‘Code2′ as cdDesc
FROM activity
WHERE date = ‘2009-01-05′
and site = ‘80′
and exists
(
select code, count(*) Counts
from activity
where code in (’2222′,’3333′, ‘66666′, ‘88888′)
GROUP BY code HAVING COUNT (*) > 3
)
Try using the code like above.
I did not test the script functionality, since I do not have sample dat, If script still does not work then please post table structure script and some sample data and expected output.
So that we can help you out.
Regards,
IM.
/*
Input Param:
@vchStartDate [in yyyymmdd format]
@intDuration [specifying the no. of days for end date calculation]
This code block will calculate the end date based on the supplied start date and duration.
Finally it will return the dates and the day starting from start date to end date both inclusive.
Ist Resultset- Returns All Dates in the range starting from start date to end date.
IInd Resultset- Returns All Dates in the range starting from start date to end date excluding saturdays and sundays.
IIIrd Resultset- Returns All Dates in the range starting from start date to end date excluding weekdays.
*/
DECLARE @vchStartDate VARCHAR(10)
DECLARE @intDuration INT
DECLARE @LclvchEndDate VARCHAR(10)
/* Set input values */
SET @vchStartDate = ‘20090101′
SET @intDuration = 50
/* Calculate End Date */
SET @LclvchEndDate = CONVERT(VARCHAR(10),DATEADD(dd,@intDuration, CAST(@vchStartDate AS DATETIME)),112);
WITH mycte AS
(
SELECT CAST(@vchStartDate AS DATETIME) DateValue
UNION ALL
SELECT DateValue + 1
FROM mycte
WHERE DateValue + 1 <= @LclvchEndDate
)
SELECT DateValue
INTO #tbl_data
FROM mycte
OPTION (MAXRECURSION 0)
/* Ist Resultset- Returns All Dates in the range starting from start date to end date. */
SELECT CONVERT(VARCHAR(10),DateValue,121) Date,
[Day] =
CASE
WHEN DATEPART(dw,DateValue)= 1 THEN ‘SUNDAY’
WHEN DATEPART(dw,DateValue)= 2 THEN ‘MONDAY’
WHEN DATEPART(dw,DateValue)= 3 THEN ‘TUESDAY’
WHEN DATEPART(dw,DateValue)= 4 THEN ‘WEDNESDAY’
WHEN DATEPART(dw,DateValue)= 5 THEN ‘THURSDAY’
WHEN DATEPART(dw,DateValue)= 6 THEN ‘FRIDAY’
WHEN DATEPART(dw,DateValue)= 7 THEN ‘SATURDAY’
END
FROM #tbl_data
/* IInd Resultset- Returns All Dates in the range starting from start date to end date excluding saturdays and sundays. */
SELECT CONVERT(VARCHAR(10),DateValue,121) Date,
[Day] =
CASE
WHEN DATEPART(dw,DateValue)= 2 THEN ‘MONDAY’
WHEN DATEPART(dw,DateValue)= 3 THEN ‘TUESDAY’
WHEN DATEPART(dw,DateValue)= 4 THEN ‘WEDNESDAY’
WHEN DATEPART(dw,DateValue)= 5 THEN ‘THURSDAY’
WHEN DATEPART(dw,DateValue)= 6 THEN ‘FRIDAY’
END
FROM #tbl_data
WHERE DATEPART(dw,DateValue) NOT IN (1,7)
/* IIIrd Resultset- Returns All Dates in the range starting from start date to end date excluding weekdays. */
SELECT CONVERT(VARCHAR(10),DateValue,121) Date,
[Day] =
CASE
WHEN DATEPART(dw,DateValue)= 1 THEN ‘SUNDAY’
WHEN DATEPART(dw,DateValue)= 7 THEN ‘SATURDAY’
END
FROM #tbl_data
WHERE DATEPART(dw,DateValue) IN (1,7)
DROP TABLE #tbl_data
Hi pinal,
your quary is good ut it has some limitation like identity col.
i have a better idea to do delete duplicate record using CTE.
like:
CREATE TABLE [dbo].[testing3](
[id] [int] NULL,
[name] [varchar](50)
) ON [PRIMARY]
with dup as(select *,row_number() over(partition by id order by id)as TID from testing3)
delete from dup where tid >1
Hello sir,
You r genious, I love your articles.
thanks
Hi Pinal,
I want to learn sql server.I am new in this field.
Thanks
Execute these queries for learning how to remove duplicate record.
CREATE TABLE dbo.Test1 (
[ID] [int] ,
[FirstName] [varchar](25),
[LastName] [varchar](25)
) ON [PRIMARY]
INSERT INTO Test1 VALUES(1, ‘Bob’,'Smith’)
INSERT INTO Test1 VALUES(2, ‘Dave’,'Jones’)
INSERT INTO Test1 VALUES(3, ‘Karen’,'White’)
INSERT INTO Test1 VALUES(1, ‘Bob’,'Smith’)
INSERT INTO Test1 VALUES(4, ‘Bobby’,'Smita’)
select identity(int,1,1) as SlNo,* into #temp from Test1
DELETE
FROM #temp
WHERE SlNo NOT IN
(
SELECT MAX(SlNo)
FROM #temp
GROUP BY ID,FirstName,lastname
)
drop table test1
select * into test1 from #temp
alter table test1 drop column SlNo
select * from test1 order by id
[...] SQL SERVER – Delete Duplicate Records – Rows [...]
–Try this to delete multiple records
;with t1 as
(
select id,Row_number() over (partition by a, b, c order by a, b, c) as rnum
from mytable with(nolock)
)
delete from t1 where rnum>1
[...] Same example converted in SQL Server 2005 to work with Database AdventureWorks. USE AdventureWorks; GO SELECT TOP 1 Rate FROM ( SELECT DISTINCT TOP 4 Rate FROM HumanResources.EmployeePayHistory ORDER BY Rate DESC) A ORDER BY Rate GO Reference : Pinal Dave (http://blog.SQLAuthority.com), Pravin Phatangare [...]
hi guys
i have a problem ??? what if the entire row is duplicate ??
with i dentity column also same .
i am transfering data from a table having duplicate rows to a table having a primary key
pls help me out
gaurav,
Praney’s sql in the comments will help you out. I have the same situation and that solved the problem.
;with t1 as
(
select id,Row_number() over (partition by a, b, c order by a, b, c) as rnum
from mytable with(nolock)
)
delete from t1 where rnum>1
Hi Gaurav,
I have duplicate records in a table,till recently i added a
datetime field in my table called timestamp.Now when
duplicates go into this table i will be able to see the current
date in the timestamp
Firstname Lastname memberno timestamp
Jack Hill 11111 2009-12-11
Jack Hill 11111 null
Simon Philip 222222 2009-12-11
Simon Philip 222222 2009-12-12
Sam Wheat 33333 nulll
Sam Wheat 33333 null
In this table i want to fetch a duplicate set which has
one record with a valid timestamp and other timestamp
as null – something like
Firstname Lastname memberno timestamp
Jack Hill 11111 2009-12-11
Jack Hill 11111 null
select max(timestamp) from member group by firstname,lastname having count(*)>1
How do i modify my display duplicate statement to have the resultset which includes duplicates with and without
a valid timestamp
Hi Pinal
With reference to the above question
Can anyone help me out with a query which can display only duplicates with or witout a timestamp, they should
not include duplicates members all having timestamps
Hi Luke,
You can use ROW_Number() to identify Duplicate Rows
Here You need to do as:
;with CTE AS(
select Row_NUmber() (OVER PARTITION BY FirstName, LastName, MemberNo Order BY TimeStamp ASC) AS RowID
FROM table
)
select * from cte where rowID=1
this will give you Records with NULL. If you want data with Dates then you need to change “ORDER BY” to “Order BY TimeStamp DESC”
For further details, you can read it on my blog: http://tejasnshah.wordpress.com/2008/12/31/find-duplicate-rows-with-row_number-sql-server-2005/
Thanks,
Tejas
Hi Pinal,
Your articles have always helped me. Just curious if I could rationalize data in the example below using a variant of this example (I have an option to use SSIS Fuzzy lookup, but its way off my application scope. So it is ruled out.)
Cheers,
Krish
HI PINAL,
I HAVE READ YOUR ARTICLES AND RESPONSES FOR THE QUERIES AND I FOUND IT GOOD.
I TOO HAVE ONE QUERY.
THERE IS TABLE1 AND TABLE2
TABLE 1 CONTAINS 100 RECORDS.
TABLE 2 CONTAINS 150 RECORDS.
TABLE 2 CONTAINS SOME RECORDS WHICH ARE ALREADY PRESENT IN TABLE 1.
NOW WHEN I MERGE BOTH IT ADDS ALL THE RECORDS GIVING TOTAL NO. TO 250.
FROM THESE I WANT TO REMOVE THE DUPLICATE DATA WHICH I GOT FROM TABLE 2 AND ARE ALREADY PRESET IN TABLE 1.
I have tried above methods but did not yield required results.
Be Helpful…
I got the solution of above problem
FRIEND in a NEED
Another Question.
i am joining TABLE1 (older) And TABLE2 (newer).
i found some records which are common in both table.
i want to remove found common data from TABLE2(newer)
How can i perform that operation?
GENERAL QUESTION
How can we perform the Update,Delete operation on query of join
–common records in two tables using inner join
select *
from TABLE1 as e
inner join TABLE2 as c
on e.respondentid = c.respondentid
Vikas, will UNION work for you?
SELECT … FROM Table_1
UNION
SELECT … FROM Table_2
Vikas, for the second question:
DELETE FROM Table2 WHERE EXISTS(SELECT * FROM Table1 WHERE Table1.respondentid = Table2.respondentid)
thnx buddy tkatch.
i have table TABLE1 with data inside.
i want to add one column name as “Serial Number”.
inside this i want to give Automatic Serial No.
Say there are 5000 records then the Serial Number Column should contain1,2,3,4…..5000 at last record.
Be Beedful
i got the solution of above query.
“alter table table1 add ID INT IDENTITY(1,1)”
But this inserts the column at the end.
i want this column to be inserted at the start.
i.e. First Column should be of identity Column
i have a table TABLE1 of Five Column.
I want to insert one more column in this table after First Column.
How can i insert the New Column from SQL Editor in between the Columns Present not at the end?
Hi Pinal,
I HAVE READ YOUR ARTICLES AND RESPONSES FOR THE QUERIES AND I FOUND IT GOOD.
Actully i m new in SQL SERVER2005.
I stucked in a query, if find the solution:-
Table1 Table2
col1|col2 col1|col2
10 100 10 100
11 150 15 200
15 200 13 210
21 210 23 100
Output should be:
col1|col2
10 100
11 150
21 210
13 210
23 100
Hi Pinal,
i have one more query:
Delete the duplicates rows from a table if there is not any primary key.
Please find the solution and Thanks in advance.
To aLL,
I have one table TABLE! , which contains one column named “EMPLOYEE NAME”.
There are various values inside it.
i want to check every field of EMPLOYEE NAME and find out if there is any Numeric Value inside.
EG.
EMPLOYEE NAME
Billgates
George
A1dam –Here 1 is present in between
Gem9ini –Here 9 is present in between
i want to identify this types of name which contains the numeric value in the field.
using the instructions that you posted about adding an id column:
“Good Question. I have been asking this question in interview many times to new candidates. Answer is:
1) Add Identity Col and perform the operation displayed in this blog and drop Identity Col.
2) User Cursor and save your values in temp table. If you receive the same row which you put in temp table delete it.
3) You can use while loop as well do the same as cursor”
i get the error message from sql telling me it cannot insert the value NULL into column “id” column does not allow nulls (i have got this after trying to set the primary key on the id field)
@Domenic
I don’t think, I understood your question completely,
I think you asked, why do you get this error:
error message from sql telling me it cannot insert the value NULL into column “id” column does not allow nulls (i have got this after trying to set the primary key on the id field)
Try this,
– Step 1: Create a table
CREATE TABLE example1 ( id INT NOT NULL , ename VARCHAR(10))
– Step 2 : Insert some data in this table, few duplicate records
INSERT INTO example1 VALUES( 1, ‘imran’)
INSERT INTO example1 VALUES( 1, ‘imran’)
INSERT INTO example1 VALUES( 2, ‘imran’)
INSERT INTO example1 VALUES( 2, ‘imran’)
INSERT INTO example1 VALUES( 3, ‘imran’)
– Step 3: lets add Identity Column
ALTER TABLE example1
ADD Iden INT IDENTITY
– Step 4: Lets remove duplicate records.
DELETE
FROM example1
WHERE Iden NOT IN
(
SELECT MAX(Iden)
FROM example1
GROUP BY id, ename)
– Check if duplicate records still exists ?
SELECT * FROM example1
– Step 6: drop Identity column, which we added earlier.
ALTER TABLE example1
DROP COLUMN Iden
– Step 7: Lets make id column Primary key.
ALTER TABLE example1
ADD CONSTRAINT PK_Example1 PRIMARY KEY (id)
This works well with no issues.
if you replace step 1: with below script
– step1 (a) : Create a table
CREATE TABLE example1 ( id INT , ename VARCHAR(10))
if you repeat from step 2 – Step 7, step7 will fail, because id column in example1 table allows Null as you can see, I did not create that column with NOT NULL constraint, that is why you cannot make that column as primary key.
Does this answer your question ? If not please be clear what you actually want to ask ?
~ IM.
hello
thanks for this script
Table1 Table2
ID Sim_No Sim_No Plate
1 2203080 2203080 AD2345
2 2203081
3 2203082
How I delete The Sim No From Table1(Same Record In Table2 ) ( If I Take any No From Table1 & put In To Table 2)
@Jerry
Whatever routine is used to INSERT the data INTO Table2, can also DELETE from Table1.
Or, for a general solution (which may not be very quick) DELETE any records in Table1 that EXIST in Table2:
DELETE FROM Table1 WHERE EXISTS (SELECT * FROM Table2 WHERE Table2.Sim_No = Table1.Sim_No)
ItemTable
ItemNo BatchNo Qty
11 101 100
11 102 500
12 101 100
12 101 500
13 105 1000
11 101 600
Now I want like this
ItemNo BatchNo Qty
11 101 700
11 102 500
12 101 600
13 105 1000
how to write query for this
the above formate has Sample record in our project it have 35000 record
@tamilselvan
Do you want a regular GROUP BY?
SELECT ItemNo, BatchNo, SUM(Qty)
GROUP BY ItemNo, BatchNo;
create table example1 (ItemNo int ,BatchNo int ,Qty int)
insert into example1 values (11 ,101, 100)
insert into example1 values (11, 102, 500)
insert into example1 values (12, 101, 100)
insert into example1 values (12, 101, 500)
insert into example1 values (13, 105, 1000)
insert into example1 values (11, 101, 600)
select ItemNo
,BatchNo
,Sum(Qty) Qty
from example1
group by ItemNo
,BatchNo
I have a few records that are duplicates, all I need to do is mark them as duplicates.
select field01 from table01
field01
AA
AB
AC
AB
AA
AD
All that I require is
select field01, mark01 from table01 to return
field01 ; mark01
AA; 1
AB; 1
AC; 1
AB; 2
AA; 2
AD; 1
@gpshiburaj
ROW_NUMBER() does that:
WITH
Data(field01)
AS
(
SELECT ‘AA’ UNION ALL
SELECT ‘AB’ UNION ALL
SELECT ‘AC’ UNION ALL
SELECT ‘AB’ UNION ALL
SELECT ‘AA’ UNION ALL
SELECT ‘AD’
)
SELECT
field01,
ROW_NUMBER() OVER(PARTITION BY field01 ORDER BY field01)
FROM
Data;
Thanks Brian,
I was thinking that we would have to use a cursor for this, using row_number is easier.
Hello Pinal,
Is there any procedure to restore deleted records from particular table in database?
Thank you
Amit Jain
Hello sir,
You r genious, I love your articles.
thanks
Hey Guys thats all fine . U can store the values in the Temp table remove the duplicate records and empty the original table . Then move all the unique records to the original table . But its a lengthy process suppose i am having 2 million records in a table and from that table if i want to ren=move the duplicate records then it would really affect the performance . Sorry Since i am not having so much of exp and if i had hurted someone . Pinal Sir do let me know if i am wrong