Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.
DELETE
FROM MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)
Watch the view to see the above concept in action:
[youtube=http://www.youtube.com/watch?v=ioDJ0xVOHDY]
Reference : Pinal Dave (https://blog.sqlauthority.com)
450 Comments. Leave new
different source files has around 5 lakhs- 10 lakhs records with duplicates and final table will be having 5 billion records . How you will remove the duplicate records before inserting it into final table.
Import data into staging tables and from there remove duplicates and them to final table
create table as select distinct * from ;
Great post! Thanks!
I believe the best way to delete (multiple) dupes from a table ( that has a primary key) would be as follows:
Delete from table
where ID not in
(Select MIN(id)
from table
group by duplicateCol1, duplicateCol2……duplicateColn)
Like this answer
DELETE
FROM TestTable
WHERE TestTable.%%physloc%%
NOT IN (SELECT MIN(b.%%physloc%%)
FROM TestTable b
GROUP BY b.column1, b.Column2)
WHILE (SELECT COUNT(*) FROM TBLEMP WHERE EMPNO IN (SELECT empno from tblemp group by empno having count(empno)>1))>1
DELETE top(1) FROM TBLEMP WHERE EMPNO IN (SELECT empno from tblemp group by empno having count(empno)>1)
or
DELETE duplicates FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY firstname, lastname, EMPNO, salary,dept ORDER BY empno) cnt
FROM tblEmp) duplicates
WHERE duplicates.Cnt > 1
thanks
Perfect :-) Thanks for the hint :-)
Hi,
I need to remove duplicates from a very big query. I already tried select distinct..but has not worked:
SELECT distinct Emp.Emp AS ‘Employee ID#’ , Emp.SSNSIN AS ‘Government ID #’, Emp.LastName, Emp.FirstName, Emp.FirstHireDT, Emp.HireDt AS ‘Original Hire Date’, Eml.EmpSts AS ‘Employment Status’, Eml.WrkSts AS ‘Work Status’, Eml.PayGrp AS ‘Payroll Group’, Eml.CstCde AS ‘Cost Code’, Eml.EnrTyp AS ‘Enrollment Type’, Eml.Shf AS ‘Shift’, Eml.Uno AS ‘Union’, Eml.Sup AS ‘ Supervisor #’, Eml.SupName AS ‘Supervisor Name’, Eml.Pos AS ‘Position’, Pos.PosTitle, Eml.SubRegCZ AS ‘Sub Region’, Eml.Ent AS ‘Opco’, Eml.Job AS ‘Job Title’, Eml.Cmp AS ‘Entities’, Eml.Loc AS ‘Location’, Eml.Dpt AS ‘Department’, Eml.Div AS ‘Functional Area’, Eml.Org AS ‘Sub-Functional Area’, Eml.Unt AS ‘Manager’, Eml.SalRat AS ‘Salary Rate’, Eml.PayPer AS ‘Pay Period’, Eml.AnnlSal AS ‘Annual Salary’, PayRng.PayGrd, Eml.ExmptSts, Eml.FTE, (select Max(NextReviewDt) from EmpRvw where EmpRvw.Emp=Emp.Emp) AS ‘Review_Date’, Eml.FileNo, Emp.BirthDt, Emp.HireDt, Emp.TerminationDt, Emp.Gender, Emp.MaritalStatus, (Case When BirthDt<= GetDate() Then FLOOR(datediff(day,BirthDt,getdate())/365.25) Else '0' End) AS Age, Emp.SSNSINCty, Emp.FirstHireDt, Emp.AdjustedHireDt, Eml.EnrTyp, Dpt.DptName, Div.DivName, Org.OrgName, SubRegCZ.SubRegName, Emp.VendorCZ, Emp.CustomerCZ, Emp.Rce, Emp.EMailWork, Emp.CellPhone, Emp.PhoneWork, EmpAdr.Address, EmpAdr.Address2, EmpAdr.Address3, EmpAdr.City, EmpAdr.StateProv, EmpAdr.ZipPostal, EmpAdr.PhoneHome, EmpAdr.EMailHome, (Case When HireDt<= GetDate() Then Round(floor(DateDiff(day,FirstHireDt,GetDate())/365.25)+((floor(DateDiff(month,FirstHireDt,getDate()))-(floor(DateDiff(day,FirstHireDt,GetDate())/365.25)*12))/12),2) Else 0 End) AS Years_of_Service, Eml.TimeSupCZ AS 'Supervisor Name' FROM Emp INNER JOIN Eml ON (Emp.Emp = Eml.Emp AND Eml.EmlEfdDt IS NULL AND Emp.EmpEfdDt IS NULL) LEFT JOIN Job ON (Eml.Job = Job.Job AND Job.JobEfdDt IS NULL) LEFT JOIN PayRng ON (Eml.PayRng = PayRng.PayRng AND PayRng.PayRngEfdDt IS NULL) LEFT JOIN Pos ON (Eml.Pos = Pos.Pos AND Pos.PosEfdDt IS NULL) LEFT JOIN Org ON Org.Org = Eml.Org LEFT JOIN Dpt ON Dpt.Dpt = Eml.Dpt LEFT JOIN Div ON Div.Div = Eml.Div LEFT JOIN EmpAdr ON EmpAdr.Emp = Emp.Emp LEFT JOIN SubRegCZ ON SubRegCZ.SubReg = Eml.SubRegCZ WHERE (Eml.EmlEfdDt IS NULL AND Emp.EmpEfdDt IS NULL AND Eml.EmlEfdDt IS NULL AND Job.JobEfdDt IS NULL AND PayRng.PayRngEfdDt IS NULL AND Pos.PosEfdDt IS NULL)
can someone please help me~! It is very urgent …
Always very helpful , Thanks Pinal Dave!!
CREATE TABLE TABLE1 (ID INT,Name char(10));
Insert Into Table1 (ID,Name) Values(1,’Dipak’)
Insert Into Table1 (ID,Name) Values(2,’Jyoti’)
Insert Into Table1 (ID,Name) Values(1,’Dipak’)
Insert Into Table1 (ID,Name) Values(3,’Veera’)
Insert Into Table1 (ID,Name) Values(2,’Jyoti’)
GO
DELETE DrowNo FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY Id, name ORDER BY Id) cnt
FROM Table1
) DrowNo
WHERE DrowNo.Cnt > 1
Go
Select * From Table1
Go
DROP TABLE Table1
GO
excellent work
Hello,
When am excuting above command am getting below error.
#1093 – You can’t specify target table ‘mysite’ for update in FROM clause
Please help
Can you post the exact query you used?
Hello Sir,
Please Advice how to insert unique records from one dB to Another.
Blowroom 2 April 2014
Blowroom 2 April 2014
Blowroom 2 January 2014
Blowroom 2 January 2014
Blowroom 2 March 2014
Blowroom 2 March 2014
that is a table duplicate month how to delete plz told anybody
my mail id [removed]
Here is simple magic to remove Duplicates
select * into NewTable from ExistingTable
union
select * from ExistingTable;
its work thanks again
Kishor – I am glad that it helped you.
by using common table expression we can delete duplicate records….
Yes you can
It throws an error like “You can’t specify target table ‘time_table_detail’ for update in FROM clause”
No need to use having Claus here??