In my previous article SQL SERVER – Delete Duplicate Records – Rows, we have seen how we can delete all the duplicate records in one simple query. In this article we will see how to find count of all the duplicate records in the table. Following query demonstrates usage of GROUP BY, HAVING, ORDER BY in one query and returns the results with duplicate column and its count in descending order.
SELECT YourColumn, COUNT(*) TotalCount
FROM YourTable
GROUPÂ BY YourColumn
HAVING COUNT(*)Â > 1
ORDERÂ BY COUNT(*) DESC
Watch the view to see the above concept in action:
[youtube=http://www.youtube.com/watch?v=ioDJ0xVOHDY]Reference : Pinal Dave (https://blog.sqlauthority.com) , SQL SERVER – Delete Duplicate Records – Rows
187 Comments. Leave new
SELECT EBD.sFirstName + ‘ ‘ + EBD.sMiddleName + ‘ ‘ + EBD.sLastName AS EmpName,
EBD.sFatherName ,
EOD.dJoinigDate,
–‘Present’,’Weekly Off’,’Holiday’,’Half Day’
ISNULL( (SELECT SUM(CASE WHEN EA.D1=’Present’ THEN 1 ELSE CASE WHEN EA.D1=’Weekly Off’ THEN 2 ELSE CASE WHEN EA.D1=’Holiday’ THEN 3
ELSE CASE WHEN EA.D1=’Half Day’ THEN 1 ELSE 0 END)
FROM tbl_empAttendenceLog EA Where EA.iEmployeeID=EBD.iEmployeeID and EA.sMonth =’January’ Group by EA.iEmployeeID ),0) as Oneday
FROM tblEmployeeBasicDetail EBD INNER JOIN
tblEmpOfficialDetails EOD ON EBD.iEmployeeID = EOD.iEmployeeID INNER JOIN
tblClientBranch CB ON EOD.iBranchID = CB.iBranchID
please give a query to find duplicate email ids
in this format:
for eg:
[email removed]
this means it has repeated 4 times. i want both mail id and number
Wow, lots of questions on this. Just goes to show that you can lead a horse to water, but you can’t make him drink.
hi,
i have data that contains item, subinventory, locator and quantity like:
saw warehouse box1 100
saw warehouse box2 100
saw warehouse box3 100
saw warehouse box4 100
saw workshop table1 20
saw workshop table2 20
screw warehouse box1 200
screw warehouse box2 200
i want to divide the quantity by locator count. I need the result like:
saw warehouse box1 25
saw warehouse box2 25
saw warehouse box3 25
saw warehouse box4 25
saw workshop table1 10
saw workshop table2 10
screw warehouse box1 100
screw warehouse box2 100
can u help me how the sql do it to all data above in a line statement ?
Thx.
select item, subinventory, locator, quantity/(count(*) over (partition by item)) from table
Most of the time, is not the problem to find a duplicate record (in a good designed database, you should NOT have a duplications).
Sometimes, you need to find out a given column is duplicated for a given other column: if you have different values for the same value
(mostly a foreign key).
In this case, you count a nested select:
SELECT ID, COUNT(field) AS duplicate
FROM (SELECT DISTINCT ID, field
FROM Table) s
GROUP BY ID
HAVING (COUNT(field) > 1)
sir, i want to delete all employee details except for two employee no in the database. I have tried this one but it delete all the data..
Delete from Employee_Profile where EmpNo NOT IN
(select EmpNo from Boss_Employee_Profile
where EmpNo =’21097′ , ‘21097A’)
Good day everyone. I’m hoping that someone can help me with my issue. I have duplicate values that I am searching for and while I can easily search for and find those, I can’t figure out how to validate the first record while invalidating the others. So lets say I have the following fields:
Q_ID, Q_Status_CD, Q_MatchKey, Q_Fulfilled
I want to find all records with more than one matching Matchkey so I can use:
SELECT Q_ID, Q_Status_CD, Q_MatchKey, Q_Fulfilled_DT
FROM MyTable
WHERE Q_Fulfilled_DT Is NULL
Having Count(Q_Matchkey) > 1
This works fine for showing all duplicates. Where I am struggling is that I need to update the Q_FULFILLED_DT field on the FIRST entry with a date as well as set the Q_STATUS_CD = 1 (valid), then on all other records set the Q_STATUS_CD = 2 (in-valid). I found a work around to do this by first setting all duplicates to invalid, then finding the duplicates and by using MIN(Q_ID) to isolate the first record and then I reset the Status and Fulfilled date fields on that individual record.
This solution would be fine, except this is part of a weekly SQL job and it resets all duplicates each week and therefore places a duplicate order for that first record each week by changing the fulfilled date to the current week. I want to only update the first record when there isn’t one with a Fulfilled date already in the duplicate set. I’m not sure if this makes sense.
Hi,
The output of my query contains duplicate records (i.e. column1 value is similar but other column’s value different) as well as single record having column2 value blanks(-). now I want to select only one record out of duplicate records for which column2 values are present (eliminating records for which column2 is blanks(-) from duplicate records) and all other single records having column2 values blanks.
My query’s Present Output
column1 column2 column3
abc xyz pqr
abc – klm
ijk – uvw
plo – ujn
yhk ttg wea
yhk – erf
Expected Output
column1 column2 column3
abc xyz pqr
ijk – uvw
plo – ujn
yhk ttg wea
Can someone help me. Thanks in advance.
Try this
select column1,column2,column3 from
(
select *,row_number() over (partition by column1 order by column1) as sno from table
) as t
where sno=1
madhivanan,
Your solution worked perfectly for me with a minor tweak to meet my desired outcome. Thank you!
Hi Pinal,
Can you help me by providing the solution to my above query. It is some what urgent. Please…
Hi i have to required following solution for total no. of records
id Created date
12343 05/15/2012
12345 05/15/2012
12121 05/16/2012
12369 05/19/2012
12371 05/19/2012
12343 05/22/2012
12888 05/22/2012
12411 05/25/2012
12121 06/06/2012
12125 06/06/2012
Total Record Create Date
2 05/15/2012
1 05/16/2012
2 05/19/2012
1 05/22/2012 (Repeat ID 12343 so not count that id)
1 05/25/2012
1 06/06/2012 (Repeat ID 121121 so not count that id)
Mr. Pinal,
Request you to take some time from your busy schedule to provide the guidelines (Solution) for my query above.
Thanks
I have already replied to your question. Please check
I’ll just leave this here
SELECT Field1, Field2, MIN(Field3) AS Min, MAX(Field3) AS Max, COUNT(*) AS TotalCount
FROM (SELECT Field1, Field2, Field3
FROM Tablename
WHERE [Criteria here]
GROUP BY Field1, Field2, Field3) AS FilteredDetail
GROUP BY Field1, Field2
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC
thanks sir
Hi Guys,
I have a list of product widgets, and we get the same product from different factories. I want to know how many are duped accross which facttories
ProductName, FactoryName
Smith Widget, Factory1
Smith Widget, Factory2
Smith Widget, Factory3
Dave Widget, Factory1
Dave Widget, Factory4
Claire Widget, Factory1
Claire Widget, Factory2
Claire Widget, Factory3
Claire Widget, Factory4
A) What I want is a SQL statement that gives a list of those Widgets that are suppled from Factory1 and Factory3
So for Factory 1 and Factory 3
Product,
Smith Widget,
Claire Widget
B) What would be amazing is
Name, Duplicates
“Factory 1 & Factory 2”, 2 Dups
“Factory 2 & Factory 3”, 2 Dups
“Factory 1 & Factory 3”, 2 Dups
“Factory 1 & Factory 4”, 1 Dups
“Factory 2 & Factory 4”, 0 Dups
“Factory 3 & Factory 4”, 1 Dups
This solution would be fine, except this is part of a weekly SQL job and it resets all duplicates each week and therefore places a duplicate order for that first record each week by changing the fulfilled date to the current week. I want to only update the first record when there isn’t one with a Fulfilled date already in the duplicate set. I’m not sure if this makes sense.
thank you. !!!
Grazie
I want to find duplicates record on the basis of fields FirstName, MiddleName, LastName from my table. I have 40 Lakhs of records in table. So any optimise query to show result in less time in SQL Server 2008 and system Specification for Such large amount of data for searching purpose.
Hi I am having a small requirement i.e I am having an Employee and Address tables where Employee table holds ID,Name and Address holds ID(foreign key) and Address. My data is as follows
Employee Address
1 Dorababu 1 XYZ
2 Vivekh 1 XYZ
3 Kaashyap 2 ABC
I would like to display all the employees who is having same address, I tried using joins but I am getting duplicates can some one help me
Hi – I’m not sure if you are still trying to accomplish this. However, you are getting duplicates because Dorababu joins to two rows in Address. This is because his primary key on Employee links to both the first and second row of Address (“1”).
If that is a typo on row 2 of Employee then this will work (if it is not a typo please give more specific instructions regarding the sample data and expected output):
IF OBJECT_ID(‘tempdb..#Employee’) IS NOT NULL DROP TABLE #Employee
IF OBJECT_ID(‘tempdb..#Address’) IS NOT NULL DROP TABLE #Address
CREATE TABLE #Employee
(
ID VARCHAR(25) NOT NULL,
Name VARCHAR(25) NOT NULL
)
INSERT INTO #Employee(ID, Name) VALUES
(‘1’, ‘Dorababu’),
(‘2’, ‘Vivekh’),
(‘3’, ‘Kaashyap’)
CREATE TABLE #Address
(
F_ID VARCHAR(25) NOT NULL,
Address VARCHAR(50) NOT NULL
)
INSERT INTO #Address(F_ID, Address) VALUES
(‘1’, ‘XYZ’),
(‘2’, ‘XYZ’),
(‘3’, ‘ABC’)
–The actual query starts here
SELECT e2.Name, s.Address
FROM
(
SELECT COUNT(*) AS Dups, a.Address
FROM #Employee e JOIN #Address a ON e.ID = a.F_ID
GROUP BY a.Address
HAVING COUNT(*) > 1
)s JOIN #Address a2 ON s.Address = a2.Address
JOIN #Employee e2 ON a2.F_ID = e2.ID
–The query ends here
DROP TABLE #Address
DROP TABLE #Employee
I decided to try a different method. This one achieves the same results with less joins:
SELECT e.Name, a.Address
FROM #Address a JOIN #Employee e ON a.F_ID = e.ID
WHERE a.Address IN
(SELECT a2.Address FROM #Address a2 GROUP BY a2.Address HAVING COUNT(a2.Address) > 1)
Thank you Pinal Dave