SQL SERVER – Count Duplicate Records – Rows

July 11, 2007

In my previous article SQL SERVER – Delete Duplicate Records – Rows, we have seen how we can delete all the duplicate records in one simple query. In this article we will see how to find count of all the duplicate records in the table. Following query demonstrates usage of GROUP BY, HAVING, ORDER BY in one query and returns the results with duplicate column and its count in descending order.

SELECT YourColumn, COUNT(*) TotalCount FROM YourTable GROUP BY YourColumn HAVING COUNT(*) > 1 ORDER BY COUNT(*) DESC

Watch the view to see the above concept in action:

[youtube=http://www.youtube.com/watch?v=ioDJ0xVOHDY]

Reference : Pinal Dave (https://blog.sqlauthority.com) , SQL SERVER – Delete Duplicate Records – Rows

Duplicate Records, SQL Scripts

SQL SERVER – 2005 – List All Stored Procedure Modified in Last N Days

SQL SERVER – 2008 – Scheduled Launch at an Event in Los Angeles on Feb. 27, 2008

187 Comments. Leave new

prithvi singh
February 2, 2012 6:16 pm
SELECT EBD.sFirstName + ‘ ‘ + EBD.sMiddleName + ‘ ‘ + EBD.sLastName AS EmpName,
EBD.sFatherName ,
EOD.dJoinigDate,
–‘Present’,’Weekly Off’,’Holiday’,’Half Day’
ISNULL( (SELECT SUM(CASE WHEN EA.D1=’Present’ THEN 1 ELSE CASE WHEN EA.D1=’Weekly Off’ THEN 2 ELSE CASE WHEN EA.D1=’Holiday’ THEN 3
ELSE CASE WHEN EA.D1=’Half Day’ THEN 1 ELSE 0 END)
FROM tbl_empAttendenceLog EA Where EA.iEmployeeID=EBD.iEmployeeID and EA.sMonth =’January’ Group by EA.iEmployeeID ),0) as Oneday
FROM tblEmployeeBasicDetail EBD INNER JOIN
tblEmpOfficialDetails EOD ON EBD.iEmployeeID = EOD.iEmployeeID INNER JOIN
tblClientBranch CB ON EOD.iBranchID = CB.iBranchID
Reply
salim
February 11, 2012 7:39 pm
please give a query to find duplicate email ids
in this format:
for eg:
[email removed]
this means it has repeated 4 times. i want both mail id and number
Reply
Tony
March 16, 2012 8:40 pm
Wow, lots of questions on this. Just goes to show that you can lead a horse to water, but you can’t make him drink.
Reply
Sandi Keren
March 20, 2012 8:57 am
hi,
i have data that contains item, subinventory, locator and quantity like:
saw warehouse box1 100
saw warehouse box2 100
saw warehouse box3 100
saw warehouse box4 100
saw workshop table1 20
saw workshop table2 20
screw warehouse box1 200
screw warehouse box2 200
i want to divide the quantity by locator count. I need the result like:
saw warehouse box1 25
saw warehouse box2 25
saw warehouse box3 25
saw warehouse box4 25
saw workshop table1 10
saw workshop table2 10
screw warehouse box1 100
screw warehouse box2 100
can u help me how the sql do it to all data above in a line statement ?
Thx.
Reply
- madhivanan
  March 20, 2012 1:24 pm
  select item, subinventory, locator, quantity/(count(*) over (partition by item)) from table
  Reply
Antonio Reynaldos
March 26, 2012 2:23 pm
Most of the time, is not the problem to find a duplicate record (in a good designed database, you should NOT have a duplications).
Sometimes, you need to find out a given column is duplicated for a given other column: if you have different values for the same value
(mostly a foreign key).
In this case, you count a nested select:
SELECT ID, COUNT(field) AS duplicate
FROM (SELECT DISTINCT ID, field
FROM Table) s
GROUP BY ID
HAVING (COUNT(field) > 1)
Reply
Ms Sida
May 15, 2012 12:48 pm
sir, i want to delete all employee details except for two employee no in the database. I have tried this one but it delete all the data..
Delete from Employee_Profile where EmpNo NOT IN
(select EmpNo from Boss_Employee_Profile
where EmpNo =’21097′ , ‘21097A’)
Reply
K. Maynard
May 31, 2012 10:04 pm
Good day everyone. I’m hoping that someone can help me with my issue. I have duplicate values that I am searching for and while I can easily search for and find those, I can’t figure out how to validate the first record while invalidating the others. So lets say I have the following fields:
Q_ID, Q_Status_CD, Q_MatchKey, Q_Fulfilled
I want to find all records with more than one matching Matchkey so I can use:
SELECT Q_ID, Q_Status_CD, Q_MatchKey, Q_Fulfilled_DT
FROM MyTable
WHERE Q_Fulfilled_DT Is NULL
Having Count(Q_Matchkey) > 1
This works fine for showing all duplicates. Where I am struggling is that I need to update the Q_FULFILLED_DT field on the FIRST entry with a date as well as set the Q_STATUS_CD = 1 (valid), then on all other records set the Q_STATUS_CD = 2 (in-valid). I found a work around to do this by first setting all duplicates to invalid, then finding the duplicates and by using MIN(Q_ID) to isolate the first record and then I reset the Status and Fulfilled date fields on that individual record.
This solution would be fine, except this is part of a weekly SQL job and it resets all duplicates each week and therefore places a duplicate order for that first record each week by changing the fulfilled date to the current week. I want to only update the first record when there isn’t one with a Fulfilled date already in the duplicate set. I’m not sure if this makes sense.
Reply
Prakash Paranjape
June 1, 2012 1:00 pm
Hi,
The output of my query contains duplicate records (i.e. column1 value is similar but other column’s value different) as well as single record having column2 value blanks(-). now I want to select only one record out of duplicate records for which column2 values are present (eliminating records for which column2 is blanks(-) from duplicate records) and all other single records having column2 values blanks.
My query’s Present Output
column1 column2 column3
abc xyz pqr
abc – klm
ijk – uvw
plo – ujn
yhk ttg wea
yhk – erf
Expected Output
column1 column2 column3
abc xyz pqr
ijk – uvw
plo – ujn
yhk ttg wea
Can someone help me. Thanks in advance.
Reply
- madhivanan
  June 18, 2012 6:15 pm
  Try this
  select column1,column2,column3 from
  (
  select *,row_number() over (partition by column1 order by column1) as sno from table
  ) as t
  where sno=1
  Reply
  - Gautac18
    September 20, 2012 7:52 pm
    madhivanan,
    Your solution worked perfectly for me with a minor tweak to meet my desired outcome. Thank you!
Prakash Paranjape
June 4, 2012 11:01 am
Hi Pinal,
Can you help me by providing the solution to my above query. It is some what urgent. Please…
Reply
mehul
June 12, 2012 8:23 am
Hi i have to required following solution for total no. of records
id Created date
12343 05/15/2012
12345 05/15/2012
12121 05/16/2012
12369 05/19/2012
12371 05/19/2012
12343 05/22/2012
12888 05/22/2012
12411 05/25/2012
12121 06/06/2012
12125 06/06/2012
Total Record Create Date
2 05/15/2012
1 05/16/2012
2 05/19/2012
1 05/22/2012 (Repeat ID 12343 so not count that id)
1 05/25/2012
1 06/06/2012 (Repeat ID 121121 so not count that id)
Reply
Prakash Paranjape
June 14, 2012 3:54 pm
Mr. Pinal,
Request you to take some time from your busy schedule to provide the guidelines (Solution) for my query above.
Thanks
Reply
- madhivanan
  June 18, 2012 6:16 pm
  I have already replied to your question. Please check
  Reply
eric gerds
August 15, 2012 11:01 pm
I’ll just leave this here
SELECT Field1, Field2, MIN(Field3) AS Min, MAX(Field3) AS Max, COUNT(*) AS TotalCount
FROM (SELECT Field1, Field2, Field3
FROM Tablename
WHERE [Criteria here]
GROUP BY Field1, Field2, Field3) AS FilteredDetail
GROUP BY Field1, Field2
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC
Reply
ajay
August 29, 2012 3:26 pm
thanks sir
Reply
Mark
September 6, 2012 8:58 pm
Hi Guys,
I have a list of product widgets, and we get the same product from different factories. I want to know how many are duped accross which facttories
ProductName, FactoryName
Smith Widget, Factory1
Smith Widget, Factory2
Smith Widget, Factory3
Dave Widget, Factory1
Dave Widget, Factory4
Claire Widget, Factory1
Claire Widget, Factory2
Claire Widget, Factory3
Claire Widget, Factory4
A) What I want is a SQL statement that gives a list of those Widgets that are suppled from Factory1 and Factory3
So for Factory 1 and Factory 3
Product,
Smith Widget,
Claire Widget
B) What would be amazing is
Name, Duplicates
“Factory 1 & Factory 2”, 2 Dups
“Factory 2 & Factory 3”, 2 Dups
“Factory 1 & Factory 3”, 2 Dups
“Factory 1 & Factory 4”, 1 Dups
“Factory 2 & Factory 4”, 0 Dups
“Factory 3 & Factory 4”, 1 Dups
Reply
Jible
October 12, 2012 9:42 pm
This solution would be fine, except this is part of a weekly SQL job and it resets all duplicates each week and therefore places a duplicate order for that first record each week by changing the fulfilled date to the current week. I want to only update the first record when there isn’t one with a Fulfilled date already in the duplicate set. I’m not sure if this makes sense.
Reply
jacksonemmerich
October 19, 2012 5:04 am
thank you. !!!
Reply
Antonio
November 23, 2012 10:40 pm
Grazie
Reply
Pranay
December 20, 2012 12:05 am
I want to find duplicates record on the basis of fields FirstName, MiddleName, LastName from my table. I have 40 Lakhs of records in table. So any optimise query to show result in less time in SQL Server 2008 and system Specification for Such large amount of data for searching purpose.
Reply
dorababu743
January 16, 2013 2:47 pm
Hi I am having a small requirement i.e I am having an Employee and Address tables where Employee table holds ID,Name and Address holds ID(foreign key) and Address. My data is as follows
Employee Address
1 Dorababu 1 XYZ
2 Vivekh 1 XYZ
3 Kaashyap 2 ABC
I would like to display all the employees who is having same address, I tried using joins but I am getting duplicates can some one help me
Reply
- Mason
  July 31, 2013 8:51 am
  Hi – I’m not sure if you are still trying to accomplish this. However, you are getting duplicates because Dorababu joins to two rows in Address. This is because his primary key on Employee links to both the first and second row of Address (“1”).
  Reply
- Mason
  July 31, 2013 9:08 am
  If that is a typo on row 2 of Employee then this will work (if it is not a typo please give more specific instructions regarding the sample data and expected output):
  IF OBJECT_ID(‘tempdb..#Employee’) IS NOT NULL DROP TABLE #Employee
  IF OBJECT_ID(‘tempdb..#Address’) IS NOT NULL DROP TABLE #Address
  CREATE TABLE #Employee
  (
  ID VARCHAR(25) NOT NULL,
  Name VARCHAR(25) NOT NULL
  )
  INSERT INTO #Employee(ID, Name) VALUES
  (‘1’, ‘Dorababu’),
  (‘2’, ‘Vivekh’),
  (‘3’, ‘Kaashyap’)
  CREATE TABLE #Address
  (
  F_ID VARCHAR(25) NOT NULL,
  Address VARCHAR(50) NOT NULL
  )
  INSERT INTO #Address(F_ID, Address) VALUES
  (‘1’, ‘XYZ’),
  (‘2’, ‘XYZ’),
  (‘3’, ‘ABC’)
  –The actual query starts here
  SELECT e2.Name, s.Address
  FROM
  (
  SELECT COUNT(*) AS Dups, a.Address
  FROM #Employee e JOIN #Address a ON e.ID = a.F_ID
  GROUP BY a.Address
  HAVING COUNT(*) > 1
  )s JOIN #Address a2 ON s.Address = a2.Address
  JOIN #Employee e2 ON a2.F_ID = e2.ID
  –The query ends here
  DROP TABLE #Address
  DROP TABLE #Employee
  Reply
- Mason
  July 31, 2013 9:25 am
  I decided to try a different method. This one achieves the same results with less joins:
  SELECT e.Name, a.Address
  FROM #Address a JOIN #Employee e ON a.F_ID = e.ID
  WHERE a.Address IN
  (SELECT a2.Address FROM #Address a2 GROUP BY a2.Address HAVING COUNT(a2.Address) > 1)
  Reply
Jewel
January 21, 2013 3:56 pm
Thank you Pinal Dave
Reply