SQL SERVER – Count Duplicate Records – Rows

In my previous article SQL SERVER – Delete Duplicate Records – Rows, we have seen how we can delete all the duplicate records in one simple query. In this article we will see how to find count of all the duplicate records in the table. Following query demonstrates usage of GROUP BY, HAVING, ORDER BY in one query and returns the results with duplicate column and its count in descending order.

SELECT YourColumn, COUNT(*) TotalCount
FROM YourTable
GROUP BY YourColumn
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC

Watch the view to see the above concept in action:

[youtube=http://www.youtube.com/watch?v=ioDJ0xVOHDY]

Reference : Pinal Dave (https://blog.sqlauthority.com) , SQL SERVER – Delete Duplicate Records – Rows

Duplicate Records, SQL Scripts
Previous Post
SQL SERVER – 2005 – List All Stored Procedure Modified in Last N Days
Next Post
SQL SERVER – 2008 – Scheduled Launch at an Event in Los Angeles on Feb. 27, 2008

Related Posts

187 Comments. Leave new

  • SELECT EBD.sFirstName + ‘ ‘ + EBD.sMiddleName + ‘ ‘ + EBD.sLastName AS EmpName,
    EBD.sFatherName ,
    EOD.dJoinigDate,
    –‘Present’,’Weekly Off’,’Holiday’,’Half Day’
    ISNULL( (SELECT SUM(CASE WHEN EA.D1=’Present’ THEN 1 ELSE CASE WHEN EA.D1=’Weekly Off’ THEN 2 ELSE CASE WHEN EA.D1=’Holiday’ THEN 3
    ELSE CASE WHEN EA.D1=’Half Day’ THEN 1 ELSE 0 END)
    FROM tbl_empAttendenceLog EA Where EA.iEmployeeID=EBD.iEmployeeID and EA.sMonth =’January’ Group by EA.iEmployeeID ),0) as Oneday

    FROM tblEmployeeBasicDetail EBD INNER JOIN
    tblEmpOfficialDetails EOD ON EBD.iEmployeeID = EOD.iEmployeeID INNER JOIN
    tblClientBranch CB ON EOD.iBranchID = CB.iBranchID

    Reply
  • please give a query to find duplicate email ids
    in this format:
    for eg:
    [email removed]
    this means it has repeated 4 times. i want both mail id and number

    Reply
  • Wow, lots of questions on this. Just goes to show that you can lead a horse to water, but you can’t make him drink.

    Reply
  • hi,
    i have data that contains item, subinventory, locator and quantity like:

    saw warehouse box1 100
    saw warehouse box2 100
    saw warehouse box3 100
    saw warehouse box4 100
    saw workshop table1 20
    saw workshop table2 20
    screw warehouse box1 200
    screw warehouse box2 200

    i want to divide the quantity by locator count. I need the result like:

    saw warehouse box1 25
    saw warehouse box2 25
    saw warehouse box3 25
    saw warehouse box4 25
    saw workshop table1 10
    saw workshop table2 10
    screw warehouse box1 100
    screw warehouse box2 100

    can u help me how the sql do it to all data above in a line statement ?
    Thx.

    Reply
  • Antonio Reynaldos
    March 26, 2012 2:23 pm

    Most of the time, is not the problem to find a duplicate record (in a good designed database, you should NOT have a duplications).

    Sometimes, you need to find out a given column is duplicated for a given other column: if you have different values for the same value
    (mostly a foreign key).

    In this case, you count a nested select:

    SELECT ID, COUNT(field) AS duplicate
    FROM (SELECT DISTINCT ID, field
    FROM Table) s
    GROUP BY ID
    HAVING (COUNT(field) > 1)

    Reply
  • sir, i want to delete all employee details except for two employee no in the database. I have tried this one but it delete all the data..

    Delete from Employee_Profile where EmpNo NOT IN
    (select EmpNo from Boss_Employee_Profile
    where EmpNo =’21097′ , ‘21097A’)

    Reply
  • Good day everyone. I’m hoping that someone can help me with my issue. I have duplicate values that I am searching for and while I can easily search for and find those, I can’t figure out how to validate the first record while invalidating the others. So lets say I have the following fields:

    Q_ID, Q_Status_CD, Q_MatchKey, Q_Fulfilled

    I want to find all records with more than one matching Matchkey so I can use:

    SELECT Q_ID, Q_Status_CD, Q_MatchKey, Q_Fulfilled_DT
    FROM MyTable
    WHERE Q_Fulfilled_DT Is NULL
    Having Count(Q_Matchkey) > 1

    This works fine for showing all duplicates. Where I am struggling is that I need to update the Q_FULFILLED_DT field on the FIRST entry with a date as well as set the Q_STATUS_CD = 1 (valid), then on all other records set the Q_STATUS_CD = 2 (in-valid). I found a work around to do this by first setting all duplicates to invalid, then finding the duplicates and by using MIN(Q_ID) to isolate the first record and then I reset the Status and Fulfilled date fields on that individual record.

    This solution would be fine, except this is part of a weekly SQL job and it resets all duplicates each week and therefore places a duplicate order for that first record each week by changing the fulfilled date to the current week. I want to only update the first record when there isn’t one with a Fulfilled date already in the duplicate set. I’m not sure if this makes sense.

    Reply
  • Prakash Paranjape
    June 1, 2012 1:00 pm

    Hi,
    The output of my query contains duplicate records (i.e. column1 value is similar but other column’s value different) as well as single record having column2 value blanks(-). now I want to select only one record out of duplicate records for which column2 values are present (eliminating records for which column2 is blanks(-) from duplicate records) and all other single records having column2 values blanks.

    My query’s Present Output
    column1 column2 column3
    abc xyz pqr
    abc – klm
    ijk – uvw
    plo – ujn
    yhk ttg wea
    yhk – erf

    Expected Output
    column1 column2 column3
    abc xyz pqr
    ijk – uvw
    plo – ujn
    yhk ttg wea

    Can someone help me. Thanks in advance.

    Reply
    • Try this

      select column1,column2,column3 from
      (
      select *,row_number() over (partition by column1 order by column1) as sno from table
      ) as t
      where sno=1

      Reply
      • madhivanan,

        Your solution worked perfectly for me with a minor tweak to meet my desired outcome. Thank you!

  • Prakash Paranjape
    June 4, 2012 11:01 am

    Hi Pinal,

    Can you help me by providing the solution to my above query. It is some what urgent. Please…

    Reply
  • Hi i have to required following solution for total no. of records

    id Created date

    12343 05/15/2012
    12345 05/15/2012
    12121 05/16/2012
    12369 05/19/2012
    12371 05/19/2012
    12343 05/22/2012
    12888 05/22/2012
    12411 05/25/2012
    12121 06/06/2012
    12125 06/06/2012

    Total Record Create Date
    2 05/15/2012
    1 05/16/2012
    2 05/19/2012
    1 05/22/2012 (Repeat ID 12343 so not count that id)
    1 05/25/2012
    1 06/06/2012 (Repeat ID 121121 so not count that id)

    Reply
  • Prakash Paranjape
    June 14, 2012 3:54 pm

    Mr. Pinal,

    Request you to take some time from your busy schedule to provide the guidelines (Solution) for my query above.

    Thanks

    Reply
  • I’ll just leave this here

    SELECT Field1, Field2, MIN(Field3) AS Min, MAX(Field3) AS Max, COUNT(*) AS TotalCount
    FROM (SELECT Field1, Field2, Field3
    FROM Tablename
    WHERE [Criteria here]
    GROUP BY Field1, Field2, Field3) AS FilteredDetail
    GROUP BY Field1, Field2
    HAVING COUNT(*) > 1
    ORDER BY COUNT(*) DESC

    Reply
  • thanks sir

    Reply
  • Hi Guys,

    I have a list of product widgets, and we get the same product from different factories. I want to know how many are duped accross which facttories

    ProductName, FactoryName
    Smith Widget, Factory1
    Smith Widget, Factory2
    Smith Widget, Factory3
    Dave Widget, Factory1
    Dave Widget, Factory4
    Claire Widget, Factory1
    Claire Widget, Factory2
    Claire Widget, Factory3
    Claire Widget, Factory4

    A) What I want is a SQL statement that gives a list of those Widgets that are suppled from Factory1 and Factory3

    So for Factory 1 and Factory 3

    Product,
    Smith Widget,
    Claire Widget

    B) What would be amazing is

    Name, Duplicates
    “Factory 1 & Factory 2”, 2 Dups
    “Factory 2 & Factory 3”, 2 Dups
    “Factory 1 & Factory 3”, 2 Dups
    “Factory 1 & Factory 4”, 1 Dups
    “Factory 2 & Factory 4”, 0 Dups
    “Factory 3 & Factory 4”, 1 Dups

    Reply
  • This solution would be fine, except this is part of a weekly SQL job and it resets all duplicates each week and therefore places a duplicate order for that first record each week by changing the fulfilled date to the current week. I want to only update the first record when there isn’t one with a Fulfilled date already in the duplicate set. I’m not sure if this makes sense.

    Reply
  • jacksonemmerich
    October 19, 2012 5:04 am

    thank you. !!!

    Reply
  • Grazie

    Reply
  • I want to find duplicates record on the basis of fields FirstName, MiddleName, LastName from my table. I have 40 Lakhs of records in table. So any optimise query to show result in less time in SQL Server 2008 and system Specification for Such large amount of data for searching purpose.

    Reply
  • Hi I am having a small requirement i.e I am having an Employee and Address tables where Employee table holds ID,Name and Address holds ID(foreign key) and Address. My data is as follows

    Employee Address

    1 Dorababu 1 XYZ
    2 Vivekh 1 XYZ
    3 Kaashyap 2 ABC

    I would like to display all the employees who is having same address, I tried using joins but I am getting duplicates can some one help me

    Reply
    • Hi – I’m not sure if you are still trying to accomplish this. However, you are getting duplicates because Dorababu joins to two rows in Address. This is because his primary key on Employee links to both the first and second row of Address (“1”).

      Reply
    • If that is a typo on row 2 of Employee then this will work (if it is not a typo please give more specific instructions regarding the sample data and expected output):

      IF OBJECT_ID(‘tempdb..#Employee’) IS NOT NULL DROP TABLE #Employee
      IF OBJECT_ID(‘tempdb..#Address’) IS NOT NULL DROP TABLE #Address

      CREATE TABLE #Employee
      (
      ID VARCHAR(25) NOT NULL,
      Name VARCHAR(25) NOT NULL
      )

      INSERT INTO #Employee(ID, Name) VALUES
      (‘1’, ‘Dorababu’),
      (‘2’, ‘Vivekh’),
      (‘3’, ‘Kaashyap’)

      CREATE TABLE #Address
      (
      F_ID VARCHAR(25) NOT NULL,
      Address VARCHAR(50) NOT NULL
      )

      INSERT INTO #Address(F_ID, Address) VALUES
      (‘1’, ‘XYZ’),
      (‘2’, ‘XYZ’),
      (‘3’, ‘ABC’)

      –The actual query starts here
      SELECT e2.Name, s.Address
      FROM
      (
      SELECT COUNT(*) AS Dups, a.Address
      FROM #Employee e JOIN #Address a ON e.ID = a.F_ID
      GROUP BY a.Address
      HAVING COUNT(*) > 1
      )s JOIN #Address a2 ON s.Address = a2.Address
      JOIN #Employee e2 ON a2.F_ID = e2.ID
      –The query ends here

      DROP TABLE #Address
      DROP TABLE #Employee

      Reply
    • I decided to try a different method. This one achieves the same results with less joins:

      SELECT e.Name, a.Address
      FROM #Address a JOIN #Employee e ON a.F_ID = e.ID
      WHERE a.Address IN
      (SELECT a2.Address FROM #Address a2 GROUP BY a2.Address HAVING COUNT(a2.Address) > 1)

      Reply
  • Thank you Pinal Dave

    Reply

Leave a Reply