INF: Removing Duplicate Rows from an Existing Table in SQL (70956)



The information in this article applies to:

  • Microsoft SQL Server 4.2x
  • Microsoft SQL Server 6.0
  • Microsoft SQL Server 6.5
  • Microsoft SQL Server 7.0
  • Microsoft SQL Server 7.0 Service Pack 1

This article was previously published under Q70956

SUMMARY

The following script can be used to remove duplicate rows from a SQL Server table:
   SELECT DISTINCT *
      INTO duplicate_table
      FROM original_table
      GROUP BY key_value
      HAVING COUNT(key_value) > 1

   DELETE original_table
      WHERE key_value
      IN (SELECT key_value
             FROM duplicate_table)

   INSERT original_table
      SELECT *
         FROM duplicate_table

     DROP TABLE duplicate_table
				

When executed, this script moves one instance of any duplicate row in the original table to a duplicate table. It then deletes all rows from the original table that also reside in the duplicate table. Next, the rows in the duplicate table are moved back into the original table. Finally, the duplicate table is dropped.

MORE INFORMATION

While this method is simple, it does require that you have enough space available in your database to temporarily build the duplicate table.

Modification Type:MinorLast Reviewed:2/14/2005
Keywords:kbProgramming KB70956