Thursday, September 22, 2011

Delete Duplicate rows from Oracle table

Duplicate data is not a rare incident - infact it's very easy to end up with duplicate data when you do daily database loads and lack constraint keys. This article shows you how to remove these duplicates.

The solution relies on the following two principles:
  1. Identical rows with different rowid's
  2. joining the table by itself
The SQL code to remove duplicates from any table is the following:

delete from
    T1 A
WHERE
    a.rowid >
ANY (
     SELECT
        B.rowid
     FROM
        T1 B
     WHERE
        A.c1 = B.c1 and
        A.c2 = B.c2 and
        A.c3 = B.c3
        )

So how does the script work? The columns c1-c3 identify the composite keys of the row which will make it a unique entry in the table. Thus the subquery will join the table with itself and returns identical rows with a different rowid from the parent query. The delete statement will remove the row with the larger rowid, i.e. the duplicate.
Post a Comment