Sunday, April 22, 2012

Delete Duplicates fast in MySQL

In an earlier article I explained how to delete duplicate rows in Oracle using the rowid pseudocolumn. In MySQL there is no pseudocolumn, and in order to use the rowid equivalent one has to implement session variables which can get a bit messy and slower when you have large tables.

As such one has to use some imagination and an old technique to remove duplicates. The process is as follows:

  1. Create a similar table to the original one which contains the duplicate values.
  2. Insert distinct values in the new table.
  3. Drop the old table.
  4. Rename the new table as the old one.
Pretty clear, no? In SQL we can implement the first two steps using the CREATE TABLE ... SELECT syntax. Therefore to remove duplicates from t1, you can execute the following code:
CREATE TABLE t2 
SELECT DISTINCT name, surname, country_id FROM t1;
DROP TABLE t1;
ALTER TABLE t2 RENAME t1;
Post a Comment