AdvancedintermediateCommonly used

How to delete duplicate rows in SQL

Find and remove duplicate records while keeping one copy.

1. Find duplicates firstAll databases
SELECT email, COUNT(*) AS occurrences
FROM users
GROUP BY email
HAVING COUNT(*) > 1;

Always find duplicates before deleting. Understand what you are removing.

Delete duplicates — keep lowest id (PostgreSQL)PostgreSQL
DELETE FROM users
WHERE id NOT IN (
  SELECT MIN(id)
  FROM users
  GROUP BY email
);

Keeps the row with the lowest id for each email. Deletes all others.

Delete duplicates using CTE (PostgreSQL)PostgreSQL
WITH ranked AS (
  SELECT
    id,
    ROW_NUMBER() OVER (
      PARTITION BY email
      ORDER BY id ASC
    ) AS rn
  FROM users
)
DELETE FROM users
WHERE id IN (
  SELECT id FROM ranked WHERE rn > 1
);

Assigns row numbers within each email group. Deletes all rows ranked 2 or higher.

When to use: Cleaner approach for complex deduplication logic.

Preview before deletingAll databases
WITH ranked AS (
  SELECT
    id, email, name,
    ROW_NUMBER() OVER (
      PARTITION BY email ORDER BY id
    ) AS rn
  FROM users
)
SELECT * FROM ranked WHERE rn > 1;
-- Replace SELECT * with DELETE when ready

Test with SELECT first. Replace SELECT * FROM ranked WHERE rn > 1 with the DELETE statement above once confirmed.

When to use: Always preview what you will delete before running DELETE.

Common Mistakes

  • Always SELECT first to preview what will be deleted
  • Wrap in a transaction so you can ROLLBACK if something goes wrong
← Back to SQL Reference