Mysql Reminder - A Delete Duplicates that works fast.

Note to self. Deleting dupes

This takes forever:

SET SQL_SAFE_UPDATES=0;
DELETE a 
FROM 
 tbl1.tbl_data as a, 
    tbl1.tbl_data as b
where 
 json_unquote(a.data->'$.text') = json_unquote(b.data->'$.text')  
AND 
 a.updated < b.updated;    
SET SQL_SAFE_UPDATES=1;



This Gem Works well.

SET SQL_SAFE_UPDATES=0;
 DELETE tbl_data from tbl_data
  inner join (
     select max(updated) as lastupdated, data->'$.text' as text
       from tbl_data
      group by text
     having count(*) > 1) duplic on duplic.text = data->'$.text'
  where updated < duplic.lastupdated;
SET SQL_SAFE_UPDATES=1;

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.

goBloggerCrawler Blogger Web Crawler in Go

Go-based web crawler used to efficiently extract structured data (titles, video URLs, and tags) from Google Blogger sites using concurrent p...