Finding Duplicate WordPress Post Meta

Have you ever accidentally saved the same post meta to a post in WordPress, and ended up with a duplicate?

wp> get_post_meta( 919, '_totally_duped' );
=> array(3) {
  [0]=>
  string(1) "1"
  [1]=>
  string(1) "1"
  [2]=>
  string(1) "1"
}

Of course, you could go through each post and check the meta, check for duplicates, and delete any of them but…

aint nobody got time for that GIF

I think I’ve figured out a beast of an SQL query that will give you the Meta IDs of any duplicate post meta so that you can delete them with delete_metadata_by_mid().

SELECT * FROM (SELECT meta_id FROM (SELECT meta_table_1.* FROM wp_postmeta meta_table_1, wp_postmeta meta_table_2 WHERE ( meta_table_1.post_id = meta_table_2.post_id AND meta_table_1.meta_value = meta_table_2.meta_value AND meta_table_1.meta_key = meta_table_2.meta_key ) ORDER BY meta_table_2.post_id) Table_All_Duplicates GROUP BY meta_id HAVING COUNT(*)>1) Unique_Dupes WHERE meta_id NOT IN (SELECT min(meta_id) AS min_meta_id FROM wp_postmeta GROUP BY post_id,meta_key,meta_value HAVING COUNT(post_id) > 1 AND COUNT(meta_key) > 1 AND COUNT(meta_value) > 1 AND COUNT(meta_key >= 1));

An example:

mysql> SELECT * FROM wp_postmeta WHERE post_id = 919 AND meta_key = '_totally_duped';
+---------+---------+----------------+------------+
| meta_id | post_id | meta_key       | meta_value |
+---------+---------+----------------+------------+
|    2123 |     919 | _totally_duped | 1          |
|    2124 |     919 | _totally_duped | 1          |
|    2125 |     919 | _totally_duped | 1          |
+---------+---------+----------------+------------+
3 rows in set (0.01 sec)

mysql> SELECT * FROM (SELECT meta_id FROM (SELECT meta_table_1.* FROM wp_postmeta meta_table_1, wp_postmeta meta_table_2 WHERE ( meta_table_1.post_id = meta_table_2.post_id AND meta_table_1.meta_value = meta_table_2.meta_value AND meta_table_1.meta_key = meta_table_2.meta_key ) ORDER BY meta_table_2.post_id) Table_All_Duplicates GROUP BY meta_id HAVING COUNT(*)>1) Unique_Dupes WHERE meta_id NOT IN (SELECT min(meta_id) AS min_meta_id FROM wp_postmeta GROUP BY post_id,meta_key,meta_value HAVING COUNT(post_id) > 1 AND COUNT(meta_key) > 1 AND COUNT(meta_value) > 1 AND COUNT(meta_key >= 1));
+---------+
| meta_id |
+---------+
|    2124 |
|    2125 |
+---------+
2 rows in set (0.03 sec)

I just want to note though, this is VERY slow. I did this because I’m currently working on trying to clean up a site that has about 24 million rows of postmeta, and I really hope that there’s some duplicates to delete. This one query has been running for about 12 hours now, and I have no idea when it will end.

So I guess I do have time for that after all ¯\_(ツ)_/¯

UPDATE:

I’ve discovered that it might be good to have more data for forensic investigation… so I’ve adjusted the query to also give me the post id, meta key, meta value length, and the original meta id:

mysql> SELECT post_id,group_concat(distinct meta_id) as meta_ids,min( distinct meta_id) as primary_meta_id,meta_key,char_length(meta_value) as meta_length,count( distinct meta_id ) * char_length(meta_value) as total_size FROM ( SELECT meta_table_1.* FROM wp_postmeta meta_table_1, wp_postmeta meta_table_2 WHERE (
    -> meta_table_1.post_id = meta_table_2.post_id AND meta_table_1.meta_value = meta_table_2.meta_value AND meta_table_1.meta_key = meta_table_2.meta_key ) ORDER BY meta_table_2.post_id ) Table_All_Duplicates GROUP BY post_id,meta_key HAVING COUNT(*)>1;
+---------+----------------+-----------------+------------------+-------------+------------+
| post_id | meta_ids       | primary_meta_id | meta_key         | meta_length | total_size |
+---------+----------------+-----------------+------------------+-------------+------------+
|     915 | 2176,2177,2178 |            2176 | _totally_duped_2 |           1 |          3 |
|     919 | 2123,2124,2125 |            2123 | _totally_duped   |           1 |          3 |
+---------+----------------+-----------------+------------------+-------------+------------+
2 rows in set (0.03 sec)

Leave a Reply