I have been playing with WordPress a short amount of time, though the first thing that’s getting to me is the mass of revisions that are stored in the DB. In most respects it seems to do a great job at garbage collecting, though I noticed that the wp_term_relationships table does not quite keep as tidy; tags are seemingly not preserved, though categories for revisions most certainly are. Ergo, after some delving into the DB and looking briefly over the codebase, I came up with this monster query:
SELECT r.* FROM wp_posts p
LEFT JOIN wp_term_relationships r ON r.object_id=p.id
LEFT JOIN wp_term_taxonomy t ON t.term_taxonomy_id=r.term_taxonomy_id
LEFT JOIN wp_terms ON t.term_id=wp_terms.term_id
WHERE post_type='revision' AND taxonomy IN ('category','post_tag');
Essentially, I’m just finding the rows in wp_term_relationships which are tied to revisions. I then can write a small bit of PHP which will go through this data and run SQL DELETE FROM statements for each of the found rows. You can potentially do this entirely in SQL, though you’ll need a temporary table and a series of statements due to the fact that in MySQL you cannot run a DELETE FROM with a WHERE referencing the table from which you are deleting. Of course, once these processes are finished, you then can easily enough run the following simple query: DELETE FROM wp_posts WHERE post_type='revision';