I recently worked on migrating a site to a different server and for one reason or another, some of the images did not come over properly. While I could have just re-downloaded and re-imported all of the media, it would have taken quite a while since the media library was well over 100Gb. Instead, I opted to use WP-CLI to help find what images were missing:
/**
* Iterate over attachments and check to see if they actually exist.
*
* @subcommand validate-attachments
* @synopsis --output=<csv-filename> [--log-found]
*/
public function validate_attachments( $args, $assoc_args ) {
$attachment_count = array_sum( (array) wp_count_posts( 'attachment' ) );
if ( isset( $args['log-found'] ) ) {
$log_found = true;
} else {
$log_found = false;
}
$output_file = $assoc_args['output'];
$posts_per_page = 500;
$paged = 1;
$count = 0;
$output = array();
$progress = \WP_CLI\Utils\make_progress_bar( 'Checking ' . number_format( $attachment_count ) . ' attachments', $attachment_count );
$file_descriptor = fopen( $output_file, 'w' );
do {
$attachments = get_posts( array(
'post_type' => 'attachment',
'posts_per_page' => $posts_per_page,
'paged' => $paged,
) );
foreach ( $attachments as $attachment ) {
$url = $attachment->guid;
$request = wp_remote_head( $url );
if ( 200 !== $request['response']['code'] ) {
$output[] = array(
$url,
$request['response']['code'],
$request['response']['message'],
);
} else {
if ( $log_found ) {
$output[] = array(
$url,
$request['response']['code'],
$request['response']['message'],
);
}
}
$progress->tick();
$count++;
}
// Pause.
sleep( 1 );
$paged++;
} while ( count( $attachments ) );
$progress->finish();
WP_CLI\Utils\write_csv( $file_descriptor, $output );
fclose( $file_descriptor );
}
Code language: PHP (php)
The benefit to this will be that I can just take the CSV, grab the URLs out of it, replace the domain name, and wget
just what I need.
It was also the firs time I’ve used WP-CLI’s write_csv()
function, which gave me a short pause since it’s not very well documented.
Leave a Reply