Tag: csv

  • Bash Script: Calculate before/after averages

    Bash Script: Calculate before/after averages

    I’ve been doing some performance testing, and wanted a quick way to test how well or poorly changes affect a site. Normally I’d whip out the ol’ calculator app and do this manually. That got tiring after a while, so instead with the help of ChatGPT, I made this little bash script that will do the work for you:

    #!/bin/bash
    
    # Function to calculate the average of a list of numbers
    average() {
    	local sum=0
    	local count=0
    
    	for num in "$@"; do
    		sum=$(echo "$sum + $num" | bc -l)
    		count=$((count+1))
    	done
    
    	echo "$sum / $count" | bc -l
    }
    
    # Parse arguments
    for i in "$@"; do
    	case $i in
    		--before=*)
    		BEFORE="${i#*=}"
    		shift
    		;;
    		--after=*)
    		AFTER="${i#*=}"
    		shift
    		;;
    		*)
    		# unknown option
    		;;
    	esac
    done
    
    # Check if both BEFORE and AFTER parameters are provided
    if [ -z "$BEFORE" ] || [ -z "$AFTER" ]; then
    	echo "Error: Missing required parameters."
    	echo "Usage: $0 --before=<comma-separated-values> --after=<comma-separated-values>"
    	exit 1
    fi
    
    IFS=',' read -ra BEFORE_LIST <<< "$BEFORE"
    IFS=',' read -ra AFTER_LIST <<< "$AFTER"
    
    # Calculate average for before and after lists
    BEFORE_AVG=$(printf "%.2f\n" $(average "${BEFORE_LIST[@]}"))
    AFTER_AVG=$(printf "%.2f\n" $(average "${AFTER_LIST[@]}"))
    
    echo "Before average: $BEFORE_AVG"
    echo "After average: $AFTER_AVG"
    
    # Calculate average percent increase, decrease or no change for the list
    if [ "$BEFORE_AVG" != "0.00" ]; then
    	PERCENT_CHANGE=$(echo "(($AFTER_AVG - $BEFORE_AVG) / $BEFORE_AVG) * 100" | bc -l)
    	if [ "$(echo "$PERCENT_CHANGE > 0" | bc -l)" -eq 1 ]; then
    		printf "Average percent increased: %.2f%%\n" "$PERCENT_CHANGE"
    	elif [ "$(echo "$PERCENT_CHANGE < 0" | bc -l)" -eq 1 ]; then
    		printf "Average percent decreased: %.2f%%\n" "$PERCENT_CHANGE" | tr -d '-'
    	else
    		echo "No change in average."
    	fi
    else
    	echo "Percent change from before to after: undefined (division by zero)"
    fi
    Code language: Bash (bash)

    It runs like this:

    $ average.sh --before=13.07,9.75,16.14,7.71,10.32 --after=1.22,1.28,1.13,1.19,1.26
    Before average: 11.40
    After average: 1.22
    Average percent decreased: 89.30%
    

    In this instance, it was calculating seconds–but you need to remember that it only goes to two decimal places, so if you need something finer you’ll need to adjust the code or your inputs.

    Happy Slacking!

  • Quick Tip: Export WordPress SQL output via WP-CLI

    Quick Tip: Export WordPress SQL output via WP-CLI

    If for some reason you can’t run wp db query, but need to export SQL output to a CSV or other file, then have a look at this small WP-CLI command I whipped up that should allow this:

    /**
     * Runs a SQL query against the site database.
     *
     * ## OPTIONS
     *
     * 
     * : SQL Query to run.
     *
     * [--format=]
     * : Render output in a particular format.
     * ---
     * default: table
     * options:
     * - table
     * - csv
     * - json
     * - count
     * - yaml
     * ---
     *
     * [--dry-run=]
     * : Performa a dry run
     *
     * @subcommand sql
     */
    public function sql( $args, $assoc_args ) {
         global $wpdb;
     
         $sql     = $args[0];
         $format  = WP_CLI\Utils\get_flag_value( $assoc_args, 'format', 'table' );
         $dry_run = WP_CLI\Utils\get_flag_value( $assoc_args, 'dry-run', 'true' );
     
         // Just some precautions.
         if ( preg_match( '/[update|delete|drop|insert|create|alter|rename|truncate|replace]/i', $sql ) ) {
             WP_CLI::error( 'Please do not modify the database with this command.' );
         }
     
         if ( 'false' !== $dry_run ) {
             WP_CLI::log( WP_CLI::colorize( '%gDRY-RUN%n: <code>EXPLAIN</code> of the query is below: https://mariadb.com/kb/en/explain/' ) );
             $sql = 'EXPLAIN ' . $sql;
         }
     
         // Fetch results from database.
         $results = $wpdb->get_results( $sql, ARRAY_A ); // phpcs:ignore WordPress.DB
     
         // Output data.
         WP_CLI\Utils\format_items( $format, $results, array_keys( $results[0] ) );
    }Code language: PHP (php)

    I’d add an example here, but I don’t have any right now that I can share 😐 I’ll try to find one later (don’t hold your breath on me remembering to do that)

  • Converting CSV to SQL

    Converting CSV to SQL

    I was recently working on an issue that required me to dig through gigabytes of CSV files to look for patterns and data points. I had no intention of bringing all of this data in to Excel, because there’s not enough RAM on earth for that to work. Instead I thought it would be easier to dump the data into MariaDB (MySQL would be fine as well) and just write some simple SQL statements.

    There were a few things I tried, such as LOAD DATA INFILE and other SQL type things. Nothing really worked properly, so instead I decided to just convert the CSV files into SQL statements I could import.

    This is a bad idea! Don’t ever let this touch a live production database!

    I wrote this little PHP script to convert all of the CSV files in a directory to SQL:

    <?php
    $files    = glob( './*.{csv}', GLOB_BRACE );
    $db_table = 'csv_table';
    
    foreach ( $files as $file ) {
    	$row       = 0;
    	$handle    = fopen( $file, 'rb' );
    	$row_names = '';
    
    	if ( false !== $handle ) {
    		while ( false !== ( $data = fgetcsv( $handle ) ) ) {
    			$row++;
    			$values = '';
    
    			if ( 1 === $row ) {
    				// Do Header things.
    				$index  = 0;
    
    				echo sprintf( 'DROP TABLE IF EXISTS `%s`;' . PHP_EOL, $db_table );
    				echo sprintf( 'CREATE TABLE `%s` (' . PHP_EOL, $db_table );
    
    				foreach ( $data as $key => $value ) {
    					$index++;
    					$data[ $key ] = strtolower( preg_replace( '/[^A-Za-z0-9]/', '_', $value ) );
    
    					// Add a comma if we're not at the end of the array.
    					if ( count( $data ) !== $index ) {
    						echo sprintf( '	`%s` text NOT NULL,' . PHP_EOL, $data[ $key ] );
    					} else {
    						echo sprintf( '	`%s` text NOT NULL' . PHP_EOL, $data[ $key ] );
    					}
    				}
    
    				echo ') ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;' . PHP_EOL;
    				$row_names = '(`' . implode( '`,`', $data ) . '`)';
    				continue;
    			}
    
    			foreach ( $data as $key => $value ) {
    				$data[ $key ] = fake_sql_escape( $value );
    			}
    
    			$values = '(\'' . implode( '\',\'', $data ) . '\')';
    
    			echo sprintf( 'INSERT INTO `%s` %s VALUES %s;' . PHP_EOL, $db_table, $row_names, $values );
    		}
    		fclose( $handle );
    	}
    }
    
    function fake_sql_escape( $value ) {
    	// https://stackoverflow.com/questions/1162491/alternative-to-mysql-real-escape-string-without-connecting-to-db/1163091#1163091
    	// Totaly not safe for production!
    	$return = '';
    	$length = strlen( $value );
    
    	for ( $i = 0; $i < $length; ++$i ) {
    		$char = $value[ $i ];
    		$ord  = ord( $char );
    		if (
    			"'" !== $char &&
    			'"' !== $char &&
    			'\\' !== $char &&
    			$ord >= 32 &&
    			$ord <= 126
    		) {
    			$return .= $char;
    		} else {
    			$return .= '\\x' . dechex( $ord );
    		}
    	}
    	return $return;
    }Code language: PHP (php)

    Borrowing the example airtravel.csv from https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html this is what the output looks like:

    DROP TABLE IF EXISTS `csv_table`;
    CREATE TABLE `csv_table` (
    	`month` text NOT NULL,
    	`1958` text NOT NULL,
    	`1959` text NOT NULL,
    	`1960` text NOT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('JAN','  340','  360','  417');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('FEB','  318','  342','  391');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('MAR','  362','  406','  419');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('APR','  348','  396','  461');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('MAY','  363','  420','  472');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('JUN','  435','  472','  535');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('JUL','  491','  548','  622');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('AUG','  505','  559','  606');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('SEP','  404','  463','  508');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('OCT','  359','  407','  461');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('NOV','  310','  362','  390');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('DEC','  337','  405','  432');Code language: JavaScript (javascript)

    And if you’re too lazy to see what the input CSV looks like (I would be) here it is, in its LGPL glory:

    "Month", "1958", "1959", "1960"
    "JAN",  340,  360,  417
    "FEB",  318,  342,  391
    "MAR",  362,  406,  419
    "APR",  348,  396,  461
    "MAY",  363,  420,  472
    "JUN",  435,  472,  535
    "JUL",  491,  548,  622
    "AUG",  505,  559,  606
    "SEP",  404,  463,  508
    "OCT",  359,  407,  461
    "NOV",  310,  362,  390
    "DEC",  337,  405,  432Code language: JavaScript (javascript)

    There’s a lot of ways that this could be made better, but it works in a quick pinch. If you need something similar, feel free to take this as a starting point.

  • Validating WordPress attachments with WP-CLI

    Validating WordPress attachments with WP-CLI

    I recently worked on migrating a site to a different server and for one reason or another, some of the images did not come over properly. While I could have just re-downloaded and re-imported all of the media, it would have taken quite a while since the media library was well over 100Gb. Instead, I opted to use WP-CLI to help find what images were missing:

    (more…)