Tag: automation

  • Converting CSV to SQL

    Converting CSV to SQL

    I was recently working on an issue that required me to dig through gigabytes of CSV files to look for patterns and data points. I had no intention of bringing all of this data in to Excel, because there’s not enough RAM on earth for that to work. Instead I thought it would be easier to dump the data into MariaDB (MySQL would be fine as well) and just write some simple SQL statements.

    There were a few things I tried, such as LOAD DATA INFILE and other SQL type things. Nothing really worked properly, so instead I decided to just convert the CSV files into SQL statements I could import.

    This is a bad idea! Don’t ever let this touch a live production database!

    I wrote this little PHP script to convert all of the CSV files in a directory to SQL:

    <?php
    $files    = glob( './*.{csv}', GLOB_BRACE );
    $db_table = 'csv_table';
    
    foreach ( $files as $file ) {
    	$row       = 0;
    	$handle    = fopen( $file, 'rb' );
    	$row_names = '';
    
    	if ( false !== $handle ) {
    		while ( false !== ( $data = fgetcsv( $handle ) ) ) {
    			$row++;
    			$values = '';
    
    			if ( 1 === $row ) {
    				// Do Header things.
    				$index  = 0;
    
    				echo sprintf( 'DROP TABLE IF EXISTS `%s`;' . PHP_EOL, $db_table );
    				echo sprintf( 'CREATE TABLE `%s` (' . PHP_EOL, $db_table );
    
    				foreach ( $data as $key => $value ) {
    					$index++;
    					$data[ $key ] = strtolower( preg_replace( '/[^A-Za-z0-9]/', '_', $value ) );
    
    					// Add a comma if we're not at the end of the array.
    					if ( count( $data ) !== $index ) {
    						echo sprintf( '	`%s` text NOT NULL,' . PHP_EOL, $data[ $key ] );
    					} else {
    						echo sprintf( '	`%s` text NOT NULL' . PHP_EOL, $data[ $key ] );
    					}
    				}
    
    				echo ') ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;' . PHP_EOL;
    				$row_names = '(`' . implode( '`,`', $data ) . '`)';
    				continue;
    			}
    
    			foreach ( $data as $key => $value ) {
    				$data[ $key ] = fake_sql_escape( $value );
    			}
    
    			$values = '(\'' . implode( '\',\'', $data ) . '\')';
    
    			echo sprintf( 'INSERT INTO `%s` %s VALUES %s;' . PHP_EOL, $db_table, $row_names, $values );
    		}
    		fclose( $handle );
    	}
    }
    
    function fake_sql_escape( $value ) {
    	// https://stackoverflow.com/questions/1162491/alternative-to-mysql-real-escape-string-without-connecting-to-db/1163091#1163091
    	// Totaly not safe for production!
    	$return = '';
    	$length = strlen( $value );
    
    	for ( $i = 0; $i < $length; ++$i ) {
    		$char = $value[ $i ];
    		$ord  = ord( $char );
    		if (
    			"'" !== $char &&
    			'"' !== $char &&
    			'\\' !== $char &&
    			$ord >= 32 &&
    			$ord <= 126
    		) {
    			$return .= $char;
    		} else {
    			$return .= '\\x' . dechex( $ord );
    		}
    	}
    	return $return;
    }Code language: PHP (php)

    Borrowing the example airtravel.csv from https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html this is what the output looks like:

    DROP TABLE IF EXISTS `csv_table`;
    CREATE TABLE `csv_table` (
    	`month` text NOT NULL,
    	`1958` text NOT NULL,
    	`1959` text NOT NULL,
    	`1960` text NOT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('JAN','  340','  360','  417');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('FEB','  318','  342','  391');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('MAR','  362','  406','  419');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('APR','  348','  396','  461');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('MAY','  363','  420','  472');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('JUN','  435','  472','  535');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('JUL','  491','  548','  622');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('AUG','  505','  559','  606');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('SEP','  404','  463','  508');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('OCT','  359','  407','  461');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('NOV','  310','  362','  390');
    INSERT INTO `csv_table` (`month`,`1958`,`1959`,`1960`,`type`) VALUES ('DEC','  337','  405','  432');Code language: JavaScript (javascript)

    And if you’re too lazy to see what the input CSV looks like (I would be) here it is, in its LGPL glory:

    "Month", "1958", "1959", "1960"
    "JAN",  340,  360,  417
    "FEB",  318,  342,  391
    "MAR",  362,  406,  419
    "APR",  348,  396,  461
    "MAY",  363,  420,  472
    "JUN",  435,  472,  535
    "JUL",  491,  548,  622
    "AUG",  505,  559,  606
    "SEP",  404,  463,  508
    "OCT",  359,  407,  461
    "NOV",  310,  362,  390
    "DEC",  337,  405,  432Code language: JavaScript (javascript)

    There’s a lot of ways that this could be made better, but it works in a quick pinch. If you need something similar, feel free to take this as a starting point.

  • Disabling WordPress Faux Cron

    Disabling WordPress Faux Cron

    The WordPress WP-Cron system is a decently okay faux cron system, but it has its problems, such as running on frontend requests and not running if no requests are coming through.

    WP-Cron works by: on every page load, a list of scheduled tasks is checked to see what needs to be run. Any tasks scheduled to be run will be run during that page load. WP-Cron does not run constantly as the system cron does; it is only triggered on page load. Scheduling errors could occur if you schedule a task for 2:00PM and no page loads occur until 5:00PM.

    From the WordPress Plugin Handbook

    These are problems because:

    • A heavy cron event can cause severe slowdown on random frontend requests, hurting page speeds.
    • Not running without requests can be bad for sites that are infrequently updated and heavily cached.

    The solution to this is to disable the built-in cron firing that’s done with pageviews, and use a system cron (or other service) to poll for cron events.

    Disabling the cron firing is done by adding this to the wp-config.php file:

    define( 'DISABLE_WP_CRON', true );Code language: JavaScript (javascript)

    For this site specifically, I use the “Cron Jobs” system of DreamHost to run this WP-CLI command every 10 minutes:

    wp cron event run --due-now --path=/path/to/derrick.blog/ --url=https://derrick.blog/Code language: JavaScript (javascript)

    This forces the cron to run and check for ready jobs every 10 minutes.  It’s possible that some cron events might run later than they “should” but in practice, I’ve seen this running more cron jobs than if I relied on page loads.

  • Quick Tip: Force Enable Auto-Updates in WordPress

    Quick Tip: Force Enable Auto-Updates in WordPress

    I know that auto-updates are a bit of a (#wpdrama) touchy subject, but I believe in them.

    In an mu-plugin I enable all auto-updates like so:

    <?php
    // Turn on auto-updates for everything
    if ( ! defined( 'IS_PRESSABLE' ) || ! IS_PRESSABLE ) {
    	add_filter( 'allow_major_auto_core_updates', '__return_true' );
    	add_filter( 'allow_minor_auto_core_updates', '__return_true' );
    }
    
    add_filter( 'auto_update_core', '__return_true' );
    add_filter( 'auto_update_plugin', '__return_true' );
    add_filter( 'auto_update_theme', '__return_true' );
    add_filter( 'auto_update_translation', '__return_true' );
    Code language: HTML, XML (xml)
  • Windows, SFTP, and the Registry

    Windows, SFTP, and the Registry

    One of the tasks that I have to do often at work is copying data to and from an SFTP directory.  Previously I had a constant domain and port that I was able to connect to, an I could save this in a WinSCP profile for ease of use.  Due to some recent architectural changes though, we’re now dynamically generating IPs and ports to connect to, which caused a bit of a headache.  Luckily though, we do get a really nice sftp://user@domain.example.com:1234 URI that gives us this information, and some terminal clients even allow you to click it (ConEmu).

    Unfortunately, I can’t just register WinSCP as the default handler for sftp URIs because I needed to provide other data, such as a private key and proxy information.

    To fix this,  I created a wrapper, poorly named scp.cmd that does all of this work:

    @echo off
    :: Set some necessary path variables.
    :: I would recommend WinSCP Portable, but that's just me.
    set WINSCP_PATH="C:\Path To\winscp.exe"
    set PRIVKEY_PPK="C:\Path To\Private Key.ppk"
    :: Run the actual SCP command.
    %WINSCP_PATH% %1 /privatekey=%PRIVKEY_PPK% /rawsettings ProxyMethod=2 ProxyHost="127.0.0.1" ProxyPort=8080
    :: Unset the variables now that we don't need them.
    set WINSCP_PATH=
    set PRIVKEY_PPK=Code language: PHP (php)

    From here I can create a Windows Registry file (or manually do it with regedit.exe but that’s crazy) to register the sftp URI handler and point it to my scp.cmd file:

    Windows Registry Editor Version 5.00
    
    [HKEY_CLASSES_ROOT\sftp\shell\open\command]
    @="\"C:\\Windows\\scp.cmd\" \"%1\""Code language: JavaScript (javascript)

    Now I can easily click on sftp links for work, I can paste them into the Windows Run dialog, or even open them via the command line with start.  This is a wonderful time saver!

  • Let’s Encrypt SSL on SABnzbd+

    Let’s Encrypt SSL on SABnzbd+

    Let’s Encrypt has been in public beta for some time now, so I thought it was time for me to test it out and see how it works.

    I’ve been working on some automation for Let’s Encrypt, WordPress Multisite, Domain Mapping, and Apache for a while, but I don’t have anything that I feel comfortable sharing yet.

    For now though, I was able to get Let’s Encrypt to work with SABnzbd+, which is a binary newsgroup downloader for things such as Linux ISOs.

    (more…)