Tag: json

  • Extract Transcript from Quill Meetings Files

    Extract Transcript from Quill Meetings Files

    I use Quill Meetings for local on-device transcriptions of calls. It’s pretty great!

    The app definitely has some quirks and is missing some features that I’d prefer, like the ability just export a text file of a call transcript. Sure, I can “copy” it and paste it into a file, but it’s missing things like timestamps:

    So I built a quick script to extract transcripts from .qm files for me. .qm files are basically just JSON files:

    #!/opt/homebrew/bin/php
    <?php
    declare(strict_types=1);
    
    error_reporting( E_ALL );
    ini_set( 'display_errors', '1' );
    
    // Quill export dir is first argument, or current directory if not provided.
    $export_dir = isset( $argv[1] ) ? rtrim( $argv[1], '/' ) : getcwd();
    
    // Find every file that ends in .qm in the export directory.
    $files = glob( $export_dir . '/*.qm' );
    if ( ! $files ) {
    	echo "No .qm files found in the directory: $export_dir\n";
    	exit( 1 );
    }
    
    /**
     * Each QM file is just a JSON file with a .qm extension and the first line being "QMv2"
     * We need to read each file, remove the first line, and decode the JSON.
     */
    foreach( $files as $file ) {
    	if ( ! is_readable( $file ) ) {
    		echo "Cannot read file: $file\n";
    		continue;
    	}
    
    	// Read the file and remove the first line.
    	$content = file_get_contents( $file );
    	if ( false === $content ) {
    		echo "Failed to read file: $file\n";
    		continue;
    	}
    
    	// Remove the first line (QMv2).
    	$lines = explode( "\n", $content );
    	array_shift( $lines ); // Remove the first line.
    	$json_content = implode( "\n", $lines );
    
    	// Decode the JSON content.
    	$data = json_decode( $json_content, true );
    	if ( null === $data && json_last_error() !== JSON_ERROR_NONE ) {
    		echo "Invalid JSON in file: $file\n";
    		continue;
    	}
    
    	// Pretty print the JSON data.
    	$pretty_json = json_encode( $data, JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES );
    	if ( false === $pretty_json ) {
    		echo "Failed to encode JSON for file: $file\n";
    		continue;
    	}
    
    	$speakers = array();
    	$transcript = array();
    	$output_string = '';
    	$output_file = '';
    	foreach ( $data as $quill_objects => $quill_object ) {
    	    // Each Quill object is an array. We want to check if it has a 'type' of 'Meeting'.
    		if ( isset( $quill_object['type'] ) && $quill_object['type'] === 'Meeting' ) {
    			$output_file = $quill_object['data']['start'] . '-' . $quill_object['data']['end'] . ': ' . $quill_object['data']['title'] . '.txt';
    			// The "audio_transcript" is just a JSON string that we need to decode.
    			$audio_transcript = json_decode( $quill_object['data']['audio_transcript'], true );
    			$encoded_speakers = $quill_object['data']['speakers'] ?? [];
    			foreach( $encoded_speakers as $encoded_speaker ) {
    				$speakers[ $encoded_speaker['id'] ] = $encoded_speaker['name'] ?? 'Unknown Speaker ' . $encoded_speaker['id'];
    			}
    			if ( ! isset ( $audio_transcript['startTime'] ) ) {
    				echo "Invalid start time in audio transcript for file: $file\n";
    				continue;
    			}
    			$start_time = $audio_transcript['startTime'];
    			$end_time   = $audio_transcript['endTime'];
    			foreach( $audio_transcript['blocks'] as $block ) {
    				$time_block = ms_to_readable( $block['from'] - $start_time );
    				if ( isset( $block['speaker_id' ] ) ) {
    					$speaker_block = $speakers[ $block['speaker_id'] ];
    				} else {
    					echo 'Unkown Speaker found. Please manually mark all speakers in Quill before exporting.' . PHP_EOL;
    					die( 1 );
    				}
    				$output_string .= sprintf( "%s %s: %s\n", $time_block, $speaker_block, $block['text'] );
    			}
    		}
    	}
    
    	if ( ! empty( $output_string ) && ! empty( $output_file ) ) {
    		// Sanitize the filename.
    		$output_file = sanitize_filename( $output_file );
    		// Write the output string to the file.
    		if ( file_put_contents( $output_file, $output_string ) === false ) {
    			echo "Failed to write to file: $output_file\n";
    		} else {
    			echo "Exported to: $output_file\n";
    		}
    	} else {
    		echo "No valid Meeting data found in file: $file\n";
    	}
    }
    
    function ms_to_readable(int $ms): string {
    	// round to nearest second
    	$secs = (int) round($ms / 1000);
    	// gmdate formats seconds since 0 into H:i:s β€” we just need i:s
    	return '[' . gmdate('i:s', $secs) . ']';
    }
    
    function sanitize_filename(string $filename): string {
    	// strip any path information
    	$fname = basename($filename);
    	// replace any character that is NOT a-z, 0-9, dot, hyphen or underscore with an underscore
    	$clean = preg_replace('/[^\w\.-]+/', '_', $fname);
    	// collapse multiple underscores
    	return preg_replace('/_+/', '_', $clean);
    }Code language: PHP (php)

    and when I say “I” wrote it, it was probably half AI πŸ™ƒ

    This gives me a nice text file with timestamps:

    So, yeah. Whatever.

  • Wisps, a WordPress Plugin

    Wisps, a WordPress Plugin

    Last year I had a need for an editable JSON file that was retrievable via HTTP. Of course there’s a million ways that I could do this, but the easiest I thought of would be to have it inside of WordPress, since all of the people that needed access to edit the file already had edit access to a specific site. So I built a plugin.

    Doing this inside WordPress already brings a lot of benefits with little to no effort:

    1. User Management
    2. Revision History
    3. oEmbed Support
    4. Permalinks
    5. Syntax Highlighting Code Editor
    6. Self-Hosted Data

    Possibly more benefits as well, depending on the setup, such as caching.

    I’ve tweaked the plugin some, and I’m almost ready to submit it to the WordPress.org Plugin Repository. I just need to do the hard part of figuring out artwork. Ugh.

    Introducing Wisps:

    Wisps are embeddable and sharable code snippets for WordPress.

    With Wisps, you can have code snippets similar to Gist, Pastebin, or similar code sharing sites. Using the built-in WordPress code editor, you can write snippets to post and share. This has the benefit of WordPress revisions, auto-drafts, etc to keep a record of how code changes.

    Wisps can be downloaded by appending /download/ to the permalink, or viewed raw by adding /view/ or /raw/. There is full oEmbed support so you can just paste in a link to a wisp in the editor and it will be fully embedded.

    PrismJS is used for syntax highlighting for oEmbeds.

    You can add Wisp support to your theme either by modifying the custom post type page-wisp.php template, which will continue to display Wisps in the loop securely, or you can use add_theme_support( 'wisps' ) to tell the plugin to not automatically escape the output. You can then do what you like, such as potentially adding frontend support for syntax highlighting.

    Here’s what the oEmbed data looks like:

    (Yeah, I totally stole the design from Gists, because I’m not talented 😬)

    View the example Wisp

    View it raw

    Download it

    Currently available on GitHub

    Hopefully one day available on the WordPress.org Plugin Repository πŸ™‚

    If you give it a try and have any suggestions, or issues drop me a line here or on GitHub!