Extracting audio from CD images

I always wondered what the point was in cue/bin CD images compared to the ISO format, but never bothered to look it up.

A couple of days ago I was looking for music from an old Playstation game (which I own but couldn't be bothered to track down). I knew the music was on the disc as CD audio since I'd copied it to tape a dozen years or more ago, so I found and downloaded an image of the game from the web.

The thought then occurred -- how do I rip the music from this image? I usually use cdparanoia to get music off CDs, but could I point cdparanoia to a mounted CD image? The answer is no -- I would only be able to mount the data partition. You don't mount an audio CD before ripping music from it, cdparanoia just accesses the drive directly. Would I have to burn the image to a CD, then rip it back onto the machine? That's ridiculous.

Meanwhile the download finished and I saw that it was bin/cue rather than ISO. This is when I looked it up. It seems obvious now, but an ISO is just the ISO-9660 filesystem -- the data part. As such, an ISO file can't include CD audio. The bin/cue format is needed in order to include the data on the separate tracks of the CD. Converting the bin/cue to ISO would have lost me all the music.

So for future reference, extracting the audio (and the ISO at the same time) from bin/cue is as easy as this:

bchunk -w wild9.bin wild9.cue wild9

The -w there makes it write any audio as wave files so I can then convert to Flac. I assume it'd write them as raw PCM otherwise, but I've no desire to check.


Archiving mailing list messages but not replies to my own posts

Sometimes I sign up to a mailinglist and want to see replies to my own threads but not have my inbox overtaken by everything else.

I can set up a filter in Gmail to label the mailing list posts and archive them, obviously, but then replies to my own threads (or threads I've participated in) are not obvious.

So let's make a more sophisticated filter, to match all messages with the relevant List-ID header except those which mention my messages in their References header.

After a few test queries it seems that search queries of the form References:*@segnus matches messages replying to things I sent from my laptop. The negative version -References:*@segnus appears to match the opposite. So it's not hard to add more negatives to avoid matching messages from my other machines and to build up a final query:

List:django-users.googlegroups.com -References:*@segnus -References:*@t900 -References:*@perihelion

This then goes in the "includes the words" box of the filter, and is set to apply a label and skip the inbox.

It's possible that other people send messages with Message-ID headers like mine and so I'd get some extra messages in my inbox, but I can live with that.

This doesn't seem to work quite correctly on existing messages. I'm guessing this is because Gmail sees the messages in the thread before any of them referenced me -- my original message or the existing conversation before I turned up -- and those messages match the filter, leading Gmail to lump the rest of the conversation in with that positive match and archive the lot. Fingers crossed it'll work with fresh incoming messages, though -- I'll update this post when I confirm.


Viewing HTML in mutt

I use mutt to read my email, and every now and then I get sent a message in HTML format with no plain text alternative. I don't like to load these files in a browser, since it'd go ahead and fetch any images, run scripts and so on with potential privacy risks. In other words, a message from a dubious source might phone home and confirm my email address or track me or whatever, just from opening their HTML in my browser.

So generally I just mind-parse the HTML. In more obfuscated cases (like the garbage output as newsletters by various websites) I manually pipe the message through lynx or similar.

Then when I reply to the message I have to pipe it through again if I want to quote something other than the HTML code.

I got fed up of this and looked for a solution. It consists of two parts -- changing up the entries in my mailcap file so that filtering the HTML to plain text is preferred to opening up a browser; and telling mutt to automatically filter text/html files using the rules it finds in the mailcap file. I've added some redundancy in to the mailcap entries so that it works both on my main machines (where I prefer pandoc since Markdown is nice to read, then I prefer lynx to either w3m or html2text, since lynx displays the links as references at the bottom) and on my phone (where only lynx is available).

In ~/.mailcap:

text/html; pandoc -f html -t markdown; copiousoutput; description=HTML Text; test=type pandoc >/dev/null
text/html; lynx -stdin -dump -force_html -width 70; copiousoutput; description=HTML Text; test=type lynx >/dev/null
text/html; w3m -dump -T text/html -cols 70; copiousoutput; description=HTML Text; test=type w3m >/dev/null
text/html; html2text -width 70; copiousoutput; description=HTML Text; test=type html2text >/dev/null

In ~/.mutt/muttrc:

auto_view text/html

Now HTML is automatically piped through one of those programs to turn it into plain text, and when I reply the quoted text is the plain text version rather than the raw HTML.


Different keys to push and pull to git repository

I was getting fed up of typing in the password for my home server when pulling changes from a git repository up to the live server. But I'm not comfortable with putting my private key on the remote since other people have root. What to do?

I made two new SSH key pairs. One with a passphrase and one without. I told gitosis, which handles permissions for the git repositories on my home server, to accept my main passphraseless key with read/write access, the passphrased one also with read/write access, and the new passphraseless one with read-only access. I then uploaded the private keys for the passphrased key and the new passphraseless key to the remote host.

So even though they have one of my passphraseless private keys, it's only good for read access to the repositories -- data which they already have anyway.

To tell SSH it has multiple keys you edit the config file and add an IdentityFile line for each key. But when connecting to the remote SSH server only the first acceptable key is tried. So if the passphraseless key is first everything will be fine when doing a read operation but gitosis will give a no permission error message when doing a write operation, and the other key won't be tried. If the key with the passphrase is first, the passphrase is asked for no matter whether it's a read or write operation.

So here's the solution: pretend to git using the pushurl option that we're pulling from and pushing to different hosts, then set up SSH to use different keys for these different hosts, but in fact then point them to the same host. Here's the configuration to illustrate.

Configuration in repository/.git/config:

[remote "origin"]
    fetch = +refs/heads/*:refs/remotes/origin/*
    url = gitosis@example.com:repository.git
    pushurl = gitosis@rw.example.com:repository.git

Configuration in ~/.ssh/config:

Host example.com
IdentityFile ~/.ssh/id_rsa.ro

Host rw.example.com
IdentityFile ~/.ssh/id_rsa.rw
HostName example.com

So the dummy hostname rw.example.com triggers SSH to use the passphrased private key at the correct hostname. A passphrase prompt appears when pushing but not when pulling.


Automatically switch main group when logging in to a host

I have accounts on couple of hosts where I don't have root powers, and on some of them I am working alongside others. We're in the same group, but our primary groups are named after ourselves. Sometimes we need to edit files the other person has made and it can be a pain if one of us forgets to set the permissions such that the other user can edit the file.

Instead of having to remember to run chgrp semsorgrid newfile and then chmod g+w newfile every time we make a new file, we can change our primary group after logging in with newgrp semsorgrid and set the file creation mask to give the group maximum privileges with umask 002. But we still have to remember to do that when logging in.

I tried putting those commands in our shell configuration files (my .zshrc and his .bashrc) but then when logging in new shells are spawned recursively. The solution is to check which group we're in and conditionally run newgrp like this:

umask 007
if [ $(groups | awk '{print $1}') != "semsorgrid" ]; then
    exec newgrp semsorgrid

This replaces the shell which was originally spawned with a new one with the primary group properly set, then when that shell initializes it skips the newgrp command since the primary group is already "semsorgrid".


Open online PHP reference from vim in a new split vim window

Following the last post I went a bit further. I haven't decided which solution out of that and the following I like best yet.

This time I wanted the documentation to open in a new window within vim. That means grabbing the documentation HTML, cutting out what I don't want to see, rendering to plain text and putting the result in a new window.

Since the documentation is given as XHTML, as long as it's valid the safest way to chop out the bits I don't want is by parsing it as XML. So since PHP is my quick hacking language of choice I cooked up the following script and saved it as ~/bin/phpman-text.

#!/usr/bin/env php

if (!isset($_SERVER["argv"][1])) {
 fwrite(STDERR, "No keyword given\n");

$xmlstring = @file_get_contents("http://php.net/" . urlencode($_SERVER["argv"][1]));
if ($xmlstring === false) {
 fwrite(STDERR, "Failed to fetch doc page\n");

// remove default namespace
$xmlstring = preg_replace('%\bxmlns="[^"]*"%', "", $xmlstring);

$xml = @simplexml_load_string($xmlstring);
if ($xml === false) {
 fwrite(STDERR, "Failed to parse doc page\n");

// get content div
$content = array_pop($xml->xpath("//div[@id='content']"));

if (is_null($content)) {
 fwrite(STDERR, "Couldn't find div with ID 'content'\n");

// remove nav bars
foreach ($content->xpath("./div") as $nav)
 if (in_array("manualnavbar", explode(" ", $nav["class"])))

// get new XML
$newxml = $content->asXML();

// run lynx
$lynx = proc_open("lynx -dump -stdin", array(array("pipe", "r"), array("pipe", "w"), array("pipe", "w")), $pipes);
if (!is_resource($lynx)) {
 fwrite(STDERR, "Couldn't run lynx\n");

// poke new XML to lynx's stdin
fwrite($pipes[0], $newxml);

// get lynx's stdout
echo stream_get_contents($pipes[1]);
$erroroutput = stream_get_contents($pipes[2]);

// close lynx
$returnval = proc_close($lynx);

// check return value
if ($returnval != 0) {
 fwrite(STDERR, "lynx exited with code $returnval -- error output follows.\n$erroroutput");


function simplexml_remove_node($node) {
 $domnode = dom_import_simplexml($node);


Then, with help from the vim wiki, I came up with this, to be added to my .vimrc:

function! OpenPhpFunction (keyword)
 exe "12new"
 exe "silent read !phpman-text ".substitute(a:keyword, "_", "-", "g")
 exe "set buftype=nofile bufhidden=delete filetype=php readonly"
 exe "1"
autocmd FileType php map <buffer> K :call OpenPhpFunction('<C-r><C-w>')<CR>

This mostly works. It loads the PHP documentation into a temporary new buffer and sets the filetype to PHP so that bits of text in <?php ?> tags are highlighted as PHP code. There are a few bad things -- some of the blocks of PHP code in the manual don't have the end tag and so highlighting continues beyond the end of the code, some of the manual pages (probably due to dodgy comments) aren't valid XML and so won't go through the script, and lynx's output isn't completely ideal.


Open online PHP reference from vim

Pressing K when over a keyword in vim usually opens the keyword's manpage. In Python it invokes pydoc instead. I wanted it to open PHP's online documentation when I press it over a function name in PHP. Easy.

I wrote a tiny shell script in my ~/bin directory first, phpman:

sensible-browser http://php.net/"$*"

I then added to my .vimrc the line

autocmd FileType php set keywordprg=phpman

Done. And using sensible-browser has the added bonus that it'll run whatever my preferred graphical browser is when I am in X or a textmode browser when I'm not.