Joe Maller.com

Fixing Mail import crashes on Leopard

After restoring from a Time Machine backup, Apple Mail would crash every time I tried to re-import my email archive. The problem seems to be affecting a lot of people who have mail that pre-dates OS X. The same crash also happens after migrating to a new machine.

Quick fix

Copy the following command and paste it into your Terminal. Press return and wait a few minutes. Once the command finishes, Mail should be able to import your messages.

grep -lZr 'Content-disposition: attachment' ~/Library/Mail/Mailboxes/ | xargs -0 ruby -i -pe 'gsub(/(Content-type:[^;]*;\s*name=)"(.*)"/){$1+(if !$2.nil? then $2.dump.gsub(/\\\\/, "\\") end)}'

The problem

The problem is specific to older, MacRoman encoded email messages with attachments whose filename includes non-ascii characters. For whatever reason, that freakshow edge-case combination will crash Apple Mail every time it tries to import those files. If by some stroke of luck you already have these messages in your email archive (I did), they can still crash Mail when trying to rebuild the containing mailbox.

The solution is sort of simple, just rename the attachment in the .emlx file. The harder part is finding which file to edit.

We’ll use grep to recursively search for a common token in these older files, then pass the filenames to a Ruby snippet which will filter the line containing the bad characters. Here is a section of a suspect message’s header:

--B_3113094650_866203
Content-type: application/octet-stream; name="SPWH 4.01ü.sit";
x-mac-creator="53495421";
x-mac-type="53495435"
Content-disposition: attachment
Content-transfer-encoding: base64

The first bolded line contains the attachment name which is causing the problem, who knows what that umlaut was originally. The second bolded line is what we’re telling grep to locate, I couldn’t get anything to dependably match the oddball characters. “Content-disposition” appears to be an older attachment syntax and didn’t appear in any of my messages from after 2001.

While the Ruby script could be run from find’s exec command, it wouldn’t be particularly efficient. Calling the script from find would pass every .emlx file in the Mailboxes directory through Ruby’s gsub, which is almost wholly unnecessary (and much slower). Only 10 of my 57,007 messages needed fixing, 99.98% of them were fine.

Is this safe?

Since the most common way to encounter this bug involves Time Machine restores or migration to a new machine, most users should already be backed up. If something should go wrong, just restore the backup’s Library/Mail folder over the messed up one. You can also copy or zip the Library/Mail folder to another drive to be even safer.

That said, I ran dozens of iterations of this solution against copies of my personal Mail archive without issue. And besides, if you’re reading this, you might not have any of your old mail, so how much worse could it really get?

If you’re still worried, copy your ~/Library/Mail/Mailboxes folder to another drive, run the script, then compare directories with something like Apple’s FileMerge. That will show you exactly which files have changed and what was changed inside them.

Renaming the attachment should be perfectly safe since the full encoded file contents are stored in the message. All the attachment name does is specify the filename, in a sense, it’s totally arbitrary.

I first submitted this bug with Apple back in May, if you have a developer account with Apple, please file a dupe for radar: 5912997


AppleScript source links from TextMate

AppleScript’s URL Protocol Support allows the full source code of an AppleScript to be shared through an encoded URL. Here’s a short little TextMate Command for converting AppleScripts to encoded URLs.

Create a new command in TextMate’s AppleScript bundle with the following code:

#!/usr/bin/env ruby -KA -rcgi
src = CGI.escape(STDIN.read)
src = src.gsub('+', '%20')
src = 'applescript://com.apple.scripteditor?action=new&script=' + src
`echo -n "#{src}" | pbcopy`
print "The URL encoded AppleScript was copied to the clipboard"

There’s no reason this idea can’t be used for all sorts of stuff. So why not make it easier, it’s not AppleScript, but here’s that code in Script Editor for easier copying.

This is exceptionally fast, much faster than Apple’s provided encoding script. As an extreme example, converting the 558 lines of my iPhoto Date Reset script took just under a minute using Apple’s script. The little Ruby script in TextMate does it instantly. (running the AppleScript in Script Editor with the Event Log Window open took nearly 7 minutes.)

Update: I committed this command to the TextMate Bundles repository and it’s now included in the default set of Bundles shipping with TextMate. (And slightly improved by other TextMate bundle developers)