Joe Maller.com

Mbox files and Mail.app in 10.4

One of the big under-the-hood changes to Mail.app in 10.4 is that messages are no longer in mbox files, this allows Spotlight to index individual messages without having to first parse out the contents of the entire mailbox. Despite being unused, the old mbox files are often still on the drive, which means that most everyone’s mail is now taking up almost twice as much space as it did with 10.3. (my mail folder went from 1.4 to 2.8 gigs). If installing Tiger devoured a lot of hard drive space, that might account for a significant portion of where it went.

After an Archive & Install upgrade, my ~/Library/Mail directory still has folders labeled *.mbox, but those folders each now contain a “Messages” directories which holds thousands of numbered *.emix files. Those mostly appear to be plain text files each containing one message. There is a small glob of XML plist data attached to the end of each file, as well an integer at the top of the file. The first integer is the message’s character/byte count from the end of the integer to the beginning of the XML data.

In theory, a fairly simple shell script could glom everything together into a standard mbox. Not sure how processor intensive that would be, but the steps to reassemble the data would be trivial. At very least Apple’s decision to move away from the mbox format can be easily reversed with no data loss.

Not much has been written about this, but I found this MacOS X Hints mbox thread which confirms what I’m seeing:

I used to be able to use mutt or pine to view the mbox mailboxes in ~/Library/Mail/<account>/<box>/mbox . In 10.4 these are still present, but appear not to be updated any more. The up to date emails are in ~/Library/Mail/<account>/<box>/Messages/*.emlx which I believe is required for spotlight to be able to index messages – it only indexes file-based entities, not subportions of files.

Because Carbon Copy Cloner doesn’t work with 10.4 yet, I can’t comfortably back up my drive and experiment with deleting the old mboxes. It seems like it should be safe to remove all mbox files and associated files, nothing outside the Messages directories has been modified since I upgraded to 10.4. If anyone has more information, please leave a comment.

(While reading a little background on the mbox format, I found the original RFC for email as a text file. The W3c also has an HTML version of RFC822, partially converted by (sir) Tim Berners-Lee. It’s fun to encounter raw history like that.)

Update I posted a simple command to delete unused Mbox files.


getSelection() Workaround for Safari 1.3, 2.0 and Firefox 1.0.3

update 2: These workarounds also work with Safari 2.0 in Mac OS X v10.4 .
update: There’s a simpler fix, jump to the bottom.

Yesterday morning I noticed a change to the JavaScript/DOM getSelection() behavior in the new Safari 1.3 (in 10.3.9) and the most recent version of Firefox 1.0.3.

I’ve been using this method for years to pull selected text from web pages for several of my bookmarklets. The one I use most frequently generates a link from whatever text is selected. If nothing is selected, it grabs the document’s title. The change in getSelection() broke that bookmarklet, no selected text was recognized.

After a bit of research, I found Mozilla’s Safely accessing content DOM from chrome page which describes the security fixes behind the modification and detailing other problems the changes had caused. Based on Mozilla bug 290777 and this post by Buzz Anderson, both browsers seem to have problems with the change. Despite those bugs, I managed to find the simple workaround as described below.

What Safari and Firefox now seem to be doing is creating a DOM selection object from getSelection() instead of treating it as a simple string. The result is that getSelection() appear to be a string, but few of the string manipulation functions work without additional considerations.

The following examples are all intended to be tested as bookmarklets, drag them to your bookmarks bar for testing:

  • getSelection() test 1
    javascript:d=window.getSelection();
    alert(d);

    Works as expected in Safari 1.2.4, Safari 1.3 and Firefox 1.03, popping an alert containing the selected text. Trying to measure that returned string fails in Safari 1.3 and Firefox 1.0.3 but works in Safari 1.2.4:

  • getSelection() test 2
    javascript:d=window.getSelection();alert(d.length);

    The older version of Safari returns a character count for the string of selected text. Firefox and Safari 1.3 return “undefined”. There are quite a few other problems:

  • getSelection() test 3
    javascript:d=window.getSelection();
    alert(d.toString());

    Works in Firefox and Safari 1.2.4 but not in Safari 1.3.

  • getSelection() test 4
    javascript:d=window.getSelection();
    alert(d.toString().length);

    Getting the length after toString works in Safari 1.2.4 and Firefox.

Further inconsistencies between Safari 1.3 and Firefox 1.0.3:

  • getSelection() test 5
    javascript:d=window.getSelection();alert(d.type);

    Returns “Range” in Safari with a selection, returns “Caret” or “None” with nothing selected. Fails with “undefined” in Firefox. (I think the Firefox 1.0.3 DHTML regression bug might be preventing it from working in Firefox but I didn’t try any of the recent nightly builds.)

  • getSelection() test 6
    javascript:d=window.getSelection();
    alert(d.getRangeAt(0));

    Fails silently in Safari, returns selected text in Firefox. Safari dumps this into the Console log:
    [5956] :TypeError - Value undefined (result of expression d.getRangeAt) is not object.

  • getSelection() test 7
    javascript:d=window.getSelection();
    alert(d.typeDetail);

    Fails with “undefined” in both.

There is some good news:

  • getSelection() test 8
    javascript:d=window.getSelection();
    alert(d.isCollapsed);

    Works in both Firefox and Safari 1.3; fails in Safari 1.2.4 as “undefined”. This means (finally!) there is a workaround for my problem.

I was using the length property to determine whether a selection was empty or not, then fetching the title of the window if that value was 0. Knowing that length no longer works in Firefox and Safari, isCollapsed can be used as a conditional switch.

  • getSelection() Workaround
    javascript:d=window.getSelection();
    d=(d.isCollapsed||d.length==0)?document.title:d;
    alert(d);

    That will return any selected text or the document title if there is no selection. Tested successfully in Safari 1.2.4, Safari 1.3, Firefox 1.0.3 and presumably Safari 2.0 as well.

Line breaks were added to visible code examples because my style-sheet choked on long lines and I can’t redo the CSS right now…

Update: After working through all of the above, I realized there is a far simpler solution: +''. The Safari problem seems to be that string methods do not work on the returned object from getSelection(). Forcing the result into a string by concatenating with an empty string fixes all of my bookmarklets. Concat() fails because it’s a method of string, use the "+" joining operator and an empty string '' instead.



« Previous Page