Fixing a Palm duplicate disaster

I recently came across an absolute disaster of a Palm Desktop data file while helping someone setup a new iPhone. It had 13,572 contacts, mostly duplicates. Judging from the number of obvious duplicate entries, my guess is the actual number will be somewhere around 2500 (it was).

Here is the process I used to automatically remove a lot of those duplicates and import the remainder into the Mac’s Address Book.

The first step is to get out of Palm Desktop as soon as possible. Select all contacts and export to a group VCard. This one was 3.4 MB.

Most of this will happen in Terminal, but a quick stop in BBEdit or TextWrangler will save a few steps later on. (TextMate tends to choke on big, non-UTF files.) The Palm export file is encoded in MacRoman. It’s 2008, pretty much any text that isn’t Unicode should be. I used TextWrangler to convert the encoding to UTF-8 no BOM (byte order marker).

VCards require Windows style CRLF line endings. While we could deal with those in Sed, we might as well just switch the file to Unix style LF endings in TextWrangler too. The TextWrangler bottom bar should switch from this:

MacRoman CRLF

To this:

utf8 LF

Now comes the magic.

While this could be done as an impossible-to-read one-line sed command, it’s easier to digest and debug as separate command files.

Here are the steps:

  1. Use Sed to join each individual VCard into a single line using a token to replace line feeds, output to intermediate file
  2. Sort and Uniq the result to remove obvious duplicates.
  3. Replace the tokens with line feeds

Below are the two sed command files I used. I ran these individually but they could easily be piped together into a one-line command.


# define the range we'll be working with

# define the loopback

# add the next line to the pattern buffer

# if pattern is not found, loopback and add more lines
/\nEND:VCARD$/! b loop

# replace newlines in multi-line pattern
s/\n/   %%%     /g

Run that like this:

sed -f vcard_oneline.sed palm_dump.vcf > vcards_oneline.txt

Then run that file through sort and uniq:

sort vcards_oneline.txt | uniq > vcards_clean.txt 


# replace tokens with DOS style CRLF line endings
s/      %%%     /^M\

# add the <CR> before the LF at the end of the line

Run that with something like this:

sed -f vcard_restore.sed vcards_clean.txt > vcards_clean.vcf

After that last step, you should be able to drag the vcards_clean.vcf file into Address Book to import your vcards.

Suggestions for improvement are always welcomed.


In VIM, type the tab character as control-v-i (hold control while pressing v then i), type the line break by typing control-v-enter.

iconv could be used to convert from MacRoman to UTF-8. TextWrangler just seemed easier at the time.

Palm Desktop appears to dump group VCards in input order, so duplicate entries were not grouped together. Running the output through sort visually reveals a ton of duplicates and makes it possible to use uniq to remove consecutive duplicates.

I had to quit and re-open Address Book once or twice before it would import the files.

Tabbed clipboard to HTML Table

I was looking for a quick way to get a structured table from some data I had in Numbers. Unfortunately Numbers isn’t scriptable and doesn’t seem to offer plain HTML export. After a little poking around, I just ended up writing a script to do what I wanted.

This little AppleScript will convert anything text in the clipboard into a simple, unstyled HTML table. View the script in Script Editor

Just save it into your Scripts folder and call it after copying some data to the clipboard. Any text on your clipboard will be converted to a basic, un-styled HTML table, ready to paste.

set oldDelims to AppleScript‘s text item delimiters

set AppleScript‘s text item delimiters to return

set TRs to every text item of (the clipboard as text)

set AppleScript‘s text item delimiters to tab

set theTable to “<table>” & return

repeat with TR in TRs

copy theTable & “<tr>” & return to theTable

repeat with TD in text items of TR

copy theTable & “<td>” & TD & “</td>” & return to theTable

end repeat

copy theTable & “</tr>” & return to theTable

end repeat

copy theTable & “</table>” to theTable

set AppleScript‘s text item delimiters to oldDelims

set the clipboard to theTable

MWSF 2008 pre-thoughts

Just so I can go on the record, here are my thoughts before the Macworld 2008 keynote.

Multiple RegisterResource directives broken in Leopard

This is sort of a follow up on the old mod_rendezvous article I wrote for O’Reilly. While cleaning up my virtual hosts I discovered a bug in 10.5’s handling of multiple RegisterResource directives in mod_bonjour. This is expanded from a bug report I submitted to Apple (rdar://problem/5628484).

I keep functional mirrors all my development sites in separate Apache Virtual Hosts. Each one then gets it’s own port, which allows me to check them on local networks, in Parallels and, if I want, remotely via IP address.

To advertise two local vhosts over bonjour, something like this in httpd.conf should work:

&lt;IfModule bonjour_module&gt;
RegisterResource "Site 1" / 9001
RegisterResource "Site 2" / 9002

After restarting Apache (sudo apachectl graceful), local copies of Safari should see the two sites, “Site 1” and “Site 2” in Bonjour bookmark listings. In 10.4, they show up. In 10.5, only “Site 2” shows up. No matter how many directives are included, only the last one will be visible.

I’d love to be wrong about this, but it seems that something broke this function in Leopard.

A faster way of checking Bonjour entries is to open a terminal window and run the following command:

mdns -B _http._tcp

That will show a live updating list of current multicast (Bonjour) entries. Under 10.4, I get the following after adding the above directives to httpd.conf:

16:10:14.517  Add     0 local.     _http._tcp.     Site 1
16:10:14.667  Add     0 local.     _http._tcp.     Site 2

However with 10.5, I get this:

16:12:52.597  Add     1 local.     _http._tcp.     Site 2
16:12:52.598  Add     1 local.     _http._tcp.     Site 2
16:12:52.598  Add     0 local.     _http._tcp.     Site 2

What I’d really love to do is figure out how to register and respond to multiple Bonjour names. That way I could have each vhost be a named host and each staged site accessible at a url like site1.local and site2.local. So far I haven’t had any luck getting that working.

Rotating sub-pixel text rendering

John Gruber of Daring Fireball (thanks for the link!) knows a lot about font-rendering, however in a recent post discussing screen-rotation and sub-pixel text rendering he let this slip:

“I tested it on my Cinema Display with the screen rotated 90°, and, to my eyes, sub-pixel anti-aliasing still looked good.”

That is just preposterous. Aside from his observation being completely wrong, he also revealed a bug in OS X: The current system doesn’t recognize rotated pixel orientations, sub-pixel rendering on rotated screens should probably be disabled automatically. (rdar://problem/5627732)

Here are two screenshots of my browser’s address bar as displayed on my Cinema Display, which clearly shows the difference. The top image is Leopard’s default sub-pixel rendering. The second image is the same bar photographed with my display rotated 90°, the photo was then rotated back in Photoshop for better comparison.

Comparison of rotated sub-pixel type

The text was apparently calculated against the presumed horizontal LCD primary orientation. But because the pixels were rotated, several of the letterform stems (verticals) are drawing as full-pixel-width colored lines. The first “h” is especially glaring, its stem and stroke are drawn as a pair of dark red and light blue lines.

Sub-pixel rendering takes advantage of a known horizontal alignment of the three color primaries that make up each physical pixel. The algorithm seems to render text at 3x the horizontal resolution, ignoring the color information and treating each third-pixel as a valid light source to use for drawing letterforms. That 3x width is then striped with red, green and blue to match screen’s component primary ordering. (That was an educated guess)

As Steve Weller stated in the post John linked, the human eye has “pathetic color-resolution”. This fact is exploited all over the place in video, with many formats sampling color only once for every four luminance pixels.

Several things are at play here:

  1. Human vision is the bifocal product of horizontally arranged eyes.
  2. Most written human language uses letterforms which are vertically oriented and horizontally distinguished. Especially Latin-derived languages.
  3. Most human languages read horizontally.
  4. Human vision tends to be less color sensitive for motion, or when scanning information (like reading)

It all just kind of worked out perfectly. Digital color reproduction combined our horizontal predisposition with our soft and slow perception of color, and then arranged color primaries horizontally. Text also reads horizontally, and since the viewer is rapidly moving their eyes, we perceive shape and contrast before color. Additionally, Latinate languages evolved letterforms which utilize horizontal variations against a largely regular vertical syncopation. Presto: sub-pixel rendering just seems fantastically obvious.

Regarding John’s closing supposition,

“I’m not sure the iPhones rotating display is reason enough to rule out sub-pixel rendering.”

Based on everything leading up to sub-pixel rendering in the first place, most of the benefits would be lost if the underlying pixel grid was vertically oriented. The sensitivity of computer text falls across the horizontal axis. Adding resolution to the vertical axis isn’t worth the effort.

Sub-pixel rendering is ultimately a transitional technology anyway, a half-step that improves the now while waiting for a better and inevitable future to arrive. Once we start seeing iPhone level pixel-densities all over the place, sub-pixel rendering will began its transition to technology footnote.

Digital displays will someday reach a point where every physical pixel is capable of producing every color of visible light. (And someone will doubtlessly push into near infrared and ultra-violet, claiming increased realism and fidelity). Future displays will also be operating at a density where anti-aliasing may not be necessary at all.

I still think Apple’s decision to use standard anti-aliasing for the Leopard menu bar was a mistake. Unless they’ve got some spiffy high-pixel-density cinema displays ready for MacWorld and enable system-wide resolution independence in 10.5.x, switching to standard anti-aliased text rendering in the menu bar was a change that should have been postponed. The necessary hardware pool just isn’t here yet and the result is an interface that looks markedly worse than it did under previous releases.

Calculate Sizes CPU usage bug in Leopard’s Finder

Lately I’ve noticed the Finder on my MacBook Pro has been running both CPU cores at 40-80% for no apparent reason. From what I’ve been able to tell, there is a bug related to having the same window open in two different spaces with Calculate Folder Sizes enabled. I filed a bug on this (5609348) but Apple is already aware of this issue.

Workaround: Closing all the Finder windows (⌘-w) seems to bring the Finder’s CPU usage back to zero.

The following steps will recreate the problem every time for me:

  1. Log in to the guest account
  2. ⌘-up arrow twice (navigate up from guest user’s home folder)
  3. ⌘-2 (list view)
  1. Open System Prefs
  2. Enable Spaces with default options.
  1. Open Activity Monitor (via Spotlight), search for “Finder” to clean up display
  2. Arrange windows so the CPU value is visible behind the window
  1. Switch to another space
  2. Create a new window: ⌘-n, ⌘-up arrow twice, ⌘-2
  1. Switch back to first space
  2. Click Finder window to be sure it’s selected (probably unnecessary)
  3. ⌘-J (Show View Options)
  4. Check “Calculate all sizes”

CPU usage should now increase. On my MBP I see about 40% across both cores.

  1. Uncheck “Calculate all sizes”

CPU usage increases to as much as 80%

A few notes:

The window should have a lot of files underneath it. If the Calculate all sizes command finishes too quickly it won’t show the problem. I opened windows to the top level of the hard drive because there weren’t enough files in the default guest account home folder.

This behavior did not happen with both windows in the same space.

New You Control: Desktops beta

Last year I wrote about Mac Virtual Desktops, focusing mostly on You Control: Desktops. I ended up buying a license and last week they released version 1.3 beta 2

Some of the good stuff I noticed right away:

  • You Control: Desktops Menu barCustomizable color for the highlighted desktop in the menu bar. The previous beta used a hard-coded red outline which was ghastly.

  • Behavior of the cursor on edge-screen flipping. It can now be set to mimic Compiz/Beryl, where the cursor starts at the opposite edge of the next screen. So if you drag a window off the left side, it appears on the next desktop on the right side. With a transition like Cube or Pan, I find this to be spatially very intuitive. The cursor can also be set to remain in place, for transitions like Slide, it feels like you’re holding a window while the current desktop just gets out of the way beneath it. Not sure which one I prefer, but the options are a huge improvement.
  • Overall, the speed of transitions and switching feels much faster, especially on the more graphics intensive effects.

Feedback I’m sending to the developers:

  • Fade, Swirl, Twist and Zoom still use an additive composite, which means they pretty much always blow out to white during the transition and then pop to the next screen. It’s uncharacteristically ugly.
  • Cursor repositioning seems to wait for some mouse movement before redrawing dragged windows. If the mouse is kept perfectly still, the dragged window will remain at the position it was pre-transition, then pop into place when the mouse is moved again.

    It would be fantastic if the dragged windows could be composited before the transition or in the transition buffer, I think it would be perceptible and intuitively support the various repositioning options. If you’re doing that, might as well update the menu-bar icon’s status pre-transtion too. It’s disorienting to see it show up wrong and then update — the menu bar should reflect the new state before the transition finishes.

  • Hardware support for extra mouse buttons would be fantastic.
  • Hot key assignment is way too clumsy.

While I go through phases of using and not using virtual desktops, if you want multiple desktops on your Mac right now and can’t wait for Spaces in Leopard (which doesn’t do as much), YC:Desktops is the way to go.

Next Page »