Joe Maller.com

EXIF and the Unix Strings command

I got an email over the weekend pointing out a bug in my iPhoto Date Reset if an original image contained a single-quote in its name. Most all of my iPhoto images were imported from the camera, so I hadn’t seen this before, but I’m pretty sure I’ve already gotten it fixed.

While fixing that, I did a little revising of the EXIF sniffing script. I was using a one-line Perl snippet to scrape the date out of the first kilobyte of the file. Here’s the command broken across several lines

 perl -e 'open(IMG, q:[ABSOLUTE PATH}:);
 read(IMG, $exif, 1024); 
 $exif =~ s/\n/ /g; 
 $exif =~ s/.*([0-9]{4}(?::[0-9]{2}){2} [0-9]{2}(?::[0-9]{2}){2}).*$/$1/g;
 print STDOUT $exif;'

That worked, but perl one-liners usually need to be enclosed in single-quotes, since AppleScript was filling in the path, single-quotes in the name broke the script. I’m not that fluent in Perl, so there’re probably better ways of doing that.

But then I stumbled across the Unix Strings command. This basically does most of what I was doing. It scrapes a binary file (meaning non-text) and extracts anything that seems to be a string. The output from JPEGs often contains a bunch of gibberish, but right above the gibberish is every unencoded string from the EXIF header.

Using strings, sed for the pattern and head to trim, that somewhat convoluted perl script became this trim little shell script:

 strings [ABSOLUTE PATH] | sed -E -n '/([0-9]{4}(:[0-9]{2}){2} [0-9]{2}(:[0-9]{2}){2})/p' | head -n1

They’re both essentially instant on my computer so I’m not going to bother building a test to figure out which is actually faster.