<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Joe Maller &#187; html</title>
	<atom:link href="http://joemaller.com/tag/html/feed/" rel="self" type="application/rss+xml" />
	<link>http://joemaller.com</link>
	<description>.com</description>
	<lastBuildDate>Fri, 27 Jan 2012 06:04:17 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Fixing a quarter million misnested HTML tags</title>
		<link>http://joemaller.com/1567/fixing-a-quarter-million-misnested-html-tags/</link>
		<comments>http://joemaller.com/1567/fixing-a-quarter-million-misnested-html-tags/#comments</comments>
		<pubDate>Tue, 22 Dec 2009 04:01:42 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
				<category><![CDATA[misc.]]></category>
		<category><![CDATA[Web Development]]></category>
		<category><![CDATA[html]]></category>
		<category><![CDATA[regex]]></category>
		<category><![CDATA[Regular Expressions]]></category>

		<guid isPermaLink="false">http://joemaller.com/?p=1567</guid>
		<description><![CDATA[These things just seem to find me, this time it was a very large database dump for a media site which was plagued with misnested HTML tags. Seriously. Just shy of 250,000 misnested pairs. Here&#8217;s the pattern I came up with to fix it: Find: &#60;(([^ &#62;]+)(?:[^&#62;]*))&#62;(.*)&#60;(([^ &#62;]+)(?:[^&#62;]*))&#62;(.*)&#60;/\2&#62;(.*)&#60;/\5&#62; Replace with: &#60;$1&#62;$3&#60;$4&#62;$6&#60;/$5&#62;$7&#60;/$2&#62; or, depending on your [...]]]></description>
			<content:encoded><![CDATA[<p>These things just seem to find me, this time it was a very large database dump for a media site which was plagued with misnested HTML tags. Seriously. Just shy of 250,000 misnested pairs. </p>
<p>Here&#8217;s the pattern I came up with to fix it:</p>
<p>Find:</p>
<pre><code>&lt;(([^ &gt;]+)(?:[^&gt;]*))&gt;(.*)&lt;(([^ &gt;]+)(?:[^&gt;]*))&gt;(.*)&lt;/\2&gt;(.*)&lt;/\5&gt;</code></pre>
<p>Replace with:<br />
<code>&lt;$1&gt;$3&lt;$4&gt;$6&lt;/$5&gt;$7&lt;/$2&gt;</code><br />
or, depending on your regex engine, your replace string might look like this:<br />
<code>&lt;\1&gt;\3&lt;\4&gt;\6&lt;/\5&gt;\7&lt;/\2&gt;</code></p>
<p>That handles all of the following cases:</p>
<pre><code>&lt;b&gt;&lt;i&gt;text&lt;/b&gt;&lt;/i&gt;
&lt;b&gt;text&lt;i&gt;text&lt;/b&gt;text&lt;/i&gt;
&lt;b&gt;&lt;a href="#" target="_new"&gt;link&lt;/b&gt;text&lt;/a&gt;
&lt;a href="#"&gt;&lt;h2&gt;text&lt;/a&gt;&lt;/h2&gt;</code></pre>
<p>Running the final substitution was ridiculously fast, <a href="http://xkcd.com/208/">Regular Expressions are magic</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://joemaller.com/1567/fixing-a-quarter-million-misnested-html-tags/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Tabbed clipboard to HTML Table</title>
		<link>http://joemaller.com/887/tabbed-clipboard-to-html-table/</link>
		<comments>http://joemaller.com/887/tabbed-clipboard-to-html-table/#comments</comments>
		<pubDate>Wed, 13 Feb 2008 21:00:23 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
				<category><![CDATA[Mac OS X]]></category>
		<category><![CDATA[Web Development]]></category>
		<category><![CDATA[AppleScript]]></category>
		<category><![CDATA[html]]></category>
		<category><![CDATA[numbers]]></category>

		<guid isPermaLink="false">http://joemaller.com/2008/02/13/tabbed-clipboard-to-html-table/</guid>
		<description><![CDATA[I was looking for a quick way to get a structured table from some data I had in Numbers. Unfortunately Numbers isn&#8217;t scriptable and doesn&#8217;t seem to offer plain HTML export. After a little poking around, I just ended up writing a script to do what I wanted. This little AppleScript will convert anything text [...]]]></description>
			<content:encoded><![CDATA[<p>I was looking for a quick way to get a structured table from some data I had in <a href="http://www.apple.com/iwork/numbers/">Numbers</a>. Unfortunately Numbers isn&#8217;t scriptable and doesn&#8217;t seem to offer plain HTML export. After a little poking around, I just ended up writing a script to do what I wanted.</p>
<p>This little AppleScript will convert anything text in the clipboard into a simple, unstyled HTML table. <a href="applescript://com.apple.scripteditor?action=new&#038;script=%0Aset%20oldDelims%20to%20AppleScript%27s%20text%20item%20delimiters%0Aset%20AppleScript%27s%20text%20item%20delimiters%20to%20return%0Aset%20TRs%20to%20every%20text%20item%20of%20%28the%20clipboard%20as%20text%29%0A%0Aset%20AppleScript%27s%20text%20item%20delimiters%20to%20tab%0Aset%20theTable%20to%20%22%3Ctable%3E%22%20%26%20return%0A%0Arepeat%20with%20TR%20in%20TRs%0A%09copy%20theTable%20%26%20%22%3Ctr%3E%22%20%26%20return%20to%20theTable%0A%09repeat%20with%20TD%20in%20text%20items%20of%20TR%0A%09%09copy%20theTable%20%26%20%22%3Ctd%3E%22%20%26%20TD%20%26%20%22%3C%2Ftd%3E%22%20%26%20return%20to%20theTable%0A%09end%20repeat%0A%09copy%20theTable%20%26%20%22%3C%2Ftr%3E%22%20%26%20return%20to%20theTable%0Aend%20repeat%0A%0Acopy%20theTable%20%26%20%22%3C%2Ftable%3E%22%20to%20theTable%0A%0Aset%20AppleScript%27s%20text%20item%20delimiters%20to%20oldDelims%0A%0Aset%20the%20clipboard%20to%20theTable%0A%0A">View the script in Script Editor</a></p>
<p>Just save it into your Scripts folder and call it after copying some data to the clipboard. Any text on your clipboard will be converted to a basic, un-styled HTML table, ready to paste.</p>
<style type="text/css">
    p.p1 {margin: 0.0px 0.0px 0.0px 32.7px; text-indent: -32.8px; font: 10.0px Verdana; min-height: 12.0px}
    p.p2 {margin: 0.0px 0.0px 0.0px 34.7px; text-indent: -34.7px; font: 10.0px Verdana; color: #0000ff}
    p.p3 {margin: 0.0px 0.0px 0.0px 32.7px; text-indent: -32.8px; font: 12.0px Helvetica; min-height: 14.0px}
    p.p4 {margin: 0.0px 0.0px 0.0px 34.7px; text-indent: -34.7px; font: 10.0px Verdana}
    p.p5 {margin: 0.0px 0.0px 0.0px 56.0px; text-indent: -56.1px; font: 10.0px Verdana}
    p.p6 {margin: 0.0px 0.0px 0.0px 56.0px; text-indent: -56.1px; font: 10.0px Verdana; color: #0000ff}
    p.p7 {margin: 0.0px 0.0px 0.0px 84.1px; text-indent: -84.1px; font: 10.0px Verdana}
    p.p8 {margin: 0.0px 0.0px 0.0px 34.7px; text-indent: -34.7px; font: 12.0px Helvetica; min-height: 14.0px}
    span.s1 {font: 12.0px Helvetica; color: #000000}
    span.s2 {color: #408000}
    span.s3 {color: #000000}
    span.s4 {color: #0000ff}
    span.s5 {font: 12.0px Helvetica}
    span.Apple-tab-span {white-space:pre}
  </style>
<p class="p1"></p>
<p class="p2"><b>set</b><span class="s1"> </span><span class="s2">oldDelims</span><span class="s1"> </span><b>to</b><span class="s1"> </span>AppleScript<span class="s3">&#8216;s</span><span class="s1"> </span>text item delimiters</p>
<p class="p2"><b>set</b><span class="s1"> </span>AppleScript<span class="s3">&#8216;s</span><span class="s1"> </span>text item delimiters<span class="s1"> </span><b>to</b><span class="s1"> </span>return</p>
<p class="p2"><b>set</b><span class="s1"> </span><span class="s2">TRs</span><span class="s1"> </span><b>to</b><span class="s1"> </span><b>every</b><span class="s1"> </span>text item<span class="s1"> </span><b>of</b><span class="s1"> </span><span class="s3">(</span>the clipboard<span class="s1"> </span>as<span class="s1"> </span>text<span class="s3">)</span></p>
<p class="p3"></p>
<p class="p2"><b>set</b><span class="s1"> </span>AppleScript<span class="s3">&#8216;s</span><span class="s1"> </span>text item delimiters<span class="s1"> </span><b>to</b><span class="s1"> </span>tab</p>
<p class="p4"><span class="s4"><b>set</b></span><span class="s5"> </span><span class="s2">theTable</span><span class="s5"> </span><span class="s4"><b>to</b></span><span class="s5"> </span>&#8220;&lt;table&gt;&#8221;<span class="s5"> </span>&amp;<span class="s5"> </span><span class="s4">return</span></p>
<p class="p3"></p>
<p class="p2"><b>repeat</b><span class="s1"> </span><b>with</b><span class="s1"> </span><span class="s2">TR</span><span class="s1"> </span><b>in</b><span class="s1"> </span><span class="s2">TRs</span></p>
<p class="p5"><span class="s5"><span class="Apple-tab-span">	</span></span><span class="s4"><b>copy</b></span><span class="s5"> </span><span class="s2">theTable</span><span class="s5"> </span>&amp;<span class="s5"> </span>&#8220;&lt;tr&gt;&#8221;<span class="s5"> </span>&amp;<span class="s5"> </span><span class="s4">return</span><span class="s5"> </span><span class="s4"><b>to</b></span><span class="s5"> </span><span class="s2">theTable</span></p>
<p class="p6"><span class="s1"><span class="Apple-tab-span">	</span></span><b>repeat</b><span class="s1"> </span><b>with</b><span class="s1"> </span><span class="s2">TD</span><span class="s1"> </span><b>in</b><span class="s1"> </span>text items<span class="s1"> </span><b>of</b><span class="s1"> </span><span class="s2">TR</span></p>
<p class="p7"><span class="s5"><span class="Apple-tab-span">	</span><span class="Apple-tab-span">	</span></span><span class="s4"><b>copy</b></span><span class="s5"> </span><span class="s2">theTable</span><span class="s5"> </span>&amp;<span class="s5"> </span>&#8220;&lt;td&gt;&#8221;<span class="s5"> </span>&amp;<span class="s5"> </span><span class="s2">TD</span><span class="s5"> </span>&amp;<span class="s5"> </span>&#8220;&lt;/td&gt;&#8221;<span class="s5"> </span>&amp;<span class="s5"> </span><span class="s4">return</span><span class="s5"> </span><span class="s4"><b>to</b></span><span class="s5"> </span><span class="s2">theTable</span></p>
<p class="p6"><span class="s1"><span class="Apple-tab-span">	</span></span><b>end</b><span class="s1"> </span><b>repeat</b></p>
<p class="p5"><span class="s5"><span class="Apple-tab-span">	</span></span><span class="s4"><b>copy</b></span><span class="s5"> </span><span class="s2">theTable</span><span class="s5"> </span>&amp;<span class="s5"> </span>&#8220;&lt;/tr&gt;&#8221;<span class="s5"> </span>&amp;<span class="s5"> </span><span class="s4">return</span><span class="s5"> </span><span class="s4"><b>to</b></span><span class="s5"> </span><span class="s2">theTable</span></p>
<p class="p2"><b>end</b><span class="s1"> </span><b>repeat</b></p>
<p class="p8"></p>
<p class="p4"><span class="s4"><b>copy</b></span><span class="s5"> </span><span class="s2">theTable</span><span class="s5"> </span>&amp;<span class="s5"> </span>&#8220;&lt;/table&gt;&#8221;<span class="s5"> </span><span class="s4"><b>to</b></span><span class="s5"> </span><span class="s2">theTable</span></p>
<p class="p8"></p>
<p class="p2"><b>set</b><span class="s1"> </span>AppleScript<span class="s3">&#8216;s</span><span class="s1"> </span>text item delimiters<span class="s1"> </span><b>to</b><span class="s1"> </span><span class="s2">oldDelims</span></p>
<p class="p8"></p>
<p class="p2">set the clipboard to<span class="s1"> </span><span class="s2">theTable</span></p>
]]></content:encoded>
			<wfw:commentRss>http://joemaller.com/887/tabbed-clipboard-to-html-table/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Database Caching 1/14 queries in 0.014 seconds using disk: basic
Object Caching 267/289 objects using disk: basic

Served from: joemaller.com @ 2012-02-08 12:05:44 -->

<!-- W3 Total Cache: Page cache debug info:
Engine:             disk: enhanced
Cache key:          tag/html/feed/_index.xml_gzip
Caching:            enabled
Status:             not cached
Creation Time:      0.456s
Header info:
Set-Cookie:         bb2_screener_=1328720744+38.107.179.219+38.107.179.219; path=/
X-Pingback:         http://joemaller.com/wordpress/xmlrpc.php
Content-Type:       text/xml; charset=UTF-8
Last-Modified:      Wed, 08 Feb 2012 17:05:44 GMT
Vary:               Accept-Encoding, Cookie
Expires:            Wed, 08 Feb 2012 18:05:44 GMT
Pragma:             public
Cache-Control:      public, must-revalidate, proxy-revalidate
Etag:               b82fcd36520468e8c15109ce19f2e4ce
X-Powered-By:       W3 Total Cache/0.9.2.4
Content-Encoding:   gzip
-->
