<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>

<channel>
	<title>Blogos &#187; SemanticBible</title>
	<atom:link href="http://semanticbible.com/blogos/category/semanticbible/feed/" rel="self" type="application/rss+xml" />
	<link>http://semanticbible.com/blogos</link>
	<description>God's Word &#124; our words &#124; meaning, communication, &#38; technology &#124; following Jesus, the Word made flesh</description>
	<pubDate>Tue, 16 Jun 2009 21:29:58 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>http://ref.ly for Bible References</title>
		<link>http://semanticbible.com/blogos/2009/06/16/httprefly-for-bible-references/</link>
		<comments>http://semanticbible.com/blogos/2009/06/16/httprefly-for-bible-references/#comments</comments>
		<pubDate>Tue, 16 Jun 2009 21:29:58 +0000</pubDate>
		<dc:creator>Sean</dc:creator>
		
		<category><![CDATA[Programming]]></category>

		<category><![CDATA[SemanticBible]]></category>

		<category><![CDATA[bibleref]]></category>

		<category><![CDATA[bible_reference]]></category>

		<category><![CDATA[reftagger]]></category>

		<guid isPermaLink="false">http://semanticbible.com/blogos/?p=943</guid>
		<description><![CDATA[My colleagues at Logos have launched http://ref.ly, a URL shortening service for Bible references: see this blog post. It provides the convenience of TinyURL (turning long unreadable URLs into something much more manageable), but unlike that service also provides readable, understandable content. Once you get past the prefix, you won&#8217;t have any trouble figuring out [...]]]></description>
			<content:encoded><![CDATA[<p>My colleagues at Logos have launched <a title="URL-shortening service for Bible references" href="http://ref.ly">http://ref.ly</a>, a URL shortening service for Bible references: see <a href="http://blog.logos.com/archives/2009/06/bible_references_on_twitter.html">this blog post</a>. It provides the convenience of <a href="http://tinyurl.com/">TinyURL</a> (turning long unreadable URLs into something much more manageable), but unlike that service also provides readable, understandable content. Once you get past the prefix, you won&#8217;t have any trouble figuring out what verse <a href="http://ref.ly/Mk4.9">http://ref.ly/Mk4.9</a> is referring to.</p>
<p>If you&#8217;re a <a href="http://en.wikipedia.org/wiki/Twitter">Twitter</a> person trying to shoehorn your message into 140-character tweets, you&#8217;ll like the fact that this gives you a brief and unambiguous way to both specify a Bible reference and link to the content behind it (the references resolve to the actual verse text at <a href="http://bible.logos.com">bible.logos.com</a>). Since <a href="http://semanticbible.com/blogos/2009/01/08/addressability-matters/">addressability matters</a>, this is a good thing.</p>
<p>But it has precisely the same utility even if you&#8217;re not a Twitterhead (i&#8217;m not):</p>
<ul>
<li>it clearly marks a string of characters as a Bible reference</li>
<li>it also normalizes the reference into a form that can be automatically processed</li>
</ul>
<p>While it&#8217;s not quite a microformat, it&#8217;s really only a small step away from things like <a title="Bibleref overview" href="http://semanticbible.com/bibleref/bibleref-overview.html">bibleref</a>. In particular, if lots of people start using ref.ly references, it will be possible to process that content and understand things like what verses are most popular.</p>
<p>What&#8217;s more, editors that recognize and automatically link URLs (like MS Outlook for HTML-based email, and MS Word) will now automatically make Bible links for you (like <a title="RefTagger" href="http://logos.com/reftagger">RefTagger </a>does for blog posts), as long as you&#8217;re willing to tack on &#8220;http://ref.ly/&#8221; and live with the slightly non-traditional format. You don&#8217;t need to know anything about how to make a hyperlink in HTML: just a little extra syntax (14 characters, to be precise) moves these references toward much greater usefulness.</p>
]]></content:encoded>
			<wfw:commentRss>http://semanticbible.com/blogos/2009/06/16/httprefly-for-bible-references/feed/</wfw:commentRss>
		</item>
		<item>
		<title>The Most Important Verses? It Depends What You Mean</title>
		<link>http://semanticbible.com/blogos/2009/05/22/the-most-important-verses-it-depends-what-you-mean/</link>
		<comments>http://semanticbible.com/blogos/2009/05/22/the-most-important-verses-it-depends-what-you-mean/#comments</comments>
		<pubDate>Fri, 22 May 2009 18:17:57 +0000</pubDate>
		<dc:creator>Sean</dc:creator>
		
		<category><![CDATA[Programming]]></category>

		<category><![CDATA[SemanticBible]]></category>

		<category><![CDATA[analysis]]></category>

		<category><![CDATA[bible_reference]]></category>

		<category><![CDATA[machine_learning]]></category>

		<guid isPermaLink="false">http://semanticbible.com/blogos/?p=913</guid>
		<description><![CDATA[The title of this post is a deliberate take-off from a recent post at OpenBible.info entitled &#8220;What Are the Most Popular Verses in the Bible? It Depends Whom You Ask&#8221;. That post combines data from an earlier ESV analysis of search results, TopVerses.com, a BibleGateway (internal) study, and OpenBible data to present a list of [...]]]></description>
			<content:encoded><![CDATA[<p>The title of this post is a deliberate take-off from a recent post at <a href="http://www.openbible.info">OpenBible.info</a> entitled <a href="http://www.openbible.info/blog/2009/05/what-are-the-most-popular-verses-in-the-bible-it-depends-whom-you-ask/">&#8220;What Are the Most Popular Verses in the Bible? It Depends Whom You Ask&#8221;</a>. That post combines data from <a href="http://www.esv.org/blog/2005/12/what-are-the-most-popular-verses-in-the-bible/">an earlier ESV analysis of search results</a>, <a href="http://topverses.com">TopVerses.com</a>, <a href="http://www.biblegateway.com/blog/?p=125">a BibleGateway (internal) study</a>, and OpenBible data to present a list of 278 verses, all of which occur in the top hundred of at least one source&#8217;s &#8220;top 100&#8243; list. It&#8217;s interesting to see both how much disparity there is (only 13% occur in at least three of the four lists), but also how uneven the distribution is. <a href="http://www.openbible.info/blog/2009/05/what-are-the-most-popular-verses-in-the-bible-it-depends-whom-you-ask/comment-page-1/#comment-31066">As one commenter points out</a>, it&#8217;s somewhat surprising that there are no verses from Revelation, and Old Testament narrative in particular is largely absent except for Genesis. John&#8217;s gospel has about as many popular verses as all the other gospels combined: there are only four verses from Mark (two of them from the often-questioned ending). Less surprisingly, perhaps, there are none from the shortest NT books (Philemon, Jude, 2-3 John). Altogether it&#8217;s an interesting study.</p>
<p>The larger question this raises for me is how we might come up with a comprehensive, global score for verses to indicate their importance for a variety of purposes. As the OpenBible post suggests, this depends both on what the source of the data is, but also on what your purpose is and what you mean by &#8220;important&#8221; (which is certainly different from &#8220;popular&#8221;, though not completely unrelated).</p>
<p>One useful purpose is ranking verses to present them in response to searches: TopVerses.com is explicitly organized this way, as indicated in <a href="http://www.topverses.com/topv/wordpress/?p=13">this news article about the site</a>. They don&#8217;t go into much detail about how they gathered their data, though the scope (37M references scoured from the web) is impressive. But there&#8217;s a subtle disparity here: their data is based on counting mentions (citations) in published web pages, but their use case is prioritizing search results, and these may be out of sync. The fact that a given verse is frequently published on the web doesn&#8217;t necessarily mean it&#8217;s the one you want at the top of the list when you&#8217;re doing a word-based search, for example. The other three sources seem perhaps better matched to ranking search results, since they&#8217;re derived from searches themselves.</p>
<p>Another key hitch is these endeavors is how to handle range references, both in processing source data and (for search purposes) in handling queries. For example, many Bible dictionaries frequently reference ranges of verses, sometimes extensive, multi-chapter ones. If you&#8217;re going to count these, you need to think carefully about how you do the counting so you don&#8217;t introduce bias (or, better, you select the bias that&#8217;s best suited to your purposes).</p>
<p>For example, in the TopVerses.com ranking <cite class="bibleref">John 3.1</cite> <a href="http://www.topverses.com/?verse=32080">is #26</a>, despite the rather plain descriptive content with little obvious spiritual impact.</p>
<blockquote><p>Now there was a Pharisee, a man named Nicodemus who was a member of the Jewish ruling council. (<cite class="bibleref">John 3.1</cite>, NIV)</p></blockquote>
<p>While i can&#8217;t be sure, i strongly suspect this high rank is an unintended consequence of  dis-aggregating ranges and whole chapter references from <cite class="bibleref" title="John 3">John 3</cite>. In fact, scanning top verses by chapter from John, the first verse in each chapter is very often the highest or second-highest ranked, and near always among the top ten. This probably says more about the counting methodology than the significance of those verses in particular. The Bible Gateway study focuses on ranges of no more than three verses to explicit mitigate this problem.</p>
<h3>Other Measures of Importance</h3>
<p>Moving from popularity to importance, i can imagine several different factors that might be combined to produce a more general importance score:</p>
<ul>
<li><em>citation frequency (</em>based on some corpus). In the TopVerses.com approach, these are web pages, which provides a very large set of observations. A number of other digital text collections would also suit this purpose, and even allow segmentation by genre: for example, you get a very different ranking from the <a href="http://www.logos.com/products/details/1678">Anchor Yale Bible Dictionary</a> compared to <a href="http://www.logos.com/ebooks/details/eastons">Easton&#8217;s</a> (and neither have <cite class="bibleref">John 3.16</cite> at the top of the list). See below for more about this.</li>
<li><em>search frequency</em>, the basis for the other three sources in the OpenBible.info post. This could be refined further given data on follow-up activities. For example, depending on your application, verses searches whose results are then expanded into a chapter view or followed to the next verse might get a boost compared to those with no further action (this seems like a variant of &#8220;click through&#8221; rates used in search engine advertising)</li>
<li><em>content analysis</em> (context-independent): this could have several different flavors.
<ul>
<li>word count: though <cite class="bibleref">John 11:35</cite> gets mentioned more than you&#8217;d expect precisely <em>because</em> it&#8217;s the shortest verse in the (English) Bible, in general longer verses are more likely to be important. This could be refined further given a metric for <em>important</em> words (but now we&#8217;ve introduced a new problem: where does that data come from?), which could be used for weighting the counts.</li>
<li>We could do even better if, instead of counting words, we count <em>concepts</em> (and weight them). Assuming we think the concept of <span class="font-variant: small-caps;">HUMILITY</span> is important, we&#8217;d want verses expressing that concept to rank more highly, regardless of whether they used a more common word like &#8220;humilty&#8221;, or a less common one like &#8220;lowly&#8221;. Converting words to concepts is a difficult challenge, however.</li>
<li>Connections to other data also affect importance. In some sense, every verse that reports words of Jesus is probably more important to a Christian than one whose importance is otherwise comparable, which is why we have the convention of printing Bibles with the words of Christ in red (a binary system for visualizing importance).</li>
<li>We might even consider negative factors: a lower rank for unfamiliar, hard-to-pronounce names, or &#8220;taboo&#8221; words.</li>
</ul>
</li>
</ul>
<p>Unlike TopVerses.com, i don&#8217;t see a particular need to provide a <em>unique</em> rank for each verse. If each verse has a score (to simplify the math, a decimal between 0 and 1 is a common approach), you can simply pick the top n verses that fit your purpose, and then order any ties canonically.</p>
<h3>Comparing Dictionary Reference Citations</h3>
<p>I did a small experiment to compare the most frequent reference citations in seven Bible dictionaries that are incorporate in Logos&#8217;s software (so this is citation frequency, not search frequency). I extracted and counted all the references, and then aggregated the counts across all seven: the top 20 references are shown below, along with how many &#8220;votes&#8221; they received in the OpenBible.info list. In the case of whole chapter references (four of the top ten), i&#8217;ve indicated with yes/no whether <em>any</em> verse from that chapter occurs in the OpenBible list.</p>
<p>There&#8217;s relatively little overlap between the two lists: only seven of these are in the OpenBible list. Many of these make sense given the different purposes of reference works: for example, <cite class="bibleref">Is 61.1</cite> is a key messianic text. The high rank for <cite class="bibleref">2 Ki 15.29</cite> is initially puzzling, but probably results from being commonly cited in discussions of the conquests of Tiglath-Pileser and the Babylonian exile. Overall, this is probably much too small a sample to show the correspondences: i presume we&#8217;d find much more overlap in the top few hundred.</p>
<table border="0">
<tbody>
<tr>
<th>Reference</th>
<th>Aggregate Count</th>
<th>Count In<br />
OpenBible List</th>
</tr>
<tr>
<td>Jn 1:14</td>
<td>169.5</td>
<td><strong>1</strong></td>
</tr>
<tr>
<td>2 Ki 15:29</td>
<td>165.2</td>
<td>0</td>
</tr>
<tr>
<td>Is 61:1</td>
<td>159.8</td>
<td>0</td>
</tr>
<tr>
<td>Ac 1:13</td>
<td>151.7</td>
<td>0</td>
</tr>
<tr>
<td>Ge 1</td>
<td>150.0</td>
<td><strong>yes</strong></td>
</tr>
<tr>
<td>Ac 15</td>
<td>143.0</td>
<td>no</td>
</tr>
<tr>
<td>Ge 2:7</td>
<td>142.3</td>
<td>no</td>
</tr>
<tr>
<td>Ge 46:21</td>
<td>139.3</td>
<td>no</td>
</tr>
<tr>
<td>Jn 3:16</td>
<td>137.8</td>
<td><strong>4</strong></td>
</tr>
<tr>
<td>Ge 1:26</td>
<td>135.2</td>
<td><strong>3</strong></td>
</tr>
<tr>
<td>Is 7:14</td>
<td>134.3</td>
<td><strong>1</strong></td>
</tr>
<tr>
<td>Mt 28:19</td>
<td>130.2</td>
<td><strong>3</strong></td>
</tr>
<tr>
<td>Da 7:13</td>
<td>130.0</td>
<td>0</td>
</tr>
<tr>
<td>Ps 2:7</td>
<td>129.8</td>
<td>0</td>
</tr>
<tr>
<td>1 Pe 2:9</td>
<td>126.3</td>
<td>0</td>
</tr>
<tr>
<td>Ac 20:4</td>
<td>124.3</td>
<td>0</td>
</tr>
<tr>
<td>Lk 3:1</td>
<td>123.8</td>
<td>0</td>
</tr>
<tr>
<td>Mk 10:45</td>
<td>123.7</td>
<td>0</td>
</tr>
<tr>
<td>1 Sa 1:1</td>
<td>121.5</td>
<td>0</td>
</tr>
<tr>
<td>Ac 1:8</td>
<td>120.8</td>
<td><strong>3</strong></td>
</tr>
</tbody>
</table>
<p>Details:</p>
<ul>
<li>The dictionaries used were the <a href="http://www.logos.com/products/details/1678">Anchor Yale Bible Dictionary</a>, <a href="http://www.logos.com/products/details/1924">Baker Encyclopedia of the Bible</a>, <a href="http://www.logos.com/products/details/1874">Eerdman&#8217;s Dictionary of the Bible</a>, <a href="http://www.logos.com/products/details/2422">Eerdman&#8217;s Bible Dictionary</a>, <a href="http://www.logos.com/products/details/1569">International Standard Bible Encyclopedia (ISBE)</a>, New Bible Dictionary, and <a href="http://www.logos.com/ebooks/details/TYNBIBDCT">Tyndale Bible Dictionary</a>.</li>
<li>Like the OpenBible.info approach, i took range references with 3 or fewer verses and decomposed them into individual verses, splitting the counts (which is why the aggregate counts are floats rather than integers). Larger ranges were left atomic, which confuses the results further: for example, <cite class="bibleref">Ge 1:26</cite> probably ought to be even higher, since the high-ranking chapter reference <cite class="bibleref" title="Ge 1">Ge 1</cite> includes it.</li>
<li>Some references are undercounted because this method distinguishes BHS and LXX references, but i doubt this materially affects the results.</li>
</ul>
<h3>Conclusions</h3>
<p>None of this is meant as criticism of the particular sites mentioned above. I strongly believe that any user-oriented, empirically-based data set is better than nothing, and in most endeavors like this, &#8220;the best data is more data&#8221;. * But with more data comes more complexity, and i&#8217;ve only scratched the surface here in considering some of the different factors.</p>
<p>The key point is this: if we want to measure something, we need to be clear up front about exactly what it is, and also what purpose we hope it will serve. I never stop being amazed at how often &#8220;obvious&#8221; approaches to data problems produce surprising results.</p>
<hr />* In my recollection, this quote is attributed to Bob Mercer, a leading researcher in statistical language processing who was part of the IBM research group in the 1990s. I haven&#8217;t been able to verify a real source, however.</p>
]]></content:encoded>
			<wfw:commentRss>http://semanticbible.com/blogos/2009/05/22/the-most-important-verses-it-depends-what-you-mean/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Search Engine Optimization for Blogs and Non-profits</title>
		<link>http://semanticbible.com/blogos/2009/01/31/search-engine-optimization-for-blogs-and-non-profits/</link>
		<comments>http://semanticbible.com/blogos/2009/01/31/search-engine-optimization-for-blogs-and-non-profits/#comments</comments>
		<pubDate>Sun, 01 Feb 2009 02:06:29 +0000</pubDate>
		<dc:creator>Sean</dc:creator>
		
		<category><![CDATA[SemanticBible]]></category>

		<guid isPermaLink="false">http://semanticbible.com/blogos/2009/01/31/search-engine-optimization-for-blogs-and-non-profits/</guid>
		<description><![CDATA[I listen to as many podcasts as i can, usually as a way to keep my mind engaged while my body is otherwise occupied with things like vacuuming, exercising, or taking long drives. I&#8217;m a glutton for ideas, so for me it&#8217;s a great way to spark creativity and explore new interests, usually in the [...]]]></description>
			<content:encoded><![CDATA[<p>I listen to as many podcasts as i can, usually as a way to keep my mind engaged while my body is otherwise occupied with things like vacuuming, exercising, or taking long drives. I&#8217;m a glutton for ideas, so for me it&#8217;s a great way to spark creativity and explore new interests, usually in the realm of new technology. Some of my favorites feeds:</p>
<ul>
<li><a href="http://itc.conversationsnetwork.org/">IT Conversations</a> is full of interesting content. Jon Udell&#8217;s <a href="http://itc.conversationsnetwork.org/series/innovators.html">Interviews with Innovators</a> is a regular favorite, as is <a href="http://itc.conversationsnetwork.org/series/etech.html">Emerging Technology</a>, <a href="http://itc.conversationsnetwork.org/series/technometria.html">Technometria</a>, and others.</li>
<li>Talis has <a href="http://www.talis.com/applications/resources/podcasts.shtml">several podcast &#8220;brands&#8221;</a> i haven&#8217;t fully sorted out, but they cover topics i&#8217;m interested in like digital libraries, the Semantic Web, and open education</li>
</ul>
<p>A recent IT Conversations podcast was on <a href="http://itc.conversationsnetwork.org/shows/detail3887.html">Search Engine Marketing</a>, a discussion with Mike Moran and Bill Hunt, authors of the book <a href="http://www.amazon.com/Search-Engine-Marketing-Inc-Companys/dp/0136068685?ie=UTF8&#038;s=books&#038;qid=1218726990&#038;sr=1-1">Search Engine Marketing Inc</a>. A lot of their discussion focuses on companies whose web presence provides real revenue, and who therefore have a strong financial motivation to think hard about Search Engine Optimization (SEO). They&#8217;ve got some good advice: focus on content, check your description, write an article that solves a real problem (so others can link to it and build your web rank). But there are still plenty of us producing blogs (like <a href="http://semanticbible.com/blogos/">Blogos</a>) and open resource websites (like <a href="http://semanticbible.com/">SemanticBible</a>) whose motivation may be different: and SEO still matters for us.</p>
<p>If you&#8217;re reading this, in marketing terms, you&#8217;re a potential customer of my &#8220;brand&#8221;, and each web page or blog post i create involves, at one level, a marketing activity directed at you. I don&#8217;t get any revenue from my readers: my only half-hearted attempt at this is when i remember to put my Amazon Associates tag in a book recommendation (and to my knowledge, that&#8217;s never paid off). I don&#8217;t do ads either. I do, however, get something less intangible, but perhaps more important: blogging enhances my digital identity, including my reputation. If you&#8217;re in a high tech field, your on-line identity is becoming as important a representation of you to prospective employers as your resume. In my case, my unpaid activities of blogging, conference speaking, and web site development led pretty directly to my current work at Logos.</p>
<p>Of course, given the wide-open nature of web search, there are plenty of people who get to my blog for unrelated reasons. While i don&#8217;t want to repeat them here and perpetuate the problem, at one point a popular set of keywords leading people to Blogos had to do with my quoting some news story about home-manufactured, uh, pharm-a-sue-tickles. While it&#8217;s possible some of those misdirected searchers found some higher knowledge, most of them probably spent one second&#8217;s attention before clicking away. Moran and Hunt make a really good point here: these people are not &#8220;good customers&#8221;, and you&#8217;re not helping them or yourself by trying to attract them. Instead, they recommend you think carefully about what makes your site or blog distinctive: what are the target keywords you want to attract? Then determine a strategy for &#8220;owning&#8221; (to the extent possible) the search results for those keywords.</p>
<p>Example: with Google, i&#8217;m #1 among the 750k results for &#8220;semantic bible&#8221; (entered without quotes). I&#8217;m #3 for &#8220;hyperconcordance&#8221; (a modest achievement given there are only 5000 results). A two-year old post is #4 for blogos (but not the home page?? i must be doing something wrong): since blogging has become more popular, so has the name (though <a href="http://www.semanticbible.com/blogos/2004/05/25.html">i was there first</a>).Â  But i&#8217;m not even in the top 50 for &#8220;digital bible&#8221;, even though those are important keywords related to my content. Given all the competition in that space, it would take enormous effort to achieve a high ranking there. In this case, my efforts are probably better spent elsewhere.</p>
<p>There are plenty of free resources out there: Moran&#8217;s <a href="http://www.mikemoran.com/skinflintsearch/index.htm">Skinflint Search Marketing</a> is a good place to begin, and Google Analytics already provide far more capability than i know how to take advantage of. Which brings me back to the real challenge of doing SEO for non-profit sites: deciding how much effort is really worth it. But if nothing else, thinking about SEO gets you thinking about what your site is for in the first place, and that&#8217;s always a good thing to keep in focus.</p>
]]></content:encoded>
			<wfw:commentRss>http://semanticbible.com/blogos/2009/01/31/search-engine-optimization-for-blogs-and-non-profits/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Semantic Search in the Gospels</title>
		<link>http://semanticbible.com/blogos/2009/01/19/semantic-search-in-the-gospels/</link>
		<comments>http://semanticbible.com/blogos/2009/01/19/semantic-search-in-the-gospels/#comments</comments>
		<pubDate>Mon, 19 Jan 2009 18:43:32 +0000</pubDate>
		<dc:creator>Sean</dc:creator>
		
		<category><![CDATA[SemanticBible]]></category>

		<guid isPermaLink="false">http://semanticbible.com/blogos/2009/01/19/semantic-search-in-the-gospels/</guid>
		<description><![CDATA[Cognition uses &#8220;Semantic NLP, the Company&#8217;s patented linguistic meaning-based text processing technology&#8221; to process natural language text and make the information in it searchable by meaning rather than simply by word. They&#8217;ve recently released a demo based on the Gospels and associated notes from the NET Bible.
Dr. Kathleen Dahlgren, their founder and CTO, has been [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.cognition.com/">Cognition</a> uses &#8220;Semantic NLP, the Company&#8217;s patented linguistic meaning-based text processing technology&#8221; to process natural language text and make the information in it searchable by meaning rather than simply by word. They&#8217;ve recently released <a href="http://gospels.cognition.com/">a demo</a> based on the Gospels and associated notes from <a href="http://bible.org/">the NET Bible</a>.</p>
<p>Dr. Kathleen Dahlgren, their founder and CTO, has been working in the field of NLP for a long time, so this is not some newly-launched startup with more hype than substance. Their underlying technology represents an enormous investment in the linguistic data required for actually understanding language. Having worked in closely-related fields for most of <a href="http://semanticbible.com/blogos/2006/12/24/from-blogos-to-logos/">my pre-Logos career</a> (and having <a href="http://semanticbible.com/blogos/2006/07/18/topic-labels-and-semantic-bible-search/">thought quite a bit</a> about things like this for Bible study and search), i was very curious to take it for a spin and see how well it does. While they correctly claim that there&#8217;s a lot of figurative language in the Gospels, there&#8217;s also plenty of plain narrative description that ought to understandable.</p>
<p>Not surprisingly, the examples on their demo page look reasonably good (that&#8217;s what you do when you put together a demo, after all). &#8220;Who double-crossed the Lamb of God?&#8221; is a clever way to show off their ability to recognize double-cross as a synonym for betray, and Lamb of God as an alternate designation for Jesus. I might quibble with &#8220;blessed are the pure in heart&#8221; (Matt 5:8) as a hit for &#8220;blessed are the innocent&#8221;, but it&#8217;s clearly on the right track.</p>
<p>But they also allow you to try your own queries, which is where you can really see whether this approach helps or not. Some queries i tried:</p>
<ul>
<li>&#8220;a valuable pearl&#8221; comes up empty. Just searching for &#8220;pearl&#8221; finds Matt 13:45-46, but not finding &#8220;a pearl of great value&#8221; as a valuable pearl seems like a definite lack of understanding. Just searching for &#8220;valuable&#8221; finds a great many hits (remember this includes the NET Bible notes as well as the text), but some of the senses it retrieves don&#8217;t seem like a good fit for &#8220;valuable&#8221;: for instance, &#8221; a <strong><span class="hilit">major</span></strong> category of meaning&#8221;, &#8220;an aorist <strong><span class="hilit">main</span></strong> verb&#8221;, &#8220;is <span class="hilit"><strong>redundant</strong>&#8221; (?), </span>&#8220;is not being <strong><span class="hilit">critical</span></strong> of&#8221;. I understand why some of these matched, but they don&#8217;t convince me that there&#8217;s deep understanding going on.</li>
<li>&#8220;good soil&#8221; also comes up empty, even though this phrase occurs verbatim in Luke 8.15.</li>
<li>&#8220;a herd of swine&#8221; gets in the neighborhood: it apparently bridges the gap between swine and pig, and finds Matt 8.31 (apparently getting to &#8220;drive&#8221; from &#8220;herd&#8221;?), and some other notes related to &#8220;herdsmen&#8221;. But surprisingly it misses Mark 5.11 which has &#8220;a herd of pigs&#8221;.</li>
<li>&#8220;Peter&#8217;s brother&#8221; first tries the interpretation of &#8220;brother&#8221; as &#8220;member of a religious order&#8221; (!), but there&#8217;s a nice interface where you can choose alternate senses. After selecting the &#8220;sibling&#8221; sense, it does better, though the results aren&#8217;t always appropriate (e.g. Matt 17.1).</li>
<li>You can try questions like &#8220;Where did Jesus live?&#8221;, though the responses look like it&#8217;s merely searching on individual content words, not the semantics of the proposition. &#8220;Where did Herod live?&#8221; brings back a few interesting results where &#8220;live&#8221; has been connected to &#8220;palace&#8221;, which then results in helpful information because his palace was in Jerusalem.</li>
</ul>
<p>Finding a use case for this particular demo comes down to finding an interesting intersection of several requirements: how many queries <em>are </em>there that</p>
<ul>
<li>you&#8217;d actually want to look for</li>
<li>you couldn&#8217;t easily find based on the words alone</li>
<li>don&#8217;t require synthesis or reasoning (that&#8217;s really asking too much of this technology)</li>
</ul>
<p>It was harder than i thought to come up with cases like this, and for most of them, the results still left something to be desired. But all critique aside, kudos to Cognition for being brave enough to put their technology out there and letting the results speak for themselves. Real understanding of text is an extremely difficult task: it looks to me like Cognition has made substantial progress, though the problem is still far from solved.</p>
]]></content:encoded>
			<wfw:commentRss>http://semanticbible.com/blogos/2009/01/19/semantic-search-in-the-gospels/feed/</wfw:commentRss>
		</item>
		<item>
		<title>BibleTech 2009</title>
		<link>http://semanticbible.com/blogos/2008/10/24/bibletech-2009/</link>
		<comments>http://semanticbible.com/blogos/2008/10/24/bibletech-2009/#comments</comments>
		<pubDate>Fri, 24 Oct 2008 17:30:16 +0000</pubDate>
		<dc:creator>Sean</dc:creator>
		
		<category><![CDATA[Logos]]></category>

		<category><![CDATA[SemanticBible]]></category>

		<guid isPermaLink="false">http://semanticbible.com/blogos/2008/10/24/bibletech-2009/</guid>
		<description><![CDATA[Things have been silent at Blogos for several months now: i needed to take a break and focus more intensely on moving along some of our major data projects at Logos (like the Bible Knowledgebase).
But i&#8217;m ready to get back to a more regular blogging schedule, and nothing gets the creative juices flowing like the [...]]]></description>
			<content:encoded><![CDATA[<p>Things have been silent at Blogos for several months now: i needed to take a break and focus more intensely on moving along some of our major data projects at Logos (like the <a href="http://semanticbible.com/blogos/category/bible-knowledgebase/">Bible Knowledgebase</a>).</p>
<p>But i&#8217;m ready to get back to a more regular blogging schedule, and nothing gets the creative juices flowing like the prospects of another <a href="http://www.bibletechconference.com/">BibleTech conference</a>! The first BibleTech (this past January) was one of the highlights of my year: here&#8217;s <a href="http://www.bibletechconference.com/speakers.htm">a list of 2008 speakers</a>, including two presentations by me (you can find links to the slides <a href="http://semanticbible.com/blogos/2008/01/30/more-bibletech08-followup/">here</a>, and there&#8217;s an MP3 for the Zoomable Bible talk <a href="http://www.logos.com/media/bibletech/the_zoomable_bible.mp3">here</a>, though be warned that it&#8217;s 150Mb and non-streaming). So i&#8217;m really looking forward to the next one, March 28-29 in Seattle.</p>
<p>The call for presentations has gone out, and so i face the dilemma of choosing among lots of different ideas and topics, and deciding what to propose. So many smart people attended the last conference that i&#8217;d love to just sit around and talk tech for several days straight, but i probably have to focus on just one or two topics.</p>
<p>So here&#8217;s your chance to give me some feedback (and for me to learn whether anybody&#8217;s still listening!). I&#8217;m planning to blog about some of my presentation ideas in subsequent posts, and i&#8217;d love to hear your comments about them. Does the topic make sense? Would <span style="font-style: italic">you </span>want to hear about it? Is it compelling, relevant, important, &#8220;cool&#8221;? Is it too obscure, too far out there, too geeky? What can i improve from last year (if you attended one of my talks)? It would really help me to have some feedback on these questions, especially from those who attended last year and therefore have a good feel for what the conference is all about (but i&#8217;ll take any comments i can get).</p>
<hr />If you&#8217;re on Facebook, please join the <a href="http://www.facebook.com/home.php#/group.php?gid=30972637876">BibleTech group</a>.</p>
<p>Maybe <em>you</em> should be presenting at BibleTech 2009 too! <a href="http://www.bibletechconference.com/participate.htm">The call for participation</a> is open until Nov 3, and describes what we&#8217;re looking for, so get those abstracts in. And if i happen to mention a topic that you&#8217;re interested in presenting on, let me know and then go for it! There&#8217;s no shortage of things <em>i&#8217;d</em> like to talk about &#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://semanticbible.com/blogos/2008/10/24/bibletech-2009/feed/</wfw:commentRss>
<enclosure url="http://www.logos.com/media/bibletech/the_zoomable_bible.mp3" length="159057105" type="audio/mpeg" />
		</item>
		<item>
		<title>Collective Intelligence Applied to Biblical Studies</title>
		<link>http://semanticbible.com/blogos/2008/05/12/collective-intelligence-applied-to-biblical-studies/</link>
		<comments>http://semanticbible.com/blogos/2008/05/12/collective-intelligence-applied-to-biblical-studies/#comments</comments>
		<pubDate>Mon, 12 May 2008 21:52:29 +0000</pubDate>
		<dc:creator>Sean</dc:creator>
		
		<category><![CDATA[Learning]]></category>

		<category><![CDATA[SemanticBible]]></category>

		<guid isPermaLink="false">http://semanticbible.com/blogos/2008/05/12/collective-intelligence-applied-to-biblical-studies/</guid>
		<description><![CDATA[Collective intelligence is a broad term covering many cases where intelligence or novel information result from the collaborative activities of many individuals. Recent and well-known examples include sites like

Wikipedia, where people work together to create encyclopedia-like content
del.icio.us: i label (or &#8216;tag&#8217;) web page content, and others can look at my tags, or lots of people&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p><a title="Wikipedia: Collective intelligence" href="http://en.wikipedia.org/wiki/Collective_intelligence">Collective intelligence</a> is a broad term covering many cases where intelligence or novel information result from the collaborative activities of many individuals. Recent and well-known examples include sites like</p>
<ul>
<li><a title="Wikipedia" href="http://en.wikipedia.org/">Wikipedia</a>, where people work together to create encyclopedia-like content</li>
<li><a href="http://del.icio.us/">del.icio.us</a>: i label (or &#8216;tag&#8217;) web page content, and others can look at my tags, or lots of people&#8217;s tags, to find things of interest.</li>
<li><a href="http://slashdot.org/">slashdot</a>, <a href="http://digg.com">digg</a>, <a href="http://reddit.com">reddit</a>, and similar sites that collect votes on the interest of web pages and then ranked the pages by popularity</li>
</ul>
<p>Though more popular perhaps in the last few years, these kinds of approaches have been around for some time. Google&#8217;s dominance of web search, arguably the current &#8220;killer app&#8221; on the internet  (along with email), comes from a kind of collective intelligence. Their PageRank algorithm uses the number of links to a page from other web sites to estimate how important the page is, and assign its rank in the results you get back from a web search.</p>
<p>The interesting question to me is how collective intelligence might be usefully applied to Biblical studies. There have been a few projects in this area, though i think it&#8217;s fair to say they haven&#8217;t yielded too much yet. I&#8217;ve written a few posts (<a href="http://semanticbible.com/blogos/2007/10/30/youversion-and-bible-20/">here</a>, and almost 2 years ago <a href="http://semanticbible.com/blogos/2006/01/10/bible-study-20/">here</a>) about applying &#8220;Web 2.0&#8243; ideas to Bible study. <a href="http://youversion.com/">YouVersion</a> is perhaps the most promising of that bunch, but it still doesn&#8217;t collect nearly enough intelligence to really be different (meaning that the scale is too small, not that the comments are too stupid <img src='http://semanticbible.com/blogos/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> ).</p>
<p>Another interesting set of data come from the ESV Bible Blog, where they analyzed their web searches to identify <a href="http://www.esv.org/blog/2005/12/what.are.the.most.popular.verses.in.the.bible">the most popular verses in the Bible</a>. This provides some well-grounded analysis of people&#8217;s actual behavior (which is always better than <em>guessing</em> what they do). But as such it&#8217;s still just data, not information or knowledge (more about that <a title="Blogos post: Data, Information, Knowledge, and Bible Study" href="http://semanticbible.com/blogos/2007/07/24/data-information-knowledge-and-bible-study/">in this rather conceptual post</a> about the difference between data, information, and knowledge). In other words, how do we <em>apply</em> this data to do something new and different when it comes to Bible study?</p>
<p>Here&#8217;s one example collective intelligence project i&#8217;ve pondered (though i haven&#8217;t yet found time to actually construct it): identifying parables in the Gospels. We have numerous sayings of Jesus throughout the Gospels that use stories, allegories, or other metaphorical language to make a point. Some of these are explicitly described in their context as parables: for example, <a class="bibleref" href="http://www.gnpcb.org/esv/search/?q=mark+4">Mark 4:2</a> tells us</p>
<blockquote><p>And he was teaching them many things in parables, and in his teaching he said to them &#8230;</p></blockquote>
<p>We conventionally refer to the story that follows in <a class="bibleref" href="http://www.gnpcb.org/esv/search/?q=mark+4.3-8">Mark 4:3-8</a> as &#8220;the Parable of the Soils&#8221; (or, perhaps less appropriately given the focus of the story, the Parable of the Sower). However, other stories with the same character aren&#8217;t explicitly called parables in the text, or are labeled as parables in one gospel but not another. In fact, the Greek word parabolÄ“ (from which our word parable is a straightforward transliteration) doesn&#8217;t occur in <cite title="John" class="bibleref">the Gospel of John</cite> at all, though several of the teachings recorded there have a similar style as parables from the Synoptic Gospels.</p>
<p>If you consult the various Bible reference works, many of which contain lists of the parables of Jesus, you find a great deal of disagreement as to which passages are and are not parables. Not surprisingly, this also reflects divergence of opinion as to what ought to be <em>considered</em> a parable: only those instances where the term parabolÄ“ is used? Those as well as parallel stories? Any kind of figurative language? Wilmington&#8217;s Book of Bible Lists lists 38 parables of Jesus (several of which occur in multiple Gospels): the Baker Encyclopedia of the Bible lists 40; Harper&#8217;s Bible Dictionary has only 26 (plus a few others found only in the Gospel of Thomas).</p>
<p>Here&#8217;s a good candidate for applying collective intelligence to a real issue in Biblical studies: what should we list as a parable? You could approach it like this:</p>
<ul>
<li>Identify the entire set of candidate passages that anybody anywhere has considered, or might consider, a parable (and maybe throw in a few others as a control group)</li>
<li>Create a web site where people could log in and simply vote up or down on each passage: Parable or Not?</li>
<li>Along with their votes, each participant should record their criteria for voting</li>
<li>Participants could also log in as <span style="font-style: italic">proxies</span> for existing reference lists or scholarly authorities and enter (as votes) what Wilmington, Dodd or Jeremias called a parable.</li>
</ul>
<p>I&#8217;d think at least 100 participants would be required to make this exercise in distributed Biblical scholarship meaningful, and some might turn their noses up at the thought of letting unwashed masses have an equal say with the scholars. But wouldn&#8217;t this be an interesting exercise? In particular, rather than &#8220;the list&#8221; of parables, it would give us the basis for a distribution of opinions: for example, 95% might agree that <a class="bibleref" href="http://www.gnpcb.org/esv/search/?q=mark+4.3-8">Mark 4:3-8</a> is a parable, while perhaps only 10% would label <a class="bibleref" href="http://www.gnpcb.org/esv/search/?q=John.15.1-17">Jesus&#8217; saying about the vine and vinedresser (John 15:1-17)</a> that way. And the criteria might provide some interesting clusters of votes. I&#8217;d love to add this kind of data to the <a title="Composite Gospel Index" href="http://www.semanticbible.com/cgi/cgi-overview.html">Composite Gospel</a>. In fact, that&#8217;s what started the idea: i sat down to label the parables, and quickly realized this wasn&#8217;t a straightforward task.</p>
<p><strong>Additional resource</strong>: The Horizon Project is one product of the New Media Consortium that &#8220;charts the landscape of emerging technologies for teaching, learning and creative expression&#8221;. In my view, seminary education as well as pastoral preaching and teaching belong among this target audience. The Horizon Project produces an annual report on what&#8217;s here now, coming soon in the mid-term, and on the far-term horizon (3-5 years). Collective intelligence is one of their far-term horizon technologies: you can read more about in <a href="http://www.nmc.org/pdf/2008-Horizon-Report.pdf">the Horizon Report</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://semanticbible.com/blogos/2008/05/12/collective-intelligence-applied-to-biblical-studies/feed/</wfw:commentRss>
		</item>
		<item>
		<title>The Semantic Web as Data + Intelligence</title>
		<link>http://semanticbible.com/blogos/2008/04/24/the-semantic-web-as-data-intelligence/</link>
		<comments>http://semanticbible.com/blogos/2008/04/24/the-semantic-web-as-data-intelligence/#comments</comments>
		<pubDate>Thu, 24 Apr 2008 22:05:15 +0000</pubDate>
		<dc:creator>Sean</dc:creator>
		
		<category><![CDATA[Bible Knowledgebase]]></category>

		<category><![CDATA[SemanticBible]]></category>

		<guid isPermaLink="false">http://semanticbible.com/blogos/2008/04/24/the-semantic-web-as-data-intelligence/</guid>
		<description><![CDATA[Talking with Talis is rapidly becoming my favorite podcast source: Paul Miller has a lot of really interesting guests addressing topics at the intersection of libraries and the Semantic Web.
Today i listened to an interview with Dr. Jim Hendler, now at Rensselaer Polytechnic Institute, but previously at University of Maryland and  a key figure [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://talk.talis.com/">Talking with Talis</a> is rapidly becoming my favorite podcast source: Paul Miller has a lot of really interesting guests addressing topics at the intersection of libraries and the Semantic Web.</p>
<p>Today i listened to <a href="http://talk.talis.com/archives/2008/03/jim_hendler_tal.html">an interview with Dr. Jim Hendler</a>, now at Rensselaer Polytechnic Institute, but previously at University of Maryland and  a key figure in the establishment of OWL during his tenure at DARPA. My comments here are really just a rehash of some things he said much better, and with much more authority (given his history in the field) &#8212; but blame me, not him, for what i say below.</p>
<p>The concept of the Semantic Web brings together two different communities , along with their respective priorities and technologies. Many of the disagreements within what looks like a single community are just two sets of people talking about different things (but using similar terminology). The &#8220;semantic&#8221; part is mostly represented by the Artificial Intelligence community, with interests in careful ontology development, deep reasoning, theoretical correctness, and academic activities. The &#8220;web&#8221; community has been out there for more than a decade, building the World Wide Web with HTML and lots and lots of data, and is now looking for ways to make it more useful, connected, and extensible.</p>
<p>You can represent these two concerns as two axes on a graph, and many different endeavors tend strongly toward one side or the other, depending on whether they emphasize the &#8220;intelligence&#8221; dimension, or the &#8220;data&#8221; dimension.  Just a few examples on the data side (that could be multiplied many times over):</p>
<ul>
<li><a href="http://blogs.zdnet.com/semantic-web/?p=114">Yahoo plans to start indexing RDFa content</a> (i discussed this a bit in my post about <a href="http://semanticbible.com/blogos/2008/03/24/bibleref-and-rdfa/">Bibleref and RDFa</a>). As one of the major web players, this adds just a little more intelligence to a lot of data (potentially: users still have to create RDFa markup)</li>
<li><a href="http://freebase.com/">Freebase</a> is harvesting data from Wikipedia and other sources, and then adding a modest amount of structured relations.</li>
<li><a href="http://www.talis.com/">Talis</a> has their own set of data from a long history of library applications.</li>
</ul>
<p>On the &#8220;intelligence&#8221; side would be big ontology development efforts, and academics working on reasoning: Hendler also called out pharmaceutical companies as tending toward this dimension. Hendler&#8217;s own bet is that progress is more likely to come from data-side approaches than the hard-core intelligence side (and i think he&#8217;s right). He sees the combination of <a href="http://www.w3.org/2007/12/sparql-pressrelease">SPARQL </a>and persistent identifiers as two recent developments that are likely to move the field ahead: these are things i&#8217;m looking at closely as well in Bible Knowledgebase development (more on the second one to come soon).</p>
]]></content:encoded>
			<wfw:commentRss>http://semanticbible.com/blogos/2008/04/24/the-semantic-web-as-data-intelligence/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Bibleref and RDFa</title>
		<link>http://semanticbible.com/blogos/2008/03/24/bibleref-and-rdfa/</link>
		<comments>http://semanticbible.com/blogos/2008/03/24/bibleref-and-rdfa/#comments</comments>
		<pubDate>Mon, 24 Mar 2008 16:06:47 +0000</pubDate>
		<dc:creator>Sean</dc:creator>
		
		<category><![CDATA[SemanticBible]]></category>

		<guid isPermaLink="false">http://semanticbible.com/blogos/2008/03/24/bibleref-and-rdfa/</guid>
		<description><![CDATA[There&#8217;s a thoughtful post at OpenBible.info about Bibleref and an emerging W3C standard called RDFa which provides another possible approach to identifying references to Bible passages on the web. The OpenBible.info post provides a good example of how these two approaches might compare: the question is which approach makes more sense.
My interest in Bibleref is [...]]]></description>
			<content:encoded><![CDATA[<p>There&#8217;s <a href="http://www.openbible.info/blog/2008/03/yahoo-bibleref-and-rdfa/">a thoughtful post</a> at OpenBible.info about <a title="SemanticBible: Bibleref overview" href="http://semanticbible.com/bibleref/bibleref-overview.html">Bibleref</a> and an emerging W3C standard called <a href="http://www.w3.org/TR/xhtml-rdfa-primer/">RDFa</a> which provides another possible approach to identifying references to Bible passages on the web. The OpenBible.info post provides a good example of how these two approaches might compare: the question is which approach makes more sense.</p>
<p>My interest in Bibleref is only to achieve a practical goal: making it easier to distinguish and characterize citations of Biblical passages. So my pragmatic answer is, whatever approach gets us closer to that goal.</p>
<p>In the context of emerging Internet technologies and practices, Bibleref attempts to solve a specific, small problem that is nevertheless representative of a much wider set of challenges:</p>
<blockquote><p>How to &#8220;upgrade&#8221; the web as it exists today (mostly display-oriented prose) to more structured and meaningful data</p>
<ul>
<li>without breaking things that already work</li>
<li>without requiring too much effort from web page authors</li>
<li>in a way that will carry us forward into whatever the future Web turns out to be</li>
</ul>
</blockquote>
<p>One consequence of the unbridled growth of the World Wide Web is lots of conflicting standards. There&#8217;s a constant tension between those who think carefully about requirements and want a neat and tidy approach, versus those who just want to get something practical done. Both approaches have their merits, but it&#8217;s clear the Web as it exists today would never have developed if it hadn&#8217;t been possible for excited individuals to go write their own web pages with a simple text editor.</p>
<p>Microformats are more toward the &#8220;pave the cowpaths&#8221; end of this continuum, focusing on re-using existing HTML constructs in slightly more semantic fashion. RDFa requires somewhat more overhead in the use of namespaces and the RDF model (not too surprising given its W3C sponsorship). But, if you think Yahoo Search&#8217;s Amit Kumar is right (i do) that <a href="http://www.ysearchblog.com/archives/000527.html">search is the &#8220;killer app&#8221; for the Semantic Web</a>,  the issue of how all that data will get indexed and processed is as significant as how it will get authored in the first place. Since each microformat is a new special case, and searching for Bible references isn&#8217;t quite in the mainstream of the Internet economy, it&#8217;s easy to question whether a microformat-based Bibleref standard will ever achieve critical mass.</p>
<p>I&#8217;m rambling a bit: here&#8217;s a more coherent summary of where i see things today, slightly less than a year after <a href="http://semanticbible.com/blogos/2007/05/24/annotating-scripture-references-in-blog-posts-a-modest-proposal/">my first Bibleref proposal</a>:</p>
<ul>
<li>the &#8220;semantic HTML&#8221; form of the current Bibleref proposal makes sense, and would be easy enough for bloggers and other content publishers to use at web scale.</li>
<li>Logos&#8217; <a href="http://www.logos.com/reftagger">RefTagger</a> makes it even easier, by identifying the most typical kinds of references. However, its results are dynamic, not persistent in the actual markup of the web page, so there&#8217;s currently no way for search engines to benefit from this processing. I&#8217;m hoping that we&#8217;ll eventually be able to provide a tool that actually generates Bibleref-style markup in pages.</li>
<li>Using RDFa would <em>also</em> be a workable way to get more Bibleref markup onto the web (for those who understand it and are motivated enough). Though there may be practical reasons to promote one approach over another, from a technical standpoint they have the same result (identify and normalizing references), and so one doesn&#8217;t have to exclude the other.</li>
<li>until there&#8217;s a way to <strong>actually search for Bibleref</strong>, none of this will make much difference, because content producers won&#8217;t have the payoff to motivate them to take the extra trouble. For that reason, i&#8217;m more inclined to bet on Yahoo, and if RDFa is how they plan to index and expose this content, that might make more sense (though i&#8217;d like to understand better how this will all work).</li>
</ul>
<p>(You&#8217;ll find some more thoughts in <a href="http://semanticbible.com/other/presentations/2008-bibleref/Bibleref.html">my BibleTech 2008 talk on Bibleref</a>.)</p>
]]></content:encoded>
			<wfw:commentRss>http://semanticbible.com/blogos/2008/03/24/bibleref-and-rdfa/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Using Word Tree Visualization for Checking Title Consistency</title>
		<link>http://semanticbible.com/blogos/2008/01/31/using-word-tree-visualization-for-consistency-checking-titles/</link>
		<comments>http://semanticbible.com/blogos/2008/01/31/using-word-tree-visualization-for-consistency-checking-titles/#comments</comments>
		<pubDate>Thu, 31 Jan 2008 19:33:06 +0000</pubDate>
		<dc:creator>Sean</dc:creator>
		
		<category><![CDATA[SemanticBible]]></category>

		<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://semanticbible.com/blogos/2008/01/31/using-word-tree-visualization-for-consistency-checking-titles/</guid>
		<description><![CDATA[I&#8217;ve gotten a lot of positive comments on my Zoomable Bible talk from BibleTech:08. While the prototype i showed was little more than a conceptual toy, i think people liked it because

animated visualizations are just plain cool, but even more importantly,
visualizations (like zoomable user interfaces) provide a different view of the text than our linear [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve gotten a lot of positive comments on my <a title="Presentation: the Zoomable Bible" href="http://semanticbible.com/other/presentations/2008-zoomable/main.html">Zoomable Bible talk</a> from BibleTech:08. While the prototype i showed was little more than a conceptual toy, i think people liked it because</p>
<ol>
<li>animated visualizations are just plain cool, but even more importantly,</li>
<li>visualizations (like zoomable user interfaces) provide a different view of the text than our linear print legacy has previously encouraged.</li>
</ol>
<p>However, the real test of a visualization isn&#8217;t its coolness, but rather whether it helps you understand things that are otherwise difficult to grasp. I had a good example of that this morning, and walking through it might help others see the value of this tool.</p>
<p>I wrote a year ago about IBM&#8217;s <a title="Post: Visualizing Bible Data at Many Eyes" href="http://semanticbible.com/blogos/2007/01/25/visualizing-bible-data-at-many-eyes/">Many Eyes</a> site, which provides a host of easy-to-use visualization tools: you upload your data set, choose a visualization technique, and voila, you&#8217;ve got a sharable visualization! I&#8217;ve posted a few data sets and visualizations previously, like:</p>
<ul>
<li><a title="Visualization: Top 50 Bible Women" href="http://services.alphaworks.ibm.com/manyeyes/view/SGXXRFsOtha6CgneXfvNG2-">Top 50 Bible Women by frequency and dispersion (scatterplot)</a></li>
<li><a title="Visualization: Composite Gospel Index" href="http://services.alphaworks.ibm.com/manyeyes/view/SMGTJEsOtha6cC-5aqYKE2-">the Composite Gospel Index (treemap)</a></li>
</ul>
<p>(the entire collection of my data and visualizations is <a href="http://services.alphaworks.ibm.com/manyeyes/user/usyHEsOtha654-79NEIE2-#contributions">here</a>), and lots of others have posted interesting visualizations of Bible data as well. Of course, if you want fine control over the visualization, you&#8217;re probably not going to get it from these pre-packaged techniques. But it&#8217;s pretty impressive how much you can do with what&#8217;s there, and this is an easy way to learn about and sample different visualization techniques: if you&#8217;re a data-oriented person, i&#8217;d strongly encourage you to check it out.</p>
<p>One of their text oriented visualization techniques is the <a href="http://services.alphaworks.ibm.com/manyeyes/page/Word_Tree.html">word tree</a>, which provides a kind of visual concordance for free text. <a href="http://services.alphaworks.ibm.com/manyeyes/view/SmzKjIsOtha6B-kHZCLjI2-">This example</a> of the KJV text of Genesis is a good illustration: type a word in the search box at the top and hit return, and you can see all the phrases that start with that word. You can also turn it around and find phrases ending with a word, and sort by frequency. <a href="http://jtauber.com/">James Tauber</a> has also used the word tree technique for <a title="Visualization: New Testament Greek Nominal Suffixes" href="http://services.alphaworks.ibm.com/manyeyes/view/SgoRsIsOtha66N-wO-cwI2-">visualizing NT Greek nominal suffixes</a>.</p>
<p>I found a new use for word trees today, in reviewing titles for the <a title="Composite Gospel Index" href="http://www.semanticbible.com/cgi/cgi-overview.html">Composite Gospel Index</a> (CGI). One motivation for creating the CGI a few years back was to make it easier to get an overview of the combined content of the four Gospels. Pericope titles are meant to help with this by effectively summarizing the content of a single story, and i deliberately tried to regularize their content. In particular, i wanted as many as made sense to start like &#8220;Jesus  &#8230;&#8221;, to try to show the commonality: &#8220;Jesus teaches about &#8230;&#8221;, &#8220;Jesus heals &#8230;&#8221;, &#8220;Jesus tells the parable of &#8230;&#8221;, etc.</p>
<p>Word trees are a perfect tool for data like this, because they make it easy to find phrases that start the same. Conversely, they tend to visually isolate phrases that start the same but then end differently. I&#8217;ve created a word tree for <a title="Visualization: Pericope Titles from the Composite Gospel Index" href="http://services.alphaworks.ibm.com/manyeyes/view/SmAgULsOtha6FLG_RYHoL2-">titles from CGI pericopes</a> (unfortunately, i haven&#8217;t figured out how to embed the visualization live here in my blog: WordPress keeps eating the script element). The input data to word trees are normally free text, but in my case each title is a complete unit: so i just appended special tokens +start+ and +end+ to each one, making the input data look like this (except that, as viewed raw on the site, it&#8217;s all wrapped and hence not so readable).</p>
<blockquote><p>+START+  Jesus is the Word  +END+<br />
+START+  God became a human being  +END+<br />
+START+  Jesus&#8217; ancestry back to Adam  +END+<br />
+START+  Jesus&#8217; ancestry from Abraham  +END+<br />
+START+  Luke&#8217;s purpose in writing  +END+<br />
+START+  The angel Gabriel promises the birth of John to Zechariah  +END+</p></blockquote>
<p>etc., for all 355 pericopes.</p>
<p>So if you enter &#8220;+start+ jesus&#8221; in the search box (or just click on Jesus in the default view), you&#8217;ll see the various titles that start with the word Jesus (255 of 355, or 72%: punctuation becomes a separate token, so a few starting with &#8220;Jesus&#8217; &#8230;&#8221; aren&#8217;t included). This works even better sorted by frequency: here you can clearly see the most frequent pericope title is &#8220;Jesus teaches &#8230;&#8221;, and clicking on &#8220;teaches&#8221; narrows the view further (which you pretty much have to do to see the details: results over 30 or 40 aren&#8217;t really visible). One advantage of this representation is that it gives you some help in knowing what to explore (in user interface terminology, an <a title="Wikipedia: Affordance" href="http://en.wikipedia.org/wiki/Affordance">affordance</a>). Though i can&#8217;t see all the details without zooming in, i can see a significant cluster of titles starting with &#8220;Jesus warns&#8221;, and if that&#8217;s interesting, i can click on &#8220;warns&#8221; to zoom in and see those 18 titles.</p>
<p>This last case also points out a benefit i hadn&#8217;t previously considered, which is consistency checking (finally getting to the main topic of this post). Looking at the frequency-sorted suffixes for &#8220;+start+ Jesus warns&#8221;, i see a large group under &#8220;against&#8221;, and a number under &#8220;about&#8221;, but also a single instance, &#8220;Jesus warns of coming judgment&#8221;. Because the third word is &#8220;of&#8221; rather than &#8220;about&#8221;, it stands apart from the other instances which really share the same concept. This could just as easily be re-worded &#8220;Jesus warns <em>about</em> coming judgement&#8221;, and made more consistent with other similar pericopes. Given my goal of consistency (in order to enable just these kinds of visualizations!), it&#8217;s really useful to identify cases like this, where a minor revision retains the meaning but also makes the data more consistent. The word tree visualization made it easy to enter &#8220;+start+ John&#8221; and find the one case where, instead of &#8220;John the Baptist &#8220;, i just put &#8220;John baptizes Jesus.&#8221;</p>
<p>What would be <em>really</em> great would be to turn this from a visualization into a navigation system, so once i&#8217;ve drilled down to &#8220;Jesus warns against &#8230;&#8221;, then i could select a title and actually view the pericope text. That&#8217;s beyond the scope of Many Eye&#8217;s toolkit, but something i expect to be working on in the future.</p>
]]></content:encoded>
			<wfw:commentRss>http://semanticbible.com/blogos/2008/01/31/using-word-tree-visualization-for-consistency-checking-titles/feed/</wfw:commentRss>
		</item>
		<item>
		<title>More BibleTech:08 Followup</title>
		<link>http://semanticbible.com/blogos/2008/01/30/more-bibletech08-followup/</link>
		<comments>http://semanticbible.com/blogos/2008/01/30/more-bibletech08-followup/#comments</comments>
		<pubDate>Wed, 30 Jan 2008 19:20:00 +0000</pubDate>
		<dc:creator>Sean</dc:creator>
		
		<category><![CDATA[Logos]]></category>

		<category><![CDATA[SemanticBible]]></category>

		<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://semanticbible.com/blogos/2008/01/30/more-bibletech08-followup/</guid>
		<description><![CDATA[Additional posts of presentations and blog reviews about BibleTech:08 have continued to trickle in: there are even some photos, like this one taken during my Zoomable Bible talk.
I&#8217;ve finally got the slides up from my talks.

The Zoomable Bible. Abstract: Information visualization is an established computer technique for providing rich, typically interactive, visual presentations of complex [...]]]></description>
			<content:encoded><![CDATA[<p>Additional posts of presentations and blog reviews about BibleTech:08 have continued to trickle in: there are even some photos, like <a title="Photo: Zoomable Bible screenshot" href="http://flickr.com/photos/30843569@N00/2219718290/">this one</a> taken during my Zoomable Bible talk.</p>
<p>I&#8217;ve finally got the slides up from my talks.</p>
<ul>
<li><a title="Presentation: the Zoomable Bible" href="http://www.semanticbible.org/other/presentations/2008-zoomable/ZoomableBible.html">The Zoomable Bible</a>. Abstract: Information visualization is an established computer technique for providing rich, typically interactive, visual presentations of complex multivariate data. While increased computing power has made information visualization more common, our interfaces for navigating and browsing the Bible are still largely linear adaptations of traditional print forms. New interface paradigms (like Appleâ€™s iPhone and Microsoftâ€™s SeaDragon technology) can present large amounts of information on a traditionally-sized computer display though the use of Zoomable User Interfaces (ZUIs). This presentation will overview existing tools, applications, and principles for ZUIs and other visualizations, and explore some novel interfaces that give higher-level views of Biblical content.</li>
<li><a title="Presentation: Bibleref" href="http://semanticbible.com/other/presentations/2008-bibleref/Bibleref.html">Bibleref: a Microformat for Bible References</a>. Abstract: Microformats are â€œa set of simple, open data formats built upon existing and widely adopted standardsâ€ (see http://microformats.org) that capture small but important bits of information on web pages. Bibleref is a proposed microformat for identifying Bible references that are embedded in blog posts and other web content. Broad use of bibleref would enable search engines, content aggregators, and other automated tools to correctly label the references so they&#8217;re more easily searchable. This presentation will explain why bibleref is needed, explore the technical specifics, and discuss how to promote broader adoption.</li>
</ul>
<p>They&#8217;re not fully linked into the navigation structure of <a href="http://semanticbible.com/">SemanticBible</a> yet, but the direct URLs linked above (which i gave in the talk) work fine. I&#8217;ll probably also tweak the content a bit (i really need some screenshots for the Zoomable Bible talk), but i wanted to get the official version out without more delay. There are lots of links embedded in the presentations, especially the resources at the end of the Zoomable Bible talk, so look for blue text.</p>
<p>If you&#8217;re curious, i&#8217;ve created these with Dave Raggett&#8217;s <a href="http://www.w3.org/Talks/Tools/Slidy/">Slidy</a> program (see <a href="http://semanticbible.com/blogos/2006/11/25/using-slidy/">this previous post</a>). Editing (X)HTML content for these by hand is still a little clunky (though i&#8217;ve gotten better at it), and it would be nice to have a WYSIWYG interface (i did lots of edit -> save -> switch to browser -> reload -> view cycles: it&#8217;s quick, but still painful). But the big payoff for me is that the result (unlike PowerPoint) is really a first-class citizen of the web.  For example, all the content gets indexed by the search engines, you can link into the presentations (each page has an ID), and not only can i talk about web markup, i can illustrate the point in the body of the presentation itself (view the source of the Bibleref talk for examples). Yes, you can publish PowerPoint on the web, but that&#8217;s it&#8217;s own special challenge, which is why nobody does it: they just post .ppt files, which are largely opaque to web tools. The newer version of Slidy also improves browser compatibility: these presentations mostly work fine under IE (though you don&#8217;t get the footer).</p>
]]></content:encoded>
			<wfw:commentRss>http://semanticbible.com/blogos/2008/01/30/more-bibletech08-followup/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
