There’s a thoughtful post at OpenBible.info about Bibleref and an emerging W3C standard called RDFa which provides another possible approach to identifying references to Bible passages on the web. The OpenBible.info post provides a good example of how these two approaches might compare: the question is which approach makes more sense.

My interest in Bibleref is only to achieve a practical goal: making it easier to distinguish and characterize citations of Biblical passages. So my pragmatic answer is, whatever approach gets us closer to that goal.

In the context of emerging Internet technologies and practices, Bibleref attempts to solve a specific, small problem that is nevertheless representative of a much wider set of challenges:

How to “upgrade” the web as it exists today (mostly display-oriented prose) to more structured and meaningful data

  • without breaking things that already work
  • without requiring too much effort from web page authors
  • in a way that will carry us forward into whatever the future Web turns out to be

One consequence of the unbridled growth of the World Wide Web is lots of conflicting standards. There’s a constant tension between those who think carefully about requirements and want a neat and tidy approach, versus those who just want to get something practical done. Both approaches have their merits, but it’s clear the Web as it exists today would never have developed if it hadn’t been possible for excited individuals to go write their own web pages with a simple text editor.

Microformats are more toward the “pave the cowpaths” end of this continuum, focusing on re-using existing HTML constructs in slightly more semantic fashion. RDFa requires somewhat more overhead in the use of namespaces and the RDF model (not too surprising given its W3C sponsorship). But, if you think Yahoo Search’s Amit Kumar is right (i do) that search is the “killer app” for the Semantic Web, the issue of how all that data will get indexed and processed is as significant as how it will get authored in the first place. Since each microformat is a new special case, and searching for Bible references isn’t quite in the mainstream of the Internet economy, it’s easy to question whether a microformat-based Bibleref standard will ever achieve critical mass.

I’m rambling a bit: here’s a more coherent summary of where i see things today, slightly less than a year after my first Bibleref proposal:

  • the “semantic HTML” form of the current Bibleref proposal makes sense, and would be easy enough for bloggers and other content publishers to use at web scale.
  • Logos’ RefTagger makes it even easier, by identifying the most typical kinds of references. However, its results are dynamic, not persistent in the actual markup of the web page, so there’s currently no way for search engines to benefit from this processing. I’m hoping that we’ll eventually be able to provide a tool that actually generates Bibleref-style markup in pages.
  • Using RDFa would also be a workable way to get more Bibleref markup onto the web (for those who understand it and are motivated enough). Though there may be practical reasons to promote one approach over another, from a technical standpoint they have the same result (identify and normalizing references), and so one doesn’t have to exclude the other.
  • until there’s a way to actually search for Bibleref, none of this will make much difference, because content producers won’t have the payoff to motivate them to take the extra trouble. For that reason, i’m more inclined to bet on Yahoo, and if RDFa is how they plan to index and expose this content, that might make more sense (though i’d like to understand better how this will all work).

(You’ll find some more thoughts in my BibleTech 2008 talk on Bibleref.)