Saturday, August 09, 2003

Someone emailed me to ask about setting up a Hyper-concordance for a different Bible translation. I thought others might be interested in this as well. I've posted the Perl code previously in hopes it might be helpful to others

  1. You need the electronic text. While plenty of older/less-popular/more obscure translations are freely available, the newer/most popular ones are not, and it's a violation of copyright to post non-trivial portions. Whether you agree with this or not (it's a more complex issue than you might first think), that's the law. In particular, i know of no unrestricted source unfortunately for the New Revised Standard Version, New International Version, New American Standard Bible, etc. Of course, if you've purchased a personal copy and you only want a hyper-concordance for your own personal use, then read on: but i used the Revised Standard Version (RSV) because i wanted to share the results with others. My program assumes the Open Scripture Information Standard as the format, though with some programming work you could use different formats as well. The key requirements are to be able to identify verses (with some reference scheme) and words (tokens).
  2. You need to know something about Perl programming (assuming you may have to change the code a little). I'm happy to answer simple questions about the code, though i don't want to become a Perl tutor. I made no attempt to handle character encodings other than ASCII, though it wouldn't be hard to make this UTF-8 capable.
  3. The stopwords list is specific to the RSV, and might need adjusting, either because you want to include/exclude other words, or because your version uses different words than the RSV (after all, if it didn't, what's the point?). I did this step by hand, though it was only a few hours work.
  4. Creating the file that maps inflected forms back to their bases (e.g. knew => know) is also a manual process, and was driven specifically by the RSV text. I'll probably improve this a little to make it more reusable for other projects.

I guess there are two possible responses:

  • "that's all there is to it?!?" (in which case go for it), or
  • "huh?!?" (in which case i'm afraid i can't be of much help)

Even though i'm not volunteering to do your favorite translation for you, do let me know if i can help enable you to do it for yourself.


4:06:37 PM #  Click here to send an email to the editor of this weblog.  comment []  trackback []
My dad sent me a link to the Dolphin Stress Test ...
3:47:20 PM #  Click here to send an email to the editor of this weblog.  comment []  trackback []

I can't remember if i found this before :-/ (don't have Google operating over my new site yet), but the Linked Word project from Bob Jones University bears some superficial similarity to the New Testament Hyper-concordance.

The key difference: their links lead to definitions of words. I'm really interested in words as links to navigate the word space of Scripture. I'm working on some ideas for how to incorporate Strong's numbers more fully into this approach.


3:04:43 PM #  Click here to send an email to the editor of this weblog.  comment []  trackback []
I enjoy listening to audio books when i garden or do other manual labor that doesn't require my brain. A neat discovery today was  SermonIndex.net, with a couple of thousand sermons from well-known (and lesser known) preachers.
2:42:45 PM #  Click here to send an email to the editor of this weblog.  comment []  trackback []
Ben Edgington has a nice explanation of using the English Standard Version web service via PHP.
8:32:11 AM #  Click here to send an email to the editor of this weblog.  comment []  trackback []