God’s Word | our words
meaning, communication, & technology
following Jesus, the Word made flesh
March 18th, 2013

BibleTech talk: Automatically Learning Topical Content

I’ve posted the slides from my BibleTech 2013 talk. Here’s the abstract:

Continued work on the Logos Controlled Vocabulary (BibleTech 2010, “A Controlled Vocabulary for Biblical Studies”) has produced a unique collection of topic-aligned content across more than 50 different Bible dictionaries, encyclopedias, and topical indexes in both English and Spanish. This presentation will describe the information we’re learning automatically from this content, including:

  • determining concept importance
  • associating concepts with Bible references
  • extracting and associating names and descriptive terms for concepts
  • relating concepts to each other

You can see the other talks at the BibleTech website. I’ve had a number of positive comments on the talk, which is always gratifying. Slowly but surely, we’re climbing up the data stack …


3/21/2013 update: I’ve used the Slidy framework for presentations for several years because i like the way it puts the whole content out on the web in HTML. However, @JohnRGentry pointed out that my slides don’t work on the iPad because moving forward and backward requires keys. There’s a newer version of Slidy which does support swiping to move through slides, though my experience with it on both Safari and Chrome on iOS hasn’t been great: it’s not easy to register a swipe, and title text sometimes gets lost. I assume these are issues with javascript support on iOS, though i’m not really sure. I’ll try to update my slides to the newest version of Slidy, which will help a little, but i’ll also look for another framework that’s more tablet friendly.

April 25th, 2011

BibleTech Talk Slides: Using the Bible Knowledgebase For Information Integration

Finally got my slides posted from BibleTech:2011 on Using the Bible Knowledgebase for Information Integration. Since i listened to good advice and went a little more toward graphics than bullet points, they’re not completely self-explanatory (but that’s why you should have come, right?).

Audio will show up too at some point, probably at http://www.bibletechconference.com/speakers.

As i’ve told a few of my colleagues since: giving the talk helped convince me even more strongly that Biblical Events will be a really important database for Bible study. Looking forward to getting it all put together.

March 26th, 2011

BibleTech 2011

I had to miss the first day because of another commitment, but today i’m here at BibleTech:2011 and looking forward to a great day of talks. Hopefully mine will be one of them: here’s my abstract.

Using the Bible Knowledgebase for Information Integration

In 2009 I reported on the Bible Knowledgebase (BK), a machine-readable collection of semantically-organized data about people, places, and things in the Bible. This talk will describe how the BK now functions as an essential information resource for Logos, tying together information across the software. In addition, I’ll discuss the continued work on the data over the last two years, including:

  • building a database of Biblical Events
  • adding unnamed entities to the database
  • coordinating information about these entities with the Logos Controlled Vocabulary

I’ll also present prototypes for visualizing BK data to enhance discovery and exploration in the Biblical text.

I’ll be live-blogging a few talks during the day to give a quick-take on the subject for those who can’t be here. You can also follow on Twitter via #BibleTech.

March 30th, 2010

BibleTech:2010 Debrief

The BibleTech conference is an annual highlight for those of us who work at the intersection of Bible stuff and technology, and last week’s meeting in San Jose was no exception. This was the third BibleTech — i’ve been fortunate to have attended (and presented at) them all — and there’s always a great mix of new ideas, updates on ongoing projects, and lots of interesting people to talk to. (some other reviews: Rick Brannan, Mike Aubrey, Trey Gourley)

Some of the talks i liked best this year:

  • I was already interested in Pinax before hearing James Tauber’s talk on Using Django and Pinax for Collaborative Linguistics: now i’m itching to get started!
  • Stephen Smith had a nice analysis of the most frequently tweeted Bible passages (though the evidence of vast swaths of Scripture that get very little attention was perhaps a bit depressing).
  • Neil Rees showed Concordance Builder, a program that lets you use a Swahili concordance to bootstrap one for Welsh (or any other pair of languages) with no linguistic knowledge. Building on the Paratext tool, it leverages the verse indexes along with approximate string matching and statistical glossing (technical paper by J D Riding) to produce results that are about 90-95% correct out of the book. This can reduce concordance development to a matter of weeks rather than years.
  • There were several talks related to semantics in addition to mine: Randall Tan talked about more automated methods and fleshed them out relative to the higher-level structure of Galatians, and Andi Wu gave what looked like a really interesting presentation on semantic search based on syntax and cross-language correspondence (alas, i missed it).
  • Weston Ruter talked about APIs they’re developing at OpenScriptures.org (and brought in the Linked Data idea). Logos also unveiled their new API for Biblia.

I felt my talks went well and i got some good feedback. My slides are now posted (if you wrote down URLs at the conference, i didn’t get them quite right :-( but here they’re correct):

(As with some previous talks, i did my presentation with Slidy (previous post): i feel like it’s going a little more smoothly each time.)

January 25th, 2010

BibleTech:2010 Talk – The Logos Controlled Vocabulary

The program for BibleTech:2010 has been up for a couple of weeks now, and i’ve been delinquent in failing to point that out. We’ve got a full roster of really interesting talks that span the gamut from friendly warm technology to hard-core geekishness: Bible translation, social media, Biblical linguistics, mobile computing, preaching, publishing, tweeting, and more. And this year, it’s in San Jose, CA: i’m hoping that will open up attendance to some folks who have the misfortune to not live in the beautiful Pacific NW. The dates are March 26-27, 2010.

I’ll be giving two talks this year: here’s my abstract for the first one, on the Libronix Logos Controlled Vocabulary.


Dozens of books provide terminology from the field of Biblical studies, principally Bible dictionaries, encyclopedias, and other subject-oriented reference works. However, the terminology used varies between books, authors, and publishers, and doesn’t always include all the terms a user might employ to find information.

The Libronix Logos Controlled Vocabulary (LCV) organizes content from multiple Bible dictionaries to integrate information across the Logos library. As a controlled vocabulary, the LCV identifies, organizes, and systematizes a specific set of terms for indexing content, capturing inter-term relationships, and expressing term hierarchies. Like other kinds of metadata, this infrastructure then supports applications in search, discovery, and general knowledge management. The initial version of the LCV (shipping now with Logos 4) comprises some 11,100 terms, and continues to grow as more reference works are added. It also provides the backbone of http://topics.logos.com, a website for user contributions.

This talk will describe the building of the LCV, how we’re using it now, and how we plan to use and extend it in the future. This includes some interesting new capabilities for machine learning from existing prose content. For example:

  • what are the prototypical Bible references, names, or phrases used to discuss a topic?
  • can we learn anything about the importance of topics by looking at how much is written about them, how many dictionaries cover them, and other kinds of automated analysis?
  • what knowledge can be gleaned from the topology of terminology linkage (what links to what)?

Update: we’ve decided in general to retire the “Libronix” name for Logos technologies, so i’m trying to get on board by starting to call this the Logos Controlled Vocabulary.

March 30th, 2009

BibleTech:2009 Postlude

BibleTech:2009 is past now, and (just like last year) was a great opportunity both to hear new ideas about Bible and technology, but also meet and talk with many others with common interests. The few scattered thoughts i jotted down as i was live-blogging talks certainly don’t do justice to the richness of many of the presentations: so don’t judge the quality of their talks by my quick-take notes.

I’ve got slides from my talk on the Bible Knowledgebase posted now on SemanticBible: the navigational structure above them isn’t in place yet, but you should be able to follow the link directly to get there. Once again, i’ve used Slidy for the presentation, and that process went a little more smoothly this time (which probably just means i’ve gotten better at it). View the source if you want to see how it works.

[Important note: if you were at my talk and wrote down the URL for the slides, i had it wrong. The correct URL is:

http://semanticbible.com/other/talks/2009/bibletech/BK.html

Yes, i know that Cool URIs don't change, which is why i wanted to make this one adjustment before publishing them, so i won't have to change it in the future.]

At some point there should be audio from the talk posted on the BibleTech site (probably on the BibleTech speakers page, which has links to talks from last year and audio where available). Future Blogos posts on the Bible Knowledgebase will go in my WordPress category of that name (RSS feed here), and will also be tagged with bk if you want to follow along.

March 27th, 2009

BibleTech09 is On!

I’ll be live-blogging selected sessions from BibleTech today and tomorrow. If you’re here (at the Seattle Hilton), come hear my talk this afternoon at 3!

November 4th, 2008

More BibleTech 2009 Topics

Like a presidential candidate, i’m down to the wire for deciding what BibleTech talks to propose (happily, unlike them, i haven’t been campaigning for my ideas for months now!). The two i posted about last week — Bible Knowledgebase and Libronix Controlled Vocabulary — are the strongest contenders. But here’s a grab bag of some additional topics i’ve thought about, for you to cheer for, sneer at, or go off and implement yourself (so i don’t have to!). Let me know what you think.

Web Search for Bible References

At BibleTech:2008 i gave a talk on Bibleref: a Microformat for Bible References, as one approach to the problem of how content providers (web site authors, bloggers, etc.) can identify Bible references in what they create. Reftagger provides a different, more automated approach to the same problem.

But as i pointed out last January, that’s really only part of the problem — in fact, the smallest part. Because for every one blogger who adopts bibleref markup or installs Reftagger,  there will be 1000 more who’ve never heard of either one. In the earliest days of the web, you had to add keywords to your HTML to make it easy for search engines to find you: now, Google finds most plain text without any special work on your part. How do we accomplish the same thing for Bible references out on the web, so they can be reliably found regardless of whether the author took special care to identify them, despite differences in abbreviations or punctuation style, and being smart about verse ranges?

Making Bible2.0 Work

For years now, multiple sites on the web have offered Bible texts (in case you didn’t notice, Logos launched a beta version of their own, bible.logos.com, recently). More recently, in the last few years several sites have gone beyond that to a Web 2.0 style that i call “Bible 2.0“, by allowing users to contribute their own content through tags, external links, personal comments, etc. When Web 2.0 first became cool, several arose quickly and then withered (like xpound.org), while others are still out there. YouVersion is perhaps the best developed (Blogos post); Bibleserver is another.

While the concept is still fairly new, there’s enough out there to begin to evaluate:

  • what works well about these sites? what doesn’t work so well?
  • what’s missing to get these kinds of sites to have the same value as more popular Web 2.0 sites like del.icio.us, flickr, etc.?
  • what are some new ways in which Bible2.0 could support Bible study in small groups, informal web communities, etc.?

Audience Choice

If you follow Blogos, what would you suggest i talk about? Any ideas for a visualization you’d love to see, or some other algorithmic/data topic?

October 28th, 2008

BibleTech 2009 Topic: the Libronix Controlled Vocabulary

Next in my experiment to gather feedback on possible BibleTech 2009 topics: the Libronix Controlled Vocabulary. This is the second of my two major activities over the last year (the other was described in my previous post), and therefore a pretty strong contender for a BibleTech presentation.

Unlike the Bible Knowledgebase, which is about real-world entities in the Biblical text, the Libronix Controlled Vocabulary (LCV) organizes terminology from the field of Biblical studies, principally Bible dictionaries, encyclopedias, and other kinds of subject-oriented reference works. A controlled vocabulary identifies, organizes, and systematizes a specific set of terms for indexing content, capturing inter-term relationships, and expressing term hierarchies. Like other kinds of metadata, this infrastructure then supports applications in search, discovery, and general knowledge management. The initial version of the LCV was built by merging content from 7 of the most important Bible dictionaries in Libronix, and currently comprises some 11k terms: i expect it will eventually grow to 15k or perhaps more.

One interesting aspect of working in the specific domain of Biblical studies is that there is a core set of subjects that are common to many or most Bible dictionaries. This includes named individuals and places in the Bible, but also subjects like Heaven or Heresy. But while one dictionary has an article on Heresy (NBD [Libronix link], or Eastons [Libronix link]), another might have one entitled “Heresy and Orthodoxy in the NT” (Anchor [Libronix link]). These articles may have both common content but also significant differences, stemming from their intended audiences (scholarly vs. popular), theological orientation, comprehensiveness, etc. The LCV provides a way to capture some of these similarities, as well as enabling some interesting new capabilities for machine learning from existing prose content. For example:

  • what are the prototypical Bible references, names, or phrases used to discuss a topic?
  • can we learn anything about the importance of topics by looking at how much is written about them, how many dictionaries cover them, and other kinds of automated analysis?
  • what knowledge can be gleaned from the topology of terminology linkage (what links to what)?

I’m not sure i’ve provided enough information here to give a clear sense of what might be covered in such a talk, but i welcome any feedback from potential BibleTech attendees (or others) as to whether this sounds interesting, and which aspects of it you’d most like to learn about.

October 28th, 2008

BibleTech 2009 Topic: the Bible Knowledgebase

My most significant activity at Logos over the last year and a half has been building a database of people, places, and things i call the Bible Knowledgebase (BK). I’ve posted on numerous aspects of this project before (collected in this category), and thanks to lots of hard work by a number of individuals, we’re closing in on a relatively complete internal version. This won’t be released until the next major version of Logos software, so it’s public debut is still some ways off.

So one strong candidate for a BibleTech talk is a review of the BK, a machine-readable knowledge base of semantically-organized Bible data that is linked to Biblical texts to support search, navigation, visualization. The thousands of entities in the BK (people, places, and things, along with their names) have a variety of attributes that are appropriate to their type: people have family relationships, places have geo-coordinates, etc. Relationships between entities support discovery and exploration.
Unlike knowledge expressed in prose (like Bible dictionaries), BK data provides reusable content that can serve a variety of purposes. It also provides an important integration framework for Libronix resources, in the general spirit of Tim Berners-Lee’s Linked Data idea.

Some other topics the talk might address:

  • visualizing and learning from the graph of relationships
  • BK as an information architecture for other Libronix resources
  • challenges in building and using BK
  • some specific tools that have proved useful in managing BK development
  • a possible future for community participation in BK extension

So now, the audience participation portion of our program:

  • would you be interested in hearing a talk like this at BibleTech 2009?
  • what aspects are most/least interesting to you?

I’d encourage you to post a comment with your responses.