BibleTech 2009 Topic: the Libronix Controlled Vocabulary

Next in my experiment to gather feedback on possible BibleTech 2009 topics: the Libronix Controlled Vocabulary. This is the second of my two major activities over the last year (the other was described in my previous post), and therefore a pretty strong contender for a BibleTech presentation.

Unlike the Bible Knowledgebase, which is about real-world entities in the Biblical text, the Libronix Controlled Vocabulary (LCV) organizes terminology from the field of Biblical studies, principally Bible dictionaries, encyclopedias, and other kinds of subject-oriented reference works. A controlled vocabulary identifies, organizes, and systematizes a specific set of terms for indexing content, capturing inter-term relationships, and expressing term hierarchies. Like other kinds of metadata, this infrastructure then supports applications in search, discovery, and general knowledge management. The initial version of the LCV was built by merging content from 7 of the most important Bible dictionaries in Libronix, and currently comprises some 11k terms: i expect it will eventually grow to 15k or perhaps more.

One interesting aspect of working in the specific domain of Biblical studies is that there is a core set of subjects that are common to many or most Bible dictionaries. This includes named individuals and places in the Bible, but also subjects like Heaven or Heresy. But while one dictionary has an article on Heresy (NBD [Libronix link], or Eastons [Libronix link]), another might have one entitled “Heresy and Orthodoxy in the NT” (Anchor [Libronix link]). These articles may have both common content but also significant differences, stemming from their intended audiences (scholarly vs. popular), theological orientation, comprehensiveness, etc. The LCV provides a way to capture some of these similarities, as well as enabling some interesting new capabilities for machine learning from existing prose content. For example:

  • what are the prototypical Bible references, names, or phrases used to discuss a topic?
  • can we learn anything about the importance of topics by looking at how much is written about them, how many dictionaries cover them, and other kinds of automated analysis?
  • what knowledge can be gleaned from the topology of terminology linkage (what links to what)?

I’m not sure i’ve provided enough information here to give a clear sense of what might be covered in such a talk, but i welcome any feedback from potential BibleTech attendees (or others) as to whether this sounds interesting, and which aspects of it you’d most like to learn about.

6 thoughts on “BibleTech 2009 Topic: the Libronix Controlled Vocabulary”

  1. Sounds fascinating to me. Though sadly I won’t be attending, combination of airfare and time (of the year and travel time). But then we are working on a Bible Dictionary 😉 any way for mere mortals to get at this data, to see if it could help in constructing the list of headwords/terms for our dictionary?

  2. Tim:
    I haven’t explored yet what the prospects might be for sharing some of this data (and it may also be a bit too early). But i am interested in sharing where it makes sense, and i’d love to see more community standardization (rising tides and all that). If i do wind up talking about the LCV, the slides (and probably audio) will wind up on the web, and i’ll definitely have an answer to your question by then.

  3. The previous post (BibleTech 2009 Topic …) doesn’t work for me either. It’s also not in your list of recent posts.

    I wanted to comment on it and ask a little more about BK. Is this just something available in Logos or are you actually going to make it available for machine readability on the web somewhere (search engines, etc.)? If it was publicly available that would really establish Logos’ web presence as a trusted source.

  4. Joel:
    I’m still unsure about this problem: some kind of WordPress weirdness maybe.
    * can you confirm that you now see that post’s title in the “recent posts” list on the right? (i can, but i didn’t notice it was missing until you mentioned it)
    * does the “recent posts” link, and the one in this post with text “my previous post”, now take you to the post?
    * if not, can you confirm the actual URL that doesn’t work is the one in comment #2 above?

    As to sharing: we’re developing the BK to enrich our product, both as primary content, and also as an indexing and navigational infrastructure. We’re committed in principle to sharing at least some aspects of it: at the same time, Logos has invested substantial resources in developing it, so some aspects will only be in our product.

    At the last BibleTech, we had an informal gathering of people interested in standardizing identifiers for Bible names, and established a Google Group for this. While i did a little work to organize things and to share some data from Reinier de Blois of the United Bible Society, mostly i’ve dropped the ball on follow-up, with the traditional excuse of busy-ness. I hope to remedy that prior to the next BibleTech so we can make a little progress perhaps once people are in the same room again.

