God’s Word | our words
meaning, communication, & technology
following Jesus, the Word made flesh
January 25th, 2010

BibleTech:2010 Talk – The Logos Controlled Vocabulary

The program for BibleTech:2010 has been up for a couple of weeks now, and i’ve been delinquent in failing to point that out. We’ve got a full roster of really interesting talks that span the gamut from friendly warm technology to hard-core geekishness: Bible translation, social media, Biblical linguistics, mobile computing, preaching, publishing, tweeting, and more. And this year, it’s in San Jose, CA: i’m hoping that will open up attendance to some folks who have the misfortune to not live in the beautiful Pacific NW. The dates are March 26-27, 2010.

I’ll be giving two talks this year: here’s my abstract for the first one, on the Libronix Logos Controlled Vocabulary.


Dozens of books provide terminology from the field of Biblical studies, principally Bible dictionaries, encyclopedias, and other subject-oriented reference works. However, the terminology used varies between books, authors, and publishers, and doesn’t always include all the terms a user might employ to find information.

The Libronix Logos Controlled Vocabulary (LCV) organizes content from multiple Bible dictionaries to integrate information across the Logos library. As a controlled vocabulary, the LCV identifies, organizes, and systematizes a specific set of terms for indexing content, capturing inter-term relationships, and expressing term hierarchies. Like other kinds of metadata, this infrastructure then supports applications in search, discovery, and general knowledge management. The initial version of the LCV (shipping now with Logos 4) comprises some 11,100 terms, and continues to grow as more reference works are added. It also provides the backbone of http://topics.logos.com, a website for user contributions.

This talk will describe the building of the LCV, how we’re using it now, and how we plan to use and extend it in the future. This includes some interesting new capabilities for machine learning from existing prose content. For example:

  • what are the prototypical Bible references, names, or phrases used to discuss a topic?
  • can we learn anything about the importance of topics by looking at how much is written about them, how many dictionaries cover them, and other kinds of automated analysis?
  • what knowledge can be gleaned from the topology of terminology linkage (what links to what)?

Update: we’ve decided in general to retire the “Libronix” name for Logos technologies, so i’m trying to get on board by starting to call this the Logos Controlled Vocabulary.

November 2nd, 2009

Logos 4 Launches Today

I’m thrilled to announce that we’re releasing Logos Bible Software 4 today. This is a complete rewrite from the ground up of the best Bible study software on the planet, so that makes this an exciting day in my book.

Logos 4 sports an entirely new interface to make it easier than ever to find what you’re looking for and keep your study space organized and effective. There’s a wealth of new, visually oriented resources, and better controls for working through the enormous space of resources Logos makes available. There’s even an iPhone app for no extra charge!

That’s the marketing view (and i stand behind it). But this means much more to me on a very personal level. It’s been almost 3 years since i came at Logos, and this will be the first time most of my work has seen the light of day. Specifically, Logos 4 contains the work of my colleagues and me in several new areas:

  • Biblical People, which organizes information about the 3300 individuals, groups of people, and deities named in the Biblical text. It includes a comprehensive list of references, their family relationships, links to dictionary articles, and links to related items. It also includes family tree and story-based diagrams. And everything is hyperlinked.
  • Biblical Places includes all the same kinds of information for 1200 named places from the Bible: cities, regions, even geographic features like rivers and mountains. Along with the data, there are 60 new high-resolution maps commissioned by Logos and covering the major Biblical events, as well as a mega-map that shows all the places together.
  • Biblical Things describes the physical objects of the Bible: animals, plants, body parts, clothing, food and drink, and much more, as well as specific items like Noah’s ark and Goliath’s sword and weights and measures. There are more than 1000 objects here, which also bring together thousands of images from across the library.
  • There’s also a new collection of high-resolution infographics illustrating different aspects of the Biblical world (and i’m extra proud that the bulk of this work was managed by my wife Donna)
  • In additional to regular word search (which is much faster than ever), under the hood is the Libronix Controlled Vocabulary (LCV), working to organize 11,000 different subjects in the Biblical studies literature and coordinating information across the library.

So if you’ve been following my posts on the Bible Knowledgebase … well, now it’s here. I can’t overstate how important i think this is: this is quite literally the first time in the centuries-old history of Biblical studies that this information has been made available in this way. The LCV isn’t quite as visible (yet), but it’s also an important organizing feature that will continue to grow in power going forward.

I hope you’re catching my sense of excitement about these new resources (and this says nothing about all the hard work of my dozens of colleagues in other areas). I hoped i’ve piqued your interest to learn more about Logos 4. It really is a watershed event in Bible software.

Obligatory disclaimer: i work for Logos and highly value what i do there. So i’m not the least bit objective about this. (more detailed disclosures)

March 30th, 2009

BibleTech:2009 Postlude

BibleTech:2009 is past now, and (just like last year) was a great opportunity both to hear new ideas about Bible and technology, but also meet and talk with many others with common interests. The few scattered thoughts i jotted down as i was live-blogging talks certainly don’t do justice to the richness of many of the presentations: so don’t judge the quality of their talks by my quick-take notes.

I’ve got slides from my talk on the Bible Knowledgebase posted now on SemanticBible: the navigational structure above them isn’t in place yet, but you should be able to follow the link directly to get there. Once again, i’ve used Slidy for the presentation, and that process went a little more smoothly this time (which probably just means i’ve gotten better at it). View the source if you want to see how it works.

[Important note: if you were at my talk and wrote down the URL for the slides, i had it wrong. The correct URL is:

http://semanticbible.com/other/talks/2009/bibletech/BK.html

Yes, i know that Cool URIs don't change, which is why i wanted to make this one adjustment before publishing them, so i won't have to change it in the future.]

At some point there should be audio from the talk posted on the BibleTech site (probably on the BibleTech speakers page, which has links to talks from last year and audio where available). Future Blogos posts on the Bible Knowledgebase will go in my WordPress category of that name (RSS feed here), and will also be tagged with bk if you want to follow along.

March 28th, 2009

Linne: The Near-Future of the Bible – Scenarios, Methods and Structures of Futures Studies

FutureS with an ’s’: we don’t know what will happen, but we can imagine a range of possibilities within the cone of plausibility. The farther out you go, the broader the range of possibilities. Kevin Kelly (Wired magazine): the problem with Christianity is that every generation has expected Jesus to return, so they don’t look beyond their generation to think about what Christianity will look like in 1000 years (see http://qideas.org/shorts/)

Method: The S-Curve – early adoption, followed by loss of interest, then mass adoption.

Method: Framing – set scope and focus, adjust attitudes, set objectives.

Method: Scanning. Map the system.

Method: Forecasting. Look at drivers and uncertainties. Generate and prioritize ideas.

Method: STEEP. Look at what’s happening in Social, Technological, Economic, Environmental, and Political arenas.

Method: Visioning. What are the implications of our forecasts? Challenge assumptions. Think big.

Method: Planning. Think strategically about what future you want, and develop options for it.

Method: Acting. Communicate results, create an action agenda, and develop strategic thinking.

Some possible future scenarios:

  • the Digitally Illuminated Bible. A convergence of factors: Kindle/iPhone, BibleTech conference, Green Movement. What if paper is outlawed: what happens to Bible publication?
  • the Bible as Service Oriented Architecture. Can we make our meta-aids and interpretations so good that the text itself effectively disappears?
  • the Bible as a Digitally Sacred Cow. What if the Great Firewall of China makes the Bible unavailable online?

Some baseline scenarios for building our own scenarios:

  • in 2040: Human population will hit 8B, and then decline for the first time ever. Average age will be 50/60 years old. 80% of humans will live in cities. China will overtake the US economy. 90% of humanity connected via the internet. True AI will be achieved. Seat of Christianity is NOT the US (today, half of S. Korea is Christian). Read Jesus in Beijing.

Other resources:

March 28th, 2009

Ruter: Open Scriptures: Picking up the Mantle of the RE:Greek-Open Source Initiative

The background of this talk: Zack Hubert’s talk from the last BibleTech. Zack developed a very useful web site which ultimately failed because he couldn’t maintain it, and couldn’t get other developers to pitch in and help.

The vision: an open web repository for integrated scriptural data and a platform for building applications of scripture (OpenScriptures.org). What kinds of data? Manuscripts, translations, versification systems, morphosyntactic parsings, user tags/annotations/cross-references. But it takes a lot of effort to get started with all this data, each of which is typically in its own format, and unlinked to other data.

Linked data principles (from timbl):

  • use URIs as names for things
  • use HTTP URIs so that people can look up those names
  • provide useful information behind the URIs
  • and links to other URIs so they can discover more things

“… the more things you have to connect together, the more powerful it is.” Can we connect things together through a unified manuscript that links together semantic units (words, phrases, clauses)?

Manuscript unification: normalize a manuscript (lowercase and remove diacritics: no spelling normalization yet), insert and save links to the unified manuscript. Then for additional manuscripts, normalize, merge links, and save them. Now you’ve got all the attested readings linked together. This unified manuscript now has an automated critical apparatus. [demo here of the manuscript comparator]

Potential applications include:

  • translation comparator (can also help with the versification problem)
  • comprehensive concordance
  • translation-independent cross-references (e.g. NT quotations of the OT)
  • interlinear/bilingual editions

You can automatically link manuscripts in the same language, but not different languages. Use collective intelligence to capture semantic linking between languages. Use the “games with a purpose” (GWAP) approach to gather links.

Copyright is a major challenge: you can’t link texts together if you can’t access them, and you can’t share them if they’re not open. Recently MorphGNT texts have been taken down from several sites because they’re not freely sharable. If the key benefit is connections between data, then data (including texts) should be more valuable if they’re sharable and connected. One solution: an Open Scriptures Platform that connects content owners, developers, and end-users. Passionate developers could build applications based on content licensed to Open Scriptures (as a proxy), and Open Scriptures makes sure than end-users provide revenue to content owners.

March 27th, 2009

BibleTech09 is On!

I’ll be live-blogging selected sessions from BibleTech today and tomorrow. If you’re here (at the Seattle Hilton), come hear my talk this afternoon at 3!

January 8th, 2009

Addressability Matters

Ever since Adam named the beasts (Gen 1:19-20), labels have mattered to humanity: it’s pretty hard to hold a conversation if you have to start with “you know that really big gray beast with the cute little ears that sits in the river all day with just its eyes showing?”, instead of just “hippopotamus”.

Information on the web works the same way. Most (but not all!) web pages have the equivalent of a name, their Uniform Resource Locator (URL), which tells your browser how to bring up the page. But too many conversations about web pages are still like the hippopotamus conversation: “just go to www.frooble.com, then type ’shebang’ in the search box, and look about half-way down the page on the left side …”. In other words, that little tidbit of information isn’t addressable: i can’t give you a name for it, i can only tell you to travel over the river, through the woods, and then turn left at the 3rd oak tree.

Though there’s usually no good technical reason, this is still so often true for our web-enabled world. For example, i admit to my chagrin that i only just now figured out the URL for my Facebook profile, even though i had looked for it (half-heartedly) several times previously. (I happened to stumble over somebody else’s, saw the pattern, and then plugged my own name and ID in the URL instead). Having a URL that’s both explicit and understandable enables this kind of URL hacking, which is a really powerful technique.

Here’s a small example (combined with a shameless plug). The HTML designers for the upcoming Bible Tech conference have added page targets for speakers to the Speakers page. So even though there’s one long list, you can get to just the right spot on the list by following the link to my talk. And if i show you the URL

http://www.bibletechconference.com/speakers.htm#SeanBoisen-2009

and explain the schema ([baseURL]#[Firstname][Lastname]-year), you can get to my talks from last year too. That’s a nice bit of design, and part of a much larger and important architectural practice called Representational State Transfer or REST. As another example, you can probably figure out how to change this URL

http://bible.logos.com/passage/NIV/Ge 2.19-20

to get you to Mark 4.1-12 in the ESV instead (though you might stumble if you use a colon instead of a period to separate chapter and verse).

A lot of important things only become possible once you start to provide names for your resources. That’s a big part of the justification for the complex tangle of ideas called the Semantic Web, or if that’s too high-falutin’ for you, just call it smarter web design for information integration.

PS: i realized later it wasn’t just that i couldn’t figure out how to construct a Facebook URL: you have to make a badge first to get an addressable URL, which seems pretty non-obvious!

November 4th, 2008

More BibleTech 2009 Topics

Like a presidential candidate, i’m down to the wire for deciding what BibleTech talks to propose (happily, unlike them, i haven’t been campaigning for my ideas for months now!). The two i posted about last week — Bible Knowledgebase and Libronix Controlled Vocabulary — are the strongest contenders. But here’s a grab bag of some additional topics i’ve thought about, for you to cheer for, sneer at, or go off and implement yourself (so i don’t have to!). Let me know what you think.

Web Search for Bible References

At BibleTech:2008 i gave a talk on Bibleref: a Microformat for Bible References, as one approach to the problem of how content providers (web site authors, bloggers, etc.) can identify Bible references in what they create. Reftagger provides a different, more automated approach to the same problem.

But as i pointed out last January, that’s really only part of the problem — in fact, the smallest part. Because for every one blogger who adopts bibleref markup or installs Reftagger,  there will be 1000 more who’ve never heard of either one. In the earliest days of the web, you had to add keywords to your HTML to make it easy for search engines to find you: now, Google finds most plain text without any special work on your part. How do we accomplish the same thing for Bible references out on the web, so they can be reliably found regardless of whether the author took special care to identify them, despite differences in abbreviations or punctuation style, and being smart about verse ranges?

Making Bible2.0 Work

For years now, multiple sites on the web have offered Bible texts (in case you didn’t notice, Logos launched a beta version of their own, bible.logos.com, recently). More recently, in the last few years several sites have gone beyond that to a Web 2.0 style that i call “Bible 2.0“, by allowing users to contribute their own content through tags, external links, personal comments, etc. When Web 2.0 first became cool, several arose quickly and then withered (like xpound.org), while others are still out there. YouVersion is perhaps the best developed (Blogos post); Bibleserver is another.

While the concept is still fairly new, there’s enough out there to begin to evaluate:

  • what works well about these sites? what doesn’t work so well?
  • what’s missing to get these kinds of sites to have the same value as more popular Web 2.0 sites like del.icio.us, flickr, etc.?
  • what are some new ways in which Bible2.0 could support Bible study in small groups, informal web communities, etc.?

Audience Choice

If you follow Blogos, what would you suggest i talk about? Any ideas for a visualization you’d love to see, or some other algorithmic/data topic?

October 28th, 2008

BibleTech 2009 Topic: the Libronix Controlled Vocabulary

Next in my experiment to gather feedback on possible BibleTech 2009 topics: the Libronix Controlled Vocabulary. This is the second of my two major activities over the last year (the other was described in my previous post), and therefore a pretty strong contender for a BibleTech presentation.

Unlike the Bible Knowledgebase, which is about real-world entities in the Biblical text, the Libronix Controlled Vocabulary (LCV) organizes terminology from the field of Biblical studies, principally Bible dictionaries, encyclopedias, and other kinds of subject-oriented reference works. A controlled vocabulary identifies, organizes, and systematizes a specific set of terms for indexing content, capturing inter-term relationships, and expressing term hierarchies. Like other kinds of metadata, this infrastructure then supports applications in search, discovery, and general knowledge management. The initial version of the LCV was built by merging content from 7 of the most important Bible dictionaries in Libronix, and currently comprises some 11k terms: i expect it will eventually grow to 15k or perhaps more.

One interesting aspect of working in the specific domain of Biblical studies is that there is a core set of subjects that are common to many or most Bible dictionaries. This includes named individuals and places in the Bible, but also subjects like Heaven or Heresy. But while one dictionary has an article on Heresy (NBD [Libronix link], or Eastons [Libronix link]), another might have one entitled “Heresy and Orthodoxy in the NT” (Anchor [Libronix link]). These articles may have both common content but also significant differences, stemming from their intended audiences (scholarly vs. popular), theological orientation, comprehensiveness, etc. The LCV provides a way to capture some of these similarities, as well as enabling some interesting new capabilities for machine learning from existing prose content. For example:

  • what are the prototypical Bible references, names, or phrases used to discuss a topic?
  • can we learn anything about the importance of topics by looking at how much is written about them, how many dictionaries cover them, and other kinds of automated analysis?
  • what knowledge can be gleaned from the topology of terminology linkage (what links to what)?

I’m not sure i’ve provided enough information here to give a clear sense of what might be covered in such a talk, but i welcome any feedback from potential BibleTech attendees (or others) as to whether this sounds interesting, and which aspects of it you’d most like to learn about.

October 28th, 2008

BibleTech 2009 Topic: the Bible Knowledgebase

My most significant activity at Logos over the last year and a half has been building a database of people, places, and things i call the Bible Knowledgebase (BK). I’ve posted on numerous aspects of this project before (collected in this category), and thanks to lots of hard work by a number of individuals, we’re closing in on a relatively complete internal version. This won’t be released until the next major version of Logos software, so it’s public debut is still some ways off.

So one strong candidate for a BibleTech talk is a review of the BK, a machine-readable knowledge base of semantically-organized Bible data that is linked to Biblical texts to support search, navigation, visualization. The thousands of entities in the BK (people, places, and things, along with their names) have a variety of attributes that are appropriate to their type: people have family relationships, places have geo-coordinates, etc. Relationships between entities support discovery and exploration.
Unlike knowledge expressed in prose (like Bible dictionaries), BK data provides reusable content that can serve a variety of purposes. It also provides an important integration framework for Libronix resources, in the general spirit of Tim Berners-Lee’s Linked Data idea.

Some other topics the talk might address:

  • visualizing and learning from the graph of relationships
  • BK as an information architecture for other Libronix resources
  • challenges in building and using BK
  • some specific tools that have proved useful in managing BK development
  • a possible future for community participation in BK extension

So now, the audience participation portion of our program:

  • would you be interested in hearing a talk like this at BibleTech 2009?
  • what aspects are most/least interesting to you?

I’d encourage you to post a comment with your responses.