Bible Knowledgebase: What, Why, How

(Post 2 in a series on building the Bible Knowledgebase, unfortunately delayed by a plague of web hosting problems)

The What: BK (Bible Knowledgebase) is reference information about the world of the Bible using Semantic Web standards and tools. The Semantic Web refers to moving from a world of networked pages displayed for humans (HTML, the vast majority of the current World Wide Web), to semantically-characterized information that is machine-readable (and therefore supports a variety of uses like search, browsing, visualization, etc.). Tim Berners-Lee likes to describe it as moving from a web of documents (meant to be read by humans) to a web of data (meant to be read by computers).

Initially, the scope is every named thing in the Bible (people and places are the bulk of the cases, but there are also languages, ethic groups, holidays, and numerous others). Eventually i hope to extend this to unnamed but described entities: for example, the Samaritan woman of John 4 is never named, but we know her ethnicity, where she lived, some people she interacted with, and other facts.

The Why: the Bible Knowledgebase will support

Knowledge exploration and discovery: just as hyperlinked web pages lead you to new information, linked facts about individuals will lead to other individuals or resources about them.
Smart (semantic) indexing: for a given passage, you’ll know which John/Mary/James is referred to, not just the collection of individuals who share that name. Searching will provide more precisely targeted results, because reference material will be disambiguated.
Visualization: rich data sets support graphic displays that given an overview of information that would otherwise be scattered across numerous different passages

The How:

I’ve designed an OWL ontology that captures an initial set of entity types and relationships between them
Information from Logos’ Biblical People feature and New Testament Names has been merged into the initial dataset
Both the ontology and the instance data will be extended to incorporate additional information. There’s no principled stopping point, but i expect to grow BK from its current size of ~100k RDF triples to perhaps 100-1000 times that size.

In the (perhaps unlikely) event that you’re in the intersection between

Blogos readers, and
attendees at the Semantic Technology Conference in San Jose next week

i’ll be giving a presentation about this work Thursday morning.