BibleTech talk: Automatically Learning Topical Content

I’ve posted the slides from my BibleTech 2013 talk. Here’s the abstract:

Continued work on the Logos Controlled Vocabulary (BibleTech 2010, “A Controlled Vocabulary for Biblical Studies”) has produced a unique collection of topic-aligned content across more than 50 different Bible dictionaries, encyclopedias, and topical indexes in both English and Spanish. This presentation will describe the information we’re learning automatically from this content, including:

  • determining concept importance
  • associating concepts with Bible references
  • extracting and associating names and descriptive terms for concepts
  • relating concepts to each other

You can see the other talks at the BibleTech website. I’ve had a number of positive comments on the talk, which is always gratifying. Slowly but surely, we’re climbing up the data stack …


3/21/2013 update: I’ve used the Slidy framework for presentations for several years because i like the way it puts the whole content out on the web in HTML. However, @JohnRGentry pointed out that my slides don’t work on the iPad because moving forward and backward requires keys. There’s a newer version of Slidy which does support swiping to move through slides, though my experience with it on both Safari and Chrome on iOS hasn’t been great: it’s not easy to register a swipe, and title text sometimes gets lost. I assume these are issues with javascript support on iOS, though i’m not really sure. I’ll try to update my slides to the newest version of Slidy, which will help a little, but i’ll also look for another framework that’s more tablet friendly.

LCV Talk at Semantic Technology Conference

I’ll be giving a talk at the Semantic Technology Conference, June 23 from 7:30AM8:20am (ouch!), in San Francisco, CA. The talk title is “Using a Controlled Vocabulary for Managing a Digital Library Platform“: no talk page yet, but the abstract follows. If you’re there, come by and say hello!

(Astute readers will note some similarities between this and my upcoming BibleTech talk. But the audiences are quite different, so the content will be too. This talk will provide “a practical case study on semantically organizing reference material to support search and navigation, using a controlled vocabulary.”)

Abstract

Encyclopedias and other subject-oriented reference books frequently present the same content using different names: and users often look for this information using other names altogether.

The Logos Controlled Vocabulary (LCV) organizes parallel but distinct content in the domain of Biblical studies to integrate reference information and support search, discovery, and knowledge management. The LCV captures

  • preferred and alternate terminology
  • inter-term relationships
  • term hierarchy
  • linkage to other semantic information

The initial version of the LCV (now shipping in the Logos digital library platform) comprises some 11,000 terms, and continues to grow as more reference works are added. It also provides the backbone of http://topics.logos.com, a website for user contributions to terminology and content.

This talk will describe the building of the LCV, how we’re using it now, and how we plan to use and extend it in the future.

Keywords: , , , ,

BibleTech:2010 Talk – The Logos Controlled Vocabulary

The program for BibleTech:2010 has been up for a couple of weeks now, and i’ve been delinquent in failing to point that out. We’ve got a full roster of really interesting talks that span the gamut from friendly warm technology to hard-core geekishness: Bible translation, social media, Biblical linguistics, mobile computing, preaching, publishing, tweeting, and more. And this year, it’s in San Jose, CA: i’m hoping that will open up attendance to some folks who have the misfortune to not live in the beautiful Pacific NW. The dates are March 26-27, 2010.

I’ll be giving two talks this year: here’s my abstract for the first one, on the Libronix Logos Controlled Vocabulary.


Dozens of books provide terminology from the field of Biblical studies, principally Bible dictionaries, encyclopedias, and other subject-oriented reference works. However, the terminology used varies between books, authors, and publishers, and doesn’t always include all the terms a user might employ to find information.

The Libronix Logos Controlled Vocabulary (LCV) organizes content from multiple Bible dictionaries to integrate information across the Logos library. As a controlled vocabulary, the LCV identifies, organizes, and systematizes a specific set of terms for indexing content, capturing inter-term relationships, and expressing term hierarchies. Like other kinds of metadata, this infrastructure then supports applications in search, discovery, and general knowledge management. The initial version of the LCV (shipping now with Logos 4) comprises some 11,100 terms, and continues to grow as more reference works are added. It also provides the backbone of http://topics.logos.com, a website for user contributions.

This talk will describe the building of the LCV, how we’re using it now, and how we plan to use and extend it in the future. This includes some interesting new capabilities for machine learning from existing prose content. For example:

  • what are the prototypical Bible references, names, or phrases used to discuss a topic?
  • can we learn anything about the importance of topics by looking at how much is written about them, how many dictionaries cover them, and other kinds of automated analysis?
  • what knowledge can be gleaned from the topology of terminology linkage (what links to what)?

Update: we’ve decided in general to retire the “Libronix” name for Logos technologies, so i’m trying to get on board by starting to call this the Logos Controlled Vocabulary.