Automatically Learning Topical Content

Sean Boisen, [myfirstname], @SeanBoisen

Director of Content Innovation, Logos Bible Software

BibleTech:2013, Seattle WA March 15, 2013

Presentation URL:

#BibleTech, #BibleData

Semantic Organization at Logos

Why Topical Analysis?

The Logos Controlled Vocabulary (LCV)

LCV Example: Deceit

This organizes knowledge about language and subjects in biblical studies

Newer LCV Features

Connecting different datasets

Connecting Concepts to Terms and References

Deriving term-concept connections
ConceptAn item from the LCV (a "topic")
BookTraditional container for content
ArticleSubject-oriented unit of prose from a book
TermA "word" from an article
ReferenceA Bible reference term

The LCV Corpus: Overview

Resources by term count and reference percentage
Breadth of Article Sources for Concepts

Binned number of sources per concept
So one definition of "important" is "broad"

Distribution of Terms for Concepts

Number of terms per concept
Another definition of "important" is how much is written about it
I didn't freehand this! Gives some confidence that scale is helping here.
Other measures we don't yet have data for: popular (search, or topic guide), "significant" according to some other standard. But breadth and term count are good cross-book approximations.

Learning Concept → Reference Associations

concept types
Ac 18:26 202 Ti 4:19 19
Ro 16:3 161 Co 16:19 15
Ac 18:18 9Ac 18:2 8
Ac 18:1-3 3Ac 18:2-3 3
Ro 16:3-5 3Ro 16:4 3

42 references, 137 instances

Learning Reference → Concept Associations

Learning Term → Concept Associations

key terms in Priscilla article

Concept Relationships through Reference Vectors

Mk 14:3 19
Lk 23:56 16
Es 2:12 14
Is 1:6 13
Ps 133:2 13
Ex 30:23-25 12
Job 41:31 12
Lk 7:46 12
Ec 7:1 11
Je 8:22 11

Top Ointment Matches by Reference

spikenard 0.0018box 0.0243
flask 0.0590nard 0.0652
alabaster 0.0905spice 0.1347
oil 0.1383Cosmetics_Object 0.1387
BethanyOfJudith_Place 0.1424BethanyOnTheMountOfOlives_Place 0.1439
perfume 0.15021Bethany_Place 0.15391
recreation 0.15791perfumery 0.15908
anointing 0.16091leper 0.16426
herb 0.16465Bethabara_Place 0.16714
precious 0.16828vial 0.17194
Are these useful? Would an editor have thought of them?


Resources and References

About This Presentation

