God’s Word | our words
meaning, communication, & technology
following Jesus, the Word made flesh
January 25th, 2008

Mark Miller: New Culture, New Media

What makes great communication? Story. (lots of pointers to websites doing New Media) Vox Pop Network (Mars Hill) is a good example of a church using new media. Media Convergence will force current software companies to re-think their models. Mashups are a new mix between professionally-created content and user-generated content. RSS is becoming a new broadcast model. wefeelfine.org is one interesting flash visualization of a variety of feelings taken off the web. (Lots of discussion around how communication is changing, and new modes of communication. Search “continuous partial attention” for more about how multiple simultaneous communication modes affects us. )

January 25th, 2008

Kurt Fuqua: High-Quality Machine Translation Using a Semantic Interlingua

Many languages still require a Bible translation, and the traditional approach is time-consuming and therefore not yet up to scale. Machine translation through an interlingua offers another approach to meeting the demand for new translations. Analysis of the source (exegesis) is the hardest part: but this only has to be done once, and then re-used multiple times in generating new translations.

Having a good interlingua is critical. InterLing, a predicate logic interlingua, was created in 1996 as an open, freely available standard. One intermediate goal short of full translation is generating an interlinear: not completely fluid, but a useful tool for leaders. You can automate layout using XML with word correlations: the same is true for general Bible publication. For example, cross-references are language independent. Languages that have only a New Testament translation can bootstrap Old Testament translations from the NT grammar and related resources.

January 25th, 2008

Andi Wu: Treebanks of Biblical Texts

Andi’s from the Asia Bible Society, and is building Bible treebanks that give the syntactic structure of verses, a joint project between Asia Bible Society and Groves Center at Westminster Theological Seminary. Treebanks support data-driven analysis, study tools, syntactic search, and tree alignment for evaluating translations, concordancing, and other applications.

Dynamic treebanking: use a parser to directly generate trees, but rather than editing the trees, correct the lexical attributes which guide the parser to get the correct analysis. [so it's a compilation process rather than static data annotation] Editing trees directly is painful (and can be dangerous), and inter-annotator consistency can be a problem.

Current status: Hebrew prosodic treebank is completely parsed. Hebrew syntactic treebank is parsed (except for Daniel), and being manually checked. Greek syntactic treebank is parsed and being checked. For the prosodic treebank of Hebrew (using Masoretic cantillation marks), every verse has been successfully parsed, and more than 99% of the verses received one and only one parse.

Syntactic treebanks use a combination of phrase-structure grammar (and the phrase level) and dependency grammar (at the clause level), since Hebrew and Greek are non-configurational languages at the clause level. They use multiple trees for ambiguous parses. He showed an interesting dynamic interlinear: selecting a word in one language highlights the corresponding words in the other language.

Open issues: verse- versus sentence-based parsing, discontinuous constituents. Ultimately he’d like to get to a logical form (language-independent) representation. “This kind of work is much more fun that Microsoft”!

January 25th, 2008

Karl Hofmann: Building Community or Building Babel?

His hope is that technology practitioners will think more carefully about how technology impacts our approach to the Bible. Strong influence: Ivan Illich, a Catholic priest who was involved in the early discussions that led to Vatican 2, and left that process. “The distortion of the best thing becomes the worst thing.” Two books:

Also Rivers North of the Future, conversations with Ivan Illich.

How is the Bible technology? (audience participation produced some of the comments here) Examples: many of the Biblical texts were meant to be heard (not read). Our manuscript traditions and text-critical process affect the “output” . We need to be involved with the Author, not just the text, but also with the right hermeneutic. There are other “forms” of the Bible than print, and software especially is pushing on that definition. Web technology brings new opportunities for community.

The story of Babel from Gen 11 talks about technological development and people “making a name for themselves” through language. What is it that we want to build, and how will enable us to communicate with each other? The “reification of a community of belief”. Technology can be a double-edged sword.

The technology that’s been added to the Bible (typography, chapter and verse divisions, etc.) allows us to get away from what’s actually being said. “The magic is at run time”, the impact of the Word on the hearer. Just adding vowels to Hebrew can distinguish communities of faith.

January 25th, 2008

James Tauber: MorphGNT

The beginnings of MorphGNT were seeing the functional annotation of the Friberg AGNT, and CCAT at UPenn in Beta code. An early realization in working with CCAT is that the data would be much more usable if regularized into one lemma per line: you can then use Unix command line utilities. Further investigation revealed thousands of errors (though some were systematic and hence easily fixed), including some deeper analytic ones. Building systems to generate data from scratch has been an important part of the process of identifying errors.

Much of the content of Mounce’s book (with reference to morphology) could be replicated with a single awk command on the MorphGNT data.

Helped start the Electronic New Testaments Manuscript Project in 1996: but it was too early, and people didn’t understand what it meant to put things on the web.

Much of the early challenge was simply putting Greek on the web. This led to GreekGIF, a series of images of Greek letters that enabled more readable representations. “I’m relieved to say this is no longer necessary”!

Early involvement with XML put PhD plans on hold. Around 2002, started working on automatically generating inflected forms (initially driven by Mounce’s classes). In 2004, released v5 of MorphGNT, now with Unicode. zhubert.com was the next major development in the use of MorphGNT, and a milestone event. Since then, been working on other corrections (which haven’t yet led to a new version), started PhD studies, and also started collaborating with Ulrik Sandborg-Petersen. A current interest is splitting the text from the analysis: you really need an additional field to identify the analysis for a particular form to eliminate all ambiguity. Also working on splitting off the lexicon: morphological analysis, semantic domain, and other attributes, as well as standardizing lexical representations.

The myth of vocabulary coverage: “the 100 most common words account for 66% of the text”. But these words typically don’t have information content, and you really need about 95% of the words in a verse to understand it (according to learning theorists). Really, we need a new kind of grader reader that’s optimized for early comprehension, clause-based, form (rather than lexeme) based, and gives context in English. Progressive substitution of Greek phrases into an English text helps provide a gradual transition.
Web site: http://morphgnt.org.

January 25th, 2008

Blogging BibleTech:2008

It’s only 10AM day one, and i’ve already had several great conversations here at BibleTech, where some of the leading figures in the intersection of technology and Biblical study are gathered.

I’m planning to informally blog a few of the talks, just so those of you who didn’t come will know what you’re missing! You’ll find these (and hopefully other posts about the conference) here behind the Technorati tag “bibletech08″.

|