Happy Powers Of Ten Day

It’s been 30 years since since Charles and Ray Eames made their brief movie “Powers of Ten”, illustrating what it would be like to zoom out and in from the perspective we normally enjoy to ones at the cosmic and microscopic levels.
Today, applications like Google Earth (and Hollywood spy movies) have made the “Long Zoom” familiar to many of us: but the Eames’ film was one of the earliest attempts to depict this kind of thinking. The Eames Office suggests each October 10th as an opportunity to think about both the Big and Small Picture (literally).
I’ve been thinking for a while now about how to bring Long Zoom perspective to visualization and navigation of the Bible: i’ll be giving a talk about this at Bible Tech 2008. In the meanwhile, Happy Power of Ten Day!
View the movie Powers of Ten:

Tag Clouds for Visualization

Tim comments over at the LibraryThing blog about click-based tag clouds, like this one from the State of Delaware website.

Click-based Tag Cloud

I’m not so sure why this seems surprising or innovative. Tim rightly notes that tag clouds are more commonly used to represent tagged items. But fundamentally, a tag cloud is just a kind of textual histogram, where

  • rather than a horizontal axis, the axis is wrapped across multiple lines (like text), making it more compact
  • rather than a bar whose height indicates magnitude, the font size (typically scaled) shows magnitude

So you can use a tag cloud for any kind of frequency distribution whose labels are textual. For example:

  • word counts from a document (i used to have Hyper-concordance views like this, though they’ve gone missing in action)
  • article titles for a blog, where the magnitude might be # of page views, # of citations by Google/del.icio.us/Connotea/you name it, # of sentences
  • wiki pages by # of outbound or inbound links
  • content and prosody measures for a text (see this old Blogos post)

I’m not surprised people like them: they can be a very effective visualization tool. But i am surprised people are surprised by the fact that they’re being used in more than just one way.

Visualizing Text with LiveInk

Bob Pritchett recently blogged about visual text formatting: one example is LiveInk. I’ve wondered for some time how we might improve reading now that we don’t really need to have one-fontsize-fits-all, linear textual arrangements (for example, sizing text by prominence).

Apropos of this, i’m reading Robin Williams’ Non-Designers Design Book (hat tip to Coding Horror), a good starting point for people who aren’t professional designers but still have to do some kind of design (pretty much all of us these days). Two of the basic principles are Alignment and Proximity: elements that are close or aligned will seem related (whether they really are or not!).

Back to LiveInk: here’s one of their demo examples.

LiveInk Sample

While breaking the sentence up definitely makes it more scannable, i have some trouble parsing the result, and i think Williams’ principle of Alignment helps explain it. For example, the alignment of “means” and “among adults” makes me think they’re somehow related. But they’re not: “among adults” modifies “physical activity”, and the linguist in me thinks it ought to therefore be moved farther to the right. Of course, you can only push right so far before running out of room, and maybe that’s the practical explanation for the alignment here (LiveInk’s site suggests they have solid research behind what they do).

Visualizing Bible Data at Many Eyes

When i presented my work on the Composite Gospel Index at the 2005 SBL meeting, i included a number of treemaps showing different overviews and perspectives on the Gospel data. I put my slides on the web, but i was frustrated that i couldn’t share the treemaps themselves! The excellent Treemap software (from the HCIL group at the University of Maryland) that i used to create them is a desktop application. I could dump screenshots, but interactivity is one of the cool features of a treemap, so not being able to put the treemaps on the web (without a bunch of Java programming i didn’t know how to do) felt like a serious omission. It takes a lot more words to explain what quickly becomes clear when you have the visual (that’s why we do visualization!). (note there was some kind of problem with the data set i put there, which should now be fixed.)

So i was excited to learn about Many Eyes, a new site from some established researchers in infoviz that allows you to experiment with and share visualizations. You upload your data in a tab-delimited format with headers, and Many Eyes guesses the data types (currently just text and number). Once the data is stored, you can try any number of different visualizations, including several that go beyond the usual: bar charts, histograms, bubble charts, network diagrams, and treemaps. The visualizations can be published to their website so others can view and interact with them (or create new ones using your data set).
Here’s a linked graphic to a treemap that shows individual pericopes, grouped across gospels, colored by how many sources include that pericope (1-4 gospels), and sized by the number of verses (for each sources) that make up that pericope. So at the upper left is Pericope.122, “Jesus feeds five thousand”, which has the most verses of all the pericopes (since it occurs in all four gospels, and each gospel has a fairly lengthy description).

The coloring makes it easy to see which pericopes are common or unique among the Gospels: so one column over, 4th down is Pericope.119, “Jesus prepares the disciples for persecution”, the largest pericope source that is unique to an individual Gospel (though of course elements of this occur in the other Gospels: some might consider this particular instance an artifact of my methodology for dividing and grouping pericopes). In the next column to the right are Pericope.264, “Jesus condemns the religious leaders”, and Pericope.213, “Jesus tells the parable of the lost son”, both lengthy pericopes from a single source (and here i don’t think the data are vulnerable to the same criticism about my methodology).
You can also re-arrange the hierarchy and select different attributes for the treemap (though i can’t claim the data labels are obvious enough to make that easy!). Many Eyes doesn’t let you do everything the UMd Treemap software does: for example, i’ve found it helpful to color the individual pericope sources by their source gospel, but apparently you can only color by numeric fields, so that’s not possible with Many Eyes. But in my experience, simpler-yet-web-enabled often beats sophisticated-yet-trapped-on-my-desktop.
I’ve long been interested in information visualization, and in my new position at Logos Research Systems i’m hoping to have more opportunity to explore how visual presentation can people understand the Bible in new ways.

(Thanks to O’Reilly Radar for pointing out the Many Eyes site)

Bible Mapping Sites

The ESV Blog had a post last week about BibleMap.org, a new interactive mapping application that combines the ESV Bible text, a Google Maps display, and articles from the International Standard Bible Encyclopedia (ISBE). So you can find a passage, click on the hyperlinks for place names, and see a satellite picture of e.g. where Nazareth is actually located (unfortunately, Google can’t show you what it looked like 2000 years ago!).
Of course, it’s wonderful that people are making these kinds of applications available: thinking about the place names in the Bible is an essential part of really understanding the context, though i suspect most Bible readers tend to simply gloss over them. This kind of tight integration can help bring the world of the Bible alive to modern readers.
Nevertheless, without faulting the creators of this site, i can’t help but wish for more:

  • This is a classic example of a stovepipe application: while it’s got a lot of useful data (linking verses to place names, place names to lat-longs, and place names to ISBE articles), all of that data is embedded in the application (the website) itself. That’s fine if all you want is to use it, but not if you want to re-use it. If instead there were a web service behind this, there could be multiple versions of this same basic capability, without having to re-engineer the basic data. I’ve ridden the hobby horse of data before applications before, and this is a basic tenet of Web 2.0 thinking. The most recent version of New Testament Names has some Google Earth data (which i used for this map in my SBL presentation) for just this reason, though (like BibleMap.org) it’s not complete.
  • I can easily guess why they chose the ISBE: it’s the most comprehensive Bible reference work in the public domain. But it’s not the most up-to-date (if it were, it probably wouldn’t be in the public domain!), and the depth of information sometimes goes well beyond what casual readers want. Which raises the fundamental question: what’s the right level of information for a reference like this? Most readers won’t care about proximity to modern archaeological sites, and would instead rather have basic information like best guesses as to how large a town was, prominent physical features, etc. Much of this information doesn’t exist in ISBE (or other resources, for that matter).
  • Once you start down the road of information integration (using hyperlinks or other mechanisms), you hate to stop. Wouldn’t it be great if the ISBE text itself was also hyperlinked with place names? The first part of the ISBE article on Nazareth reads

    “A town in Galilee, the home of Joseph. and the Virgin Mary, and for about 30 years the scene of the Saviour’s life (Matthew 2:23; Mark 1:9; Luke 2:39,51; 4:16, etc.). He was therefore called Jesus of Nazareth, although His birthplace was Bethlehem; …”

    Unsurprisingly, definitions for place names typically use other place names to put things in context. Without the hyperlinks here, the text becomes a bit of a dead-end.

  • Their display for John 1:28 shows a classic example of why simple string matching gets you most, but not quite all, of the way: Bethany isn’t the same as “Bethany beyond the Jordan”. Happily, there are few enough of these cases in the Bible texts that they can generally be fixed by hand: but having fixed them, that disambiguation becomes another critical piece of data that shouldn’t be stovepiped.
  • Viewing a little of the geographic context mostly leaves me wanting more. Back to my example of Nazareth: i’d like to see additional overlays of other towns (and of course, that’s specific to the context of a given passage) as well as other features like travel routes and named bodies of water, since showing that town alone doesn’t tell you much. There’s also the subtle issue of what’s the right zoom level: for Matt.4.13, you’d want the map to show both Nazareth and Capernaum, rather than being closely focused in on Nazareth alone.

It’s always easier to critique than to create, i know. My point is simply this: while these early integrations of open tools like Google Maps with Bible study applications are exciting, much more is still possible.

CSS for Interlinear Styling

Many Bible readers are introduced to interlinear displays through a New Testament that shows the correspondance between English and Greek terms. Lots of other language-related phenomenon can also be aptly visualized using interlinear displays. For example, the ESV English-Greek Reverse Interlinear New Testament (which i’ve recently begun using through Logos Bible Software) stacks up five levels of information:

  • the English text
  • the Greek terms (re-arranged to match the English order: that’s the “reverse” part)
  • the Greek lemma
  • the Greek part of speech and morphological analysis
  • the Strong’s number (which is really just a notational variant of the lemma)

Interlinears are good for other things too: today i was working to an interlinear display to show the alignment of speech recognizer output (which makes lots of mistakes) with the correct transcript: that’s one way to make it easier to compare and analyze the kinds of mistakes it makes. Most any kind of information that can be defined to correspond to lexical items can be usefully displayed this way.

So i figured it was worth a little effort to try to figure out a principled way to organize and display this kind of information in a browser. I had to dig deep in to the CSS Guide, but eventually discovered the display: inline-block; property.

It works like this for my speech recognition example: there’s a div whose class is utterance, and which contains a span for each aligned/interlinear group. Each element in the alignment group is also in a span. This file shows the idea: view the source to see the details and the (minimal) CSS styling (but see below if you use Firefox). Look Ma, no tables! And because it’s CSS, things just work like they should if you change the text size, alter the size of the window, etc.
Googling afterwards, i found some posts along similar lines from James Tauber and Pinyin News, both usingtags, which (as Pinyin News notes) is a minor violation of markup semantics: these are words, after all, not paragraphs. The inline-block approach is a little cleaner since it just uses spans (though i couldn’t argue this is a huge issue).
I also loved this quote from Pinyin News:

The interlinear version of the Scriptures is the prototype or ideal of all translation.
— Walter Benjamin

Problem: Firefox doesn’t seem to recognize the inline-block element! Opera does (no big surprise), as does IE 6 (that was a surprise!). Usually IE is the loser in the browser-compliance wars, but here IE gets it right. That’s disappointing, since Firefox is my usual browser of choice: but (as Eric Meyer suggests) the right approach is usual to stick with the standards, and hope the browser eventually catch up.

AJAX Timeline

In the “How Cool is That!” Department: this very morning, i was looking (for the umpteenth time) for some not-invented-by-me and open/semi-standard way to author event information in an XML format, to be rendered graphically in the form of a timeline. I’d like to record and organize some major events of my life (while i can still remember most of them!) and have a visualization of the results: i’m also interested in expressing genealogical information this way. This time around, i found the Historical Event Markup Language project, which i intend to take a closer look at. It looks promising, and i said to myself at the time “wouldn’t it be cool to create a visual timeline of early Christian history?”.
So tonight, going to the SIMILE project at MIT’s web site, i found something new: their Timeline project, which offers a DHTML widget for making timelines. Check out this very detailed timeline of Jewish and Christian history: you really need a 100″ monitor to get the big picture here! (hint: the top inch or so is a control you grab to scroll the window right and left)

Blogos Posting Frequency

One nice consequence of the move to WordPress was a tidy set of data about my posting history, organized by months, from my first post in April 2003 through September (not yet complete of course). It’s currently on the sidebar on the left (though i’ll probably get rid of this in the layout: it takes too much space, and the links aren’t quite right anyway). Here it is in the form of a sparkline instead: Sparkline of Number of Blogos Posts Per Month

A few details are obvious:

  • Typical initial blogging enthusiasm resulting in lots of posts (blue dot: high of 41 in June 2003), waning down to the first zero-post month (first red dot) in July 2004.
  • A definite seasonal effect: summer has generally been a quieter time (the second red dot is zero posts in August 2005), and the two peaks after the decline have been in winter (Jan 2005, 12 posts; March 2006, 18 posts).
  • The blue zone is 1 standard deviation, and i’ve been there consistently since about 12 months after starting

Since i have the exported data in a tidy text file, a little parsing would let me create a parallel graphic for length of post, hopefully a little more consistent over time.

Other resources on sparklines: