God’s Word | our words
meaning, communication, & technology
following Jesus, the Word made flesh
September 27th, 2006

CSS for Interlinear Styling

Many Bible readers are introduced to interlinear displays through a New Testament that shows the correspondance between English and Greek terms. Lots of other language-related phenomenon can also be aptly visualized using interlinear displays. For example, the ESV English-Greek Reverse Interlinear New Testament (which i’ve recently begun using through Logos Bible Software) stacks up five levels of information:

  • the English text
  • the Greek terms (re-arranged to match the English order: that’s the “reverse” part)
  • the Greek lemma
  • the Greek part of speech and morphological analysis
  • the Strong’s number (which is really just a notational variant of the lemma)

Interlinears are good for other things too: today i was working to an interlinear display to show the alignment of speech recognizer output (which makes lots of mistakes) with the correct transcript: that’s one way to make it easier to compare and analyze the kinds of mistakes it makes. Most any kind of information that can be defined to correspond to lexical items can be usefully displayed this way.

So i figured it was worth a little effort to try to figure out a principled way to organize and display this kind of information in a browser. I had to dig deep in to the CSS Guide, but eventually discovered the display: inline-block; property.

It works like this for my speech recognition example: there’s a div whose class is utterance, and which contains a span for each aligned/interlinear group. Each element in the alignment group is also in a span. This file shows the idea: view the source to see the details and the (minimal) CSS styling (but see below if you use Firefox). Look Ma, no tables! And because it’s CSS, things just work like they should if you change the text size, alter the size of the window, etc.
Googling afterwards, i found some posts along similar lines from James Tauber and Pinyin News, both usingtags, which (as Pinyin News notes) is a minor violation of markup semantics: these are words, after all, not paragraphs. The inline-block approach is a little cleaner since it just uses spans (though i couldn’t argue this is a huge issue).
I also loved this quote from Pinyin News:

The interlinear version of the Scriptures is the prototype or ideal of all translation.
— Walter Benjamin

Problem: Firefox doesn’t seem to recognize the inline-block element! Opera does (no big surprise), as does IE 6 (that was a surprise!). Usually IE is the loser in the browser-compliance wars, but here IE gets it right. That’s disappointing, since Firefox is my usual browser of choice: but (as Eric Meyer suggests) the right approach is usual to stick with the standards, and hope the browser eventually catch up.

September 5th, 2006

AJAX Timeline

In the “How Cool is That!” Department: this very morning, i was looking (for the umpteenth time) for some not-invented-by-me and open/semi-standard way to author event information in an XML format, to be rendered graphically in the form of a timeline. I’d like to record and organize some major events of my life (while i can still remember most of them!) and have a visualization of the results: i’m also interested in expressing genealogical information this way. This time around, i found the Historical Event Markup Language project, which i intend to take a closer look at. It looks promising, and i said to myself at the time “wouldn’t it be cool to create a visual timeline of early Christian history?”.
So tonight, going to the SIMILE project at MIT’s web site, i found something new: their Timeline project, which offers a DHTML widget for making timelines. Check out this very detailed timeline of Jewish and Christian history: you really need a 100″ monitor to get the big picture here! (hint: the top inch or so is a control you grab to scroll the window right and left)

September 3rd, 2006

Blogos Posting Frequency

One nice consequence of the move to WordPress was a tidy set of data about my posting history, organized by months, from my first post in April 2003 through September (not yet complete of course). It’s currently on the sidebar on the left (though i’ll probably get rid of this in the layout: it takes too much space, and the links aren’t quite right anyway). Here it is in the form of a sparkline instead: Sparkline of Number of Blogos Posts Per Month

A few details are obvious:

  • Typical initial blogging enthusiasm resulting in lots of posts (blue dot: high of 41 in June 2003), waning down to the first zero-post month (first red dot) in July 2004.
  • A definite seasonal effect: summer has generally been a quieter time (the second red dot is zero posts in August 2005), and the two peaks after the decline have been in winter (Jan 2005, 12 posts; March 2006, 18 posts).
  • The blue zone is 1 standard deviation, and i’ve been there consistently since about 12 months after starting

Since i have the exported data in a tidy text file, a little parsing would let me create a parallel graphic for length of post, hopefully a little more consistent over time.

Other resources on sparklines:

September 3rd, 2006

Personal Information Management and Windows Programs

One of the remarkable transformations in my life over the last decade is the extraordinary amount of time and attention i give to personal information management. Back in the Paper Age, i had books and file cabinets, along with stacks of paper to be read or acted on: that, along with my memory, was about it. In the Digital Age, though, both the scope and the intensity of activity in information management is immensely magnified. Now i have mountains of saved email recording projects, decisions, and conversations with others. I have to work hard at simply organizing my email folders (as i was painfully reminded recently, when work forced me to move from POP mail to IMAP), to keep the volume and complexity from becoming overwhelming. I do the same in organizing my hard drive. Even then, i find i need search appliances like Google Desktop just to search the things i myself have created: there’s simply too much to remember where it is, and i regularly discover things that apparently i created but i’d completely forgotten.

Several years ago, i got a PDA so i could keep my information resources (contacts, task lists, other bits of information) with me when i was away from my laptop. One of the most useful programs i have (which i selected based on the availability of a PocketPC version) is a password manager, FlexWallet 2005 (which i can enthusiastically endorse). At present, i have 188 entries, the vast majority being work or personal websites that require password access (and that omits a number of inconsequential ones that i just keep in email because i hardly ever use them).

We’ve all used bookmarks ever since there were browsers: these days, though, i tend to track websites of interest through del.icio.us (you can view my public items here), since tagging is easier than top-down categorization, and you get a lot of benefit from the digital commons of what others have tagged. I have a lot of information on my Amazon wishlist (two of them, actually), and now i have to struggle against the fragmentation of multiple sites that want to maintain my information. Increasingly, i use web-like mechanisms (for example, several wikis) for storing local information, as an alternative to file-and-folder organization.
One consequence of all this complexity is that i deliberately farm more and more information out to my prosthetic information devices. I simply don’t try to remember dates, phone numbers, passwords, or anything else: that’s what these systems are for. Granted, some of this is simply because i tend to obsess about capture and organization: lots of others folks just let it go. But we really are approaching a future where the costs of storage are so low that you can just store everything. The associated problems are, how do i find it, and increasingly, what do i pay attention to? (not far behind is, how do i keep the management task alone from becoming an end in itself?)
Windows Programs By FunctionIn this spirit, i decided to try an experiment with the Windows Start Menu of my new laptop. Since i had the previous one for three years, and my laptop has become the nexus of information management, i had a lot of programs installed: critical ones from work that i use every week, but also exploratory ones, and a host of personal applications as well. Of course, under Windows, every application wants to be at the top-level of your list of programs: but after a while, that leads to a menu so long it spills over into two or three columns. Worse, however, is that you have to find a program by know what it’s called: wouldn’t it be better to instead organize by what they do?

The screendump at left shows what i can up with: a set of about a dozen top-level task categories, with an occasional sub-category, and then programs arranged underneath. So the Edit category takes me to XMLSpy (an XML editor), several web editors, and a sub-category of graphics editors. Browse includes browsers like Firefox, but also a WordNet information browser. Communicate includes FTP, PuTTY, VNC, etc. Copy is an interesting category: it includes Sonic (for burning DVDs), and ReaderWare (you could argue this is a database application instead). Manage is a bit of a catch-all, with sub-categories for managing local services (Apache, wireless utilities, backup, etc.), local hardware (keyboard and mouse), and local data (Google Desktop, MySql, WinZip, etc.).
Of course, this isn’t completely neat and tidy. Though you can put things in multiple categories (they’re just shortcuts after all), some are a little difficult to categorize at all: for example, what’s the functional category for Cygwin? (i put it under Program) And Browse doesn’t really capture all the things i do with Firefox (but i get it off the frequently-used programs anyway). But it helps reinforce the notion of “what am i doing now”, and (perhaps implicitly) why?

September 2nd, 2006

Moving from Radio Userland to WordPress

One of the many headaches resulting from a recent change of laptops was a semi-forced abandonment of Radio Userland as my blogging platform. I’d been considering the move for some time (Scobleizer‘s experience helped): while i owe a debt of gratitude to Radio as my introduction to blogging three and a half years ago, the problems had been mounting. Radio is cool because it bundles together a bunch of different features into a desktop application: you get a server, a scripting language, an RSS reader, and lots more. But it’s quirky (probably for the same reason), it was always a major effort to change or fix something, and the software’s looking increasingly, well, shabby compared with other alternatives. Back when i first started blogging, i spent an inordinate amount of time (some of it enjoyable) tweaking my site’s appearance, incorporating new features, etc. Now i’ve got plenty of other ways to waste time, and i just want to write.
Since i had let my subscription lapse, moving to a new laptop meant either pay up or say goodbye. Tacking this change on top of several others (more about that in a later post) made it especially painful, which explains three solid weeks of blogging silence.

Deciding where to go next was almost a foregone conclusion: WordPress was one of several client-side installations available on my hosting service, and looking at this very helpful comparison chart made it pretty clear. The main selling points for me were things like the right cost (free), rich but user-friendly template creation, painless syndication, categories, and a strong user user community (i’m looking for low-hassle). Comments were a key thing that just didn’t work anymore with Userland: they were slow to access, i couldn’t get notifications (so people would leave comments i’d never know about), and comment spam was winning hands-down: i’m hoping to do better with WP. The ability to import from other platforms turned out to be a major plus too, though it wasn’t one of my criteria.

I’m plagued by the Geek’s Law of Difficulty in Hindsight: objects in the rear view mirror appear much harder than they did when you were approaching them. Basic installation was a piece of cake, but i guess i’m pretty picky about the cosmetics (something i spent waay too much time on with my Radio blog). I started out with the Tiga theme, and spent a little time trying the built-in controls for modifying its appearance (the Tigarator :-) ), but eventually broke down and just modified the stylesheet directly. However, at least there was a stylesheet (i had to create my own for my Userland blog), and it’s very clearly organized. I still have a few tweaks to make to the presentation, but i’m mostly there.

Then there was importing old posts, something i hadn’t even considered (i figured i’d just leave them where they were, in their former Userland glory). Here i found two guides that were useful, though they didn’t cover it all. This wiki entry laid out one set of details for importing from Radio’s RSS feeds, but it started out with several requirements i didn’t meet, and got more complex from there. Instead, i took this road, which involved exporting from Radio to a MovableType format, then importing from that into WordPress. The big advantage here is that the export process produces one big text file in a structured format: so you can do search-and-replace, or even scripting on it, and incrementally build toward the right outcome.

There were several gotchas:

  • When i previously moved my Radio blog off their hosting onto another host, i must have missed updating a few links: these were easily replaced with a text editor
  • Way too many posts wound up without titles (i suspect a bug in the Exporter), and if i had three posts in a row without titles, the last preceding title got used for all four. So i had to go through and add titles by hand (but since i had a text file, i could do so directly, despite the tedium).
  • For some reason newlines in the MovableType HTML got turned into extra blank lines in WordPress (as though they had been a
    ): that seems like a clear mistake in interpreting the data on the importation. So i had to remove all the newlines in sequences like “>[newline]<".
  • Radio macros were quite a bit trickier, and i had a number of them that i used for ESV references, pericope references, and so forth. As an example, i had a macro called pericopeRef that took an ID argument, and generated the URL for the relevant pericope in the Composite Gospel: <%pericopeRef("020")%> would generate http://www.semanticbible.com/cgi/2004/11/Pericope.020.xml. Unfortunately, the Exporter doesn’t expand these macros, and since the arguments can vary, there isn’t a simple way to do search-and-replace. I wound up writing a little Perl script to do macro expansion: i’d be glad to share it if you’re interested.

One thing that didn’t make it, though, were comments: as the previous post noted, there isn’t any easy way to bring them over. I’m sorry to not bring them along, but since they weren’t working all that well, and there’s a big pile of spam there, it seemed an acceptable loss. You can still find them on legacy posts by mapping from the WordPress format to the old format. For example, a recent post on xpound.org (which drew a comment by Josh, who runs the site) had the following URLs:

So if you drop the title of the post and preceding slash off the WP version, and tack on “.html”, you’ll find the old version, which links in the comment. (note you have to start from the permalink for the WP version)
ALso, posts with multiple categories got turned into category composites: instead of “reading” and “character”, these got a category of “reading; character”. Fortunately there were only a few of these, and i just went back in WordPress and fixed them by hand.
Lots more work than i anticipated, but now i’m (mostly) moved into WordPress citizenship, i’m hoping it will all be worth it!

September 1st, 2006

Google Books Provides Out-of-copyright Works

TechCrunch had the news recently that Google now allows downloads of out-of-copyright books. While Project Gutenberg started down this road, Google is digitizing books at a much larger scale, and their holdings are different in many respects. For example, here’s the table of contents for “A Harmony of the Gospels for Historical Study”, by William Arnold Stevens and Ernest De Witt Burton, published in 1893.

Of course, works in the last 70 or so years are still protected by copyright: if you think only those recent works are worth reading, this won’t matter to you.

September 1st, 2006

Sortable Pericope Index

Will has put up a nice version of the pericope index for the Composite Gospel that’s dynamically sortable by book. This is something that’s been needed since the beginning: the two fundamental orderings are by pericope sequence (the original), and by individual author (Will’s version). I got bogged down trying to make it work in XSLT and never got it done: thanks to Will for contributing!

I’d put this up on SemanticBible in place of the current one, except that i’m in the process of trying to re-arrange the Composite Gospel presentation altogether. In particular, i’ve been trying to come up with a more visual navigation metaphor, which has proven surprisingly tough: more on this in a later post.

|