Wednesday, December 11, 2013

A good trick? (and is it legit?)

Every couple of years I team-teach the lit half of a "Connected" pair of classes with my friend, the mathematician Bill Goldbloom Bloch. "The Edge of Reason" links my SciFi class with his Math Thought class over the course of an entire year, with us alternating teaching days and each prof sitting in on and participating in all of the other's classes. It's incredibly fun, and I learn a lot about math and, maybe more importantly, about how mathematicians think. Bill says that most mathematicians have 1 or 2 "good tricks," ways of conceptualizing the world or handling problems, that allows them to make multiple discoveries. Going beyond mathematicians: physicist Richard Feynman had his "integrate over all paths" trick, Einstein had his visualizations, etc. My own "good tricks" have included "read the whole thing" (you'd be amazed by how few scholars do) and "push the metaphor until it breaks."

Now I think I have a new one. And it's mathematical.

This summer our research group was working on the problem of thorn / eth distribution  . We were having trouble visualizing the data. I don't know why, but suddenly into my mind popped the notion of a rolling average, something I think I'd learned way back in high school and which had shown up when I was being creative with budgets to avoid laying people off during the financial crisis: it turns out that the amount of money you are allowed to draw from an endowment's revenue stream is based on a rolling average of the returns in several previous quarters. This saved me during the crash, as we had a little more money right at the beginning—since the the previous quarters were propping up the average—so we could at least give visiting and part-time faculty a year or two to try to find something else instead of just dropping them into a terrible economy (my sole accomplishment as department chair was that I didn't lay anyone off or fail to renew a contract).

So I started calculating the rolling ratio of θ (total number of þ divided by total number of þ plus total number of ð) through a text: choose a "window" of words or letters, add up all the thorns and eths in that window, calculate θ, and then move the window one unit to the right and re-calculate. The plots of the rolling ratios turn out to be very interesting. I'm just finishing up a paper now on what they might tell us about a work's textual history.

But I have been worried--following a chance remark by Janet Bately at ISAS Dublin--that all we were detecting with θ was the frequency of first-, second- or third-person plural present tense or plural imperative verbs. These forms end with an interdental, and there certainly seems to be a correlation between terminal interdentals and scribal use of ð (most famously by the B-scribe of Beowulf, but elsewhere as well). I wanted to know if θ was just a complicated proxy measurement for portions of the poem in the plural present tense or the imperative.

So we developed another measure, τ, which is the ratio of terminal interdentals (þ and ð) to the total number of interdentals in a passage. We calculated τ as a rolling ratio as well, and then compared the plots of τ and θ.

Sometimes these plots appear to be negatively correlated with each other: when τ goes down, θ increases, but other times,  not so much. And just looking at the graphs wasn't entirely satisfactory. So I calculated Pearson correlation coefficients between τ and θ. It turns out that these are pretty ambiguous when applied to whole texts, generally being on the order of .3  (1.0 would be perfect correlation and 0 would be no correlation at all). That wasn't entirely helpful: with an r of .3, tense and number could be contributing to θ, but other things (textual  history) could be as well.

Then last night I was staring in frustration at the τ and θ graphs for the Old English Genesis, and it hit me: there was a visible correlation between τ and θ in Genesis B, but not in Genesis A. I quickly calculated the Pearson correlation coefficient for each poem and indeed, Genesis B is highly correlated, with an r of .69, while Genesis A is only weakly correlated.

And here's where both the "good trick" and my question of legitimacy comes in. I realized that I could do the rolling window trick with the correlation coefficient. Calculate τ and θ, then choose a window length and calculate the correlation coefficient for that window. Then shift to the right and recalculate r. Plot the whole thing.

Except that it was hard to read the plot, since you ended up with both positive and negative correlations (negative correlation just means that when one variable goes up, the other goes down. It's just as much a correlation as a positive one).  So I had idea of taking the absolute value of r and plotting that. When you do so, you get very interesting results. Genesis B, for example, jumps right out of the Genesis plot. So too does the canticle-sourced material in Daniel and the section of Christ III that's based on the sermon of Caesarius of Arles.

My tentative conclusion: because not all scribes consistently followed the "terminal interdental to be represented by ð" rule, the correlation between τ and θ is actually useful data. Instead of simply invalidating θ, the correlation--and its absence--tells you something about the copying history of the text. My hunch is that it's the later scribes who produce segments with closely correlated τ and θ, so when we don't see the correlation, we can hypothesize that we're looking at a text that was written and copied earlier and so in which the inertia of the earlier forms is influencing that final copy.

But my worry is that a rolling Pearson's correlation coefficient is somehow statistically or mathematically illegitimate. You've got two rolling ratios (τ and θ), each of which over-samples many of the same data points (because the same point is going to influence multiple windows) and then you're doing the same kind of rolling comparison with over-sampling with the relatively complex Pearson formula.  I'm worried that my lack of mathematical and statistical sophistication has led me to miss something that should cancel out something else. Unfortunately, it is finals week, so I can't meet with my friend and co-author the statistician for a while at least, so I just have to live with being both excited at a potential discovery and worried that at any moment the intellectual floor is going to collapse out from under it.

Sunday, November 17, 2013


For the overlapping parts, regular correlation coefficient is .349. Spearman's Rho is .357.  But all that changes a lot if you correct for that one place in the middle where the top graph goes up and the bottom one goes down. I think this happens because for most of the overlapping section we are comparing apples with apples, but just at that point, an orange or two got dropped in.

The challenge is to decide when you have reached that magical point where you are manipulating your data rather than evaluating it.

And that's what I'll be spending tomorrow morning doing.

Friday, November 08, 2013

A Little Formula

Turns out that you can find out stuff about Old English texts with just a simple formula:

For any text of length n
with a sub-segment of length w < n

where k is the first term in w
þ is the total number of thorns in the segment;
ð is the total number of eths in the segment;
and w+k ≤ n.

The real tricks are figuring out if what you're detecting is significant or just a product of stochastic variation and, if it is statistically significant, whether or not it is just an epiphenomenon of a less interesting process.

As Richard Feynman, one of my intellectual heroes, once said “The first principle is that you must not fool yourself and you are the easiest person to fool.”

Which is why I've been having learning to debug programs in Python and re-learning Stats II from 20+ years ago.

Unfortunately, at least one of the more striking findings is looking like its just an epiphenomenon. But the good news is that the other discoveries seem like they are pretty robust.

Sunday, September 29, 2013

Father Brag

My son's first touchdown. 

Thursday, September 26, 2013

The Killer-Barney Effect

In some of the Icelandic sagas in which he appears, Bjarni Brodd-Helgason is a generally peaceful man, even though he got the nickname Víga-Bjarni (Killer-Barney) when he had to kill some of his relatives at Bodvarsdalr. In Vápnfirðinga Saga he is reluctant to take revenge; he is eager to reconcile in Voðu-Brands þáttr; and he’s clever and honorable in Þorsteins þattr stangarhoggs. So the nickname is somewhat at odds with the character, especially in these sagas that come from the East, where Bjarni was from. The disjunction between name and personality seems to be the point, especially in Thorstein the Staff-struck. 

 Víga-Bjarni’s name, however, appears to have overpowered his character in later sagas from the West, where people either had not known Bjarni Brodd-Helgason, or the transmitted knowledge of his personality was forgotten. In this material, Killer-Barney is now a blood-loving, death-dealing maniac.

I hate you, 
you hate me, 
I had to slaughter members of my family....

 We can call this phenomenon, in which a traditional referent, like Víga-Bjarni's name, loses the link with its original extra-textual and contextual meaning and instead develops as part of a new, intra-textual tradition,  The Killer-Barney Effect.

Monday, September 23, 2013

How to Think: The Liberal Arts and Their Enduring Value

I have a new lecture course out from Recorded Books' Modern Scholar series:
How to Think: The Liberal Arts and Their Enduring Value

It's an 8-lecture course:

1. The Liberal Arts: Where did they come from?
2. Separating Science
3. Tools to Rule
4. Can the Liberal Arts Make you a Better Person?
5. The Best Reasons: Solving Complex Problems, Preserving and Transmitting Culture
6. Beowulf: A Case Study of the Richness of the Liberal Arts Tradition
7. What's Wrong with the Liberal Arts? (And How to Fix it).
8. A Defense and Celebration of the Liberal Arts.

The CD set is available from Amazon here at this link.
The direct link to all of my courses on Recorded Books is here.

John Alexander, the founder of The Modern Scholar and my producer for all 12 courses, has formed Scholarly Sojourns: beautiful, flawless educational tours throughout the world. In Summer 2014 I am leading tours to Anglo-Saxon Britain, Iceland and Tolkien's England. We could meet up!

Tuesday, September 03, 2013

Wisdom from Neal Stephenson

Some people try to communicate "out of a conviction that the world must be amenable to human understanding, and that if you can understand something, you can explain it in words: fancy words if that helps, plain words if possible. But in any case you can reach out to other minds through the medium of words." And by doing this, you are saying "here is something cool that I want to share with you for not other reason than making a spark jump between minds."

From the Foreword to David Foster Wallace's Everything and More.

Wednesday, August 21, 2013

The book is out!

The new book is out!

Tradition and Influence in Anglo-Saxon Literature: An Evolutionary, Cognitivist Approach.

This book was more years in the making than I like to think about.  I would never have finished it if it weren't for the unexpected help of Jack Zipes, who, in a kind of Tolkienian eucatastrophe, swooped in right when things were most dire.

The cover was designed by Wheaton students Leah Smith and Amira Pualwan. From it people of a particular age may be able to guess at what album I most overplayed in the summer of 1982.

Sunday, July 21, 2013

I'm very happy to announce that we'll be hosting the next Mythcon at Wheaton College.

Fantasy literature does not fit comfortably into any scheme. Both old and new, traditional and innovative, popular and elite, mainstream and esoteric, escapist and engaged, high-tech and anti-technology, fantasy defies definitions and transcends categories, dramatizing the incompleteness of our understanding of our own imaginations. At Mythcon 45 we will discuss the place of fantasy in our culture, our institutions, and our hearts. 

Scholar Guest of Honor:
Richard West

Author Guest of Honor:
To be announced

keep an eye on

Monday, July 08, 2013

Intellectual Environments

We've been having another successful season of research in the Lexomics Lab this summer: discoveries made, complex things figured out, methods invented, tools created.  And most of all, it has, once again, been really, really fun.  Much more fun, and much more intellectually exciting, than anything I ever did in grad school (or in undergrad or as a professor, for that matter).

I've been wondering why.

Probably the simplest reason is that English students don't naturally work in a lab environment. Our ideal seems to be lots of solitary time with books and a computer and then a quick chance to show off what we've written in a seminar or at a conference.  It's not that we're not social, but that our socialization comes after the fact, not during the research.  The daily give and take, the continuously social nature of the lab, isn't really part of our experience. So we get all the way through our studies never knowing that we're missing out on the most intellectual fun you can have. 

We were lucky to have the use of a single big room, with one wall of white-board, desks and monitors around the other walls, and moveable smaller tables, which we configured as a "library table" and a work table around which we put our laptops. Since we all worked together, we talked a lot, working through ideas out loud, overhearing the challenges various parts of the project faced (i.e., the Beowulf-focused people hearing the discussions of the Cynewulf-focused people and the Shakespeare-focused people and the Old Norse-focused people and the software people, etc.).

So it was all shared and cross-pollinated and social and fun.

But also there was another factor in play, I think. Because my students are undergraduates ranging from  freshmen to seniors and ranged in major from English to Computer Science to Chemistry to Philosophy, there was always more than one person in the audience who didn't know something. And so it was always ok to ask for more explanation.

We completely avoided that most damaging of statements a teacher or mentor can make: "What! You don't already know that!"

It's a damaging statement, because it encourages the student, in the future, to bluff, to hide ignorance or cover it up.  And when you do that, you never end up learning what you need to learn.

Intellectual bluff is such an important part of grad school, and the job market, and the tenure game, that it's absolutely ingrained throughout academic culture. And it's debilitating, because people aren't willing to ask questions that would expose their ignorance, and so they don't get answers or explanations that could enable them to solve particular problems and make real contributions.

But in a room with as many computer-scientists as medievalists, everybody is ignorant about lots of stuff and so, over the course of the summer, we've gotten to the point where we can confess that ignorance, get an explanation, and then learn more.  It's exhilarating.

Quick example: I was trying to derive some kind of measure of comparative vocabulary homogeneity for different segmentations of a text.  I had an elaborate formula that I thought made sense, but I couldn't get the numbers out that were consistent with what we thought we knew. Finally, my student research partner (the chemistry major) just plotted a bunch of data on a graph, and, uh-oh, it made a straight, diagonal line. I didn't have enough variables in my equation, so x+y always equalled one. So I was being a math doofus. But by doing so, in front of all my students and some colleagues and the computer science students, who were half listening in, I modeled, I think, the way to approach such a problem: take a stab at it, fail, take another stab, fail better, get some help (I dragged a physics professor in from the hallway), take another stab, fail even better, and maybe finally get somewhere.

And hopefully what the students learned--and their amazing contributions suggest that they did--was that being a scholar is about seeking the answers, not already having them (or pretending to).

Monday, July 01, 2013

On Discipline

If the experiment contradicts the theory, then the theory is wrong.

                                                  —Richard Feynman (paraphrase)  

it can be so hard to accept this rule sometimes, but just keep telling yourself...

Progress comes when the experiment contradicts the theory.

Saturday, June 22, 2013

Saturday, June 01, 2013

More Stages of Textual Transmission

Very General Hypothesis: 
There are more stages of transmission in Anglo-Saxon manuscripts (i.e., between original source and the manuscript we actually possess) than are generally recognized.

Anecdotal Evidence: 
Nearly every time I start to work with a text that is not among those most familiar to me, I discover that there is evidence for multiple stages of transmission.

Paul Remley shows that there could be as many as eight stages between the biblical book of Daniel, two Latin canticles (the Oratio Azariae and the Canticum Treum Puerum), and the manuscript texts of Daniel and Azarias.

I'm pretty certain that the OE translation of Chrodegang's Rule was done in the earlier 10th century, between 940 and 950 and then copied again between 984-1006 (when the names from the Old Minster at Winchester were added), before being copied in the 11th century to create the manuscript version we now have.

The Lexomics Research Group has come across some evidence that strongly suggests more than one stage of textual transmission behind a few Exeter Book poems, including Christ III, Guthlac A, and the Descent into Hell.

Beowulf is consistent with being at least a copy of a copy: Lapidge's argument for the exemplar being in  a different hand, the problems the scribes have with proper names, the apparent haplography in a few lines, and now some work by the Lexomics group strongly suggests a rather complex textual history.

I think that many scholars (I certainly was one of them) have unconsciously adopted the idea that the earlier textual history of Anglo-Saxon texts was like their late textual history: texts sat un-read and un-revised in a monastic library for centuries, only to be copied over in the 10th century, to then be neglected after the Conquest and sit un-read until, after surviving the cataclysms of the Dissolution, the English civil war and the Cotton Fire, they are finally re-discovered in the 18th century.

Also, because our ability to locate texts in time and space is very limited (our resolution is extremely coarse grained), we unconsciously adopt the position that texts might be handled, copied or read once in a century or so.

Tentative Conclusions:
Evidence scattered throughout the corpus suggests instead that there was a rich and complex history of textual transmission, modification, editing and augmentation throughout the Anglo-Saxon period. If this is so, then texts may have been very much more "alive" and in use than we have previously thought. They may have been less of a quiet archive than a regularly consulted set of intellectual resources (cf. the heavily glossed texts that are the product of Athelwold's and Dunstan's "Aldhelm Seminar" at Glastonbury).

The texts we have may not represent only a single temporal slice of a culture but instead be multiple re-workings of inherited material, and our authors may have been widely read in their vernacular literature as well as in the Latin tradition.

Friday, March 29, 2013

Mechthild Gretsch, RIP

One of my prized academic possessions
is a small yellow post-it note that I received in 2005.

With great trepidation I had sent Prof. Mechthild Gretsch an offprint of my article, "Re-Dating the Old English Translation of the Enlarged Rule of Chrodegang: The Evidence of the Prose Style," from JEGP.  I had originally submitted this article to another journal, which had asked Mechthild's husband, Prof. Helmut Gneuss, to be the outside reviewer.  Helmut sent his review directly to me as well as to the editors: typed, seemingly on a manual typewriter, on a narrow sheet of paper.  My German isn't great, but I knew enough that when I read the words "grosse mangel" in the second line, I broke out in a cold sweat.  Needless to say, the journal didn't publish the article, but by addressing the helpful criticisms of Helmut and another reviewer I was able to fix the essay and get it published in JEGP

But it was not because of her connection with Helmut that I sent my article (which was my best technical work up to that point) to Mechthild, but because she was the living Anglo-Saxonist whose work I most admired.  But she didn't know that, I had never met her personally, and she had a fearsome reputation as a scholar who did not hesitate to point out errors and lacks, particularly if they were grosse. It had taken some willpower, and maybe a drink or two, to put the offprint in an envelope and mail it off.

You see, though I never told her this, my intellectual debt to Mechthild is very great.

In 2001 I had been stuck. The success of Beowulf and the Critics was combining with the difficulty I was having in putting together my first monograph on Anglo-Saxon to pull me away from the field. Kalamazoo that year had been a big, depressing disappointment. What other people seemed to find exciting did nothing for me, and the terrible job market had caused a number of my friends to leave academia altogether. The intellectual spark had gone out.  Anglo-Saxon studies was following a path that led only to insignificant but all-consuming quibbling. The field was entangled in miserable thickets of personal and institutional politics, and those who--through the positions they occupied, if not the work they were no longer doing--should have led were instead dissipating the hard-won intellectual inheritance of our titanic forebears (not on debauchery, more's the pity, but on orthodoxy, groveling, scheming). It was just a radical change from my feelings of immense excitement at ISAS '95 at Stanford or '97 at Palermo or '99 at Notre Dame. I wanted out, to be away from this whole field that I had loved so much.

I clearly remember sitting on the floor of O'Hare airport at 6:30 a.m. on Sunday morning exhausted (having gone to bed at 3:00 and gotten up at 4:30), bored, and with a five-hour wait ahead of me, thinking that this was going to be my last Kalamazoo. I would focus on Tolkien, get my tenure in a couple years, and spend my energies on my 1-year-old daughter.

Then, to pass the time, I opened a book I had bought: Mechthild Gretsch's The Intellectual Foundations of the English Benedictine Reform. I wasn't expecting much. The Cambridge UP Anglo-Saxon England books are always well-done, by smart (albeit connected) people with good training. They're usually correct in whatever it is that they say, the kinds of books that smart, connected people create to gather up and transfer knowledge.  But they aren't exciting. Or at least one never had been until I started reading Gretsch.

Then my whole intellectual world changed.

Here, finally, was what I had been looking for without knowing it: an approach to Anglo-Saxon texts that was scientifically rigorous, linked always to specific knowledge taken from texts rather than guesses or abstractions. Here were arguments the built upon one another with the same kind of logical beauty as those in math or physics. But what these arguments did was absolute romance: they brought the dead to life.

And Gretsch did this with Psalter glosses.

Psalter glosses! Possibly the most boring set of "texts"in the history of earth. But by sorting through layers of such glosses, categorizing them, understanding the ways that the medieval authors had tried to understand their Latin sources, the mistakes they made, even, perhaps, the puns that amused them, Gretsch had reconstructed the an "Aldhelm Seminar" at Glastonbury in the tenth century. Æthelwold and Dunstan had gathered their followers together and sweated out the difficulties of Aldhelms "dense wood of Latin." And then they had burst out to change English culture forever.

By the time my flight arrived, I had recaptured my love for Anglo-Saxon studies.  I read Intellectual Foundations the way Richard Feynman said to read a physic textbook: read until you reach a point where you don't understand, then go back to the beginning and re-read, hopefully getting further the second time. When I finally understood each step of the argument, I was ready (and able) write my own book.

In the process I stumbled upon stylistic patterns in the Old English translation of the Rule of Chrodegang that allowed me to re-date that text and put it in its correct Benedictine Reform context. I tried to make that argument as logically rigorous as Mechthild's argument in Intellectual Foundations, and I guess, thanks to the influence of that model, I succeeded.

My post-it note from Mechthild reads, in part:

Dear Michael,
Thanks for your offprint.
I think you are right.

It is in part due to Mechthild's scholarly example and her achievement that "I think you are right" is a greater compliment than any effusive praise could be.

Thank you, Mechthild, for your inspirational scholarship, for your uncompromising intellectual toughness and for your blend of logical rigor and creativity. Thank you for being so kind and encouraging. I hope you are resting peacefully and that you now know the authors of the glosses and their purposes, that you've had a chat with Æthelwold and Dunstan and Wulfstan and Ælfric and that when they tell you,"I think you are right" it gives your spirit as much joy as your words gave mine.  

Tuesday, March 26, 2013

Feynman on "Pretentious Science"

The "work" is always

(1) completely un-understandable,

(2) vague and indefinite,

(3) something correct that is obvious and self-evident, worked out by a long and difficult analysis, and presented as an important discovery, or

(4) a claim based on the stupidity of the author that some obvious and correct fact, accepted and checked for years is, in fact, false (these are the worst: no argument will convince the idiot),

(5) an attempt to do something, probably impossible, but certainly of no utility, which, it is finally revealed at the end, fails or

(6) just plain wrong.

There is a great deal of "activity in the field:" these days, but this "activity" is mainly in showing that the previous "activity" of someone else resulted in an error or in nothing useful or in something promising.