Sunday, March 1, 2009

Tournament of Books Bracketology #2

And we're back with more super-geeky, highly untrustworthy analysis of The Morning News's Tournament of Books, 2009!

Previously on Bracketology, we've covered how seeding tends to play out, whether the gender of the author has an effect on the outcome of a given matchup, and whether any publisher has been particularly successful in previous tournaments. (Click that link for full coverage.)

Today we'll look at the effects of page count on judges' decisions—do judges tend to get bored with longer books and penalize them? Or do they tend to treat shorter books as more frivolous and less deserving of advancing to the next round? Let's roll the tape.

First, let me preface this with one note: the 1120-page Pynchon wrist-breaker Against the Day (which is 320 pages longer than the next longest book) skews what is a small data set, so keep that in mind on the averages (the other methods and measurements should control for outliers better):
  • In the first round, the average page count for winning books was about 390 pages, the average for losers, about 340. The median page counts for each are: 321.5 pages for winners, 316.5 for losers. Obviously, not much of a difference here: even if we itemize by matchup, the record for all four years (2005-2008) is 17 times the longer book won, 15 times the shorter book. Over the past two years, it's been even: 8-8.
  • The second round is a little more interesting when broken down by matchup: overall record of 10 longer wins, 6 shorter wins; past two years, 7 longer book wins, only 1 win by a shorter book. The winners in Round Two are on average about 440 pages long; the losers have an average of about 345. Medians are: winners, 365 pp; losers, 306.
  • In the semifinals, the shorter books lead a counter-offensive, winning 5 of 8 times overall, and 3 of 4 times in the past two years. However, the averages now get really screwy, as the Pynchon brick weighs more heavily among fewer books: winners in this round are on average 360 pages long, while losers are a silly average of 515 pages. Medians make a little more sense: 365 for winners, 379 for losing books.
  • The Zombie Round results are stunning when broken out by matchup: the longer book has never won. Who cares about means and medians (295 winners, 555 losers; 306.5, 448, respectively)—you can take this one to the bookie: Andrew Womack and Rosecrans Baldwin, who have always been the judges for the Zombie Round, evidently can't stand long books, so bet on the short horse.
  • Champions do tend to be the longer of the two books: 3 out of 4, in fact, have been longer, and by an average of about 80 pages, which is not insignificant (about an hour to an hour and a half of reading for most people, I would imagine). The medians do make this difference disappear: 322 pages for winners, 320.5 for losers. The averages are 355 pages for champions, 318 for runner-ups.
  • The overall record for all matchups in all rounds over all four years is actually split completely evenly between longer and shorter books: 33 matchups have been won by the longer of the two books in the bracket, 33 have been won by the shorter.
We can take away a few things from these rather crude results, I think, especially if I refine things a bit. Below is a table with the average number of pages of all books involved in a given round, and the averages removing Against the Day:
RoundAverageAve. w/o Pynchon
1st Round364.45352.46
2nd Round391.72368.23
Semifinals436.94391.40
Zombie423.75360.45
Champion335.75335.75
When you remove Pynchon, the average page count by round actually follows something (very, very roughly) approximating a normal curve, which is a little odd, yet pretty neat, just to give a rough sense of the trending. In order to get any sense out of this, however, we need to drill down and create some finer categories for length. I've separated all books into four groups: over 500pp, under 250pp, between 250 and 350, and between 350 and 500pp. (W's=Wins, Z=Zombie, Ch=Championship)

CategoryBooks in Field1st Rd W's2ndRd W'sSF W'sZ W'sCh W's
> 500964101
< 2501572221
>250, < 35025126242
> 350, < 5001574300

As you can see, the first two rounds are hard on the shortest category, books under 250pp, which is what pushes the average up in the 2nd and then the semifinal rounds. But the semifinal round absolutely kills the 500+pp. books, which is what knocks the average down in the following rounds. The average page count continues to drop as we see some more attrition in the between 350 and 500 pp. books. The between 250 and 350 pp. category obviously does the best overall, and its attrition is fairly consistent down the line. (Again, note the preference in the Zombie Round for this shorter category.) If you want to stake your money or your reputation on a book, I think the numbers stand behind this range.

Books which are between 250 and 350 pp. make up the bulk of this year's field: The White Tiger, The Lazarus Project, Unaccustomed Earth, My Revolutions, The Disreputable History of Frankie Landau-Banks, The Dart League King, Steer Towards Rock, Netherland, Home, and Harry, Revised. Going back to previous analyses, I think we can rule out the #4 seeds, Harry, Revised, Steer Towards Rock, Disreputable History…, and The Dart League King. It's possible one will catch some attention, but unless Mark Sarvas has whipped his blog readers to vote for him as a potential Zombie revival, none of these will probably be seen past the second round. My Revolutions, similarly, likely won't make it out of the second round, judging by past performance of #3 seeds. If Home wins its first round matchup, it will almost certainly win in the second round, since FSG always wins in Round 2, but probably won't win in the Semifinals. Unaccustomed Earth, published by Knopf, will probably add to the Knopf Second Round Curse: it probably won't make it to the Semifinals, as the Second Round has become Knopf's tomb.

I'll take a look at what will factor into the remaining possibilities (White Tiger, Lazarus Project, and Netherland) in another post, and also run some numbers regarding judges.

2 comments:

Anonymous said...

Andrew Seal: Let me be the first to congratulate you on your fine, fine commentary. I've decided right now to officially make you my (R)ToB Bill Simmons (ESPN.com). I plan to use your data and analysis in the online bets I'm about to make. You're gonna give Warner and Guilfoile a serious run for their commentator's money. (Which commentating, for what it's worth, is far more Statler and Waldorf [and I like the muppets, but still; I don't look to them for analysis] than Bill Simmons.

Anyway, I'm gushing. I'm a fan of what you're doing here with ToB. Keep it coming.

--Matt Evans

JM said...

You left out Bolano in that last little analysis. Which is too bad because it's going to win, despite its girth.