Can K*n D*venport do Math? A four-part series (Part I)

I don’t like K*n D*venport (KD). I know he brought the beautiful Spring Awakening revival to Broadway in 2015 and also produced this season’s glorious Once on this Island, and I am super thankful that these shows made it to Broadway. However, he’s done some truly shady shit in the past, and he also purports himself to be an analytics expert when it comes to Broadway. I realize that he, as a producer, has a lot of experience in the theater industry I don’t have. But as a trained statistician, I have a lot of issues with some of the methods he uses in the analytics on his blog, The Producer’s Perspective. In Part I, I want to tackle something he posted in January 2015: “Sure, these Broadway shows recouped! But why? A by the numbers infographic.” Which brings us to today’s question:

Does K*n D*venport know how to round numbers?

I spend a lot of time thinking about how hard it is to measure success on Broadway. One approach is to think like a producer who has no interest in theater and whose only goal is to make money. (Just writing this breaks my heart.) In this situation, recoupment of capitalization costs is a really useful outcome to consider. Loosely, initial capitalization costs are the amount of money that is needed to put a production together from the ground up, which is separate from a show’s weekly running costs (weekly wages for cast and crew, rent paid to the theater, etc.). A show is only really considered a ‘hit’ or ‘financial success’ if the investors are paid back the initial capitalization. When I saw KD’s post about  recoupment statistics, I thought: Great! First of all, we love infographics. Plus, here’s someone who may have some more insider information about shows that were less transparent about their finances, or at least less boastful about their successes. Indeed, the footnote at the end of his infographic says:

*Only commercial musicals were studied for this infographic. In order to determine if a show recouped or not, we looked for public announcements and talked to some of our inside sources. In order to have recouped for this study, a musical needed to earn back its capitalization while in performances on Broadway. Some shows may have recouped after closing on Broadway through subsidiary rights, etc. Those shows were not included in our “recouped” category.

His infographic only includes original Broadway musicals that opened between 1994 and 2014. I would love to know exactly which of these musicals recouped – but KD doesn’t provide any actual counts for how he did any of his calculations. He doesn’t provide an actual list of these musicals, or even the number of musicals that recouped. However, it dawned on me that I could use his statistics to do some back-calculations, recreate the list of musicals that recouped with my own research, and verify my list by seeing if I can replicate the numbers on the infographic. He claims that 44% of shows that recouped won the Tony Award for Best Musical, 16% were Disney shows, and so on. I should be able to use these results to recreate his original list.

A preliminary list of musicals that opened from 1994-2014 and recouped by January 7, 2015

Below, I’ve included a list of the 32 musicals for which I could find official press releases reporting successful recoupment and initial capitalization costs, if available. I couldn’t find a press release saying Rent had recouped, but that may be because I didn’t try very hard to find it, because…well…obviously. Same goes for Aladdin, but KD mentions Aladdin by name in his infographic, so I know it’s definitely on his list. I also made special note of the date of KD’s post, because Motown The Musical announced that it had recouped only nine days after the infographic was published.

Opening Date Musical Capitalization ($)
1994-04-18 Beauty and the Beast
1996-04-29 Rent
1997-11-13 The Lion King 20 million
2000-03-23 Aida
2000-10-26 The Full Monty 7.5 million
2001-04-19 The Producers 10.5 million
2001-09-20 Urinetown 3.7 million
2001-10-18 Mamma Mia! 10 million
2002-08-15 Hairspray 10.5 million
2002-10-24 Movin’ Out 10 million
2003-07-31 Avenue Q 3.5 million
2003-10-16 The Boy From Oz 8.25 million
2003-10-30 Wicked 14 million
2005-03-17 Spamalot 12 million
2005-05-02 The 25th Annual Putnam County Spelling Bee 3.5 million
2005-11-06 Jersey Boys
2005-12-01 The Color Purple 11 million
2006-05-01 The Drowsy Chaperone 8 million
2006-11-16 Mary Poppins
2006-12-10 Spring Awakening 6 million
2008-03-09 In the Heights 10 million
2008-11-13 Billy Elliot: The Musical 18 million
2009-04-15 Next To Normal 4 million
2009-10-19 Memphis 12 million
2011-03-24 The Book of Mormon 11.4 million
2012-03-18 Once 5.5 million
2012-03-29 Newsies The Musical 5 million
2013-04-04 Kinky Boots 13.5 million
2013-04-11 Matilda The Musical 16 million
2013-11-17 A Gentleman’s Guide to Love & Murder 7.5 million
2014-01-12 Beautiful The Carole King Musical 13 million
2014-03-20 Aladdin

Musicals I thought might have recouped but could not confirm/deny: The Full MontyFosseCurtainsSmokey Joe’s CafeRagtime, and more. This suggests that KD’s list of musicals is somewhere in the 30s.

Finding the denominator: how many musicals are on KD’s list?

Mathematically, how would we go about calculating the number of new musicals that recouped from 1994-2014, according to KD? Let’s discuss in general, outside of the context of theater, how we might approach this problem.

Suppose we’re trying to recreate a list of items and we know:

  • 23% of the items meet some criteria A, and
  • There are somewhere between 10 and 30 items on that list.

We know right off the bat that there can’t be 10 items on the list, because if 2 items met criteria A, you’d calculate 20%, and if 3 items met criteria A, you’d calculate 30%. (We are dealing only with whole numbers here.) There also can’t be 11 items – if 2 items met criteria A, you’d calculate 2/11=0.182=18%, and if 3 items did, you’d calculate 3/11=0.273=27%. 2/12 = 0.167 = 17% and 3/12 = 25%, so 12 doesn’t work either. But, if you had 13 items, and 3 items met criteria A, you’d get 3/13=0.2307=23%. You might then think that your list has 13 items because the percentages work out.

But what if you had 22 items and 5 met criteria A? 5/22=0.227=23% also.If 6 of 26 items meet criteria A, 6/26=0.2307=23%. Finally, if you had 30 items and 7 met criteria A, 7/30=0.2333=23%.  Therefore, if we only have the two pieces of information outlined above, then we don’t know if there are 13, 22, 26, or 30 items on the list, because they all can give us the percentage we need. The rows corresponding to the lists of these sizes are highlighted in the table below.

table2

But suppose we added some more information:

  • 19% of the items meet some criteria B.

Now, of the lists with 13, 22, 26, or 30 items, we find that only the list with 26 items can produce the percentage we need: 5/26=0.192=19%.

On KD’s list, he lists the following percentages, listed in ascending order and excluding repeats:  7%, 9%, 16%, 19%, 22%, 25%, 31%. 34%, 38%, 40%, 44%, 47%, 50%, 53%, 78%, 81%. From our preliminary list of musicals that recouped, we would probably guess that there are between 30 and 45 items on his list. So, all we have to do is scale up the table we’ve assembled earlier and look for the rows that include all the percentages we need. There are so many percentages that it ought to be pretty easy to find the right number of musicals in KD’s list, right?

Not quite. It turns out that there is no reasonable number of musicals on KD’s list that can mathematically produce the percentages he uses in his infographic, unless he is rounding his numbers incorrectly.

I wrote a short R script to identify this number, and the closest we can get to those 16 percentages is a list of size 32. Fortunately, this is the number I included in my preliminary list – but doesn’t guarantee that we have the same lists. A list of 32 musicals gives us all the right percentages except for two: 7% (musicals that opened in the summer) and 40% (won the Tony award for best book). While it is true that if you just keep increasing the number of musicals in your list, you will eventually get all the percentages you need, the smallest number that gives us the 16 required percentages is 68. There’s no way 68 shows recouped between 1994 and 2014 when there were only 32 published press releases. This is why I used the word ‘reasonable’ in the above paragraph. The table below shows the raw counts for a list of size 32, along with the descriptors included in the infographic.

table3.png

For musicals that won the Tony award for best book, 13/32 = 40.6% gets us close to 40%, but this (obviously) rounds to 41%. Maybe this was just, in fact, a rounding error. But, for musicals that opened in the summer, 2/32=6.25% and 3/32=9.38%, and nobody would round up or down to 7% from those numbers.

So did I really just type all of this is just to say that KD either made a rounding error or an unfortunate typo? Well, no, not really. The thing that bothers me about the infographic is that it provides zero useful contextual information. Cool, so 53% of new musicals that recouped were adaptations. What I’d really like to know is: what % of musicals that didn’t recoup were adaptations? Of musicals that are adaptations, what % recouped? Sure, adaptations might make up a larger share of musicals that recouped than jukebox musicals (19%), but there are way more adaptations that make it to Broadway compared to jukebox musicals. There are also some really odd and questionable statistics KD chose to include. The one that really irks me is the statistic that “81% had caucasian protagonists.” Again, how does this percentage compare to musicals that didn’t recoup? And how exactly are we defining “Caucasian protagonist?” Are we talking about the actors in the show, or the characters they play? Does Wicked count? (Sure, Glinda has to be blonde, but she sure as hell does not need to be white. I’m getting sidetracked.) And is he trying to say that musicals with white protagonists are really successful? He also mentions that “Only 3 shows featured stars in the cast.” What’s a star?

As a statistician and data scientist, this stuff drives me crazy. There’s a movement in science recently towards reproducibility, which means that another person should be able to obtain your input data, implement your definitions and methods, and generate the same exact results. I know that KD is not a scientist and it might not be in his best financial interests to disclose his methodology, but I am, and I’m not here to make money.

This is what made me want to start writing about theatre and statistics in the first place. I want to treat this like science and I want my research to be fully reproducible. I’m here because I want to transparently discuss my results and analyses and share them with others to receive feedback and criticism, not sell you a subscription. I think we can learn a lot about theater from data, and this means being open about how we think about things, not by baiting readers to get them to buy a subscription.

The comments section on KD’s post is full of questions like the ones I mentioned above. He replies to none of them, but please leave me comments if you have any thoughts or ideas about this stuff! I would love nothing more than to hear from all you other theater/data nerds out there.


Next time, in Part II of “Can K*n D*venport do Math? A four-part series”, I’ll start actually seeing if my list of 32 shows is right and if I can actually reproduce the counts and percentages in the blog post. Be warned: something weird is going on with how KD is counting shows that won best score at the Tonys, and if anyone has any ideas about how he’s defining “completely original”… let me know in the comments.

 

Leave a comment