probability

March 2016 [Edited: April 2016]

Brewing Some Numbers

Schlitz Beer

In the early 1980s Joseph Schlitz Brewing Co. came up with an ingenious campaign to improve its dwindling market share. They bought TV air time and conducted blind taste tests live during football games culminating in one during the Super Bowl in 1981. Among Schlitz's competitors at that time were Budweiser, Miller and Michelob, and so Schlitz pitted their product against these three. As subjects for the tests they recruited not just any beer drinkers, but those who avowedly preferred Bud, Miller, or Michelob. They wanted to show that a fair number—hopefully a large proportion of them—would actually like Schlitz better. They intentionally (and for very good reasons) did not invite Schiltz fans.

The test went something like this: They asked 100 Michelob drinkers to test two mugs of beers—one contained Michelob and the other Schlitz. This was a blind test so the mugs were unlabeled and identical and participants weren't told which mug contained which brand of beer. After sampling the brew the subjects made their choice. At the end of the test the preferrences of all 100 participants were tallied and scores announced.

Here's the backstory as to why Schlitz dared try this trial—and live on air at that. According to Charles Wheelan (from whom I learned this interesting event) manufacturers had known all along that all three brands tasted practically the same. There was little discernable difference between them and so a blind taste test between any two brands was in essence simply a toss-up. There's around a 50% chance that, for example, a Bud drinker would—between a mug of Bud and a mug of Schlitz—pick either one.

This is why Schlitz intentionally excluded anyone who preferred their brand—the likelihood of these devoted Schlitz drinkers choosing a non-Schlitz was also 50%. Hardly something Schlitz wanted to show.

Schlitz could be so confident that their ruse, I mean, test would work in their favor because they must've had staff savvy enough to compute the probabilities. And truth be told coming up with the numbers is rather elementary. It merely involves the use of what (glassy-eyed) Prob 101 students take up in class—the binomial probability distribution.

So what numbers did Schlitz get? We can't know for sure but we can come up with some ballpark figures.

Assuming that any two brands taste practically the same we let the probability of choosing a brand among two choices = 0.5. Since there were 100 participants that's equivalent to having 100 trials (analogous to 100 coin tosses). (We can assume the subjects and their taste testing are independent of one another, i.e., one person tasting and choosing does not affect another person's test and choice.)

So what's the probability that at least 40 Miller drinkers—i.e., 40 or more—would chooose Schlitz? That's the probability of 40 drinkers choosing Schlitz + the probability of 41 drinkers choosing Schiltz + ... + the probability of 99 drinkers choosing Schiltz + the probability of 100 drinkers choosing Schiltz. In mathematical lingo, that's P(x ≥ 40).

Rather than provide the binomial equation and doing the mind-numbingly tedious calculations by hand, let's use a binomial distribution calculator, one which displays the results in a convenient table form. To use it type in the number of trials which in this case is 100, and the probability p which is 0.5. Then click on the button labeled "Create Binomial Probability Table".

From the table we see that P(x ≥ 40) is around 98%.

And the probability of getting at least 45 drinkers? P(x ≥ 45) = 86%.
50 or more? P(x ≥ 50) = 54%.
Getting at least 60 is close to hoping against hope: P(x ≥ 60) = <3%.
While probability of having over 70 is so low that it would take a miracle from the god of fermentation.

Wheelan says that versus Michelob, Schlitz got exactly 50 taste testers choosing their brew. Not bad at all.

A full-page Schlitz ad which appeared in the February 4, 1981 issue of The Cornell Daily Sun provides numbers for some of the tests conducted. In the first test of Bud drinkers, 46 out of 100 picked Schlitz. In the second test one week later exactly half (of a hundred) chose Schlitz.

A short NYTimes article reports the numbers for the Miller tests. In those tests condcuted a week apart, 37% and 38% of the participants chose Schlitz. Not as dramatic as with the Bud and Michelob crowd, but still it's safe to assume that among Miller drinkers there must've been a few who thought: Wow, over a third of our buddies actually like Schlitz better. Maybe it ain't so bad. Might give it a try one of these days. (Of course, the kneejerk reaction of explaining away such cognitive dissonance would probably have kicked in for most of the die-hard Millerites—readers of psychologist Leon Festinger will get the pun).

Let's take a closer look at Schlitz's poor(er) performance against Miller. In the binomial probability table we see that Schlitz was almost guaranteed to have at the very least 30 drinkers picking their brew. That's the worst they would do in such a test. The probability of getting less than 30 is close to infinitesimal. So getting at least 37 wasn't at all unexpected. What would've been a head scratcher (and pink slips for those in the marketing department) would've been if they got less than 30.

On the other hand, the probabilty of getting 37 or less is a mere 0.6%. That's low indeed. What makes the poor performance vis-a-vis Miller intriguing is that Schlitz got 37 on one test and 38 on the other. That makes us wonder whether our assumption of p = 0.5 is significantly off the mark. Perhaps there is in fact just a hint of flavor or aroma that distinguishes Miller from Schlitz, enough to tip off some drinkers. If so then p would be less than 0.5 (remember p is probability of choosing Schlitz). If we had to compute the probability of choosing Schlitz over Miller using just the taste test results we'd get p = 0.375 (Out of the 200 taste testers, 37 + 38 = 75 chose Schlitz; therefore, p = 75 / 200 = 0.375). Could it be that p is in fact closer to 0.4 than 0.5?

A total of 200 drinkers is not that large a sample. We'd be far more confident that p is less than 0.5 if Schlitz had recruited something like a thousand participants. So it's not implausible at all that Schlitz merely had a run of bad luck and that there is no difference between their beer and Miller's.

Just food for thought. Now go get a cup of tea, Earl Grey, hot to make it go down a lot smoother. No intoxicants, please. Alcohol and clear thinking don't go along very well.