negative binomial distribution | College Math Teaching

March 6, 2010

Probability, Evolution and Intelligent Design

Filed under: data fitting, evolution, media, negative binomial distribution, probability, science — collegemathteaching @ 10:48 pm

I always enjoy seeing a bit of mathematics in the mainstream media. One place that it occurred was in this Jerry Coyne’s review (in New Republic magazine) of some popular “science” books which attempted to attack evolutionary theory. The review is called The Great Mutator.

Much of the review is about the mechanisms of evolution (and the ubiquitous “wind sweeping through the junkyard and making a 747” argument is demolished). But there is some mathematics used to illustrate an example:

Suppose a complex adaptation involves twenty parts, represented by twenty dice, each one showing a six. The adaptation is fueled by random mutation, represented by throwing the dice. Behe’s way of getting this adaptation requires you to roll all twenty dice simultaneously, waiting until they all come up six (that is, all successful mutations much happen together).
The probability of getting this outcome is very low; in fact, if you tossed the dice once per second, it would
take about a hundred million years to get the right outcome.
But now let us build the adaptation step by step, as evolutionary theory dictates. You start by rolling the first die, and keep rolling it until a six comes up. When it does, you keep that die (a successful first step in the adaptation) and move on to the next one. You toss the second die until it comes up six (the second step), and so on until all twenty dice show a six. On average, this would take about a hundred and twenty rolls, or a total of two minutes at one roll per second.

So, how does the mathematics work?

In the first example, the probability of getting 20 sixes in any one roll is, of course, $(1/6)^{20}$ . Then, as we repeat the experiment and stop when we get our first “all 20” outcome, we are using the geometric distribution with $p = (1/6)^{20}$ and the expected value to the first “success” (all 20 sixes outcome) is $6^{20} = 3.656 \times 10^{15}$ tries. At a rate of 1 per second, that is about 115.86 million years (using 24 hour days and 365.25 days per year).

Now if we roll the first die until the first 6 comes up, and then the second, the third, etc. and stop when we obtain the 20’th six, we are using the negative binomial distribution with $p = 1/6, r = 20$ . The expected value here is $r/p = 6(20) = 120$ tries. That is a total of 2 minutes at one try per minute.

Of course it is better than that, as we’d actually be rolling the set of 20 dice until we get at least one 6, pulling out all of the sixes we get, and then rolling the remaining dice until we get at least one more 6, throwing out all of the remaining sixes, and continuing.

Working out that distribution would be an excellent exercise!

But let’s return to the negative binomial distribution versus the geometric distribution case: if the probability of a mutation is $p$ and the number of required mutations is $r$ , then the magnitude of the error as a ratio of expected values is $(1/p)^r/(r(1/p)) = (1/p)^{(r-1)}/r$ which grows exponentially in $r$ , no matter the value of $p$ .

Note: the negative binomial distribution appears in another way: sometimes, scientists wish to calculate the number of mutations per time period. The Poisson sometimes fails because not all mutations have the same probability. So what one can do is to modify the Poisson distribution by allowing the Poisson parameter to vary as an exponential distribution; these two parameters (from the Poisson and the exponential) combine to form the two parameters for the negative binomial distribution

Instructions on how to fit the negative binomial distribution to data can be found here.

Comments (2)

College Math Teaching

March 6, 2010

Probability, Evolution and Intelligent Design

Categories

Blogroll

Archives

Top Clicks

Top Posts