# College Math Teaching

## April 5, 2019

Let’s start with an example from sports: basketball free throws. At a certain times in a game, a player is awarded a free throw, where the player stands 15 feet away from the basket and is allowed to shoot to make a basket, which is worth 1 point. In the NBA, a player will take 2 or 3 shots; the rules are slightly different for college basketball.

Each player will have a “free throw percentage” which is the number of made shots divided by the number of attempts. For NBA players, the league average is .672 with a variance of .0074.

Now suppose you want to determine how well a player will do, given, say, a sample of the player’s data? Under classical (aka “frequentist” ) statistics, one looks at how well the player has done, calculates the percentage ($p$) and then determines a confidence interval for said $p$: using the normal approximation to the binomial distribution, this works out to $\hat{p} \pm z_{\frac{\alpha}{2}} \sqrt{n}\sqrt{p(1-p)}$

\

Yes, I know..for someone who has played a long time, one has career statistics ..so imagine one is trying to extrapolate for a new player with limited data.

That seems straightforward enough. But what if one samples the player’s shooting during an unusually good or unusually bad streak? Example: former NBA star Larry Bird once made 71 straight free throws…if that were the sample, $\hat{p} = 1$ with variance zero! Needless to say that trend is highly unlikely to continue.

Classical frequentist statistics doesn’t offer a way out but Bayesian Statistics does.

This is a good introduction:

But here is a simple, “rough and ready” introduction. Bayesian statistics uses not only the observed sample, but a proposed distribution for the parameter of interest (in this case, p, the probability of making a free throw). The proposed distribution is called a prior distribution or just prior. That is often labeled $g(p)$

Since we are dealing with what amounts to 71 Bernoulli trials where p = .672 so the distribution of each random variable describing the outcome of each individual shot has probability mass fuction $p^{y_i}(1-p)^{1-y_i}$ where $y_i = 1$ for a make and $y_i = 0$ for a miss.

Our goal is to calculate what is known as a posterior distribution (or just posterior) which describes $g$ after updating with the data; we’ll call that $g^*(p)$.

How we go about it: use the principles of joint distributions, likelihood functions and marginal distributions to calculate $g^*(p|y_1, y_2...,y_n) = \frac{L(y_1, y_2, ..y_n|p)g(p)}{\int^{\infty}_{-\infty}L(y_1, y_2, ..y_n|p)g(p)dp}$

The denominator “integrates out” p to turn that into a marginal; remember that the $y_i$ are set to the observed values. In our case, all are 1 with $n = 71$.

What works well is to use the beta distribution for the prior. Note: the pdf is $\frac{\Gamma (a+b)}{\Gamma(a) \Gamma(b)} x^{a-1}(1-x)^{b-1}$ and if one uses $p = x$, this works very well. Now because the mean will be $\mu = \frac{a}{a+b}$ and $\sigma^2 = \frac{ab}{(a+b)^2(a+b+1)}$ given the required mean and variance, one can work out $a, b$ algebraically.

Now look at the numerator which consists of the product of a likelihood function and a density function: up to constant $k$, if we set $\sum^n_{i=1} y_i = y$ we get $k p^{y+a-1}(1-p)^{n-y+b-1}$
The denominator: same thing, but $p$ gets integrated out and the constant $k$ cancels; basically the denominator is what makes the fraction into a density function.

So, in effect, we have $kp^{y+a-1}(1-p)^{n-y+b-1}$ which is just a beta distribution with new $a^* =y+a, b^* =n-y + b$.

So, I will spare you the calculation except to say that that the NBA prior with $\mu = .672, \sigma^2 =.0074$ leads to $a = 19.355, b= 9.447$

Now the update: $a^* = 71+19.355 = 90.355, b^* = 9.447$.

What does this look like? (I used this calculator)

That is the prior. Now for the posterior:

Yes, shifted to the right..very narrow as well. The information has changed..but we avoid the absurd contention that $p = 1$ with a confidence interval of zero width.

We can now calculate a “credible interval” of, say, 90 percent, to see where $p$ most likely lies: use the cumulative density function to find this out:

And note that $P(p < .85) = .042, P(p < .95) = .958 \rightarrow P(.85 < p < .95) = .916$. In fact, Bird’s lifetime free throw shooting percentage is .882, which is well within this 91.6 percent credible interval, based on sampling from this one freakish streak.