# College Math Teaching

## December 30, 2012

### 2012 in review

Filed under: Uncategorized — collegemathteaching @ 10:33 pm

The WordPress.com stats helper monkeys prepared a 2012 annual report for this blog.

Here’s an excerpt:

600 people reached the top of Mt. Everest in 2012. This blog got about 11,000 views in 2012. If every person who reached the top of Mt. Everest viewed this blog, it would have taken 18 years to get that many views.

## December 13, 2012

### Domains and Anti Derivatives (Indefinite Integration)

Grading student exams sometimes inspires me to revisit elementary topics. For example, I recently spoke about some unusual (but mostly correct) integration techniques used by students on a final exam.

I’ll recap (and adjust the example slightly): on a recent exam, a student encountered $\int \frac{2}{1-x^2} dx$. I had expected the student to use the usual partial fractions expansion to obtain $\int \frac{1}{1+x} dx + \int \frac{1}{1-x} dx = ln|1+x| - ln|1-x| + C$ which is valid when $x \ne \pm 1$. I admit to being a bad professor and not being picky about domains.

But one student noticed the $1 - x^2$ in the denominator of the fraction and so used the trig substitution $x = sin(\theta), dx = cos(\theta) d\theta$ which leads to the following integral: $\int \frac{2}{cos(\theta)} d\theta = 2ln|sec(\theta) + tan(\theta)| + C$ which leads to $2ln|\frac{1}{\sqrt{1-x^2}} + \frac{x}{\sqrt{1-x^2}}| + C = 2ln|\frac{1+x}{\sqrt{1-x^2}}| = ln|1+x| - ln|1-x| + C$ for $x \in (-1,1)$. Note that, strictly speaking, the “final answer” is really defined for all $x \ne \pm 1$ though the equalities do not hold outside of the domain for $x$ used in the original trig substitution.

And yes, I was a bad professor; I gave full credit to this answer even though we “lost domain” during the string of equalities.

But that got me to wondering: is there a trig substitution that works for $|x| > 1$? Answer: of course:

$\int \frac{2}{1-x^2} dx = -\int \frac{2}{x^2 -1} dx$. Now use $x = sec(\theta), dx = sec(\theta) tan(\theta) d\theta$ which leads to $-2\int csc(\theta) d\theta = 2ln|csc(\theta) + cot(\theta)| + C = 2ln|\frac{x}{\sqrt{x^2-1}} + \frac{1}{\sqrt{x^2 -1}}| + C$ which leads us to our ultimate solution for $|x| > 1$

So, if one REALLY wanted to use trig substituions for this problem, one could and do it in a way to cover the entire domain.

But…as our existence and uniqueness theorems imply, once we get a candidate for an anti-derivative that “works” or the domain, it really doesn’t matter if we did “illegal” steps to get it; we need only show that it is an anti derivative and is valid for the entire domain for the integrand.

Now if one wants a more detailed discussion on domain issues for anti-derivatives, I can recommend the article The Importance of Being Continuous by D. J. JEFFREY which appeared in Mathematics Magazine, Vol 67, pp 294 – 300. (reprint can be found here, scroll down a bit; this mathematician has written quite a bit!). Note: I can recommend this little paper as it talks about the domains of the anti derivatives themselves and not just the domains assumed in doing the calculations along the way or the domains of validity of the substitutions. Note: integral tables and computer algebra systems don’t always give the anti derivative with the “largest” possible domain. One has to watch for that.

## December 7, 2012

### “Unusual” Student Integral Tricks….that are correct!

Filed under: academia, calculus, editorial, elementary mathematics, integrals, integration by substitution, pedagogy — Tags: — collegemathteaching @ 2:52 am

To make this list, the student has to do the integral correctly, but choose a painfully inefficient way of doing it.

On today’s final exam alone (from the most innocent to the most unusual and inefficient….)

1. $\int x\sqrt{x+1} dx =$
Ok, there are two standard methods. The first (and easiest) is to do the change of variable $u = x+1$ which transforms this to $\int (u-1)\sqrt{u} du$ which is very easy to do. The second method: parts, let $u = x, dv = \sqrt{x+1}$ etc. It is an algebraic exercise to see that one gets the same answer either way, though the answers look different at first.

One answer that I saw: $u = \sqrt{x+1}, u^2-1 = x, 2udu = dx$ which leads to $\int 2(u^2 -1)u^2 du$ which of course is doable. So this isn’t that far off of the easiest path, hence this entry only gets an “honorable mention”.

2. $\int \frac{1}{9-x^2} dx$. of course, I thought that I was testing “partial fractions” which leads to an answer of $\frac{1}{6}(ln|3+x| - ln|3-x|)+C$. Fair enough. But what did one of my students do? Well, this looked like trig substitution to him so: $x = 3sin(t), dx = 3cos(t)$ so this was transformed to $\int \frac{3cos(t)}{9cos(t)}dt = \frac{1}{3}\int sec(t) dt = \frac{1}{3}ln|sec(t) + tan(t)|+C$ which transforms back to $\frac{1}{3}ln|\frac{3}{\sqrt{9-x^2}} + \frac{x}{\sqrt{9-x^2}}| = \frac{1}{3}(ln|3+x| - \frac{1}{2}(ln|3-x|+ln|3+x|))+C$ which is, of course, the correct answer.

Yes, I know that there are domain issues with the trig substitution (that is, the integral exists for all values of $x \ne \pm 3$ but I wasn’t being that picky. Besides, this trig substitution is really setting $t = arcsin{\frac{x}{3}}$ and we are really just choosing a convenient “branch” (meaning: viewing the domain “mod (-1,1)”) of the $arcsin(x)$ function.

3. $\int \frac{arcsin(x))^2}{\sqrt{1-x^2}} dx$. Easy, you say? Why not let $u = arcsin(x), du = \frac{1}{\sqrt{1-x^2}},$ etc. Yes, most did it that way. But then we had a couple do the following: $x = sin(t), dx = cos(t)dt, arcsin(x) = t$ which lead to $\int t^2 dt = \frac{t^3}{3} + C$ which transforms to $\frac{(arcsin(x))^3}{3} + C$ which is the correct answer. 🙂

Well, I tell my classes that “this isn’t a gymnastics meet; there are no “degree of difficulty points”” but some insist on trying to entertain me anyway. 🙂

## December 4, 2012

### Teaching Linear Regression and ANOVA: using “cooked” data with Excel

During the linear regression section of our statistics course, we do examples with spreadsheets. Many spreadsheets have data processing packages that will do linear regression and provide output which includes things such as confidence intervals for the regression coefficients, the $r, r^2$ values, and an ANOVA table. I sometimes use this output as motivation to plunge into the study of ANOVA (analysis of variance) and have found that “cooked” linear regression examples to be effective teaching tools.

The purpose of this note is NOT to provide an introduction to the type of ANOVA that is used in linear regression (one can find a brief introduction here or, of course, in most statistics textbooks) but to show a simple example using the “random number generation” features in the Excel (with the data analysis pack loaded into it).

I’ll provide some screen shots to show what I did.

If you are familiar with Excel (or spread sheets in general), this note will be too slow-paced for you.

Brief Background (informal)

I’ll start the “ANOVA for regression” example with a brief discussion of what we are looking for: suppose we have some data which can be thought of as a set of $n$ points in the plane $(x_i, y_i).$ Of course the set of $y$ values has a variance which is calculated as $\frac{1}{n-1} \sum^n_{i=1}(y_i - \bar{y})^2 = \frac{1}{n-1}SS$

It turns out that the “sum of squares” $SS = \sum^n_{i=1} (y_i - \hat{y_i})^2 + \sum^n_{i=1}(\hat{y_i} - \bar{y})^2$ where the first term is called “sum of squares error” and the second term is called “sum of squares regression”; or: SS = SSE + SSR. Here is an informal way of thinking about this: SS is what you use to calculate the “sample variation” of the y values (one divides this term by “n-1” ). This “grand total” can be broken into two parts: the first part is the difference between the actual y values and the y values predicted by the regression line. The second is the difference between the predicted y values (from the regression) and the average y value. Now imagine if the regression slope term $\beta_1$ was equal to zero; then the SSE term would be, in effect, the SS term and the second term SSR would be, in effect, zero ($\bar{y} - \bar{y}$). If we denote the standard deviation of the y’s by $\sigma$ then $\frac{SSR/\sigma}{SSE/((n-2)\sigma}$ is a ratio of chi-square distributions and is therefore $F$ with 1 numerator and $n-2$ denominator degrees of freedom. If $\beta_1 = 0$ or was not statistically significant, we’d expect the ratio to be small.

For example: if the regression line fit the data perfectly, the SSE terms would be zero and the SSR term would equal the SS term as the predicted y values would be the y values. Hence the ratio of (SSR/constant) over (SSE/constant) would be infinite.

That is, the ratio that we use roughly measures the percentage of variation of the y values that comes from the regression line verses the percentage that comes from the error from the regression line. Note that it is customary to denote SSE/(n-2) by MSE and SSR/1 by MSR. (Mean Square Error, Mean Square Regression).

The smaller the numerator relative to the denominator the less that the regression explains.

The following examples using Excel spread sheets are designed to demonstrate these concepts.

The examples are as follows:

Example one: a perfect regression line with “perfect” normally distributed residuals (remember that the usual hypothesis test on the regression coefficients depend on the residuals being normally distributed).

Example two: a regression line in which the y-values have a uniform distribution (and are not really related to the x-values at all).

Examples three and four: show what happens when the regression line is “perfect” and the residuals are normally distributed, but have greater standard deviations than they do in Example One.

First, I created some x values and then came up with the line $y = 4 + 5x$. I then used the formula bar as shown to create that “perfect line” of data in the column called “fake” as shown. Excel allows one to copy and paste formulas such as these.

This is the result after copying:

Now we need to add some residuals to give us a non-zero SSE. This is where the “random number generation” feature comes in handy. One goes to the data tag and then to “data analysis”

and clicks on “random number generation”:

This gives you a dialogue box. I selected “normal distribution”; then I selected “0” of the mean and “1” for the standard deviation. Note: the assumption underlying the confidence interval calculation for the regression parameter confidence intervals is that the residuals are normally distributed and have an expected value of zero.

I selected a column for output (as many rows as x-values) which yields a column:

Now we add the random numbers to the column “fake” to get a simulated set of y values:

That yields the column Y as shown in this next screenshot. Also, I used the random number generator to generate random numbers in another column; this time I used the uniform distribution on [0,54]; I wanted the “random set of potential y values” to have roughly the same range as the “fake data” y-values.

Y holds the “non-random” fake data and YR holds the data for the “Y’s really are randomly distributed” example.

I then decided to generate two more “linear” sets of data; in these cases I used the random number generator to generate normal residuals of larger standard deviation and then create Y data to use as a data set; the columns or residuals are labeled “mres” and “lres” and the columns of new data are labeled YN and YVN.

Note: in the “linear trend data” I added the random numbers to the exact linear model y’s labeled “fake” to get the y’s to represent data; in the “random-no-linear-trend” data column I used the random number generator to generate the y values themselves.

Now it is time to run the regression package itself. In Excel, simple linear regression is easy. Just go to the data analysis tab and click, then click “regression”:

This gives a dialogue box. Be sure to tell the routine that you have “headers” to your columns of numbers (non-numeric descriptions of the columns) and note that you can select confidence intervals for your regression parameters. There are other things you can do as well.

You can select where the output goes. I selected a new data sheet.

Note the output: the $r$ value is very close to 1, the p-values for the regression coefficients are small and the calculated regression line (to generate the $\hat{y_i}'s$ is:
$y = 3.70 + 5.01x$. Also note the ANOVA table: the SSR (sum squares regression) is very, very large compared to the SSE (sum squares residuals), as expected. The variance in y values is almost completely explained by the variance in the y values from the regression line. Hence we obtain an obscenely large F value; we easily reject the null hypothesis (that $\beta_1 = 0$).

This is what a plot of the calculated regression line with the “fake data” looks like:

Yes, this is unrealistic, but this is designed to demonstrate a concept. Now let’s look at the regression output for the “uniform y values” (y values generated at random from a uniform distribution of roughly the same range as the “regression” y-values):

Note: $r^2$ is nearly zero, we fail to reject the null hypothesis that $\beta_1 = 0$ and note how the SSE is roughly equal to the SS; the reason, of course, is that the regression line is close to $y = \bar{y}$. The calculated $F$ value is well inside the “fail to reject” range, as expected.

A plot looks like:

The next two examples show what happens when one “cooks” up a regression line with residuals that are normally distributed, have mean equal to zero, but have larger standard deviations. Watch how the $r$ values change, as well as how the SSR and SSE values change. Note how the routine fails to come up with a statistically significant estimate for the “constant” part of the regression line but the slope coefficient is handled easily. This demonstrates the effect of residuals with larger standard deviations.

## December 1, 2012

### One challenge of teaching “brief calculus” (“business calculus”, “applied calculus”, etc.)

Today’s exam covered elementary integrals and partial derivatives; in our course we usually mention two variable functions and show how to calculate some “easy” partial derivatives.

So today’s exam saw a D/F student show up late (as usual); keep in mind this is an 8 am class (no class prior to it). He, as usual, got little or nothing correct. Of course we had the usual $\int \frac{1}{x^2} dx = ln(x^x) + C, \int^1_0 3e^{5x}dx = (15e^5 -15) + C$, etc.

But there was this too: note that we had barely discussed partial derivatives and how to calculate them “by the formula”. But I did give the following bonus question: “is it possible to have a function $f(x,y)$ where $f_x = x^3 + y^3$ and $f_y = 3xy$? Yes, this is a common question in multivariable calculus (e. g., “is this vector field conservative?”) but remember this is a “brief calculus” course.

A few students took the challenge; some computed $\int(x^3 + y^3)dx = \frac{x^4}{4}+ xy^3 + C, \int (3xy^2)dy = \frac{3}{2}xy^2+C$ and noted that the two functions cannot be made to match (I didn’t expect them to recognize that functions of one variable alone represents constants of integration). Some took the second partials and noted $f_{xy} = 3y^2, f_{yx} = 3y$ and that these don’t match. Again, this was NOT a problem that we practiced.

Another instance: given the ideal gas law $PV = nRT$ I challenged them to show $\frac{\partial P}{\partial V}\frac{\partial V}{\partial T}\frac{\partial T}{\partial P} = -1$ and someone got it!

Bottom line: in one course, we have some bright, interested students who enjoy thinking and we have some who either don’t or can’t. This makes teaching difficult; if one tries to “teach to the mean” one is teaching to the empty set. It is almost: either bore half the class, or blow away half the class.