College Math Teaching

May 11, 2015

The hypervolume of the n-ball enclosed by a standard n-1 sphere

I am always looking for interesting calculus problems to demonstrate various concepts and perhaps generate some interest in pure mathematics.
And yes, I like to “blow off some steam” by spending some time having some non-technical mathematical fun with elementary mathematics.

This post uses only:

1. Integration by parts and basic reduction formulas.
2. Trig substitution.
3. Calculation of volumes (and hyper volumes) by the method of cross sections.
4. Induction
5. Elementary arithmetic involving factorials.

The quest: find a formula that finds the (hyper)volume of the region \{(x_1, x_2, x_3,....x_k) | \sum_{i=1}^k x_i^2 \leq R^2 \} \subset R^k

We will assume that the usual tools of calculus work as advertised.

Start. If we done the (hyper)volume of the k-ball by V_k  we will start with the assumption that V_1 = 2R ; that is, the distance between the endpoints of [-R,R] is 2R.

Step 1: we show, via induction, that V_k =c_kR^k where c_k is a constant and R is the radius.

Our proof will be inefficient for instructional purposes.

We know that V_1 =2R hence the induction hypothesis holds for the first case and c_1 = 2 . We now go to show the second case because, for the beginner, the technique will be easier to follow further along if we do the k = 2 case.

Yes, I know that you know that V_2 = \pi R^2 and you’ve seen many demonstrations of this fact. Here is another: let’s calculate this using the method of “area by cross sections”. Here is x^2 + y^2 = R^2 with some y = c cross sections drawn in.

crosssections

Now do the calculation by integrals: we will use symmetry and only do the upper half and multiply our result by 2. At each y = y_c level, call the radius from the center line to the circle R(y) so the total length of the “y is constant” level is 2R(y) and we “multiply by thickness “dy” to obtain V_2 = 4 \int^{y=R}_{y=0} R(y) dy .

But remember that the curve in question is x^2 + y^2 = R^2 and so if we set x = R(y) we have R(y) = \sqrt{R^2 -y^2} and so our integral is 4 \int^{y=R}_{y=0}\sqrt{R^2 -y^2}  dy

Now this integral is no big deal. But HOW we solve it will help us down the road. So here, we use the change of variable (aka “trigonometric substitution”): y = Rsin(t), dy =Rcos(t) to change the integral to:

4 \int^{\frac{\pi}{2}}_0 R^2 cos^2(t) dt = 4R^2 \int^{\frac{\pi}{2}}_0  cos^2(t) dt therefore

V_2 = c_2 R^2 where:

c_2 = 4\int^{\frac{\pi}{2}}_0  cos^2(t)

Yes, I know that this is an easy integral to solve, but I first presented the result this way in order to make a point.

Of course, c_2 = 4\int^{\frac{\pi}{2}}_0  cos^2(t) = 4\int^{\frac{\pi}{2}}_0 \frac{1}{2} + \frac{1}{2}cos(2t) dt = \pi

Therefore, V_2 =\pi R^2 as expected.

Exercise for those seeing this for the first time: compute c_3 and V_3 by using the above methods.

Inductive step: Assume V_k = c_kR^k Now calculate using the method of cross sections above (and here we move away from x-y coordinates to more general labeling):

V_{k+1} = 2\int^R_0 V_k dy = 2 \int^R_0 c_k (R(x_{k+1})^k dx_{k+1} =c_k 2\int^R_0 (R(x_{k+1}))^k dx_{k+1}

Now we do the substitutions: first of all, we note that x_1^2 + x_2^2 + ...x_{k}^2 + x_{k+1}^2 = R^2 and so

x_1^2 + x_2^2 ....+x_k^2 = R^2 - x_{k+1}^2 . Now for the key observation: x_1^2 + x_2^2 ..+x_k^2 =R^2(x_{k+1}) and so R(x_{k+1}) = \sqrt{R^2 - x_{k+1}^2}

Now use the induction hypothesis to note:

V_{k+1} = c_k 2\int^R_0 (R^2 - x_{k+1}^2)^{\frac{k}{2}} dx_{k+1}

Now do the substitution x_{k+1} = Rsin(t), dx_{k+1} = Rcos(t)dt and the integral is now:

V_{k+1} = c_k 2\int^{\frac{\pi}{2}}_0 R^{k+1} cos^{k+1}(t) dt = c_k(2 \int^{\frac{\pi}{2}}_0 cos^{k+1}(t) dt)R^{k+1} which is what we needed to show.

In fact, we have shown a bit more. We’ve shown that c_1 = 2 =2 \int^{\frac{\pi}{2}}_0(cos(t))dt, c_2 = 2 \cdot 2\int^{\frac{\pi}{2}}_0 cos^2(t) dt = c_1 2\int^{\frac{\pi}{2}}_0 cos^2(t) dt and, in general,

c_{k+1} = c_{k}c_{k-1}c_{k-2} ....c_1(2 \int^{\frac{\pi}{2}}_0 cos^{k+1}(t) dt) = 2^{k+1} \int^{\frac{\pi}{2}}_0(cos^{k+1}(t))dt \int^{\frac{\pi}{2}}_0(cos^{k}(t))dt \int^{\frac{\pi}{2}}_0(cos^{k-1}(t))dt .....\int^{\frac{\pi}{2}}_0(cos(t))dt

Finishing the formula

We now need to calculate these easy calculus integrals: in this case the reduction formula:

\int cos^n(x) dx = \frac{1}{n}cos^{n-1}sin(x) + \frac{n-1}{n} \int cos^{n-2}(x) dx is useful (it is merely integration by parts). Now use the limits and elementary calculation to obtain:

\int^{\frac{\pi}{2}}_0 cos^n(x) dx = \frac{n-1}{n} \int^{\frac{\pi}{2}}_0 cos^{n-2}(x)dx to obtain:

\int^{\frac{\pi}{2}}_0 cos^n(x) dx = (\frac{n-1}{n})(\frac{n-3}{n-2})......(\frac{3}{4})\frac{\pi}{4} if n is even and:
\int^{\frac{\pi}{2}}_0 cos^n(x) dx = (\frac{n-1}{n})(\frac{n-3}{n-2})......(\frac{4}{5})\frac{2}{3} if n is odd.

Now to come up with something resembling a closed formula let’s experiment and do some calculation:

Note that c_1 = 2, c_2 = \pi, c_3 = \frac{4 \pi}{3}, c_4 = \frac{(\pi)^2}{2}, c_5 = \frac{2^3 (\pi)^2)}{3 \cdot 5} = \frac{8 \pi^2}{15}, c_6 = \frac{\pi^3}{3 \cdot 2} = \frac{\pi^3}{6} .

So we can make the inductive conjecture that c_{2k} = \frac{\pi^k}{k!} and see how it holds up: c_{2k+2} = 2^2 \int^{\frac{\pi}{2}}_0(cos^{2k+2}(t))dt \int^{\frac{\pi}{2}}_0(cos^{2k+1}(t))dt \frac{\pi^k}{k!}

= 2^2 ((\frac{2k+1}{2k+2})(\frac{2k-1}{2k})......(\frac{3}{4})\frac{\pi}{4})((\frac{2k}{2k+1})(\frac{2k-2}{2k-1})......\frac{2}{3})\frac{\pi^k}{k!}

Now notice the telescoping effect of the fractions from the c_{2k+1} factor. All factors cancel except for the (2k+2) in the first denominator and the 2 in the first numerator, as well as the \frac{\pi}{4} factor. This leads to:

c_{2k+2} = 2^2(\frac{\pi}{4})\frac{2}{2k+2} \frac{\pi^k}{k!} = \frac{\pi^{k+1}}{(k+1)!} as required.

Now we need to calculate c_{2k+1} = 2\int^{\frac{\pi}{2}}_0(cos^{2k+1}(t))dt c_{2k} = 2\int^{\frac{\pi}{2}}_0(cos^{2k+1}(t))dt \frac{\pi^k}{k!}

= 2 (\frac{2k}{2k+1})(\frac{2k-2}{2k-1})......(\frac{4}{5})\frac{2}{3}\frac{\pi^k}{k!} = 2 (\frac{(2k)(2k-2)(2k-4)..(4)(2)}{(2k+1)(2k-1)...(5)(3)} \frac{\pi^k}{k!}

To simplify this further: split up the factors of the k! in the denominator and put one between each denominator factor:

= 2 (\frac{(2k)(2k-2)(2k-4)..(4)(2)}{(2k+1)(k)(2k-1)(k-1)...(3)(5)(2)(3)(1)} \pi^k Now multiply the denominator by 2^k and put one factor with each k-m factor in the denominator; also multiply by 2^k in the numerator to obtain:

(2) 2^k (\frac{(2k)(2k-2)(2k-4)..(4)(2)}{(2k+1)(2k)(2k-1)(2k-2)...(6)(5)(4)(3)(2)} \pi^k Now gather each factor of 2 in the numerator product of the 2k, 2k-2…

= (2) 2^k 2^k \pi^k \frac{k!}{(2k+1)!} = 2 \frac{(4 \pi)^k k!}{(2k+1)!} which is the required formula.

So to summarize:

V_{2k} = \frac{\pi^k}{k!} R^{2k}

V_{2k+1}= \frac{2 k! (4 \pi)^k}{(2k+1)!}R^{2k+1}

Note the following: lim_{k \rightarrow \infty} c_{k} = 0 . If this seems strange at first, think of it this way: imagine the n-ball being “inscribed” in an n-cube which has hyper volume (2R)^n . Then consider the ratio \frac{2^n R^n}{c_n R^n} = 2^n \frac{1}{c_n} ; that is, the n-ball holds a smaller and smaller percentage of the hyper volume of the n-cube that it is inscribed in; note the 2^n corresponds to the number of corners in the n-cube. One might see that the rounding gets more severe as the number of dimensions increases.

One also notes that for fixed radius R, lim_{n \rightarrow \infty} V_n = 0 as well.

There are other interesting aspects to this limit: for what dimension n does the maximum hypervolume occur? As you might expect: this depends on the radius involved; a quick glance at the hyper volume formulas will show why. For more on this topic, including an interesting discussion on this limit itself, see Dave Richardson’s blog Division by Zero. Note: his approach to finding the hyper volume formula is also elementary but uses polar coordinate integration as opposed to the method of cross sections.

October 29, 2014

Hyperbolic Trig Functions and integration…

In college calculus courses, I’ve always wrestled with “how much to cover in the hyperbolic trig functions” section.

On one hand, the hyperbolic trig functions make some integrals much easer. On the other hand: well, it isn’t as if our classes are populated with the highest caliber student (I don’t teach at MIT); many struggle with the standard trig functions. There is only so much that the average young mind can absorb.

In case your memory is rusty:

cosh(x) =\frac{e^x + e^{-x}}{2}, sinh(x) = \frac{e^x -e^{-x}}{2} and then it is immediate that the standard “half/double angle formulas hold; we do remember that \frac{d}{dx}cosh(x) = sinh(x), \frac{d}{dx} = cosh(x).

What is less immediate is the following: sinh^{-1}(x)  = ln(x+\sqrt{x^2+1}), cosh^{-1}(x) = ln(x + \sqrt{x^2 -1}) (x \ge 1).

Exercise: prove these formulas. Hint: if sinh(y) = x then e^{y} - 2x- e^{-y} =0 so multiply both sides by e^{y} to obtain e^{2y} -2x e^y - 1 =0 now use the quadratic formula to solve for e^y and keep in mind that e^y is positive.

For the other formula: same procedure, and remember that we are using the x \ge 0 branch of cosh(x) and that cosh(x) \ge 1

The following follows easily: \frac{d}{dx} sinh^{-1} (x) = \frac{1}{\sqrt{x^2 + 1}} (just set up sinh(y) = x and use implicit differentiation followed by noting cosh^2(x) -sinh^2(x) = 1 . ) and \frac{d}{dx} cosh^{-1}(x) = \frac{1}{\sqrt{x^2-1}} (similar derivation).

Now, we are off and running.

Example: \int \sqrt{x^2 + 1} dx =

We can make the substitution x =sinh(u), dx = cosh(u) du and obtain \int cosh^2(u) du = \int \frac{1}{2} (cosh(2u) + 1)du = \frac{1}{4}sinh(2u) + \frac{1}{2} u + C . Now use sinh(2u) = 2 sinh(u)cosh(u) and we obtain:

\frac{1}{2}sinh(u)cosh(u) + \frac{u}{2} + C . The back substitution isn’t that hard if we recognize cosh(u) = \sqrt{sinh^2(u) + 1} so we have \frac{1}{2} sinh(u) \sqrt{sinh^2(u) + 1} + \frac{u}{2} + C . Back substitution is now easy:

\frac{1}{2} x \sqrt{x^2+1} + \frac{1}{2} ln(x + \sqrt{x^2 + 1}) + C . No integration by parts is required and the dreaded \int sec^3(x) dx integral is avoided. Ok, I was a bit loose about the domains here; we can make this valid for negative values of x by using an absolute value with the ln(x + \sqrt{x^2 + 1}) term.

August 31, 2014

The convolution integral: do some examples in Calculus III or not?

For us, calculus III is the most rushed of the courses, especially if we start with polar coordinates. Getting to the “three integral theorems” is a real chore. (ok, Green’s, Divergence and Stoke’s theorem is really just \int_{\Omega} d \sigma = \int_{\partial \Omega} \sigma but that is the subject of another post)

But watching this lecture made me wonder: should I say a few words about how to calculate a convolution integral?

Note: I’ve discussed a type of convolution integral with regards to solving differential equations here.

In the context of Fourier Transforms, the convolution integral is defined as it was in analysis class: f*g = \int^{\infty}_{-\infty} f(x-t)g(t) dt . Typically, we insist that the functions be, say, L^1 and note that it is a bit of a chore to show that the convolution of two L^1 functions is L^1 ; one proves this via the Fubini-Tonelli Theorem.

(The straight out product of two L^1 functions need not be L^1 ; e.g, consider f(x) = \frac {1}{\sqrt{x}} for x \in (0,1] and zero elsewhere)

So, assuming that the integral exists, how do we calculate it? Easy, you say? Well, it can be, after practice.

But to test out your skills, let f(x) = g(x) be the function that is 1 for x \in [\frac{-1}{2}, \frac{1}{2}] and zero elsewhere. So, what is f*g ???

So, it is easy to see that f(x-t)g(t) only assumes the value of 1 on a specific region of the (x,t) plane and is zero elsewhere; this is just like doing an iterated integral of a two variable function; at least the first step. This is why it fits well into calculus III.

f(x-t)g(t) = 1 for the following region: (x,t), -\frac{1}{2} \le x-t \le \frac{1}{2}, -\frac{1}{2} \le t \le \frac{1}{2}

This region is the parallelogram with vertices at (-1, -\frac{1}{2}), (0, -\frac{1}{2}), (0 \frac{1}{2}), (1, \frac{1}{2}) .

convolutiondraw

Now we see that we can’t do the integral in one step. So, the function we are integrating f(x-t)f(t) has the following description:

f(x-t)f(t)=\left\{\begin{array}{c} 1,x \in [-1,0], -\frac{1}{2} t \le \frac{1}{2}+x \\ 1 ,x\in [0,1], -\frac{1}{2}+x \le t \le \frac{1}{2} \\ 0 \text{ elsewhere} \end{array}\right.

So the convolution integral is \int^{\frac{1}{2} + x}_{-\frac{1}{2}} dt = 1+x for x \in [-1,0) and \int^{\frac{1}{2}}_{-\frac{1}{2} + x} dt = 1-x for x \in [0,1] .

That is, of course, the tent map that we described here. The graph is shown here:

tentmapgraph

So, it would appear to me that a good time to do a convolution exercise is right when we study iterated integrals; just tell the students that this is a case where one “stops before doing the outside integral”.

August 25, 2014

Fourier Transform of the “almost Gaussian” function with a residue integral

This is based on the lectures on the Fourier Transform by Brad Osgood from Stanford:

And here, F(f)(s) = \int^{\infty}_{-\infty} e^{-2 \pi i st} f(t) dt provided the integral converges.

The “almost Gaussian” integrand is f(t) = e^{-\pi t^2} ; one can check that \int^{\infty}_{-\infty} e^{-\pi t^2} dt = 1 . One way is to use the fact that \int^{\infty}_{-\infty} e^{-x^2} dx = \sqrt{\pi} and do the substitution x = \sqrt{\pi} t; of course one should be able to demonstrate the fact to begin with. (side note: a non-standard way involving symmetries and volumes of revolution discovered by Alberto Delgado can be found here)

So, during this lecture, Osgood shows that F(e^{-\pi t^2}) = e^{-\pi s^2} ; that is, this modified Gaussian function is “its own Fourier transform”.

I’ll sketch out what he did in the lecture at the end of this post. But just for fun (and to make a point) I’ll give a method that uses an elementary residue integral.

Both methods start by using the definition: F(s) = \int^{\infty}_{-\infty} e^{-2 \pi i ts} e^{-\pi t^2} dt

Method 1: combine the exponential functions in the integrand:

\int^{\infty}_{-\infty} e^{-\pi(t^2 +2  i ts}  dt . Now complete the square to get: \int^{\infty}_{-\infty} e^{-\pi(t^2 +2  i ts-s^2)-\pi s^2}  dt

Now factor out the factor involving s alone and write as a square: e^{-\pi s^2}\int^{\infty}_{-\infty} e^{-\pi(t+is)^2}  dt

Now, make the substitution x = t+is, dx = dt to obtain:

e^{-\pi s^2}\int^{\infty+is}_{-\infty+is} e^{-\pi x^2}  dx

Now we show that the above integral is really equal to e^{-\pi s^2}\int^{\infty}_{-\infty} e^{-\pi x^2}  dx = e^{\pi s^2} (1) = e^{-\pi s^2}

To show this, we perform \int_{\gamma} e^{z^2} dz along the retangular path \gamma : -x, x, x+is, -x+is and let x \rightarrow \infty

countour
Now the integral around the contour is 0 because e^{-z^2} is analytic.

We wish to calculate the negative of the integral along the top boundary of the contour. Integrating along the bottom gives 1.
As far as the sides: if we fix s we note that e^{-z^2} = e^{(s^2-x^2)+2si} and the magnitude goes to zero as x \rightarrow \infty So the integral along the vertical paths approaches zero, therefore the integrals along the top and bottom contours agree in the limit and the result follows.

Method 2: The method in the video
This uses “differentiation under the integral sign”, which we talk about here.

Stat with F(s) = \int^{\infty}_{-\infty} e^{-2 \pi i ts} e^{-\pi t^2} dt and note \frac{dF}{ds} = \int^{\infty}_{-\infty} (-2 \pi i t) e^{-2 \pi i ts} e^{-\pi t^2} dt

Now we do integration by parts: u = e^{-2 \pi i ts}, dv = (-2 \pi i t)e^{-\pi t^2} \rightarrow v = i e^{-\pi t^2}, du = (-2 \pi i s)e^{-2 \pi i ts} and the integral becomes:

(i e^{-\pi t^2} e^{-2 \pi i ts}|^{\infty}_{-\infty} - (i)(-2 \pi i s) \int^{\infty}_{-\infty} e^{-2 \pi i ts} e^{-\pi t^2} dt

Now the first term is zero for all values of s as t \rightarrow \infty . The second term is merely:

-(2 \pi s) \int^{\infty}_{-\infty} e^{-2 \pi i ts} e^{-\pi t^2} dt = -(2 \pi s) F(s) .

So we have shown that \frac{d F}{ds} = (-2 \pi s)F which is a differential equation in s which has solution F = F_0 e^{- \pi s^2} (a simple separation of variables calculation will verify this). Now to solve for the constant F_0 note that F(0) = \int^{\infty}_{-\infty} e^{0} e^{-\pi t^2} dt = 1 .

The result follows.

Now: which method was easier? The second required differential equations and differentiating under the integral sign; the first required an easy residue integral.

By the way: the video comes from an engineering class. Engineers need to know this stuff!

August 21, 2014

Calculation of the Fourier Transform of a tent map, with a calculus tip….

I’ve been following these excellent lectures by Professor Brad Osgood of Stanford. As an aside: yes, he is dynamite in the classroom, but there is probably a reason that Stanford is featuring him. 🙂

And yes, his style is good for obtaining a feeling of comradery that is absent in my classroom; at least in the lower division “service” classes.

This lecture takes us from Fourier Series to Fourier Transforms. Of course, he admits that the transition here is really a heuristic trick with symbolism; it isn’t a bad way to initiate an intuitive feel for the subject though.

However, the point of this post is to offer a “algebra of calculus trick” for dealing with the sort of calculations that one might encounter.

By the way, if you say “hey, just use a calculator” you will be BANNED from this blog!!!! (just kidding…sort of. 🙂 )

So here is the deal: let f(x) represent the tent map: the support of f is [-1,1] and it has the following graph:

tentmapgraph

The formula is: f(x)=\left\{\begin{array}{c} x+1,x \in [-1,0) \\ 1-x ,x\in [0,1] \\ 0 \text{ elsewhere} \end{array}\right.

So, the Fourier Transform is F(f) = \int^{\infty}_{-\infty} e^{-2 \pi i st}f(t)dt = \int^0_{-1} e^{-2 \pi i st}(1+t)dt + \int^1_0e^{-2 \pi i st}(1-t)dt

Now, this is an easy integral to do, conceptually, but there is the issue of carrying constants around and being tempted to make “on the fly” simplifications along the way, thereby leading to irritating algebraic errors.

So my tip: just let a = -2 \pi i s and do the integrals:

\int^0_{-1} e^{at}(1+t)dt + \int^1_0e^{at}(1-t)dt and substitute and simplify later:

Now the integrals become: \int^{1}_{-1} e^{at}dt + \int^0_{-1}te^{at}dt - \int^1_0 te^{at} dt.
These are easy to do; the first is merely \frac{1}{a}(e^a - e^{-a}) and the next two have the same anti-derivative which can be obtained by a “integration by parts” calculation: \frac{t}{a}e^{at} -\frac{1}{a^2}e^{at}; evaluating the limits yields:

-\frac{1}{a^2}-(\frac{-1}{a}e^{-a} -\frac{1}{a^2}e^{-a}) - (\frac{1}{a}e^{a} -\frac{1}{a^2}e^a)+ (-\frac{1}{a^2})

Add the first integral and simplify and we get: -\frac{1}{a^2}(2 - (e^{-a} -e^{a}) . NOW use a = -2\pi i s and we have the integral is \frac{1}{4 \pi^2 s^2}(2 -(e^{2 \pi i s} -e^{-2 \pi i s}) = \frac{1}{4 \pi^2 s^2}(2 - cos(2 \pi s)) by Euler’s formula.

Now we need some trig to get this into a form that is “engineering/scientist” friendly; here we turn to the formula: sin^2(x) = \frac{1}{2}(1-cos(2x)) so 2 - cos(2 \pi s) = 4sin^2(\pi s) so our answer is \frac{sin^2( \pi s)}{(\pi s)^2} = (\frac{sin(\pi s)}{\pi s})^2 which is often denoted as (sinc(s))^2 as the “normalized” sinc(x) function is given by \frac{sinc(\pi x)}{\pi x} (as we want the function to have zeros at integers and to “equal” one at x = 0 (remember that famous limit!)

So, the point is that using a made the algebra a whole lot easier.

Now, if you are shaking your head and muttering about how this calculation was crude that that one usually uses “convolution” instead: this post is probably too elementary for you. 🙂

August 6, 2014

Where “j” comes from

I laughed at what was said from 30:30 to 31:05 or so:

If you are wondering why your engineering students want to use j = \sqrt{-1} is is because, in electrical engineering, i usually stands for “current”.

Though many of you know this, this lesson also gives an excellent reason to use the complex form of the Fourier series; e. g. if f is piece wise smooth and has period 1, write f(x) = \Sigma^{k = \infty}_{k=-\infty}c_k e^{i 2k\pi x} (usual abuse of the equals sign) rather than writing it out in sines and cosines. of course, \overline{c_{-k}} = c_k if f is real valued.

How is this easier? Well, when you give a demonstration as to what the coefficients have to be (assuming that the series exists to begin with, the orthogonality condition is very easy to deal with. Calculate: c_m= \int^1_0 e^{i 2k\pi t}e^{i 2m\pi x} dx for when k \ne m . There is nothing to it; easy integral. Of course, one has to demonstrate the validity of e^{ix} = cos(x) + isin(x) and show that the usual differentiation rules work ahead of time, but you need to do that only once.

July 31, 2014

Stupid question: why does it appear to us that differentiation is easier than anti-differentiation?

Filed under: calculus, elliptic curves, integrals — Tags: , — collegemathteaching @ 8:05 pm

This post is inspired by my rereading a favorite book of mine: Underwood Dudley’s Mathematical Cranks

mathcrank

There was the chapter about the circumference of an ellipse. Now, given \frac{x^2}{a^2} + \frac{y^2}{b^2} = 1 it isn’t hard to see that s^2 = {dx}^2 + {dy}^2 and so going with the portion in the first quadrant: one can derive that the circumference is given by the elliptic integral of the second kind, which is one of those integrals that can NOT be solved in “closed form” by anti-differentiation of elementary functions.

There are lots of integrals like this; e. g. \int e^{x^2} dx is a very famous example. Here is a good, accessible paper on the subject of non-elementary integrals (by Marchisotto and Zakeri).

So this gets me thinking: why is anti-differentiation so much harder than taking the derivative? Is this because of the functions that we’ve chosen to represent the “elementary anti-derivatives”?

I know; this is not a well formulated question; but it has always bugged me. Oh yes, I am teaching two sections of first semester calculus this upcoming semester.

April 7, 2014

Numerical integration: why the brain is still required…

Filed under: class room experiment, integrals, numerical methods, pedagogy — Tags: — collegemathteaching @ 4:59 pm

I gave the following demonstration in class today: \int^1_0 sin^2(512 \pi x) dx =

Now, of course, even a C student in calculus II would be able to solve this exactly using sin^2(u) = \frac{1}{2} - \frac{1}{2}cos(2u) to obtain: \int^1_0 sin^2(512 \pi x) dx=\frac{1}{2}

But what about the “just bully” numerical methods we’ve learned?

Romberg integration fails miserably, at least at first:

integrationfail1

(for those who don’t know about Romberg integration: the first column gives trapezoid rule approximations, the second gives Simpson’s rule approximations and the third gives Boole’s rule; the value of \Delta x gets cut in half as the rows go down).

I said “at first” as if one goes to, say, 20 rows, one can start to get near the correct answer.

Adaptive quadrature: is even a bigger fail:

integrationfail2

The problem here is that this routine quits when the refined Simpson’s rule approximation agrees with the less refined approximation (to within a certain tolerance), and here, the approximations are both zero, hence there is perfect agreement, very early in the process.

So, what to do?

One should note, of course, that the integrand is positive except for a finite number of points where it is zero. Hence one knows right away that the results are bogus.

One quick way to get closer: just tweak the limits of integration by a tiny amount and calculate, say, \int^{.999}_{.001} sin(512*\pi *x) dx and do some mathematics!

integrationfail3

The point: the integration routines cannot replace thinking.

April 4, 2014

Gaussian Quadrature and Legendre polynomials

In a previous post, I discussed the Legendre polynomials. We now know that if P_n is the n’th Legendre polynomial then:

1. If k < n, \int^{1}_{-1}P_k(x)P_n(x)dx = 0 .
2. If Q(x) has degree k < n then \int^{1}_{-1}Q(x)P_n(x)dx = 0 .
3. P_n(x) has n simple roots in (-1,1) which are symmetric about 0.
4. P_n(x) is an odd function if n is odd and an even function if n is even.
5. P_0, P_1, P_2,....P_n form an orthogonal basis for the vector pace of polynomials of degree n or less.

To make these notes complete, I’d like to review the Lagrange Interpolating polynomial:

Given a set of points (x_0, y_0), (x_1, y_1), .....(x_n, y_n) where the x_0, x_1, ...x_n are all distinct, the Lagrange polynomial through these points (called “nodes”) is defined to be:
L_n(x) = y_0 \frac{(x-x_1)(x-x_2)....(x-x_n)}{(x_0 -x_1)(x_0-x_2)...(x_0-x_n)} +y_1 \frac{(x-x_0)(x-x_2)...(x-x_n)}{(x_1-x_0)(x_1-x_2)....(x_1-x_n)} + ....y_n\frac{(x-x_0)(x-x_2)...(x-x_{n-1})}{(x_n-x_0)(x_n-x_2)....(x_n-x_{n-1})}

It is easy to see that L_n(x_i) = y_i because the associated coefficient for the y_i term is 1 and the other coefficients are zero. We call the coefficient polynomials L_{n,i} = \Pi^n_{j=0, j \ne i}\frac{(x-x_j)}{x_j-x_i}

It can be shown by differentiation that if f is a function that has n+1 continuous derivatives on some open interval containing (a,b) and if L_n(x) is the Lagrange polynomial running through the nodes (x_i, f(x_i)) where all of the x_i \in [a,b] then the maximum of |f(x) - L_n(x)| is bounded by |K f^{n+1}(\omega) | where \omega \in [a,b] . So if the n+1’st derivative of f is zero, then the interpolation is exact.

So, we introduce some notation: let P_m(x) be the m'th Legendre polynomial, let r_1, r_2,...r_m be the roots, and c_i = L_(m,i) = \Pi^n_{j=0, j \ne i}\frac{(x-x_j)}{x_j-x_i} , the j’th coefficient polynomial for the Lagrange polynomial through the nodes (r_i, 1) .

Now let P(x) be any polynomial of degree less than m. First: the Lagrange polynomial L_m through the nodes (r_1, P(r_1)), (r_2, P(r_2)), ...(r_m, P(r_m)) represents P(x) exactly as the error term is a multiple of the m+1'st derivative of P(x) which is zero.

Therefore \int^{1}_{-1} P(x) dx = \int^{1}_{-1} L_m (x)dx = \int^{1}_{-1} P(r_1 )L_{m,1} +P(r_2 )L_{m,2} + ....P(r_m )L_{m,m} dx = \sum^{m}_{i=1} P(r_i )\int^1_{-1} L_{m,i}(x) dx

Now here is the key: \int^1_{-1} L_{m,i}(x) dx only depends on the Legendre polynomial being used and NOT on P(x) .

Hence we have a quadrature formula that is exact for all polynomials of degree less than m .

But, thanks to the division algorithm, we can do even better. We show that this quadrature scheme is exact for all polynomials P whose degree are less than 2m . This is where the orthogonality of the Legendre polynomials will be used.

Let P be a polynomial of degree less than 2m . Then by the division algorithm, P(x) = Q(x)P_m(x) + R(x) where BOTH Q(x) and R(x) have degree less than m.

Now note the following: P(r_i) = Q(r_i)P_m(r_i) + R(r_i) = R(r_i) ; that is where the fact that the r_i are the roots of the Legendre polynomials matters.
Then if we integrate: \int^1_{-1} P(x) dx = \int^{1}_{-1} Q(x)P_m(x) dx + \int^{1}_{-1} R(x)dx . Now because of property 2 above, \int^{1}_{-1} Q(x)P_m(x)dx =0 .

Now because the degree of R is less than m , \int^1_{-1}R(x) dx = \sum^{m}_{i=1} R(r_i )\int^1_{-1} L_{m,i}(x) dx =\sum^{m}_{i=1} P(r_i )\int^1_{-1} L_{m,i}(x) dx

This means: this quadrature scheme using the Legendre polynomials of degree m is EXACT for all polynomials of degree less than 2m.

Of course, one must know the roots r_i and the corresponding \int^1_{-1} L_{m,i}(x) dx but these have been tabulated.

Why this is exciting:

Let’s try this technique on: \int^1_{-1} e^x dx with, say, n = 5. We know that the exact answer is e-e^{-1} . But using the quadrature method with a spreadsheet:
guassquadexample

The roots r_i are in column A, rows 5-9. The weights (coefficients; the \int^1_{-1} L_{m,i}(x) dx ) are in column B, rows 5-9. In column F we apply the function e^x to the roots in column A; the products of the weights with e^{r_i} are in column G, and then we do a simple sum to get a very accurate approximation to the integral.

The accuracy: this is due to the quadrature method being EXACT on the first 9 terms of the series for e^{x} = \sum^{\infty}_{j=0} \frac{x^j}{j!} . The inexactness comes from the higher order Taylor terms.

Note: to convert \int^{b}_{a} f(x) dx , one uses the change of variable: u = \frac{2}{b-a} x - \frac{a+b}{b-a} to convert this to \frac{(b-a)}{2}\int^{1}_{-1} f(u \frac{(b-a)}{2}+\frac{a+b}{2})dx

April 1, 2014

Legendre Polynomials: elementary linear algebra proof of orthogonality

In our numerical analysis class, we are coming up on Gaussian Quadrature (a way of finding a numerical estimate for integrals). Here is the idea: given an interval [a,b] and a positive integer n we’d like to select numbers x_i \in [a,b], i \in \{1,2,3,...n\} and weights c_i so that \int^b_a f(x) dx is estimated by \sum^n_{i=1} c_i f(x_i) and that this estimate is exact for polynomials of degree n or less.

You’ve seen this in calculus classes: for example, Simpson’s rule uses x_1 =a, x_2 = \frac{a+b}{2}, x_3 = b and uses c_1 = \frac{b-a}{6}, c_2 =\frac{2(b-a)}{3}, c_3 =\frac{b-a}{6} and is exact for polynomials of degree 3 or less.

So, Gaussian quadrature is a way of finding such a formula that is exact for polynomials of degree less than or equal to a given fixed degree.

I might discuss this process in detail in a later post, but the purpose of this post is to discuss a tool used in developing Gaussian quadrature formulas: the Legendre polynomials.

First of all: what are these things? You can find a couple of good references here and here; note that one can often “normalize” these polynomials by multiplying by various constants.

One way these come up: they are polynomial solutions to the following differential equation: \frac{d}{dx}((1-x^2)\frac{d}{dx} P_n(x)) + n(n+1)P_n(x) = 0 . To see that these solutions are indeed polynomials (for integer values of n ). To see this: try the power series method expanded about x = 0 ; the singular points (regular singular points) occur at x = \pm 1 .

Though the Legendre differential equation is very interesting, it isn’t the reason we are interested in these polynomials. What interests us is that these polynomials have the following properties:

1. If one uses the inner product f \cdot g = \int^1_{-1} f(x) g(x) dx for the vector space of all polynomials (real coefficients) of finite degree, these polynomials are mutually orthogonal; that is, if n \ne m, P_m(x) \cdot P_n (x) = \int^1_{-1} P_n(x)P_m(x) dx = 0 .

2. deg(P_n(x)) = n .

Properties 1 and 2 imply that for all integers n , \{P_0(x), P_1(x), P_2(x), ....P_n(x) \} form an orthogonal basis for the vector subspace of all polynomials of degree n or less. If follows immediately that if Q(x) is any polynomial of degree k < m , then Q(x) \cdot P_m(x) = 0 (Q(x) is a linear combination of P_j(x) where each j < m )

Now these properties can be proved from the very definitions of the Legendre polynomials (see the two references; for example one can note that P_n is an eigenfunction for the Hermitian operator \frac{d}{dx}((1-x^2)\frac{d}{dx} P_n(x)) with associated eigenvalue n(n+1) and such eigenfunctions are orthogonal.

This little result is fairly easy to see: call the Hermitian operator A and let m \ne n, A(P_m) =\lambda_m P_m, A(P_n) =\lambda_n = A(P_n) and \lambda_n \ne \lambda_m .

Then consider: (A(P_m) \cdot P_n) = (\lambda_m P_m \cdot P_n) = \lambda_m (P_m \cdot P_n ) . But because A is Hermitian, (A(P_m) \cdot P_n) = (P_m \cdot A(P_n)) = (P_m \cdot \lambda_n P_n) = \lambda_n (P_m \cdot P_n) . Therefore, \lambda_m (P_m \cdot P_n ) = \lambda_n(P_m \cdot P_n) which means that P_m \cdot P_n = 0 .

Of course, one still has to show that this operator is Hermitian and this is what the second reference does (in effect).

The proof that the operator is Hermitian isn’t hard: assume that f, g both meet an appropriate condition (say, twice differentiable on some interval containing [-1,1] ).
Then use integration by parts with dv =\frac{d}{dx} ((1-x^2) \frac{d}{dx}f(x)), u =g(x) : \int^1_{-1} \frac{d}{dx} ((1-x^2) \frac{d}{dx}f(x))g(x) = ((1-x^2) \frac{d}{dx}f(x))g(x)|^1_{-1}-\int^1_{-1}(1-x^2)\frac{d}{dx} f(x) \frac{d}{dx}g(x) dx . But ((1-x^2) \frac{d}{dx}f(x))g(x)|^1_{-1} =0 and the result follows by symmetry.

But not every student in my class has had the appropriate applied mathematics background (say, a course in partial differential equations).

So, we will take a more basic, elementary linear algebra approach to these. For our purposed, we’d like to normalize these polynomials to be monic (have leading coefficient 1).

Our approach

Use the Gram–Schmidt process from linear algebra on the basis: 1, x, x^2, x^3, x^4.....

Start with P_0 = 1 and let U_0 = \frac{1}{\sqrt{2}} ; here the U_i are the polynomials normalized to unit length (that is, \int^{1}_{-1} (U_k(x))^2 dx = 1 . That is, U_i(x) = \sqrt{\frac{1}{\int^1_{-1}(P_i(x))^2 dx}} P_i(x)

Next let P_1(x) =x, U_1(x) = \sqrt{\frac{2}{3}} x

Let P_2(x) = x^2 - \sqrt{\frac{2}{3}} x \int^{1}_{-1} (\sqrt{\frac{2}{3}} x)x^2 -\frac{1}{\sqrt{2}}\int^{1}_{-1} \frac{1}{\sqrt{2}}x^2 = x^2 -\frac{1}{3} Note that this is not too bad since many of the integrals are just integrals of an odd function over [-1,1] which become zero.

So the general definition:

P_{n+1}(x) = x^{n+1} - U_n \int^1_{-1}x^{n+1} U_n(x) dx - U_{n-1}\int^1_{-1} U_{n-1} x^{n+1}dx .... - \frac{1}{\sqrt{2}}\int^1_{-1} \frac{1}{\sqrt{2}}x^{n+1} dx

What about the roots?
Here we can establish that each P_m(x) has m distinct, real roots in (-1,1) . Suppose P_m(x) has only k < m distinct roots of odd multiplicity in (-1,1) , say x_1, x_2, ...x_k . Let W(x) = (x-x_1)(x-x_2)...(x-x_k) ; note that W has degree k < m . Note that P_m(x)W(x) now has all roots of even multiplicity; hence the polynomial P_m(x)W(x) cannot change sign on [-1,1] as all roots have even multiplicity. But \int^{1}_{-1} P_m(x)W(x) dx = 0 because W has degree strictly less than m . That is impossible. So P_m(x) has at least m distinct roots of odd multiplicity, but since P_m(x) has degree m, they are all simple roots.

« Newer PostsOlder Posts »

Blog at WordPress.com.