College Math Teaching

March 16, 2019

The beta function integral: how to evaluate them

My interest in “beta” functions comes from their utility in Bayesian statistics. A nice 78 minute introduction to Bayesian statistics and how the beta distribution is used can be found here; you need to understand basic mathematical statistics concepts such as “joint density”, “marginal density”, “Bayes’ Rule” and “likelihood function” to follow the youtube lecture. To follow this post, one should know the standard “3 semesters” of calculus and know what the gamma function is (the extension of the factorial function to the real numbers); previous exposure to the standard “polar coordinates” proof that \int^{\infty}_{-\infty} e^{x^2} dx = \sqrt{\pi} would be very helpful.

So, what it the beta function? it is \beta(a,b) = \frac{\Gamma(a) \Gamma(b)}{\Gamma(a+b)} where \Gamma(x) = \int_0^{\infty} t^{x-1}e^{-t} dt . Note that \Gamma(n+1) = n! for integers n The gamma function is the unique “logarithmically convex” extension of the factorial function to the real line, where “logarithmically convex” means that the logarithm of the function is convex; that is, the second derivative of the log of the function is positive. Roughly speaking, this means that the function exhibits growth behavior similar to (or “greater”) than e^{x^2}

Now it turns out that the beta density function is defined as follows: \frac{\Gamma(a+b)}{\Gamma(a)\Gamma(b)} x^{a-1}(1-x)^{b-1} for 0 < x < 1 as one can see that the integral is either proper or a convergent improper integral for 0 < a < 1, 0 < b < 1 .

I'll do this in two steps. Step one will convert the beta integral into an integral involving powers of sine and cosine. Step two will be to write \Gamma(a) \Gamma(b) as a product of two integrals, do a change of variables and convert to an improper integral on the first quadrant. Then I'll convert to polar coordinates to show that this integral is equal to \Gamma(a+b) \beta(a,b)

Step one: converting the beta integral to a sine/cosine integral. Limit t \in [0, \frac{\pi}{2}] and then do the substitution x = sin^2(t), dx = 2 sin(t)cos(t) dt . Then the beta integral becomes: \int_0^1 x^{a-1}(1-x)^{b-1} dx = 2\int_0^{\frac{\pi}{2}} (sin^2(t))^{a-1}(1-sin^2(t))^{b-1} sin(t)cos(t)dt = 2\int_0^{\frac{\pi}{2}} (sin(t))^{2a-1}(cos(t))^{2b-1} dt

Step two: transforming the product of two gamma functions into a double integral and evaluating using polar coordinates.

Write \Gamma(a) \Gamma(b) = \int_0^{\infty} x^{a-1} e^{-x} dx  \int_0^{\infty} y^{b-1} e^{-y} dy

Now do the conversion x = u^2, dx = 2udu, y = v^2, dy = 2vdv to obtain:

\int_0^{\infty} 2u^{2a-1} e^{-u^2} du  \int_0^{\infty} 2v^{2b-1} e^{-v^2} dv (there is a tiny amount of algebra involved)

From which we now obtain

4\int^{\infty}_0 \int^{\infty}_0 u^{2a-1}v^{2b-1} e^{-(u^2+v^2)} dudv

Now we switch to polar coordinates, remembering the rdrd\theta that comes from evaluating the Jacobian of x = rcos(\theta), y = rsin(\theta)

4 \int^{\frac{\pi}{2}}_0 \int^{\infty}_0 r^{2a +2b -1} (cos(\theta))^{2a-1}(sin(\theta))^{2b-1} e^{-r^2} dr d\theta

This splits into two integrals:

2 \int^{\frac{\pi}{2}}_0 (cos(\theta))^{2a-1}(sin(\theta))^{2b-1} d \theta 2\int^{\infty}_0 r^{2a +2b -1}e^{-r^2} dr

The first of these integrals is just \beta(a,b) so now we have:

\Gamma(a) \Gamma(b) = \beta(a,b) 2\int^{\infty}_0 r^{2a +2b -1}e^{-r^2} dr

The second integral: we just use r^2 = x \rightarrow 2rdr = dx \rightarrow \frac{1}{2}\frac{1}{\sqrt{x}}dx = dr to obtain:

2\int^{\infty}_0 r^{2a +2b -1}e^{-r^2} dr = \int^{\infty}_0 x^{a+b-\frac{1}{2}} e^{-x} \frac{1}{\sqrt{x}}dx = \int^{\infty}_0 x^{a+b-1} e^{-x} dx =\Gamma(a+b) (yes, I cancelled the 2 with the 1/2)

And so the result follows.

That seems complicated for a simple little integral, doesn’t it?

April 24, 2018

And I trolled my complex variables class

Filed under: advanced mathematics, analysis, class room experiment, complex variables — collegemathteaching @ 6:34 pm

One question on my last exam: find the Laurent series for \frac{1}{z + 2i} centered at z = -2i which converges on the punctured disk |z+2i| > 0 . And yes, about half the class missed it.

I am truly evil.

March 12, 2018

And I embarrass myself….integrate right over a couple of poles…

Filed under: advanced mathematics, analysis, calculus, complex variables, integrals — Tags: — collegemathteaching @ 9:43 pm

I didn’t have the best day Thursday; I was very sick (felt as if I had been in a boxing match..chills, aches, etc.) but was good to go on Friday (no cough, etc.)

So I walk into my complex variables class seriously under prepared for the lesson but decide to tackle the integral

\int^{\pi}_0 \frac{1}{1+sin^2(t)} dt

Of course, you know the easy way to do this, right?

\int^{\pi}_0 \frac{1}{1+sin^2(t)} dt =\frac{1}{2}  \int^{2\pi}_0 \frac{1}{1+sin^2(t)} dt and evaluate the latter integral as follows:

sin(t) = \frac{1}{2i}(z-\frac{1}{z}), dt = \frac{dz}{iz} (this follows from restricting z to the unit circle |z| =1 and setting z = e^{it} \rightarrow dz = ie^{it}dt and then obtaining a rational function of z which has isolated poles inside (and off of) the unit circle and then using the residue theorem to evaluate.

So 1+sin^2(t) \rightarrow 1+\frac{-1}{4}(z^2 -2 + \frac{1}{z^2}) = \frac{1}{4}(-z^2 + 6 -\frac{1}{z^2}) And then the integral is transformed to:

\frac{1}{2}\frac{1}{i}(-4)\int_{|z|=1}\frac{dz}{z^3 -6z +\frac{1}{z}} =2i \int_{|z|=1}\frac{zdz}{z^4 -6z^2 +1}

Now the denominator factors: (z^2 -3)^2 -8  which means z^2 = 3 - \sqrt{8}, z^2 = 3+ \sqrt{8} but only the roots z = \pm \sqrt{3 - \sqrt{8}} lie inside the unit circle.
Let w =  \sqrt{3 - \sqrt{8}}

Write: \frac{z}{z^4 -6z^2 +1} = \frac{\frac{z}{((z^2 -(3 + \sqrt{8})}}{(z-w)(z+w)}

Now calculate: \frac{\frac{w}{((w^2 -(3 + \sqrt{8})}}{(2w)} = \frac{1}{2} \frac{-1}{2 \sqrt{8}} and \frac{\frac{-w}{((w^2 -(3 + \sqrt{8})}}{(-2w)} = \frac{1}{2} \frac{-1}{2 \sqrt{8}}

Adding we get \frac{-1}{2 \sqrt{8}} so by Cauchy’s theorem 2i \int_{|z|=1}\frac{zdz}{z^4 -6z^2 +1} = 2i 2 \pi i \frac{-1}{2 \sqrt{8}} = \frac{2 \pi}{\sqrt{8}}=\frac{\pi}{\sqrt{2}}

Ok…that is fine as far as it goes and correct. But what stumped me: suppose I did not evaluate \int^{2\pi}_0 \frac{1}{1+sin^2(t)} dt and divide by two but instead just went with:

$latex \int^{\pi}_0 \frac{1}{1+sin^2(t)} dt \rightarrow i \int_{\gamma}\frac{zdz}{z^4 -6z^2 +1} where \gamma is the upper half of |z| = 1 ? Well, \frac{z}{z^4 -6z^2 +1} has a primitive away from those poles so isn’t this just i \int^{-1}_{1}\frac{zdz}{z^4 -6z^2 +1} , right?

So why not just integrate along the x-axis to obtain i \int^{-1}_{1}\frac{xdx}{x^4 -6x^2 +1} = 0 because the integrand is an odd function?

This drove me crazy. Until I realized…the poles….were…on…the…real…axis. ….my goodness, how stupid could I possibly be???

To the student who might not have followed my point: let \gamma be the upper half of the circle |z|=1 taken in the standard direction and \int_{\gamma} \frac{1}{z} dz = i \pi if you do this property (hint: set z(t) = e^{it}, dz = ie^{it}, t \in [0, \pi] . Now attempt to integrate from 1 to -1 along the real axis. What goes wrong? What goes wrong is exactly what I missed in the above example.

February 22, 2018

What is going on here: sum of cos(nx)…

Filed under: analysis, derivatives, Fourier Series, pedagogy, sequences of functions, series, uniform convergence — collegemathteaching @ 9:58 pm

This started innocently enough; I was attempting to explain why we have to be so careful when we attempt to differentiate a power series term by term; that when one talks about infinite sums, the “sum of the derivatives” might fail to exist if the sum is infinite.

Anyone who is familiar with Fourier Series and the square wave understands this well:

\frac{4}{\pi} \sum^{\infty}_{k=1} \frac{1}{2k-1}sin((2k-1)x)  = (\frac{4}{\pi})( sin(x) + \frac{1}{3}sin(3x) + \frac{1}{5}sin(5x) +.....) yields the “square wave” function (plus zero at the jump discontinuities)

Here I graphed to 2k-1 = 21

Now the resulting function fails to even be continuous. But the resulting function is differentiable except for the points at the jump discontinuities and the derivative is zero for all but a discrete set of points.

(recall: here we have pointwise convergence; to get a differentiable limit, we need other conditions such as uniform convergence together with uniform convergence of the derivatives).

But, just for the heck of it, let’s differentiate term by term and see what we get:

(\frac{4}{\pi})\sum^{\infty}_{k=1} cos((2k-1)x) = (\frac{4}{\pi})(cos(x) + cos(3x) + cos(5x) + cos(7x) +.....)...

It is easy to see that this result doesn’t even converge to a function of any sort.

Example: let’s see what happens at x = \frac{\pi}{4}: cos(\frac{\pi}{4}) = \frac{1}{\sqrt{2}}

cos(\frac{\pi}{4}) + cos(3\frac{\pi}{4}) =0

cos(\frac{\pi}{4}) + cos(3\frac{\pi}{4}) + cos(5\frac{\pi}{4}) = -\frac{1}{\sqrt{2}}

cos(\frac{\pi}{4}) + cos(3\frac{\pi}{4}) + cos(5\frac{\pi}{4}) + cos(7\frac{\pi}{4}) = 0

And this repeats over and over again; no limit is possible.

Something similar happens for x = \frac{p}{q}\pi where p, q are relatively prime positive integers.

But something weird is going on with this sum. I plotted the terms with 2k-1 \in \{1, 3, ...35 \}

(and yes, I am using \frac{\pi}{4} csc(x) as a type of “envelope function”)

BUT…if one, say, looks at cos(29x) + cos(31x) + cos(33x) + cos(35x)

we really aren’t getting a convergence (even at irrational multiples of \pi ). But SOMETHING is going on!

I decided to plot to cos(61x)

Something is going on, though it isn’t convergence. Note: by accident, I found that the pattern falls apart when I skipped one of the terms.

This is something to think about.

I wonder: for all x \in (0, \pi), sup_{n \in \{1, 3, 5, 7....\}}|\sum^{n}_{k \in \{1,3,..\}}cos(kx)| \leq |csc(x)| and we can somehow get close to csc(x) for given values of x by allowing enough terms…but the value of x is determined by how many terms we are using (not always the same value of x ).

October 11, 2016

The bias we have toward the rational numbers

Filed under: analysis, Measure Theory — Tags: , , — collegemathteaching @ 5:39 pm

A brilliant scientist (full tenure at the University of Chicago) has a website called “Why Evolution is True”. He wrote an article titled “why is pi irrational” and seemed to be under the impression that being “irrational” was somehow special or unusual.

That is an easy impression to have; after all, almost every example we use rationals or sometimes special irrationals (e. g. multiples of pi, e^1 , square roots, etc.

We even condition our students to think that way. Time and time again, I’ve seen questions such as “if f(.9) = .94, f(.95) = .9790, f(1.01) = 1.043 then it is reasonable to conclude that f(1) = . It is as if we want students to think that functions take integers to integers.

The reality is that the set of rationals has measure zero on the real line, so if one were to randomly select a number from the real line and the selection was truly random, the probability of the number being rational would be zero!

So, it would be far, far stranger had “pi” turned out to be rational. But that just sounds so strange.

So, why do the rationals have measure zero? I dealt with that in a more rigorous way elsewhere (and it is basic analysis) but I’ll give a simplified proof.

The set of rationals are countable so one can label all of them as q(n), n \in \{0, 1, 2, ... \} Now consider the following covering of the rational numbers: U_n = (q(n) - \frac{1}{2^{n+1}}, q(n) + \frac{1}{2^{n+1}}) . The length of each open interval is \frac{1}{2^n} . Of course there will be overlapping intervals but that isn’t important. What is important is that if one sums the lengths one gets \sum^{\infty}_{n = 0} \frac{1}{2^n} = \frac{1}{1-\frac{1}{2}} = 2 . So the rationals can be covered by a collection of open sets whose total length is less than or equal to 2.

But there is nothing special about 2; one can then find new coverings: U_n = (q(n) - \frac{\epsilon}{2^{n+1}}, q(n) + \frac{\epsilon}{2^{n+1}}) and the total length is now less than or equal to 2 \epsilon where \epsilon is any real number. Since there is no positive lower bound as to how small \epsilon can be, the set of rationals can be said to have measure zero.

June 7, 2016

Pop-math: getting it wrong but being close enough to give the public a feel for it

Space filling curves: for now, we’ll just work on continuous functions f: [0,1] \rightarrow [0,1] \times [0,1] \subset R^2 .

A curve is typically defined as a continuous function f: [0,1] \rightarrow M where M is, say, a manifold (a 2’nd countable metric space which has neighborhoods either locally homeomorphic to R^k or R^{k-1}) . Note: though we often think of smooth or piecewise linear curves, we don’t have to do so. Also, we can allow for self-intersections.

However, if we don’t put restrictions such as these, weird things can happen. It can be shown (and the video suggests a construction, which is correct) that there exists a continuous, ONTO function f: [0,1] \rightarrow [0,1] \times [0,1] ; such a gadget is called a space filling curve.

It follows from elementary topology that such an f cannot be one to one, because if it were, because the domain is compact, f would have to be a homeomorphism. But the respective spaces are not homeomorphic. For example: the closed interval is disconnected by the removal of any non-end point, whereas the closed square has no such separating point.

Therefore, if f is a space filling curve, the inverse image of a points is actually an infinite number of points; the inverse (as a function) cannot be defined.

And THAT is where this article and video goes off of the rails, though, practically speaking, one can approximate the space filling curve as close as one pleases by an embedded curve (one that IS one to one) and therefore snake the curve through any desired number of points (pixels?).

So, enjoy the video which I got from here (and yes, the text of this post has the aforementioned error)

January 26, 2016

More Fun with Divergent Series: redefining series convergence (Cesàro, etc.)

Filed under: analysis, calculus, sequences, series — Tags: , , — collegemathteaching @ 10:21 pm

This post is more designed to entertain myself than anything else. This builds up a previous post which talks about deleting enough terms from a divergent series to make it a convergent one.

This post is inspired by Chapter 8 of Konrad Knopp’s classic Theory and Application of Infinite Series. The title of the chapter is Divergent Series.

Notation: when I talk about a series converging, I mean “converging” in the usual sense; e. g. if s_n = \sum_{k=0}^{k=n} a_k and lim_{n \rightarrow \infty}s_n = s then \sum_{k=0}^{\infty} a_k is said to be convergent with sum s .

All of this makes sense since things like limits are carefully defined. But as Knopp points out, in the “days of old”, mathematicians say these series as formal objects rather than the result of careful construction. So some of these mathematicians (like Euler) had no problem saying things like \sum^{\infty}_{k=0} (-1)^k = 1-1+1-1+1..... = \frac{1}{2} . Now this is complete nonsense by our usual modern definition. But we might note that \frac{1}{1-x} = \sum^{\infty}_{k=0} x^k for -1 < x < 1 and note that x = -1 IS in the domain of the left hand side.

So, is there a way of redefining the meaning of “infinite sum” that gives us this result, while not changing the value of convergent series (defined in the standard way)? As Knopp points out in his book, the answer is “yes” and he describes several definitions of summation that

1. Do not change the value of an infinite sum that converges in the traditional sense and
2. Allows for more series to coverge.

We’ll discuss one of these methods, commonly referred to as Cesàro summation. There are ways to generalize this.

How this came about

Consider the Euler example: 1 -1 + 1 -1 + 1 -1...... . Clearly, s_{2k} = 1, s_{2k+1} = 0 and so this geometric series diverges. But notice that the arithmetic average of the partial sums, computed as c_n = \frac{s_0 + s_1 +...+s_n}{n+1} does tend to \frac{1}{2} as n tends to infinity: c_{2n} = \frac{\frac{2n}{2}}{2n+1} = \frac{n}{2n+1} whereas c_{2n+1} = \frac{\frac{2n}{2}}{2n+2} =\frac{n}{2n+2} and both of these quantities tend to \frac{1}{2} as n tends to infinity.

So, we need to see that this method of summing is workable; that is, do infinite sums that converge in the previous sense still converge to the same number with this method?

The answer is, of course, yes. Here is how to see this: Let x_n be a sequence that converges to zero. Then for any \epsilon > 0 we can find M such that k > M implies that |x_k| < \epsilon . So for n > k we have \frac{x_1 + x_2 + ...+ x_{k-1} + x_k + ...+ x_n}{n} = \frac{x_1+ ...+x_{k-1}}{n} + \frac{x_k + x_{k+1} + ....x_n}{n} Because k is fixed, the first fraction tends to zero as n tends to infinity. The second fraction is smaller than \epsilon in absolute value. But \epsilon is arbitrary, hence this arithmetic average of this null sequence is itself a null sequence.

Now let x_n \rightarrow L and let c_n = \frac{x_1 + x_2 + ...+ x_{k-1} + x_k + ...+ x_n}{n} Now subtract note c_n-L =  \frac{(x_1-L) + (x_2-L) + ...+ (x_{k-1}-L) +(x_k-L) + ...+ (x_n-L)}{n} and the x_n-L forms a null sequence. Then so do the c_n-L .

Now to be useful, we’d have to show that series that are summable in the Cesàro obey things like the multiplicative laws; they do but I am too lazy to show that. See the Knopp book.

I will mention a couple of interesting (to me) things though. Neither is really profound.

1. If a series diverges to infinity (that is, if for any positive M there exists n such that for all k \geq n, s_k > M , then this series is NOT Cesàro summable. It is relatively easy to see why: given such an M, k then consider \frac{s_1 + s_2 + s_3 + ...+s_{k-1} + s_k + s_{k+1} + ...s_n}{n} = \frac{s_1+ s_2 + ...+s_{k-1}}{n} + \frac{s_k + s_{k+1} .....+s_{n}}{n} which is greater than \frac{n-k}{n} M for large n . Hence the Cesàro partial sum becomes unbounded.

Upshot: there is no hope in making something like \sum^{\infty}_{n=1} \frac{1}{n} into a convergent series by this method. Now there is a way of making an alternating, divergent series into a convergent one via doing something like a “double Cesàro sum” (take arithmetic averages of the arithmetic averages) but that is a topic for another post.

2. Cesàro summation may speed up convergent of an alternating series which passes the alternating series test, OR it might slow it down. I’ll have to develop this idea more fully. But I invite the reader to try Cesàro summation for \sum^{\infty}_{k=1} (-1)^{k+1} \frac{1}{k} and on \sum^{\infty}_{k=1} (-1)^{k+1} \frac{1}{k^2} and on \sum^{\infty}_{k=0} (-1)^k \frac{1}{2^k} . In the first two cases, the series converges slowly enough so that Cesàro summation speeds up convergence. Cesàro slows down the convergence in the geometric series though. It is interesting to ponder why.

January 6, 2016

On all but a set of measure zero

Filed under: analysis, physics, popular mathematics, probability — Tags: — collegemathteaching @ 7:36 pm

This blog isn’t about cosmology or about arguments over religion. But it is unusual to hear “on all but a set of measure zero” in the middle of a pop-science talk: (2:40-2:50)

May 11, 2015

The hypervolume of the n-ball enclosed by a standard n-1 sphere

I am always looking for interesting calculus problems to demonstrate various concepts and perhaps generate some interest in pure mathematics.
And yes, I like to “blow off some steam” by spending some time having some non-technical mathematical fun with elementary mathematics.

This post uses only:

1. Integration by parts and basic reduction formulas.
2. Trig substitution.
3. Calculation of volumes (and hyper volumes) by the method of cross sections.
4. Induction
5. Elementary arithmetic involving factorials.

The quest: find a formula that finds the (hyper)volume of the region \{(x_1, x_2, x_3,....x_k) | \sum_{i=1}^k x_i^2 \leq R^2 \} \subset R^k

We will assume that the usual tools of calculus work as advertised.

Start. If we done the (hyper)volume of the k-ball by V_k  we will start with the assumption that V_1 = 2R ; that is, the distance between the endpoints of [-R,R] is 2R.

Step 1: we show, via induction, that V_k =c_kR^k where c_k is a constant and R is the radius.

Our proof will be inefficient for instructional purposes.

We know that V_1 =2R hence the induction hypothesis holds for the first case and c_1 = 2 . We now go to show the second case because, for the beginner, the technique will be easier to follow further along if we do the k = 2 case.

Yes, I know that you know that V_2 = \pi R^2 and you’ve seen many demonstrations of this fact. Here is another: let’s calculate this using the method of “area by cross sections”. Here is x^2 + y^2 = R^2 with some y = c cross sections drawn in.

crosssections

Now do the calculation by integrals: we will use symmetry and only do the upper half and multiply our result by 2. At each y = y_c level, call the radius from the center line to the circle R(y) so the total length of the “y is constant” level is 2R(y) and we “multiply by thickness “dy” to obtain V_2 = 4 \int^{y=R}_{y=0} R(y) dy .

But remember that the curve in question is x^2 + y^2 = R^2 and so if we set x = R(y) we have R(y) = \sqrt{R^2 -y^2} and so our integral is 4 \int^{y=R}_{y=0}\sqrt{R^2 -y^2}  dy

Now this integral is no big deal. But HOW we solve it will help us down the road. So here, we use the change of variable (aka “trigonometric substitution”): y = Rsin(t), dy =Rcos(t) to change the integral to:

4 \int^{\frac{\pi}{2}}_0 R^2 cos^2(t) dt = 4R^2 \int^{\frac{\pi}{2}}_0  cos^2(t) dt therefore

V_2 = c_2 R^2 where:

c_2 = 4\int^{\frac{\pi}{2}}_0  cos^2(t)

Yes, I know that this is an easy integral to solve, but I first presented the result this way in order to make a point.

Of course, c_2 = 4\int^{\frac{\pi}{2}}_0  cos^2(t) = 4\int^{\frac{\pi}{2}}_0 \frac{1}{2} + \frac{1}{2}cos(2t) dt = \pi

Therefore, V_2 =\pi R^2 as expected.

Exercise for those seeing this for the first time: compute c_3 and V_3 by using the above methods.

Inductive step: Assume V_k = c_kR^k Now calculate using the method of cross sections above (and here we move away from x-y coordinates to more general labeling):

V_{k+1} = 2\int^R_0 V_k dy = 2 \int^R_0 c_k (R(x_{k+1})^k dx_{k+1} =c_k 2\int^R_0 (R(x_{k+1}))^k dx_{k+1}

Now we do the substitutions: first of all, we note that x_1^2 + x_2^2 + ...x_{k}^2 + x_{k+1}^2 = R^2 and so

x_1^2 + x_2^2 ....+x_k^2 = R^2 - x_{k+1}^2 . Now for the key observation: x_1^2 + x_2^2 ..+x_k^2 =R^2(x_{k+1}) and so R(x_{k+1}) = \sqrt{R^2 - x_{k+1}^2}

Now use the induction hypothesis to note:

V_{k+1} = c_k 2\int^R_0 (R^2 - x_{k+1}^2)^{\frac{k}{2}} dx_{k+1}

Now do the substitution x_{k+1} = Rsin(t), dx_{k+1} = Rcos(t)dt and the integral is now:

V_{k+1} = c_k 2\int^{\frac{\pi}{2}}_0 R^{k+1} cos^{k+1}(t) dt = c_k(2 \int^{\frac{\pi}{2}}_0 cos^{k+1}(t) dt)R^{k+1} which is what we needed to show.

In fact, we have shown a bit more. We’ve shown that c_1 = 2 =2 \int^{\frac{\pi}{2}}_0(cos(t))dt, c_2 = 2 \cdot 2\int^{\frac{\pi}{2}}_0 cos^2(t) dt = c_1 2\int^{\frac{\pi}{2}}_0 cos^2(t) dt and, in general,

c_{k+1} = c_{k}c_{k-1}c_{k-2} ....c_1(2 \int^{\frac{\pi}{2}}_0 cos^{k+1}(t) dt) = 2^{k+1} \int^{\frac{\pi}{2}}_0(cos^{k+1}(t))dt \int^{\frac{\pi}{2}}_0(cos^{k}(t))dt \int^{\frac{\pi}{2}}_0(cos^{k-1}(t))dt .....\int^{\frac{\pi}{2}}_0(cos(t))dt

Finishing the formula

We now need to calculate these easy calculus integrals: in this case the reduction formula:

\int cos^n(x) dx = \frac{1}{n}cos^{n-1}sin(x) + \frac{n-1}{n} \int cos^{n-2}(x) dx is useful (it is merely integration by parts). Now use the limits and elementary calculation to obtain:

\int^{\frac{\pi}{2}}_0 cos^n(x) dx = \frac{n-1}{n} \int^{\frac{\pi}{2}}_0 cos^{n-2}(x)dx to obtain:

\int^{\frac{\pi}{2}}_0 cos^n(x) dx = (\frac{n-1}{n})(\frac{n-3}{n-2})......(\frac{3}{4})\frac{\pi}{4} if n is even and:
\int^{\frac{\pi}{2}}_0 cos^n(x) dx = (\frac{n-1}{n})(\frac{n-3}{n-2})......(\frac{4}{5})\frac{2}{3} if n is odd.

Now to come up with something resembling a closed formula let’s experiment and do some calculation:

Note that c_1 = 2, c_2 = \pi, c_3 = \frac{4 \pi}{3}, c_4 = \frac{(\pi)^2}{2}, c_5 = \frac{2^3 (\pi)^2)}{3 \cdot 5} = \frac{8 \pi^2}{15}, c_6 = \frac{\pi^3}{3 \cdot 2} = \frac{\pi^3}{6} .

So we can make the inductive conjecture that c_{2k} = \frac{\pi^k}{k!} and see how it holds up: c_{2k+2} = 2^2 \int^{\frac{\pi}{2}}_0(cos^{2k+2}(t))dt \int^{\frac{\pi}{2}}_0(cos^{2k+1}(t))dt \frac{\pi^k}{k!}

= 2^2 ((\frac{2k+1}{2k+2})(\frac{2k-1}{2k})......(\frac{3}{4})\frac{\pi}{4})((\frac{2k}{2k+1})(\frac{2k-2}{2k-1})......\frac{2}{3})\frac{\pi^k}{k!}

Now notice the telescoping effect of the fractions from the c_{2k+1} factor. All factors cancel except for the (2k+2) in the first denominator and the 2 in the first numerator, as well as the \frac{\pi}{4} factor. This leads to:

c_{2k+2} = 2^2(\frac{\pi}{4})\frac{2}{2k+2} \frac{\pi^k}{k!} = \frac{\pi^{k+1}}{(k+1)!} as required.

Now we need to calculate c_{2k+1} = 2\int^{\frac{\pi}{2}}_0(cos^{2k+1}(t))dt c_{2k} = 2\int^{\frac{\pi}{2}}_0(cos^{2k+1}(t))dt \frac{\pi^k}{k!}

= 2 (\frac{2k}{2k+1})(\frac{2k-2}{2k-1})......(\frac{4}{5})\frac{2}{3}\frac{\pi^k}{k!} = 2 (\frac{(2k)(2k-2)(2k-4)..(4)(2)}{(2k+1)(2k-1)...(5)(3)} \frac{\pi^k}{k!}

To simplify this further: split up the factors of the k! in the denominator and put one between each denominator factor:

= 2 (\frac{(2k)(2k-2)(2k-4)..(4)(2)}{(2k+1)(k)(2k-1)(k-1)...(3)(5)(2)(3)(1)} \pi^k Now multiply the denominator by 2^k and put one factor with each k-m factor in the denominator; also multiply by 2^k in the numerator to obtain:

(2) 2^k (\frac{(2k)(2k-2)(2k-4)..(4)(2)}{(2k+1)(2k)(2k-1)(2k-2)...(6)(5)(4)(3)(2)} \pi^k Now gather each factor of 2 in the numerator product of the 2k, 2k-2…

= (2) 2^k 2^k \pi^k \frac{k!}{(2k+1)!} = 2 \frac{(4 \pi)^k k!}{(2k+1)!} which is the required formula.

So to summarize:

V_{2k} = \frac{\pi^k}{k!} R^{2k}

V_{2k+1}= \frac{2 k! (4 \pi)^k}{(2k+1)!}R^{2k+1}

Note the following: lim_{k \rightarrow \infty} c_{k} = 0 . If this seems strange at first, think of it this way: imagine the n-ball being “inscribed” in an n-cube which has hyper volume (2R)^n . Then consider the ratio \frac{2^n R^n}{c_n R^n} = 2^n \frac{1}{c_n} ; that is, the n-ball holds a smaller and smaller percentage of the hyper volume of the n-cube that it is inscribed in; note the 2^n corresponds to the number of corners in the n-cube. One might see that the rounding gets more severe as the number of dimensions increases.

One also notes that for fixed radius R, lim_{n \rightarrow \infty} V_n = 0 as well.

There are other interesting aspects to this limit: for what dimension n does the maximum hypervolume occur? As you might expect: this depends on the radius involved; a quick glance at the hyper volume formulas will show why. For more on this topic, including an interesting discussion on this limit itself, see Dave Richardson’s blog Division by Zero. Note: his approach to finding the hyper volume formula is also elementary but uses polar coordinate integration as opposed to the method of cross sections.

April 10, 2015

Cantor sets and countable products of discrete spaces (0, 1)^Z

Filed under: advanced mathematics, analysis, point set topology, sequences — Tags: , , , — collegemathteaching @ 11:09 am

This might seem like a strange topic but right now our topology class is studying compact metric spaces. One of the more famous of these is the “Cantor set” or “Cantor space”. I discussed the basics of these here.

Now if you know the relationship between a countable product of two point discrete spaces (in the product topology) and Cantor spaces/Cantor Sets, this post is probably too elementary for you.

Construction: start with a two point set D = \{0, 2 \} and give it the discrete topology. The reason for choosing 0 and 2 to represent the elements will become clear later. Of course, D_2 is a compact metric space (with the discrete metric: d(x,y) = 1 if x \neq y .

Now consider the infinite product of such spaces with the product topology: C = \Pi^{\infty}_{i=1} D_i where each D_i is homeomorphic to D . It follows from the Tychonoff Theorem that C is compact, though we can prove this directly: Let \cup_{\alpha \in I} U_{\alpha} be any open cover for C . Then choose an arbitrary U from this open cover; because we are using the product topology U = O_1 \times O_2 \times ....O_k \times (\Pi_{i=k+1}^{\infty} D_i ) where each O_i is a one or two point set. This means that the cardinality of C - U is at most 2^k -1 which requires at most 2^k -1 elements of the open cover to cover.

Now let’s examine some properties.

Clearly the space is Hausdorff (T_2 ) and uncountable.

1. Every point of C is a limit point of C . To see this: denote x \in C by the sequence \{x_i \} where x_i \in \{0, 2 \} . Then any open set containing \{ x_i \} is O_1 \times O_2 \times...O_k \times \Pi^{\infty}_{i=k+1} D_i and contains ALL points y_i where y_i = x_i for i = \{1, 2, ...k \} . So all points of C are accumulation points of C ; in fact they are condensation points (or perfect limit points ).

(refresher: accumulation points are those for which every open neighborhood contains an infinite number of points of the set in question; condensation points contain an uncountable number of points, and perfect limit points are those for which every open neighborhood contains as many points as the set in question has (same cardinality).

2. C is totally disconnected (the components are one point sets). Here is how we will show this: given x, y \in C, x \neq y, there exists disjoint open sets U_x, U_y, x \in U_x, y \in U_y, U_x \cup U_y = C . Proof of claim: if x \neq y there exists a first coordinate k for which x_k \neq y_k (that is, a first k for which the canonical projection maps disagree (\pi_k(x) \neq pi_k(y) ). Then
U_x = D_1 \times D_2 \times ....\times D_{k-1} \times x_k \times \Pi^{\infty}_{i=k+1} D_i,

U_y = D_1 \times D_2 \times.....\times D_{k-1} \times y_k \times \Pi^{\infty}_{i = k+1} D_i

are the required disjoint open sets whose union is all of C .

3. C is countable, as basis elements for open sets consist of finite sequences of 0’s and 2’s followed by an infinite product of D_i .

4. C is metrizable as well; d(x,y) = \sum^{\infty}_{i=1} \frac{|x_i - y_i|}{3^i} . Note that is metric is well defined. Suppose x \neq y . Then there is a first k, x_k \neq y_k . Then note

d(x,y) = \frac{|x_k - y_k|}{3^k} + \sum^{\infty}_{i = k+1} \frac{|x_i - y_i|}{3^i} \rightarrow |x_k -y_k| =2 = \sum^{\infty}_{i=1} \frac{|x_{i+k} -y_{i+k}|}{3^i} \leq \frac{1}{3} \frac{1}{1 -\frac{2}{3}} =1

which is impossible.

5. By construction C is uncountable, though this follows from the fact that C is compact, Haudorff and dense in itself.

6. C \times C is homeomorphic to C . The homeomorphism is given by f( \{x_i \}, \{y_i \}) = \{ x_1, y_1, x_2, y_2,... \} \in C . It follows that C is homeomorphic to a finite product with itself (product topology). Here we use the fact that if f: X \rightarrow Y is a continuous bijection with X compact and Y Hausdorff then f is a homeomorphism.

Now we can say a bit more: if C_i is a copy of C then \Pi^{\infty}_{i =1 } C_i is homeomorphic to C . This will follow from subsequent work, but we can prove this right now, provided we review some basic facts about countable products and counting.

First lets show that there is a bijection between Z \times Z and Z . A bijection is suggested by this diagram:

ztimeszmyway

which has the following formula (coming up with it is fun; it uses the fact that \sum^k _{n=1} n = \frac{k(k+1)}{2} :

\phi(k,1) = \frac{(k)(k+1)}{2} for k even
\phi(k,1) = \frac{(k-1)(k)}{2} + 1 for k odd
\phi(k-j, j+1) =\phi(k,1) + j for k odd, j \in \{1, 2, ...k-1 \}
\phi(k-j, j+1) = \phi(k,1) - j for k even, j \in \{1, 2, ...k-1 \}

Here is a different bijection; it is a fun exercise to come up with the relevant formulas:

zxzcountable

Now lets give the map between \Pi^{\infty}_{i=1} C_i and C . Let \{ y_i \} \in C and denote the elements of \Pi^{\infty}_{i=1} C_i by \{ x^i_j \} where \{ x_1^1, x_2^1, x_3^ 1....\} \in C_1, \{x_1^2, x_2 ^2, x_3^3, ....\} \in C_2, ....\{x_1^k, x_2^k, .....\} \in C_k ... .

We now describe a map f: C \rightarrow \Pi^{\infty}_{i=1} C_i by

f(\{y_i \}) = \{ x^i_j \} = \{y_{\phi(i,j)} \}

Example: x^1_1 = y_1, x^1_2 = y_2, x^2_1 = y_3, x^3_1 =y_4, x^2_2 = y_5, x^1_3 =y_6,...

That this is a bijection between compact Hausdorff spaces is immediate. If we show that f^{-1} is continuous, we will have shown that f is a homeomorphism.

But that isn’t hard to do. Let U \subset C be open; U = U_1 \times U_2 \times U_3.... \times U_{m-1} \times \Pi^{\infty}_{k=m} C_k . Then there is some k_m for which \phi(k_m, 1) \geq M . Then if f^i_j denotes the i,j component of f we wee that for all i+j \geq k_m+1, f^i_j(U) = C (these are entries on or below the diagonal containing (k,1) depending on whether k_m is even or odd.

So f(U) is of the form V_1 \times V_2 \times ....V_{k_m} \times \Pi^{\infty}_{i = k_m +1} C_i where each V_j is open in C_j . This is an open set in the product topology of \Pi^{\infty}_{i=1} C_i so this shows that f^{-1} is continuous. Therefore f^{-1} is a homeomorphism, therefore so is f.

Ok, what does this have to do with Cantor Sets and Cantor Spaces?

If you know what the “middle thirds” Cantor Set is I urge you stop reading now and prove that that Cantor set is indeed homeomorphic to C as we have described it. I’ll give this quote from Willard, page 121 (Hardback edition), section 17.9 in Chapter 6:

The proof is left as an exercise. You should do it if you think you can’t, since it will teach you a lot about product spaces.

What I will do I’ll give a geometric description of a Cantor set and show that this description, which easily includes the “deleted interval” Cantor sets that are used in analysis courses, is homeomorphic to C .

Set up
I’ll call this set F and describe it as follows:

F \subset R^n (for those interested in the topology of manifolds this poses no restrictions since any manifold embeds in R^n for sufficiently high n ).

Reminder: the diameter of a set F \subset R^n will be lub \{ d(x,y) | x, y \in F \}
Let \epsilon_1, \epsilon_2, \epsilon_3 .....\epsilon_k ... be a strictly decreasing sequence of positive real numbers such that \epsilon_k \rightarrow 0 .

Let F^0 be some closed n-ball in R^n (that is, F^) is a subset homeomorphic to a closed n-ball; we will use that convention throughout)

Let F^1_{(0) }, F^1_{(2)} be two disjoint closed n-balls in the interior of F^0 , each of which has diameter less than \epsilon_1 .

F^1 = F^1_{(0) } \cup F^1_{(2)}

Let F^2_{(0, 0)}, F^2_{(0,2)} be disjoint closed n-balls in the interior F^1_{(0) } and F^2_{(2,0)}, F^2_{(2,2)} be disjoint closed n-balls in the interior of F^1_{(2) } , each of which (all 4 balls) have diameter less that \epsilon_2 . Let F^2 = F^2_{(0, 0)}\cup F^2_{(0,2)} \cup F^2_{(2, 0)} \cup F^2_{(2,2)}

cantorset

To describe the construction inductively we will use a bit of notation: a_i \in \{0, 2 \} for all i \in \{1, 2, ...\} and \{a_i \} will represent an infinite sequence of such a_i .
Now if F^k has been defined, we let F^{k+1}_{(a_1, a_2, ...a_{k}, 0)} and F^{k+1}_{(a_1, a_2,....,a_{k}, 2)} be disjoint closed n-balls of diameter less than \epsilon_{k+1} which lie in the interior of F^k_{(a_1, a_2,....a_k) } . Note that F^{k+1} consists of 2^{k+1} disjoint closed n-balls.

Now let F = \cap^{\infty}_{i=1} F^i . Since these are compact sets with the finite intersection property (\cap^{m}_{i=1}F^i =F^i \neq \emptyset for all m ), F is non empty and compact. Now for any choice of sequence \{a_i \} we have F_{ \{a_i \} } =\cap^{\infty}_{i=1} F^i_{(a_1, ...a_i)} is nonempty by the finite intersection property. On the other hand, if x, y \in F, x \neq y then d(x,y) = \delta > 0 so choose \epsilon_m such that \epsilon_m < \delta . Then x, y lie in different components of F^m since the diameters of these components are less than \epsilon_m .

Then we can say that the F_{ \{a_i} \} uniquely define the points of F . We can call such points x_{ \{a_i \} }

Note: in the subspace topology, the F^k_{(a_1, a_2, ...a_k)} are open sets, as well as being closed.

Finding a homeomorphism from F to C .
Let f: F \rightarrow C be defined by f( x_{ \{a_i \} } ) = \{a_i \} . This is a bijection. To show continuity: consider the open set U =  y_1 \times y_2 ....\times y_m \times \Pi^{\infty}_{i=m} D_i . Under f this pulls back to the open set (in the subspace topology) F^{m+1}_{(y1, y2, ...y_m, 0 ) } \cup F^{m+1}_{(y1, y2, ...y_m, 2)} hence f is continuous. Because F is compact and C is Hausdorff, f is a homeomorphism.

This ends part I.

We have shown that the Cantor sets defined geometrically and defined via “deleted intervals” are homeomorphic to C . What we have not shown is the following:

Let X be a compact Hausdorff space which is dense in itself (every point is a limit point) and totally disconnected (components are one point sets). Then X is homeomorphic to C . That will be part II.

Older Posts »

Create a free website or blog at WordPress.com.