College Math Teaching

October 11, 2016

The bias we have toward the rational numbers

Filed under: analysis, Measure Theory — Tags: , , — collegemathteaching @ 5:39 pm

A brilliant scientist (full tenure at the University of Chicago) has a website called “Why Evolution is True”. He wrote an article titled “why is pi irrational” and seemed to be under the impression that being “irrational” was somehow special or unusual.

That is an easy impression to have; after all, almost every example we use rationals or sometimes special irrationals (e. g. multiples of pi, e^1 , square roots, etc.

We even condition our students to think that way. Time and time again, I’ve seen questions such as “if f(.9) = .94, f(.95) = .9790, f(1.01) = 1.043 then it is reasonable to conclude that f(1) = . It is as if we want students to think that functions take integers to integers.

The reality is that the set of rationals has measure zero on the real line, so if one were to randomly select a number from the real line and the selection was truly random, the probability of the number being rational would be zero!

So, it would be far, far stranger had “pi” turned out to be rational. But that just sounds so strange.

So, why do the rationals have measure zero? I dealt with that in a more rigorous way elsewhere (and it is basic analysis) but I’ll give a simplified proof.

The set of rationals are countable so one can label all of them as q(n), n \in \{0, 1, 2, ... \} Now consider the following covering of the rational numbers: U_n = (q(n) - \frac{1}{2^{n+1}}, q(n) + \frac{1}{2^{n+1}}) . The length of each open interval is \frac{1}{2^n} . Of course there will be overlapping intervals but that isn’t important. What is important is that if one sums the lengths one gets \sum^{\infty}_{n = 0} \frac{1}{2^n} = \frac{1}{1-\frac{1}{2}} = 2 . So the rationals can be covered by a collection of open sets whose total length is less than or equal to 2.

But there is nothing special about 2; one can then find new coverings: U_n = (q(n) - \frac{\epsilon}{2^{n+1}}, q(n) + \frac{\epsilon}{2^{n+1}}) and the total length is now less than or equal to 2 \epsilon where \epsilon is any real number. Since there is no positive lower bound as to how small \epsilon can be, the set of rationals can be said to have measure zero.

June 7, 2016

Pop-math: getting it wrong but being close enough to give the public a feel for it

Space filling curves: for now, we’ll just work on continuous functions f: [0,1] \rightarrow [0,1] \times [0,1] \subset R^2 .

A curve is typically defined as a continuous function f: [0,1] \rightarrow M where M is, say, a manifold (a 2’nd countable metric space which has neighborhoods either locally homeomorphic to R^k or R^{k-1}) . Note: though we often think of smooth or piecewise linear curves, we don’t have to do so. Also, we can allow for self-intersections.

However, if we don’t put restrictions such as these, weird things can happen. It can be shown (and the video suggests a construction, which is correct) that there exists a continuous, ONTO function f: [0,1] \rightarrow [0,1] \times [0,1] ; such a gadget is called a space filling curve.

It follows from elementary topology that such an f cannot be one to one, because if it were, because the domain is compact, f would have to be a homeomorphism. But the respective spaces are not homeomorphic. For example: the closed interval is disconnected by the removal of any non-end point, whereas the closed square has no such separating point.

Therefore, if f is a space filling curve, the inverse image of a points is actually an infinite number of points; the inverse (as a function) cannot be defined.

And THAT is where this article and video goes off of the rails, though, practically speaking, one can approximate the space filling curve as close as one pleases by an embedded curve (one that IS one to one) and therefore snake the curve through any desired number of points (pixels?).

So, enjoy the video which I got from here (and yes, the text of this post has the aforementioned error)

January 26, 2016

More Fun with Divergent Series: redefining series convergence (Cesàro, etc.)

Filed under: analysis, calculus, sequences, series — Tags: , , — collegemathteaching @ 10:21 pm

This post is more designed to entertain myself than anything else. This builds up a previous post which talks about deleting enough terms from a divergent series to make it a convergent one.

This post is inspired by Chapter 8 of Konrad Knopp’s classic Theory and Application of Infinite Series. The title of the chapter is Divergent Series.

Notation: when I talk about a series converging, I mean “converging” in the usual sense; e. g. if s_n = \sum_{k=0}^{k=n} a_k and lim_{n \rightarrow \infty}s_n = s then \sum_{k=0}^{\infty} a_k is said to be convergent with sum s .

All of this makes sense since things like limits are carefully defined. But as Knopp points out, in the “days of old”, mathematicians say these series as formal objects rather than the result of careful construction. So some of these mathematicians (like Euler) had no problem saying things like \sum^{\infty}_{k=0} (-1)^k = 1-1+1-1+1..... = \frac{1}{2} . Now this is complete nonsense by our usual modern definition. But we might note that \frac{1}{1-x} = \sum^{\infty}_{k=0} x^k for -1 < x < 1 and note that x = -1 IS in the domain of the left hand side.

So, is there a way of redefining the meaning of “infinite sum” that gives us this result, while not changing the value of convergent series (defined in the standard way)? As Knopp points out in his book, the answer is “yes” and he describes several definitions of summation that

1. Do not change the value of an infinite sum that converges in the traditional sense and
2. Allows for more series to coverge.

We’ll discuss one of these methods, commonly referred to as Cesàro summation. There are ways to generalize this.

How this came about

Consider the Euler example: 1 -1 + 1 -1 + 1 -1...... . Clearly, s_{2k} = 1, s_{2k+1} = 0 and so this geometric series diverges. But notice that the arithmetic average of the partial sums, computed as c_n = \frac{s_0 + s_1 +...+s_n}{n+1} does tend to \frac{1}{2} as n tends to infinity: c_{2n} = \frac{\frac{2n}{2}}{2n+1} = \frac{n}{2n+1} whereas c_{2n+1} = \frac{\frac{2n}{2}}{2n+2} =\frac{n}{2n+2} and both of these quantities tend to \frac{1}{2} as n tends to infinity.

So, we need to see that this method of summing is workable; that is, do infinite sums that converge in the previous sense still converge to the same number with this method?

The answer is, of course, yes. Here is how to see this: Let x_n be a sequence that converges to zero. Then for any \epsilon > 0 we can find M such that k > M implies that |x_k| < \epsilon . So for n > k we have \frac{x_1 + x_2 + ...+ x_{k-1} + x_k + ...+ x_n}{n} = \frac{x_1+ ...+x_{k-1}}{n} + \frac{x_k + x_{k+1} + ....x_n}{n} Because k is fixed, the first fraction tends to zero as n tends to infinity. The second fraction is smaller than \epsilon in absolute value. But \epsilon is arbitrary, hence this arithmetic average of this null sequence is itself a null sequence.

Now let x_n \rightarrow L and let c_n = \frac{x_1 + x_2 + ...+ x_{k-1} + x_k + ...+ x_n}{n} Now subtract note c_n-L =  \frac{(x_1-L) + (x_2-L) + ...+ (x_{k-1}-L) +(x_k-L) + ...+ (x_n-L)}{n} and the x_n-L forms a null sequence. Then so do the c_n-L .

Now to be useful, we’d have to show that series that are summable in the Cesàro obey things like the multiplicative laws; they do but I am too lazy to show that. See the Knopp book.

I will mention a couple of interesting (to me) things though. Neither is really profound.

1. If a series diverges to infinity (that is, if for any positive M there exists n such that for all k \geq n, s_k > M , then this series is NOT Cesàro summable. It is relatively easy to see why: given such an M, k then consider \frac{s_1 + s_2 + s_3 + ...+s_{k-1} + s_k + s_{k+1} + ...s_n}{n} = \frac{s_1+ s_2 + ...+s_{k-1}}{n} + \frac{s_k + s_{k+1} .....+s_{n}}{n} which is greater than \frac{n-k}{n} M for large n . Hence the Cesàro partial sum becomes unbounded.

Upshot: there is no hope in making something like \sum^{\infty}_{n=1} \frac{1}{n} into a convergent series by this method. Now there is a way of making an alternating, divergent series into a convergent one via doing something like a “double Cesàro sum” (take arithmetic averages of the arithmetic averages) but that is a topic for another post.

2. Cesàro summation may speed up convergent of an alternating series which passes the alternating series test, OR it might slow it down. I’ll have to develop this idea more fully. But I invite the reader to try Cesàro summation for \sum^{\infty}_{k=1} (-1)^{k+1} \frac{1}{k} and on \sum^{\infty}_{k=1} (-1)^{k+1} \frac{1}{k^2} and on \sum^{\infty}_{k=0} (-1)^k \frac{1}{2^k} . In the first two cases, the series converges slowly enough so that Cesàro summation speeds up convergence. Cesàro slows down the convergence in the geometric series though. It is interesting to ponder why.

January 6, 2016

On all but a set of measure zero

Filed under: analysis, physics, popular mathematics, probability — Tags: — collegemathteaching @ 7:36 pm

This blog isn’t about cosmology or about arguments over religion. But it is unusual to hear “on all but a set of measure zero” in the middle of a pop-science talk: (2:40-2:50)

May 11, 2015

The hypervolume of the n-ball enclosed by a standard n-1 sphere

I am always looking for interesting calculus problems to demonstrate various concepts and perhaps generate some interest in pure mathematics.
And yes, I like to “blow off some steam” by spending some time having some non-technical mathematical fun with elementary mathematics.

This post uses only:

1. Integration by parts and basic reduction formulas.
2. Trig substitution.
3. Calculation of volumes (and hyper volumes) by the method of cross sections.
4. Induction
5. Elementary arithmetic involving factorials.

The quest: find a formula that finds the (hyper)volume of the region \{(x_1, x_2, x_3,....x_k) | \sum_{i=1}^k x_i^2 \leq R^2 \} \subset R^k

We will assume that the usual tools of calculus work as advertised.

Start. If we done the (hyper)volume of the k-ball by V_k  we will start with the assumption that V_1 = 2R ; that is, the distance between the endpoints of [-R,R] is 2R.

Step 1: we show, via induction, that V_k =c_kR^k where c_k is a constant and R is the radius.

Our proof will be inefficient for instructional purposes.

We know that V_1 =2R hence the induction hypothesis holds for the first case and c_1 = 2 . We now go to show the second case because, for the beginner, the technique will be easier to follow further along if we do the k = 2 case.

Yes, I know that you know that V_2 = \pi R^2 and you’ve seen many demonstrations of this fact. Here is another: let’s calculate this using the method of “area by cross sections”. Here is x^2 + y^2 = R^2 with some y = c cross sections drawn in.


Now do the calculation by integrals: we will use symmetry and only do the upper half and multiply our result by 2. At each y = y_c level, call the radius from the center line to the circle R(y) so the total length of the “y is constant” level is 2R(y) and we “multiply by thickness “dy” to obtain V_2 = 4 \int^{y=R}_{y=0} R(y) dy .

But remember that the curve in question is x^2 + y^2 = R^2 and so if we set x = R(y) we have R(y) = \sqrt{R^2 -y^2} and so our integral is 4 \int^{y=R}_{y=0}\sqrt{R^2 -y^2}  dy

Now this integral is no big deal. But HOW we solve it will help us down the road. So here, we use the change of variable (aka “trigonometric substitution”): y = Rsin(t), dy =Rcos(t) to change the integral to:

4 \int^{\frac{\pi}{2}}_0 R^2 cos^2(t) dt = 4R^2 \int^{\frac{\pi}{2}}_0  cos^2(t) dt therefore

V_2 = c_2 R^2 where:

c_2 = 4\int^{\frac{\pi}{2}}_0  cos^2(t)

Yes, I know that this is an easy integral to solve, but I first presented the result this way in order to make a point.

Of course, c_2 = 4\int^{\frac{\pi}{2}}_0  cos^2(t) = 4\int^{\frac{\pi}{2}}_0 \frac{1}{2} + \frac{1}{2}cos(2t) dt = \pi

Therefore, V_2 =\pi R^2 as expected.

Exercise for those seeing this for the first time: compute c_3 and V_3 by using the above methods.

Inductive step: Assume V_k = c_kR^k Now calculate using the method of cross sections above (and here we move away from x-y coordinates to more general labeling):

V_{k+1} = 2\int^R_0 V_k dy = 2 \int^R_0 c_k (R(x_{k+1})^k dx_{k+1} =c_k 2\int^R_0 (R(x_{k+1}))^k dx_{k+1}

Now we do the substitutions: first of all, we note that x_1^2 + x_2^2 + ...x_{k}^2 + x_{k+1}^2 = R^2 and so

x_1^2 + x_2^2 ....+x_k^2 = R^2 - x_{k+1}^2 . Now for the key observation: x_1^2 + x_2^2 ..+x_k^2 =R^2(x_{k+1}) and so R(x_{k+1}) = \sqrt{R^2 - x_{k+1}^2}

Now use the induction hypothesis to note:

V_{k+1} = c_k 2\int^R_0 (R^2 - x_{k+1}^2)^{\frac{k}{2}} dx_{k+1}

Now do the substitution x_{k+1} = Rsin(t), dx_{k+1} = Rcos(t)dt and the integral is now:

V_{k+1} = c_k 2\int^{\frac{\pi}{2}}_0 R^{k+1} cos^{k+1}(t) dt = c_k(2 \int^{\frac{\pi}{2}}_0 cos^{k+1}(t) dt)R^{k+1} which is what we needed to show.

In fact, we have shown a bit more. We’ve shown that c_1 = 2 =2 \int^{\frac{\pi}{2}}_0(cos(t))dt, c_2 = 2 \cdot 2\int^{\frac{\pi}{2}}_0 cos^2(t) dt = c_1 2\int^{\frac{\pi}{2}}_0 cos^2(t) dt and, in general,

c_{k+1} = c_{k}c_{k-1}c_{k-2} ....c_1(2 \int^{\frac{\pi}{2}}_0 cos^{k+1}(t) dt) = 2^{k+1} \int^{\frac{\pi}{2}}_0(cos^{k+1}(t))dt \int^{\frac{\pi}{2}}_0(cos^{k}(t))dt \int^{\frac{\pi}{2}}_0(cos^{k-1}(t))dt .....\int^{\frac{\pi}{2}}_0(cos(t))dt

Finishing the formula

We now need to calculate these easy calculus integrals: in this case the reduction formula:

\int cos^n(x) dx = \frac{1}{n}cos^{n-1}sin(x) + \frac{n-1}{n} \int cos^{n-2}(x) dx is useful (it is merely integration by parts). Now use the limits and elementary calculation to obtain:

\int^{\frac{\pi}{2}}_0 cos^n(x) dx = \frac{n-1}{n} \int^{\frac{\pi}{2}}_0 cos^{n-2}(x)dx to obtain:

\int^{\frac{\pi}{2}}_0 cos^n(x) dx = (\frac{n-1}{n})(\frac{n-3}{n-2})......(\frac{3}{4})\frac{\pi}{4} if n is even and:
\int^{\frac{\pi}{2}}_0 cos^n(x) dx = (\frac{n-1}{n})(\frac{n-3}{n-2})......(\frac{4}{5})\frac{2}{3} if n is odd.

Now to come up with something resembling a closed formula let’s experiment and do some calculation:

Note that c_1 = 2, c_2 = \pi, c_3 = \frac{4 \pi}{3}, c_4 = \frac{(\pi)^2}{2}, c_5 = \frac{2^3 (\pi)^2)}{3 \cdot 5} = \frac{8 \pi^2}{15}, c_6 = \frac{\pi^3}{3 \cdot 2} = \frac{\pi^3}{6} .

So we can make the inductive conjecture that c_{2k} = \frac{\pi^k}{k!} and see how it holds up: c_{2k+2} = 2^2 \int^{\frac{\pi}{2}}_0(cos^{2k+2}(t))dt \int^{\frac{\pi}{2}}_0(cos^{2k+1}(t))dt \frac{\pi^k}{k!}

= 2^2 ((\frac{2k+1}{2k+2})(\frac{2k-1}{2k})......(\frac{3}{4})\frac{\pi}{4})((\frac{2k}{2k+1})(\frac{2k-2}{2k-1})......\frac{2}{3})\frac{\pi^k}{k!}

Now notice the telescoping effect of the fractions from the c_{2k+1} factor. All factors cancel except for the (2k+2) in the first denominator and the 2 in the first numerator, as well as the \frac{\pi}{4} factor. This leads to:

c_{2k+2} = 2^2(\frac{\pi}{4})\frac{2}{2k+2} \frac{\pi^k}{k!} = \frac{\pi^{k+1}}{(k+1)!} as required.

Now we need to calculate c_{2k+1} = 2\int^{\frac{\pi}{2}}_0(cos^{2k+1}(t))dt c_{2k} = 2\int^{\frac{\pi}{2}}_0(cos^{2k+1}(t))dt \frac{\pi^k}{k!}

= 2 (\frac{2k}{2k+1})(\frac{2k-2}{2k-1})......(\frac{4}{5})\frac{2}{3}\frac{\pi^k}{k!} = 2 (\frac{(2k)(2k-2)(2k-4)..(4)(2)}{(2k+1)(2k-1)...(5)(3)} \frac{\pi^k}{k!}

To simplify this further: split up the factors of the k! in the denominator and put one between each denominator factor:

= 2 (\frac{(2k)(2k-2)(2k-4)..(4)(2)}{(2k+1)(k)(2k-1)(k-1)...(3)(5)(2)(3)(1)} \pi^k Now multiply the denominator by 2^k and put one factor with each k-m factor in the denominator; also multiply by 2^k in the numerator to obtain:

(2) 2^k (\frac{(2k)(2k-2)(2k-4)..(4)(2)}{(2k+1)(2k)(2k-1)(2k-2)...(6)(5)(4)(3)(2)} \pi^k Now gather each factor of 2 in the numerator product of the 2k, 2k-2…

= (2) 2^k 2^k \pi^k \frac{k!}{(2k+1)!} = 2 \frac{(4 \pi)^k k!}{(2k+1)!} which is the required formula.

So to summarize:

V_{2k} = \frac{\pi^k}{k!} R^{2k}

V_{2k+1}= \frac{2 k! (4 \pi)^k}{(2k+1)!}R^{2k+1}

Note the following: lim_{k \rightarrow \infty} c_{k} = 0 . If this seems strange at first, think of it this way: imagine the n-ball being “inscribed” in an n-cube which has hyper volume (2R)^n . Then consider the ratio \frac{2^n R^n}{c_n R^n} = 2^n \frac{1}{c_n} ; that is, the n-ball holds a smaller and smaller percentage of the hyper volume of the n-cube that it is inscribed in; note the 2^n corresponds to the number of corners in the n-cube. One might see that the rounding gets more severe as the number of dimensions increases.

One also notes that for fixed radius R, lim_{n \rightarrow \infty} V_n = 0 as well.

There are other interesting aspects to this limit: for what dimension n does the maximum hypervolume occur? As you might expect: this depends on the radius involved; a quick glance at the hyper volume formulas will show why. For more on this topic, including an interesting discussion on this limit itself, see Dave Richardson’s blog Division by Zero. Note: his approach to finding the hyper volume formula is also elementary but uses polar coordinate integration as opposed to the method of cross sections.

April 10, 2015

Cantor sets and countable products of discrete spaces (0, 1)^Z

Filed under: advanced mathematics, analysis, point set topology, sequences — Tags: , , , — collegemathteaching @ 11:09 am

This might seem like a strange topic but right now our topology class is studying compact metric spaces. One of the more famous of these is the “Cantor set” or “Cantor space”. I discussed the basics of these here.

Now if you know the relationship between a countable product of two point discrete spaces (in the product topology) and Cantor spaces/Cantor Sets, this post is probably too elementary for you.

Construction: start with a two point set D = \{0, 2 \} and give it the discrete topology. The reason for choosing 0 and 2 to represent the elements will become clear later. Of course, D_2 is a compact metric space (with the discrete metric: d(x,y) = 1 if x \neq y .

Now consider the infinite product of such spaces with the product topology: C = \Pi^{\infty}_{i=1} D_i where each D_i is homeomorphic to D . It follows from the Tychonoff Theorem that C is compact, though we can prove this directly: Let \cup_{\alpha \in I} U_{\alpha} be any open cover for C . Then choose an arbitrary U from this open cover; because we are using the product topology U = O_1 \times O_2 \times ....O_k \times (\Pi_{i=k+1}^{\infty} D_i ) where each O_i is a one or two point set. This means that the cardinality of C - U is at most 2^k -1 which requires at most 2^k -1 elements of the open cover to cover.

Now let’s examine some properties.

Clearly the space is Hausdorff (T_2 ) and uncountable.

1. Every point of C is a limit point of C . To see this: denote x \in C by the sequence \{x_i \} where x_i \in \{0, 2 \} . Then any open set containing \{ x_i \} is O_1 \times O_2 \times...O_k \times \Pi^{\infty}_{i=k+1} D_i and contains ALL points y_i where y_i = x_i for i = \{1, 2, ...k \} . So all points of C are accumulation points of C ; in fact they are condensation points (or perfect limit points ).

(refresher: accumulation points are those for which every open neighborhood contains an infinite number of points of the set in question; condensation points contain an uncountable number of points, and perfect limit points are those for which every open neighborhood contains as many points as the set in question has (same cardinality).

2. C is totally disconnected (the components are one point sets). Here is how we will show this: given x, y \in C, x \neq y, there exists disjoint open sets U_x, U_y, x \in U_x, y \in U_y, U_x \cup U_y = C . Proof of claim: if x \neq y there exists a first coordinate k for which x_k \neq y_k (that is, a first k for which the canonical projection maps disagree (\pi_k(x) \neq pi_k(y) ). Then
U_x = D_1 \times D_2 \times ....\times D_{k-1} \times x_k \times \Pi^{\infty}_{i=k+1} D_i,

U_y = D_1 \times D_2 \times.....\times D_{k-1} \times y_k \times \Pi^{\infty}_{i = k+1} D_i

are the required disjoint open sets whose union is all of C .

3. C is countable, as basis elements for open sets consist of finite sequences of 0’s and 2’s followed by an infinite product of D_i .

4. C is metrizable as well; d(x,y) = \sum^{\infty}_{i=1} \frac{|x_i - y_i|}{3^i} . Note that is metric is well defined. Suppose x \neq y . Then there is a first k, x_k \neq y_k . Then note

d(x,y) = \frac{|x_k - y_k|}{3^k} + \sum^{\infty}_{i = k+1} \frac{|x_i - y_i|}{3^i} \rightarrow |x_k -y_k| =2 = \sum^{\infty}_{i=1} \frac{|x_{i+k} -y_{i+k}|}{3^i} \leq \frac{1}{3} \frac{1}{1 -\frac{2}{3}} =1

which is impossible.

5. By construction C is uncountable, though this follows from the fact that C is compact, Haudorff and dense in itself.

6. C \times C is homeomorphic to C . The homeomorphism is given by f( \{x_i \}, \{y_i \}) = \{ x_1, y_1, x_2, y_2,... \} \in C . It follows that C is homeomorphic to a finite product with itself (product topology). Here we use the fact that if f: X \rightarrow Y is a continuous bijection with X compact and Y Hausdorff then f is a homeomorphism.

Now we can say a bit more: if C_i is a copy of C then \Pi^{\infty}_{i =1 } C_i is homeomorphic to C . This will follow from subsequent work, but we can prove this right now, provided we review some basic facts about countable products and counting.

First lets show that there is a bijection between Z \times Z and Z . A bijection is suggested by this diagram:


which has the following formula (coming up with it is fun; it uses the fact that \sum^k _{n=1} n = \frac{k(k+1)}{2} :

\phi(k,1) = \frac{(k)(k+1)}{2} for k even
\phi(k,1) = \frac{(k-1)(k)}{2} + 1 for k odd
\phi(k-j, j+1) =\phi(k,1) + j for k odd, j \in \{1, 2, ...k-1 \}
\phi(k-j, j+1) = \phi(k,1) - j for k even, j \in \{1, 2, ...k-1 \}

Here is a different bijection; it is a fun exercise to come up with the relevant formulas:


Now lets give the map between \Pi^{\infty}_{i=1} C_i and C . Let \{ y_i \} \in C and denote the elements of \Pi^{\infty}_{i=1} C_i by \{ x^i_j \} where \{ x_1^1, x_2^1, x_3^ 1....\} \in C_1, \{x_1^2, x_2 ^2, x_3^3, ....\} \in C_2, ....\{x_1^k, x_2^k, .....\} \in C_k ... .

We now describe a map f: C \rightarrow \Pi^{\infty}_{i=1} C_i by

f(\{y_i \}) = \{ x^i_j \} = \{y_{\phi(i,j)} \}

Example: x^1_1 = y_1, x^1_2 = y_2, x^2_1 = y_3, x^3_1 =y_4, x^2_2 = y_5, x^1_3 =y_6,...

That this is a bijection between compact Hausdorff spaces is immediate. If we show that f^{-1} is continuous, we will have shown that f is a homeomorphism.

But that isn’t hard to do. Let U \subset C be open; U = U_1 \times U_2 \times U_3.... \times U_{m-1} \times \Pi^{\infty}_{k=m} C_k . Then there is some k_m for which \phi(k_m, 1) \geq M . Then if f^i_j denotes the i,j component of f we wee that for all i+j \geq k_m+1, f^i_j(U) = C (these are entries on or below the diagonal containing (k,1) depending on whether k_m is even or odd.

So f(U) is of the form V_1 \times V_2 \times ....V_{k_m} \times \Pi^{\infty}_{i = k_m +1} C_i where each V_j is open in C_j . This is an open set in the product topology of \Pi^{\infty}_{i=1} C_i so this shows that f^{-1} is continuous. Therefore f^{-1} is a homeomorphism, therefore so is f.

Ok, what does this have to do with Cantor Sets and Cantor Spaces?

If you know what the “middle thirds” Cantor Set is I urge you stop reading now and prove that that Cantor set is indeed homeomorphic to C as we have described it. I’ll give this quote from Willard, page 121 (Hardback edition), section 17.9 in Chapter 6:

The proof is left as an exercise. You should do it if you think you can’t, since it will teach you a lot about product spaces.

What I will do I’ll give a geometric description of a Cantor set and show that this description, which easily includes the “deleted interval” Cantor sets that are used in analysis courses, is homeomorphic to C .

Set up
I’ll call this set F and describe it as follows:

F \subset R^n (for those interested in the topology of manifolds this poses no restrictions since any manifold embeds in R^n for sufficiently high n ).

Reminder: the diameter of a set F \subset R^n will be lub \{ d(x,y) | x, y \in F \}
Let \epsilon_1, \epsilon_2, \epsilon_3 .....\epsilon_k ... be a strictly decreasing sequence of positive real numbers such that \epsilon_k \rightarrow 0 .

Let F^0 be some closed n-ball in R^n (that is, F^) is a subset homeomorphic to a closed n-ball; we will use that convention throughout)

Let F^1_{(0) }, F^1_{(2)} be two disjoint closed n-balls in the interior of F^0 , each of which has diameter less than \epsilon_1 .

F^1 = F^1_{(0) } \cup F^1_{(2)}

Let F^2_{(0, 0)}, F^2_{(0,2)} be disjoint closed n-balls in the interior F^1_{(0) } and F^2_{(2,0)}, F^2_{(2,2)} be disjoint closed n-balls in the interior of F^1_{(2) } , each of which (all 4 balls) have diameter less that \epsilon_2 . Let F^2 = F^2_{(0, 0)}\cup F^2_{(0,2)} \cup F^2_{(2, 0)} \cup F^2_{(2,2)}


To describe the construction inductively we will use a bit of notation: a_i \in \{0, 2 \} for all i \in \{1, 2, ...\} and \{a_i \} will represent an infinite sequence of such a_i .
Now if F^k has been defined, we let F^{k+1}_{(a_1, a_2, ...a_{k}, 0)} and F^{k+1}_{(a_1, a_2,....,a_{k}, 2)} be disjoint closed n-balls of diameter less than \epsilon_{k+1} which lie in the interior of F^k_{(a_1, a_2,....a_k) } . Note that F^{k+1} consists of 2^{k+1} disjoint closed n-balls.

Now let F = \cap^{\infty}_{i=1} F^i . Since these are compact sets with the finite intersection property (\cap^{m}_{i=1}F^i =F^i \neq \emptyset for all m ), F is non empty and compact. Now for any choice of sequence \{a_i \} we have F_{ \{a_i \} } =\cap^{\infty}_{i=1} F^i_{(a_1, ...a_i)} is nonempty by the finite intersection property. On the other hand, if x, y \in F, x \neq y then d(x,y) = \delta > 0 so choose \epsilon_m such that \epsilon_m < \delta . Then x, y lie in different components of F^m since the diameters of these components are less than \epsilon_m .

Then we can say that the F_{ \{a_i} \} uniquely define the points of F . We can call such points x_{ \{a_i \} }

Note: in the subspace topology, the F^k_{(a_1, a_2, ...a_k)} are open sets, as well as being closed.

Finding a homeomorphism from F to C .
Let f: F \rightarrow C be defined by f( x_{ \{a_i \} } ) = \{a_i \} . This is a bijection. To show continuity: consider the open set U =  y_1 \times y_2 ....\times y_m \times \Pi^{\infty}_{i=m} D_i . Under f this pulls back to the open set (in the subspace topology) F^{m+1}_{(y1, y2, ...y_m, 0 ) } \cup F^{m+1}_{(y1, y2, ...y_m, 2)} hence f is continuous. Because F is compact and C is Hausdorff, f is a homeomorphism.

This ends part I.

We have shown that the Cantor sets defined geometrically and defined via “deleted intervals” are homeomorphic to C . What we have not shown is the following:

Let X be a compact Hausdorff space which is dense in itself (every point is a limit point) and totally disconnected (components are one point sets). Then X is homeomorphic to C . That will be part II.

January 16, 2015

Power sets, Function spaces and puzzling notation

I’ll probably be posting point-set topology stuff due to my being excited about teaching the course…finally.

Power sets and exponent notation
If A is a set, then the power set of A , often denoted by 2^A , is a set that consists of all subsets of A .

For example, if A = \{1, 2, 3 \} , then 2^A = \{ \emptyset , \{1 \}, \{ 2 \}, \{3 \}, \{1, 2 \}, \{1,3 \}, \{2, 3 \}, \{1, 2, 3 \} \} . Now is is no surprise that if the set A is finite and has n elements, then 2^A has 2^n elements.

However, there is another helpful way of listing 2^A . A subset of A can be defined by which elements of A that it has. So, if we order the elements of A as 1, 2, 3 then the power set of A can be identified as follows: \emptyset = (0, 0, 0), \{1 \} = (1, 0, 0), \{ 2 \} = (0,1,0), \{ 3 \} = (0, 0, 1), \{1,2 \} = (1, 1, 0), \{1,3 \} = (1, 0, 1), \{2,3 \} = (0, 1, 1), \{1, 2, 3 \} = (1, 1, 1)

So there is a natural correspondence between the elements of a power set and a sequence of binary digits. Of course, this makes the counting much easier.

The binary notation might seem like an unnecessary complication at first, but now consider the power set of the natural numbers: 2^N . Of course, listing the power sets would be, at least, cumbersome if not impossible! But there the binary notation really shows its value. Remember that the binary notation is a sequence of 0’s and 1’s where a 0 in the i’th slot means that element isn’t an element in a subset and a 1 means that it is.

Since a subset of the natural numbers is defined by its list of elements, every subset has an infinite binary sequence associated with it. We can order the sequence in the usual order 1, 2, 3, 4, ….
and the sequence 1, 0, 0, 0…… corresponds to the set with just 1 in it, the sequence 1, 0, 1, 0, 1, 0, 1, 0,… corresponds to the set consisting of all odd integers, etc.

Then, of course, one can use Cantor’s Diagonal Argument to show that 2^N is uncountable; in fact, if one uses the fact that every non-negative real number has a binary expansion (possibly infinite), one then shows that 2^N has the same cardinality as the real numbers.

Power notation
We can expand on this power notation. Remember that 2^A can be thought of setting up a “slot” or an “index” for each element of A and then assigning a 1 or 0 for every element of A . One can then think of this in an alternate way: 2^A can be thought of as the set of ALL functions from the elements of A to the set \{ 0, 1 \} . This coincides with the “power set” concept as set membership is determined by being either “in” or “not in”. So, the set in the exponent can be thought of either as the indexing set and the base as the value each indexed value can take on (sequences, in the case that the exponent set is either finite or countably finite), OR this can be thought of as the set of all functions where the exponent set is the domain and the base set is the range.

Remember, we are talking about ALL possible functions and not all “continuous” functions, or all “morphisms”, etc.

So, N^N can be thought of as either set set of all possible sequences of positive integers, or, equivalently, the set of all functions of N  to N .

Then R^N is the set of all real number sequences (i. e. the types of sequences we study in calculus), or, equivalently, the set of all real valued functions of the positive integers.

Now it is awkward to try to assign an ordering to the reals, so when we consider R^R it is best to think of this as the set of all functions f: R \rightarrow R , or equivalently, the set of all strings which are indexed by the real numbers and have real values.

Note that sequences don’t really seem to capture R^R in the way that they capture, say, R^N . But there is another concept that does, and that concept is the concept of the net, which I will talk about in a subsequent post.

January 9, 2015

Poincare Conjecture and Ricci Flow

My area of research, if you can say that I still have an area of research, is geometric topology. Yes, despite everything, I’ve managed to stay moderately active.

One big development in the past decade and a half is the solution to the Poincare Conjecture and the use of Ricci Flow to solve it (Perelman did the proof).

As far as what the Poincare Conjecture is about:

(If you’ve had some algebraic topology: the Poincare Conjecture says that an object that has the same algebraic information as the 3 dimensional sphere IS the three dimensional sphere, topologically speaking).

Now the proof uses Ricci Flow. Yes, to understand what Ricci flow is about, one has to understand differential geometry. BUT it you’ve had some brush with vector calculus (say, the amount that one teaches in a typical “Calculus III” course), one can get some intuition for this concept here.

Watch the video: it is fun. 🙂

Now when you get to the end, here is what is going on: instead of viewing a space (such as, say, the 2-d sphere) as being embedded in a larger space, one can talk about the space as being intrinsic; that is, not “sitting in” some ambient space. Then every point can be assigned some intrinsic curvature, and Ricci flow works in that setting.

Of course, one CAN always find a space to isometrically embed your space in (Nash embedding theorem) and still pretend that the space is embedded somewhere else; some “first course in differential topology” texts do exactly that.

October 1, 2014

Osgood’s uniqueness theorem for differential equations

I am teaching a numerical analysis class this semester and we just started the section on differential equations. I want them to understand when we can expect to have a solution and when a solution satisfying a given initial condition is going to be unique.

We had gone over the “existence” theorem, which basically says: given y' = f(x,y) and initial condition y(x_0) = y_0 where (x_0,y_0) \in int(R) where R is some rectangle in the x,y plane, if f(x,y) is a continuous function over R, then we are guaranteed to have at least one solution to the differential equation which is guaranteed to be valid so long as (x, y(x) stays in R.

I might post a proof of this theorem later; however an outline of how a proof goes will be useful here. With no loss of generality, assume that x_0 = 0 and the rectangle has the lines x = -a, x = a as vertical boundaries. Let \phi_0 = f(0, y_0)x , the line of slope f(0, y_0) . Now partition the interval [-a, a] into -a, -\frac{a}{2}, 0, \frac{a}{2}, a and create a polygonal path as follows: use slope f(0, y_0) at (0, y_0) , slope f(\frac{a}{2}, y_0 + \frac{a}{2}f(0, y_0)) at (\frac{a}{2}, y_0 +  \frac{a}{2}f(0, y_0)) and so on to the right; reverse this process going left. The idea: we are using Euler’s differential equation approximation method to obtain an initial piecewise approximation. Then do this again for step size \frac{a}{4},

In this way, we obtain an infinite family of continuous approximation curves. Because f(x,y) is continuous over R , it is also bounded, hence the curves have slopes whose magnitude are bounded by some M. Hence this family is equicontinuous (for any given \epsilon one can use \delta = \frac{\epsilon}{M} in continuity arguments, no matter which curve in the family we are talking about. Of course, these curves are uniformly bounded, hence by the Arzela-Ascoli Theorem (not difficult) we can extract a subsequence of these curves which converges to a limit function.

Seeing that this limit function satisfies the differential equation isn’t that hard; if one chooses t, s \in (-a.a) close enough, one shows that | \frac{\phi_k(t) - \phi_k(s)}{(t-s)} - f(t, \phi(t))|  0 where |f(x,y_1)-f(x,y_2)| \le K|y_1-y_2| then the differential equation y'=f(x,y) has exactly one solution where \phi(0) = y_0 which is valid so long as the graph (x, \phi(x) ) remains in R .

Here is the proof: K > 0 where |f(x,y_1)-f(x,y_2)| \le K|y_1-y_2| < 2K|y_1-y_2| . This is clear but perhaps a strange step.
But now suppose that there are two solutions, say y_1(x) and y_2(x) where y_1(0) = y_2(0) . So set z(x) = y_1(x) -y_2(x) and note the following: z'(x) = y_1(x) - y_2(x) = f(x,y_1)-f(x,y_2) and |z'(x)| = |f(x,y_1)-f(x,y_2)|   0 . A Mean Value Theorem argument applied to z means that we can assume that we can select our x_1 so that z' > 0 on that interval (since z(0) = 0 ).

So, on this selected interval about x_1 we have z'(x) < 2Kz (we can remove the absolute value signs.).

Now we set up the differential equation: Y' = 2KY, Y(x_1) = z(x_1) which has a unique solution Y=z(x_1)e^{2K(x-x_1)} whose graph is always positive; Y(0) = z(x_1)e^{-2Kx_1} . Note that the graphs of z(x), Y(x) meet at (x_1, z(x_1)) . But z'(x)  0 where z(x_1 - \delta) > Y(x_1 - \delta) .

But since z(0) = 0  z'(x) on that interval.

So, no such point x_1 can exist.

Note that we used the fact that the solution to Y' = 2KY, Y(x_1) > 0 is always positive. Though this is an easy differential equation to solve, note the key fact that if we tried to separate the variables, we’d calculate \int_0^y \frac{1}{Kt} dt and find that this is an improper integral which diverges to positive \infty hence its primitive cannot change sign nor reach zero. So, if we had Y' =2g(Y) where \int_0^y \frac{1}{g(t)} dt is an infinite improper integral and g(t) > 0 , we would get exactly the same result for exactly the same reason.

Hence we can recover Osgood’s Uniqueness Theorem which states:

If f(x,y) is continuous on R and for all (x, y_1), (x, y_2) \in R we have a K > 0 where |f(x,y_1)-f(x,y_2)| \le g(|y_1-y_2|) where g is a positive function and \int_0^y \frac{1}{g(t)} dt diverges to \infty at y=0 then the differential equation y'=f(x,y) has exactly one solution where \phi(0) = y_0 which is valid so long as the graph (x, \phi(x) ) remains in R .

September 23, 2014

Ok, what do you see here? (why we don’t blindly trust software)

I had Dfield8 from MATLAB propose solutions to y' = t(y-2)^{\frac{4}{5}} meeting the following initial conditions:

y(0) = 0, y(0) = 3, y(0) = 2.


Now, of course, one of these solutions is non-unique. But, of all of the solutions drawn: do you trust ANY of them? Why or why not?

Note: you really don’t have to do much calculus to see what is wrong with at least one of these. But, if you must know, the general solution is given by y(t) = (\frac{t^2}{10} +C)^5 + 2 (and, of course, the equilibrium solution y = 2 ). But that really doesn’t provide more information that the differential equation does.

By the way, here are some “correct” plots of the solutions, (up to uniqueness)


Older Posts »

Blog at