College Math Teaching

February 5, 2016

More fun with selective sums of divergent series

Just a reminder: if \sum_{k=1}^{\infty} a_k is a series and c_1, c_2, ...c_n ,, is some sequence consisting of 0’s and 1’s then a selective sum of the series is \sum_{k=1}^{\infty} c_k a_k . The selective sum concept is discussed in the MAA book Real Infinite Series (MAA Textbooks) by Bonar and Khoury (2006) and I was introduced to the concept by Ferdinands’s article Selective Sums of an Infinite Series in the June 2015 edition of Mathematics Magazine (Vol. 88, 179-185).

There is much of interest there, especially if one considers convergent series or alternating series.

This post will be about divergent series of positive terms for which lim_{n \rightarrow \infty} a_n = 0 and a_{n+1} < a_n for all n .

The first fun result is this one: any selected x > 0 is a selective sum of such a series. The proof of this isn’t that bad. Since lim_{n \rightarrow \infty} a_n = 0 we can find a smallest n such that a_n \leq x . Clearly if a_n = x we are done: our selective sum has c_n = 1 and the rest of the c_k = 0 .

If not, set n_1 = n and note that because the series diverges, there is a largest m_1 so that \sum_{k=n_1}^{m_1} a_k \leq x . Now if \sum_{k=n_1}^{m_1} a_k = x we are done, else let \epsilon_1 = x - \sum_{k=n_1}^{m_1} a_k and note \epsilon_1 < a_{m_1+1} . Now because the a_k tend to zero, there is some first n_2 so that a_{n_2} \leq \epsilon_1 . If this is equality then the required sum is a_{n_2} + \sum_{k=n_1}^{m_1} a_k , else we can find the largest m_2 so that \sum_{k=n_1}^{m_1} a_k + \sum_{k=n_2}^{m_2} a_k \leq x

This procedure can be continued indefinitely. So if we label \sum_{k=n_j}^{m_{j}} a_k = s_j we see that s_1 + s_2 + ...s_{n} = t_{n} form an increasing, bounded sequence which converges to the least upper bound of its range, and it isn’t hard to see that the least upper bound is x because x-t_{n} =\epsilon_n < a_{m_n+1}

So now that we can obtain any positive real number as the selective sum of such a series, what can we say about the set of all selective sums for which almost all of the c_k = 0 (that is, all but a finite number of the c_k are zero).

Answer: the set of all such selective sums are dense in the real line, and this isn’t that hard to see, given our above construction. Let (a,b) be any open interval in the real line and let a < x < b . Then one can find some N such that for all n > N we have x - a_n > a . Now consider our construction and choose m large enough such that x - t_m > x - a_n > a . Then the t_m represents the finite selected sum that lies in the interval (a,b) .

We can be even more specific if we now look at a specific series, such as the harmonic series \sum_{k=1}^{\infty} \frac{1}{k} . We know that the set of finite selected sums forms a dense subset of the real line. But it turns out that the set of select sums is the rationals. I’ll give a slightly different proof than one finds in Bonar and Khoury.

First we prove that every rational in (0,1] is a finite select sum. Clearly 1 is a finite select sum. Otherwise: Given \frac{p}{q} we can find the minimum n so that \frac{1}{n} \leq \frac{p}{q} < \frac{1}{n-1} . If \frac{p}{q} = \frac{1}{n} we are done. Otherwise: the strict inequality shows that pn-p < q which means pn-q < p . Then note \frac{p}{q} - \frac{1}{n} = \frac{pn-q}{qn} and this fraction has a strictly smaller numerator than p . So we can repeat our process with this new rational number. And this process must eventually terminate because the numerators generated from this process form a strictly decreasing sequence of positive integers. The process can only terminate when the new faction has a numerator of 1. Hence the original fraction is some sum of fractions with numerator 1.

Now if the rational number r in question is greater than one, one finds n_1 so that \sum^{n_1}_{k=1} \frac{1}{k} \leq r but \sum^{n_1+1}_{k=1} \frac{1}{k} > r . Then write r-\sum^{n_1+1}_{k=1} \frac{1}{k} and note that its magnitude is less than \frac{1}{n_1+1} . We then use the procedure for numbers in (0,1) noting that our starting point excludes the previously used terms of the harmonic series.

There is more we can do, but I’ll stop here for now.

January 26, 2016

More Fun with Divergent Series: redefining series convergence (Cesร ro, etc.)

Filed under: analysis, calculus, sequences, series — Tags: , , — collegemathteaching @ 10:21 pm

This post is more designed to entertain myself than anything else. This builds up a previous post which talks about deleting enough terms from a divergent series to make it a convergent one.

This post is inspired by Chapter 8 of Konrad Knopp’s classic Theory and Application of Infinite Series. The title of the chapter is Divergent Series.

Notation: when I talk about a series converging, I mean “converging” in the usual sense; e. g. if s_n = \sum_{k=0}^{k=n} a_k and lim_{n \rightarrow \infty}s_n = s then \sum_{k=0}^{\infty} a_k is said to be convergent with sum s .

All of this makes sense since things like limits are carefully defined. But as Knopp points out, in the “days of old”, mathematicians say these series as formal objects rather than the result of careful construction. So some of these mathematicians (like Euler) had no problem saying things like \sum^{\infty}_{k=0} (-1)^k = 1-1+1-1+1..... = \frac{1}{2} . Now this is complete nonsense by our usual modern definition. But we might note that \frac{1}{1-x} = \sum^{\infty}_{k=0} x^k for -1 < x < 1 and note that x = -1 IS in the domain of the left hand side.

So, is there a way of redefining the meaning of “infinite sum” that gives us this result, while not changing the value of convergent series (defined in the standard way)? As Knopp points out in his book, the answer is “yes” and he describes several definitions of summation that

1. Do not change the value of an infinite sum that converges in the traditional sense and
2. Allows for more series to coverge.

We’ll discuss one of these methods, commonly referred to as Cesàro summation. There are ways to generalize this.

How this came about

Consider the Euler example: 1 -1 + 1 -1 + 1 -1...... . Clearly, s_{2k} = 1, s_{2k+1} = 0 and so this geometric series diverges. But notice that the arithmetic average of the partial sums, computed as c_n = \frac{s_0 + s_1 +...+s_n}{n+1} does tend to \frac{1}{2} as n tends to infinity: c_{2n} = \frac{\frac{2n}{2}}{2n+1} = \frac{n}{2n+1} whereas c_{2n+1} = \frac{\frac{2n}{2}}{2n+2} =\frac{n}{2n+2} and both of these quantities tend to \frac{1}{2} as n tends to infinity.

So, we need to see that this method of summing is workable; that is, do infinite sums that converge in the previous sense still converge to the same number with this method?

The answer is, of course, yes. Here is how to see this: Let x_n be a sequence that converges to zero. Then for any \epsilon > 0 we can find M such that k > M implies that |x_k| < \epsilon . So for n > k we have \frac{x_1 + x_2 + ...+ x_{k-1} + x_k + ...+ x_n}{n} = \frac{x_1+ ...+x_{k-1}}{n} + \frac{x_k + x_{k+1} + ....x_n}{n} Because k is fixed, the first fraction tends to zero as n tends to infinity. The second fraction is smaller than \epsilon in absolute value. But \epsilon is arbitrary, hence this arithmetic average of this null sequence is itself a null sequence.

Now let x_n \rightarrow L and let c_n = \frac{x_1 + x_2 + ...+ x_{k-1} + x_k + ...+ x_n}{n} Now subtract note c_n-L =  \frac{(x_1-L) + (x_2-L) + ...+ (x_{k-1}-L) +(x_k-L) + ...+ (x_n-L)}{n} and the x_n-L forms a null sequence. Then so do the c_n-L .

Now to be useful, we’d have to show that series that are summable in the Cesàro obey things like the multiplicative laws; they do but I am too lazy to show that. See the Knopp book.

I will mention a couple of interesting (to me) things though. Neither is really profound.

1. If a series diverges to infinity (that is, if for any positive M there exists n such that for all k \geq n, s_k > M , then this series is NOT Cesàro summable. It is relatively easy to see why: given such an M, k then consider \frac{s_1 + s_2 + s_3 + ...+s_{k-1} + s_k + s_{k+1} + ...s_n}{n} = \frac{s_1+ s_2 + ...+s_{k-1}}{n} + \frac{s_k + s_{k+1} .....+s_{n}}{n} which is greater than \frac{n-k}{n} M for large n . Hence the Cesàro partial sum becomes unbounded.

Upshot: there is no hope in making something like \sum^{\infty}_{n=1} \frac{1}{n} into a convergent series by this method. Now there is a way of making an alternating, divergent series into a convergent one via doing something like a “double Cesàro sum” (take arithmetic averages of the arithmetic averages) but that is a topic for another post.

2. Cesàro summation may speed up convergent of an alternating series which passes the alternating series test, OR it might slow it down. I’ll have to develop this idea more fully. But I invite the reader to try Cesàro summation for \sum^{\infty}_{k=1} (-1)^{k+1} \frac{1}{k} and on \sum^{\infty}_{k=1} (-1)^{k+1} \frac{1}{k^2} and on \sum^{\infty}_{k=0} (-1)^k \frac{1}{2^k} . In the first two cases, the series converges slowly enough so that Cesàro summation speeds up convergence. Cesàro slows down the convergence in the geometric series though. It is interesting to ponder why.

January 14, 2016

Trimming a divergent series into a convergent one

Filed under: calculus, induction, sequences, series — Tags: , , — collegemathteaching @ 10:28 pm

This post is motivated by this cartoon

which I found at a Evelyn Lamb’s post on an AMS Blog, this fun Forbes math post by Kevin Kundson and by a June 2015 article in Mathematics Magazine by R. John Ferdinands called Selective Sums of an Infinite Series.

Here is the following question: start with a divergent series of positive terms which form a decreasing (non-increasing) sequence which tends to zero, say, \sum^{\infty}_{k =1} \frac{1}{k} . Now how does one select a subset of series terms to delete so as to obtain a convergent series? The Kundson article shows that one can do this with the harmonic series by, say, deleting all numbers that contain a specific digit (say, 9). I’ll talk about the proof here. But I’d like to start more basic and to bring in language used in the Ferdinands article.

So, let’s set the stage: we will let \sum a_k denote the divergent sum in question. All terms will be positive, a_{k} \geq a_{k+1} for all k and lim_{k \rightarrow \infty} a_k = 0 . Now let c_k represent a sequence where c_k \in \{0,1\} for all k ; then \sum c_ka_k is called a selective sum of \sum a_k . I’ll call the c_k the selecting sequence and, from the start, rule out selecting sequences that are either eventually 1 (which means that the selected series diverges since the original series did) or eventually zero (just a finite sum).

Now we’ll state a really easy result:

There is some non-eventually constant c_k such that \sum c_ka_k converges. Here is why: because lim_{k \rightarrow \infty} a_k = 0 , for each n \in \{1,2,3...\} one can find a maximal index n_j, n_j \notin \{n_1, n_2, ...n_{j-1} \} so that \frac{1}{2^n} > a_{n_j} . Now select c_k = 1 if k \in \{n_1, n_2, n_3,... \} and c_k =0 otherwise. Then \sum \frac{1}{2^k} > \sum c_ka_k and therefore the selected series converges by comparison with a convergent geometric series.

Of course, this result is petty lame; this technique discards a lot of terms. A cheap way to discard “fewer” terms (“fewer” meaning: in terms of “set inclusion”): Do the previous construction, but instead of using \frac{1}{2} use \frac{M}{M+1} where M is a positive integer of choice. Note that \sum^{\infty}_{k=1} (\frac{M}{M+1})^k = M

Here is an example of how this works: Consider the divergent series \sum \frac{1}{\sqrt{k}} and the convergent geometric series \sum (\frac{1000}{1001})^k Of course \frac{1000}{1001} < 1 so c_1 = 0 but then for k \in \{2,3,....4169 \} we have (\frac{1000}{1001})^k > \frac{1}{\sqrt{k}} . So c_k = 1 for k \in \{2,3,4,....4169 \} . But c_{4170} = 0 because (\frac{1000}{1001})^{4170} < \frac{1}{\sqrt{4170}}. The next non-zero selection coefficient is c_{4171} as (\frac{1000}{1001})^{4170} > \frac{1}{\sqrt{4171}} .

Now playing with this example, we see that \frac{1}{\sqrt{k}} > (\frac{1000}{1001})^{4171} for k \in \{4172, 4173,....4179 \} but not for k = 4180 . So c_k = 0 for k \in \{4172,....4179 \} and c_{4180} = 1 . So the first few n_j are \{2, 3, ....4169, 4171, 4180 \} . Of course the gap between the n_j grows as k does.

Now let’s get back to the cartoon example. From this example, we’ll attempt to state a more general result.

Claim: given \sum^{\infty}_{k=1} c_k \frac{1}{k} where c_k = 0 if k contains a 9 as one of its digits, then \sum^{\infty}_{k=1} c_k \frac{1}{k} converges. Hint on how to prove this (without reading the solution): count the number of integers between 10^k and 10^{k+1} that lack a 9 as a digit. Then do a comparison test with a convergent geometric series, noting that every term \frac{1}{10^k}, \frac{1}{10^k + 1}......, \frac{1}{8(10^k) +88} is less than or equal to \frac{1}{10^k} .

How to prove the claim: we can start by “counting” the number of integers between 0 and 10^k that contain no 9’s as a digit.

Between 0 and 9: clearly 0-8 inclusive, or 9 numbers.

Between 10 and 99: a moment’s thought shows that we have 8(9) = 72 numbers with no 9 as a digit (hint: consider 10-19, 20-29…80-89) so this means that we have 9 + 8(9) = 9(1+8) = 9^2 numbers between 0 and 99 with no 9 as a digit.

This leads to the conjecture: there are 9^k numbers between 0 and 10^k -1 with no 9 as a digit and (8)9^{k-1} between 10^{k-1} and 10^k-1 with no 9 as a digit.

This is verified by induction. This is true for k = 1

Assume true for k = n . Then to find the number of numbers without a 9 between 10^n and 10^{n+1} -1 we get 8 (9^n) which then means we have 9^n + 8(9^n) = 9^n (8+1) = 9^{n+1} numbers between 0 and 10^{n+1}-1 with no 9 as a digit. So our conjecture is proved by induction.

Now note that 0+ 1 + \frac{1}{2} + ....+ \frac{1}{8} < 8*1*1

\frac{1}{10} + ...+ \frac{1}{18} + \frac{1}{20} + ...+ \frac{1}{28} + \frac{1}{30} + ...+ \frac{1}{88} < 8*9*\frac{1}{10}

\frac{1}{100} + ...\frac{1}{88} + \frac{1}{200} + ....\frac{1}{888} < 8*(9^2)\frac{1}{100}

This establishes that \sum_{k=10^n}^{10^{n+1}-1} c_k \frac{1}{k} < 8*(9^k)\frac{1}{10^k}

So it follows that \sum^{\infty}_{k=1} c_k \frac{1}{k} < 8\sum^{\infty}{k=0} (\frac{9}{10})^k = 8 \frac{1}{1-\frac{9}{10}} = 80 and hence our selected sum is convergent.

Further questions: ok, what is going on is that we threw out enough terms of the harmonic series for the series to converge. Between terms \frac{1}{10^k} and \frac{1}{10^{k+1}-1} we allowed 8*(9^k) terms to survive.

This suggests that if we permit up to M (10-\epsilon)^k terms between 10^k and 10^{k+1}-1 to survive (M, \epsilon fixed and positive) then we will have a convergent series. I’d be interested in seeing if there is an generalization of this.

But I am tried, I have a research article to review and I need to start class preparation for the upcoming spring semester. So I’ll stop here. For now. ๐Ÿ™‚

April 10, 2015

Cantor sets and countable products of discrete spaces (0, 1)^Z

Filed under: advanced mathematics, analysis, point set topology, sequences — Tags: , , , — collegemathteaching @ 11:09 am

This might seem like a strange topic but right now our topology class is studying compact metric spaces. One of the more famous of these is the “Cantor set” or “Cantor space”. I discussed the basics of these here.

Now if you know the relationship between a countable product of two point discrete spaces (in the product topology) and Cantor spaces/Cantor Sets, this post is probably too elementary for you.

Construction: start with a two point set D = \{0, 2 \} and give it the discrete topology. The reason for choosing 0 and 2 to represent the elements will become clear later. Of course, D_2 is a compact metric space (with the discrete metric: d(x,y) = 1 if x \neq y .

Now consider the infinite product of such spaces with the product topology: C = \Pi^{\infty}_{i=1} D_i where each D_i is homeomorphic to D . It follows from the Tychonoff Theorem that C is compact, though we can prove this directly: Let \cup_{\alpha \in I} U_{\alpha} be any open cover for C . Then choose an arbitrary U from this open cover; because we are using the product topology U = O_1 \times O_2 \times ....O_k \times (\Pi_{i=k+1}^{\infty} D_i ) where each O_i is a one or two point set. This means that the cardinality of C - U is at most 2^k -1 which requires at most 2^k -1 elements of the open cover to cover.

Now let’s examine some properties.

Clearly the space is Hausdorff (T_2 ) and uncountable.

1. Every point of C is a limit point of C . To see this: denote x \in C by the sequence \{x_i \} where x_i \in \{0, 2 \} . Then any open set containing \{ x_i \} is O_1 \times O_2 \times...O_k \times \Pi^{\infty}_{i=k+1} D_i and contains ALL points y_i where y_i = x_i for i = \{1, 2, ...k \} . So all points of C are accumulation points of C ; in fact they are condensation points (or perfect limit points ).

(refresher: accumulation points are those for which every open neighborhood contains an infinite number of points of the set in question; condensation points contain an uncountable number of points, and perfect limit points are those for which every open neighborhood contains as many points as the set in question has (same cardinality).

2. C is totally disconnected (the components are one point sets). Here is how we will show this: given x, y \in C, x \neq y, there exists disjoint open sets U_x, U_y, x \in U_x, y \in U_y, U_x \cup U_y = C . Proof of claim: if x \neq y there exists a first coordinate k for which x_k \neq y_k (that is, a first k for which the canonical projection maps disagree (\pi_k(x) \neq pi_k(y) ). Then
U_x = D_1 \times D_2 \times ....\times D_{k-1} \times x_k \times \Pi^{\infty}_{i=k+1} D_i,

U_y = D_1 \times D_2 \times.....\times D_{k-1} \times y_k \times \Pi^{\infty}_{i = k+1} D_i

are the required disjoint open sets whose union is all of C .

3. C is countable, as basis elements for open sets consist of finite sequences of 0’s and 2’s followed by an infinite product of D_i .

4. C is metrizable as well; d(x,y) = \sum^{\infty}_{i=1} \frac{|x_i - y_i|}{3^i} . Note that is metric is well defined. Suppose x \neq y . Then there is a first k, x_k \neq y_k . Then note

d(x,y) = \frac{|x_k - y_k|}{3^k} + \sum^{\infty}_{i = k+1} \frac{|x_i - y_i|}{3^i} \rightarrow |x_k -y_k| =2 = \sum^{\infty}_{i=1} \frac{|x_{i+k} -y_{i+k}|}{3^i} \leq \frac{1}{3} \frac{1}{1 -\frac{2}{3}} =1

which is impossible.

5. By construction C is uncountable, though this follows from the fact that C is compact, Haudorff and dense in itself.

6. C \times C is homeomorphic to C . The homeomorphism is given by f( \{x_i \}, \{y_i \}) = \{ x_1, y_1, x_2, y_2,... \} \in C . It follows that C is homeomorphic to a finite product with itself (product topology). Here we use the fact that if f: X \rightarrow Y is a continuous bijection with X compact and Y Hausdorff then f is a homeomorphism.

Now we can say a bit more: if C_i is a copy of C then \Pi^{\infty}_{i =1 } C_i is homeomorphic to C . This will follow from subsequent work, but we can prove this right now, provided we review some basic facts about countable products and counting.

First lets show that there is a bijection between Z \times Z and Z . A bijection is suggested by this diagram:


which has the following formula (coming up with it is fun; it uses the fact that \sum^k _{n=1} n = \frac{k(k+1)}{2} :

\phi(k,1) = \frac{(k)(k+1)}{2} for k even
\phi(k,1) = \frac{(k-1)(k)}{2} + 1 for k odd
\phi(k-j, j+1) =\phi(k,1) + j for k odd, j \in \{1, 2, ...k-1 \}
\phi(k-j, j+1) = \phi(k,1) - j for k even, j \in \{1, 2, ...k-1 \}

Here is a different bijection; it is a fun exercise to come up with the relevant formulas:


Now lets give the map between \Pi^{\infty}_{i=1} C_i and C . Let \{ y_i \} \in C and denote the elements of \Pi^{\infty}_{i=1} C_i by \{ x^i_j \} where \{ x_1^1, x_2^1, x_3^ 1....\} \in C_1, \{x_1^2, x_2 ^2, x_3^3, ....\} \in C_2, ....\{x_1^k, x_2^k, .....\} \in C_k ... .

We now describe a map f: C \rightarrow \Pi^{\infty}_{i=1} C_i by

f(\{y_i \}) = \{ x^i_j \} = \{y_{\phi(i,j)} \}

Example: x^1_1 = y_1, x^1_2 = y_2, x^2_1 = y_3, x^3_1 =y_4, x^2_2 = y_5, x^1_3 =y_6,...

That this is a bijection between compact Hausdorff spaces is immediate. If we show that f^{-1} is continuous, we will have shown that f is a homeomorphism.

But that isn’t hard to do. Let U \subset C be open; U = U_1 \times U_2 \times U_3.... \times U_{m-1} \times \Pi^{\infty}_{k=m} C_k . Then there is some k_m for which \phi(k_m, 1) \geq M . Then if f^i_j denotes the i,j component of f we wee that for all i+j \geq k_m+1, f^i_j(U) = C (these are entries on or below the diagonal containing (k,1) depending on whether k_m is even or odd.

So f(U) is of the form V_1 \times V_2 \times ....V_{k_m} \times \Pi^{\infty}_{i = k_m +1} C_i where each V_j is open in C_j . This is an open set in the product topology of \Pi^{\infty}_{i=1} C_i so this shows that f^{-1} is continuous. Therefore f^{-1} is a homeomorphism, therefore so is f.

Ok, what does this have to do with Cantor Sets and Cantor Spaces?

If you know what the “middle thirds” Cantor Set is I urge you stop reading now and prove that that Cantor set is indeed homeomorphic to C as we have described it. I’ll give this quote from Willard, page 121 (Hardback edition), section 17.9 in Chapter 6:

The proof is left as an exercise. You should do it if you think you can’t, since it will teach you a lot about product spaces.

What I will do I’ll give a geometric description of a Cantor set and show that this description, which easily includes the “deleted interval” Cantor sets that are used in analysis courses, is homeomorphic to C .

Set up
I’ll call this set F and describe it as follows:

F \subset R^n (for those interested in the topology of manifolds this poses no restrictions since any manifold embeds in R^n for sufficiently high n ).

Reminder: the diameter of a set F \subset R^n will be lub \{ d(x,y) | x, y \in F \}
Let \epsilon_1, \epsilon_2, \epsilon_3 .....\epsilon_k ... be a strictly decreasing sequence of positive real numbers such that \epsilon_k \rightarrow 0 .

Let F^0 be some closed n-ball in R^n (that is, F^) is a subset homeomorphic to a closed n-ball; we will use that convention throughout)

Let F^1_{(0) }, F^1_{(2)} be two disjoint closed n-balls in the interior of F^0 , each of which has diameter less than \epsilon_1 .

F^1 = F^1_{(0) } \cup F^1_{(2)}

Let F^2_{(0, 0)}, F^2_{(0,2)} be disjoint closed n-balls in the interior F^1_{(0) } and F^2_{(2,0)}, F^2_{(2,2)} be disjoint closed n-balls in the interior of F^1_{(2) } , each of which (all 4 balls) have diameter less that \epsilon_2 . Let F^2 = F^2_{(0, 0)}\cup F^2_{(0,2)} \cup F^2_{(2, 0)} \cup F^2_{(2,2)}


To describe the construction inductively we will use a bit of notation: a_i \in \{0, 2 \} for all i \in \{1, 2, ...\} and \{a_i \} will represent an infinite sequence of such a_i .
Now if F^k has been defined, we let F^{k+1}_{(a_1, a_2, ...a_{k}, 0)} and F^{k+1}_{(a_1, a_2,....,a_{k}, 2)} be disjoint closed n-balls of diameter less than \epsilon_{k+1} which lie in the interior of F^k_{(a_1, a_2,....a_k) } . Note that F^{k+1} consists of 2^{k+1} disjoint closed n-balls.

Now let F = \cap^{\infty}_{i=1} F^i . Since these are compact sets with the finite intersection property (\cap^{m}_{i=1}F^i =F^i \neq \emptyset for all m ), F is non empty and compact. Now for any choice of sequence \{a_i \} we have F_{ \{a_i \} } =\cap^{\infty}_{i=1} F^i_{(a_1, ...a_i)} is nonempty by the finite intersection property. On the other hand, if x, y \in F, x \neq y then d(x,y) = \delta > 0 so choose \epsilon_m such that \epsilon_m < \delta . Then x, y lie in different components of F^m since the diameters of these components are less than \epsilon_m .

Then we can say that the F_{ \{a_i} \} uniquely define the points of F . We can call such points x_{ \{a_i \} }

Note: in the subspace topology, the F^k_{(a_1, a_2, ...a_k)} are open sets, as well as being closed.

Finding a homeomorphism from F to C .
Let f: F \rightarrow C be defined by f( x_{ \{a_i \} } ) = \{a_i \} . This is a bijection. To show continuity: consider the open set U =  y_1 \times y_2 ....\times y_m \times \Pi^{\infty}_{i=m} D_i . Under f this pulls back to the open set (in the subspace topology) F^{m+1}_{(y1, y2, ...y_m, 0 ) } \cup F^{m+1}_{(y1, y2, ...y_m, 2)} hence f is continuous. Because F is compact and C is Hausdorff, f is a homeomorphism.

This ends part I.

We have shown that the Cantor sets defined geometrically and defined via “deleted intervals” are homeomorphic to C . What we have not shown is the following:

Let X be a compact Hausdorff space which is dense in itself (every point is a limit point) and totally disconnected (components are one point sets). Then X is homeomorphic to C . That will be part II.

March 15, 2015

Compact spaces and Tychonoff’s Theorem I

Note to the reader: when I first started this post, I thought it would be a few easy paragraphs. The size began to swell, so I am taking this post in several bite sized parts. This is part I.

Pretty much everyone knows what a compact space is. But not everyone is up on equivalent definitions and on how to prove that the arbitrary product of compact spaces is compact.
The genesis of this blog post is David Wright’s Proceedings of the American Mathematical Society paper (1994) on Tychonoff’s Theorem.

Since I am also writing for my undergraduate topology class, I’ll keep things elementary where possible and perhaps put in more detail than a professional mathematician would have patience for.

I should start by saying why this topic is dear to me: my research area is knot theory; in particular I studied embeddings of the circle into the 3-dimensional sphere S^3 , which can be thought of as the “compactification” of R^3 ; basically one starts with R^3 and then adds a point \infty and declares that the neighborhoods of this new point will be generated by sets of the following form: \{ (x,y,z) | x^2 + y^2 + z^2 > M^2 \} The reason we do this: we often study the complement of the embedded circle, and it is desirable to have a compact set as the complement.

I’ve also studied (in detail) certain classes of embeddings of the real line into non-compact manifolds; to make this study a bit more focused, I insist that such embeddings be “proper” in that the inverse image of a compact set be compact. Hence compactness comes up again, even when the objects of study are not compact.

So, what do we mean by “compact”?

Instead of just blurting out the definition and equivalent formulations, I’ll try to give some intuition. If we are talking about a subset of a metric space, a compact subset is one that is both closed and bounded. Now that is NOT the definition of compactness, though it is true that:

Given a set X \subset R^n , X is compact if and only if X is both closed (as a topological subset) and bounded (in that it fits within a sufficiently large closed ball). In R^1 compact subsets can be thought of as selected finite unions and arbitrary intersections of closed intervals. In the higher dimensions, think of the finite union and arbitrary intersections of things like closed balls.

Now it is true that if f:X \rightarrow Y is continuous, then if X is a compact topological space, then f(X) is compact (either as a space, or in the subspace topology, if f is not onto.

This leads to a big theorem of calculus: the Extreme Value Theorem: if f:R^n \rightarrow R is continuous over a compact subset D \subset R^n then f attains both a maximum and a minimum value over D .

Now in calculus, we rarely use the word “compact” but instead say something about D be a closed, bounded subset. In the case where n = 1 we usually say that D =[a,b] , a closed interval.

So, in terms of intuition, if one is thinking about subsets of R^n , one can think of a compact space as a domain on which any continuous real valued function always attains both a minimum and a maximum value.

Now for the definition
We need some terminology: a collection of open sets U_{\alpha} is said to be an open cover of a space X if \cup_{\beta \in I } U_{\beta} = X and if A \subset X a collection of open sets is said to be an open cover of A if A \subset \cup_{\beta \in I } U_{\beta} A finite subcover is a finite subcollection of the open sets such that \cup^k_{i=1} U_i = \cup_{\beta \in I} U_{\beta} .

Here is an example: (\frac{3}{4}, 1] \cup^{\infty}_{n=1} [0, \frac{n}{n+1}) is an open cover of [0,1] in the subspace topology. A finite subcover (from this collection) would be [0, \frac{4}{5}) \cup (\frac{3}{4}, 1]

Let X be a topological space. We say that X is a compact topological space if any open over of X has a finite subcover. If C \subset X we say that C is a compact subset of X if any open cover of C has a finite subcover.

Prior to going through examples, I think that it is wise to mention something. One logically equivalent definition is this: A space (or a subset) is compact if every cover by open basis elements has a finite subcover. Here is why: if X is compact, then ANY open cover has a finite subcover, and an open cover by basis elements is an open cover. On the other hand: if we assume the “every open cover by open basis elements has a finite subcover” condition: then if \mathscr{U} is an open cover, then we can view this open cover as an open cover of the basis elements whose union is each open U_{\beta} \in \mathscr{U} . This open cover of basis elements has a finite subcover of basis elements..say B_1, B_2, ....B_k . Then for each basis element, choose a single U_i \in \mathscr{U} for which B_i \subset U_i . That is the required open subcover.

Now, when convenient, we can assume that the open cover in question (during a proof) consists of basic open sets. That will simplify things at times.

So, what are some compact spaces and sets, and what are some basic facts?

Let’s see some compact sets, some non compact sets and see some basic facts.

1. Let X be any topological space and A \subset X a finite subset. Then A is a compact subset. Proof: given any open cover of A choose one open set per element of A which contains said element.

2. Let R have the usual topology. Then the integers Z \subset R^1 is not a compact subset; choose the open cover \cup^{\infty}_{n = -\infty} (n - \frac{1}{4}, n+ \frac{1}{4}) is an infinite cover with no finite subcover. In fact, ANY unbounded subset A \subset R^n in the usual metric topology fails to be compact: for a \in A with d(a, 0) \geq n choose B_a(\frac{1}{n}) ; clearly this open cover can have no finite subcover.

3. The finite union of compact subsets is compact (easy exercise).

4. If C \subset X is compact and X is a Hausdorff topological space (T_2 ) then C is closed. Here is why: let x \notin C and for every c \in C choose U_c, V_c open where x \in U_c, c \in V_c . Now \cup_{c \in C}V_c is an open set which contains C and has a finite subcover \cup_{i=1}^k V_i Note that each U_i is an open set which contains x and now we have only a finite number of these. Hence x \in \cap^k_{i=1} U_i which is disjoint from \cup_{i=1}^k V_i which contains C . Because x was an arbitrary point in X -C , X-C is open which means that C is closed. Note: this proof, with one minor addition, shows that a compact Hausdorff space is regular (T_3 ) we need only show that a closed subset of a compact Hausdorff space is compact. That is easy enough to do: let \mathscr{U} be an open cover for C ; then the collection \mathscr{U} \cup (X-C) is an open cover for X , which has a finite subcover. Let that be \cup^k_{i=1} U_i \cup (X-C) where each U_i \in \mathscr{U} . Now since X-C does not cover C, \cup^k_{i=1} U_i does.

So we have proved that a closed subset of a compact set is compact.

5. Let R (or any infinite set) be given the finite complement topology (that is, the open sets are the empty set together with sets whose complements consist of a finite number of points). Then ANY subset is compact! Here is why: let C be any set and let \mathscr{U} be any open cover. Choose U_1 \in \mathscr{U}. Since X -U_1 is a finite set of points, only a finite number of them can be in C , say c_1, c_2, ...c_k . Then for each of these, select one open set in the open cover that contains the point; that is the finite subcover.

Note: this shows that non-closed sets can be compact sets, if the underlying topology is not Hausdorff.

6. If f: X \rightarrow Y is continuous and onto and X is compact, then so is Y . Proof: let \cup_{\beta \in I} U_{\beta} cover Y and note that \cup_{\beta}f^{-1}(U_{\beta}) covers X , hence a finite number of these open sets cover: X = \cup^{k}_{i=1}f^{-1}(U_i). Therefore \cup^k_{i=1}U_i covers Y . Note: this shows that being compact is a topological invariant; that is, if two spaces are homeomorphic, then either both spaces are compact or neither one is.

7. Ok, let’s finally prove something. Let R^1 have the usual topology. Then [0, 1] (and therefore any closed interval) is compact. This is (part) of the famous Heine-Borel Theorem. The proof uses the least upper bound axiom of the real numbers.

Let \mathscr{U} be any open cover for [0,1] . If no finite subcover exists, let x be the least upper bound of the subset F of [0,1] that CAN be covered by a finite subcollection of \mathscr{U} . Now x > 0 because at least one element of \mathscr{U} contains 0 and therefore contains [0, \delta ) for some \delta > 0 . Assume that x < 1 . Now suppose x \in F , that is x is part of the subset that can be covered by a finite subcover. Then because x \in U_{\beta} for some U_{\beta} \in \mathscr{U} then (x-\delta, x + \delta) \subset U_{\beta} which means that x + \delta \in F , which means that x isn’t an upper bound for F .

Now suppose x \notin F ; then because x < 1 there is still some U_{\beta} where (x-\delta, x+ \delta) \subset U_{\beta} . But since x = lub(F) then x - \delta \in F and so [0, x- \delta ) \subset F . So if F can be covered by \cup^k_{i=1} U_i then \cup^k_{i=1} U_i \cup U_{\beta} is a finite subcover of [0, x + \delta ) which means that x was not an upper bound. It follows that x = 1 which means that the unit interval is compact.

Now what about the closed ball in R^n ? The traditional way is to show that the closed ball is a closed subset of a closed hypercube in R^n and so if we show that the product of compact spaces is compact, we would be done. That is for later.

8. Now endow R^1 with the lower limit topology. That is, the open sets are generated by basis elements [a, b) . Note that the lower limit topology is strictly finer than the usual topology. Now in this topology: [0,1] is not compact. (note: none of (0,1), [0,1), (0, 1] are compact in the coarser usual topology, so there is no need to consider these). To see this, cover [0,1] by \cup ^{\infty}_{n=1} [0, \frac{n}{n+1}) \cup [1, \frac{3}{2}) and it is easy to see that this open cover has no finite subcover. In fact, with a bit of work, one can show that every compact subset is at most countable and nowhere dense; in fact, if A is compact in the lower limit topology and a \in A there exists some y_a where (y_a, a) \cap A = \emptyset .

January 16, 2015

Power sets, Function spaces and puzzling notation

I’ll probably be posting point-set topology stuff due to my being excited about teaching the course…finally.

Power sets and exponent notation
If A is a set, then the power set of A , often denoted by 2^A , is a set that consists of all subsets of A .

For example, if A = \{1, 2, 3 \} , then 2^A = \{ \emptyset , \{1 \}, \{ 2 \}, \{3 \}, \{1, 2 \}, \{1,3 \}, \{2, 3 \}, \{1, 2, 3 \} \} . Now is is no surprise that if the set A is finite and has n elements, then 2^A has 2^n elements.

However, there is another helpful way of listing 2^A . A subset of A can be defined by which elements of A that it has. So, if we order the elements of A as 1, 2, 3 then the power set of A can be identified as follows: \emptyset = (0, 0, 0), \{1 \} = (1, 0, 0), \{ 2 \} = (0,1,0), \{ 3 \} = (0, 0, 1), \{1,2 \} = (1, 1, 0), \{1,3 \} = (1, 0, 1), \{2,3 \} = (0, 1, 1), \{1, 2, 3 \} = (1, 1, 1)

So there is a natural correspondence between the elements of a power set and a sequence of binary digits. Of course, this makes the counting much easier.

The binary notation might seem like an unnecessary complication at first, but now consider the power set of the natural numbers: 2^N . Of course, listing the power sets would be, at least, cumbersome if not impossible! But there the binary notation really shows its value. Remember that the binary notation is a sequence of 0’s and 1’s where a 0 in the i’th slot means that element isn’t an element in a subset and a 1 means that it is.

Since a subset of the natural numbers is defined by its list of elements, every subset has an infinite binary sequence associated with it. We can order the sequence in the usual order 1, 2, 3, 4, ….
and the sequence 1, 0, 0, 0…… corresponds to the set with just 1 in it, the sequence 1, 0, 1, 0, 1, 0, 1, 0,… corresponds to the set consisting of all odd integers, etc.

Then, of course, one can use Cantor’s Diagonal Argument to show that 2^N is uncountable; in fact, if one uses the fact that every non-negative real number has a binary expansion (possibly infinite), one then shows that 2^N has the same cardinality as the real numbers.

Power notation
We can expand on this power notation. Remember that 2^A can be thought of setting up a “slot” or an “index” for each element of A and then assigning a 1 or 0 for every element of A . One can then think of this in an alternate way: 2^A can be thought of as the set of ALL functions from the elements of A to the set \{ 0, 1 \} . This coincides with the “power set” concept as set membership is determined by being either “in” or “not in”. So, the set in the exponent can be thought of either as the indexing set and the base as the value each indexed value can take on (sequences, in the case that the exponent set is either finite or countably finite), OR this can be thought of as the set of all functions where the exponent set is the domain and the base set is the range.

Remember, we are talking about ALL possible functions and not all “continuous” functions, or all “morphisms”, etc.

So, N^N can be thought of as either set set of all possible sequences of positive integers, or, equivalently, the set of all functions of N  to N .

Then R^N is the set of all real number sequences (i. e. the types of sequences we study in calculus), or, equivalently, the set of all real valued functions of the positive integers.

Now it is awkward to try to assign an ordering to the reals, so when we consider R^R it is best to think of this as the set of all functions f: R \rightarrow R , or equivalently, the set of all strings which are indexed by the real numbers and have real values.

Note that sequences don’t really seem to capture R^R in the way that they capture, say, R^N . But there is another concept that does, and that concept is the concept of the net, which I will talk about in a subsequent post.

February 17, 2014

Aitken acceleration: another “easy” example for the students

In a previous post I showed some spreadsheet data to demonstrate the Aitken acceleration process. Here, I’ll go through an example where a sequence converges linearly: let p_n = L +a^n + b^n where 0 < a < b < 1 . We use the form q_n = \frac{p_{n+2}p_{n} -(p_{n+1})^2}{p_{n+2} - 2p_{n+1} + p_n} (I am too lazy to use the traditional “p-hat” notation). First, note that the denominator works out to a^n(1-a)^2 +b^n(1-b)^2

The numerator is a tiny bit more work: the L^2 terms cancel and as far as the rest:
L(a^{n+2} + b^{n+2}) + L(a^n + b^n) +(a^{n+2}+b^{n+2})(a^n + b^2)-2L(a^{n+1}+b^{n+1})-(a^{n+1}+b^{n+1})^2
which simplifies to a term involving L and one that doesn’t. Here is the term involving L :

L(a^{n+2}-2a^{n+1}  + a^n + b^{n+2} -2b^{n+1} +b^n) = L(a^n(1-a)^2 +b^n(1-b)^2)

which, of course, is just L times the denominator.

Now the terms not involving L: (a^{n+2}+b^{n+2})(a^n+b^n) - (a^{n+1} + b^{n+1})^2 = b^na^n(b^2+a^2-2ab) = b^na^n(b-a)^2

So our fraction is merely

\frac{L((a^n(1-a)^2 +b^n(1-b)^2)) + b^na^n(b-a)^2}{a^n(1-a)^2 +b^n(1-b)^2} = L + \frac{b^na^n(b-a)^2}{a^n(1-a)^2 +b^n(1-b)^2}

This can be rearranged to L + \frac{(b-a)^2}{\frac{(1-a)^2}{b^n} + \frac{(1-b)^2}{a^n}}

Clearly as n goes to infinity, the error goes to zero very quickly. It might be instructive to look at the ratio of the errors for p_n and q_n :

This ratio is
(a^n + b^n)\frac{a^n(1-a)^2 + b^n(1-b)^2}{a^nb^n(b-a)^2} =(a^n +b^n)(\frac{1}{b^n}(\frac{1-a}{b-a})^2 + \frac{1}{a^n}(\frac{1-b}{b-a})^2)

Note that in the right hand factor: both squared factors are fixed and the coefficients go to infinity as n goes to infinity. If one multiplies out, one obtains:

((\frac{a}{b})^n +1)(\frac{1-a}{b-a})^2 +  ((\frac{b}{a})^n +1)(\frac{1-b}{b-a})^2 . In the limit, the first term decreases to (\frac{1-a}{b-a})^2 and the second goes to infinity.

Hence the errors in the accelerated sequence are smaller.

February 8, 2014

Demonstrating Aitken’s sequence acceleration

Right now, we are studying the various root finding algorithms in Numerical Analysis (bisection, Newton’s, etc.)

Such algorithms yield a sequence of numbers, which (hopefully) converge to a solution: p_0, p_1, p_2, p_3, .....

Of course, each point in the sequence is obtained by calculations and, if there were a way to combine these points so as to obtain a sequence that converges faster (while not adding much to the computation complexity), there is some benefit. And yes, it is possible to use the sequence points, do the series manipulation and use the manipulated points in the root finding algorithm itself (e. g. Steffensen’s method).

In this post, I’ll talk about Aitken’s method and how one can cook up examples that not only show that the method can work but give the students some intuition as to why it might work.

I’ll provide just a bit of background in the event that the general reader comes across this.

Let p_i \rightarrow p . If we have \frac{|p_{n+1} - p|}{(|p_n -p|)^{\alpha}} \rightarrow \lambda (\lambda is positive, of course) we say that the convergence is LINEAR if \alpha = 1 and \lambda < 1 the inequality must be strict. If \alpha = 2 then we say that convergence is quadratic (regardless of \lambda .)

To see the reason for the terminology, just multiply both sides of the “defining equation” by the denominator. In the linear case : |p_{n+1} - p| = \lambda |p_{n} - p| so one can think: “get new approximation by multiplying the old approximation by some constant less than one”. For example: p_n = \frac{1}{2^n} exhibits linear convergence to 0; that is a decent way to think about it.

Now (in the linear convergence case anyway), suppose you think of your approximation having an error that shrinks with iteration but shrinks in the following way: the n’th iteration looks like p_n = p + a^n + b^n where a, b are constants strictly between 0 and 1. Of course, as n goes to infinity, p_n approaches the limit p as the error terms die away.

Aitken’s method is this: let’s denote a new sequence by q_n . (I am not using the traditional P-hat out of laziness) Then q_n = p_n -\frac{(p_{n+1} - p_n)^2}{p_{n+2} -2p_{n+1} + p_{n}} . To see how this formula is obtained, check out these excellent course notes. Or the Wiki article is pretty good.

Look at the numerator of what is being subtracted off: if we have the terms written (very roughly) as p_{n+1} = p + a^{n+1} + b^{n+1}, p_n = a^n + b^n and if, say, a is much larger than b, then a^{n+1} will be closer to a^n than b^{n+1} is to b^n , hence more of this error will be subtracted away.

Yes, I know that his is simple-minded, but it “gets the spirit” of the process. I’ve set up some spreadsheets with “cooked” sequences linearly converging to zero p_n = a^n + b^n and showed how the Aitken process works there. Note: the spreadsheet round off errors start occurring at the 10^{-16} range; you can see that here.


(click to see the larger image)

To see an abstract example where p_n = L + a^n + b^n where a, b \in (0,1) , go to the next post in this series.

January 20, 2014

A bit more prior to admin BS

One thing that surprised me about the professor’s job (at a non-research intensive school; we have a modest but real research requirement, but mostly we teach): I never knew how much time I’d spend doing tasks that have nothing to do with teaching and scholarship. Groan….how much of this do I tell our applicants that arrive on campus to interview? ๐Ÿ™‚

But there is something mathematical that I want to talk about; it is a follow up to this post. It has to do with what string theorist tell us: \sum^{\infty}_{k = 1} k = -\frac{1}{12} . Needless to say, they are using a non-standard definition of “value of a series”.

Where I think the problem is: when we hear “series” we think of something related to the usual process of addition. Clearly, this non-standard assignment doesn’t related to addition in the way we usually think about it.

So, it might make more sense to think of a “generalized series” as a map from the set of sequences of real numbers (or: the infinite dimensional real vector space) to the real numbers; the usual “limit of partial sums” definition has some nice properties with respect to sequence addition, scalar multiplication and with respect to a “shift operation” and addition, provided we restrict ourselves to a suitable collection of sequences (say, those whose traditional sum of components are absolutely convergent).

So, this “non-standard sum” can be thought of as a map f:V \rightarrow R^1 where f(\{1, 2, 3, 4, 5,....\}) \rightarrow -\frac{1}{12} . That is a bit less offensive than calling it a “sum”. ๐Ÿ™‚

November 18, 2013

And I get sloppy….divergence of n!x^n

Filed under: calculus, sequences, series — Tags: — collegemathteaching @ 9:29 pm

In class I was demonstrating the various open intervals of absolute convergence and gave the usual \sum k!x^k as an example of a series that converges at x = 0 only. I mentioned that “\sum k!x^k doesn’t even pass the divergence test”, which, as it turns out, is true. But why? (yes, it is easier to just use the ratio test and be done with it)

Well, I should have noted: if x > 0 , then x > \frac{1}{m} for some integer m, then for k > m we have k!x^k > \frac{1*2*3...*m *(m+1)*(m+2)...*k}{m*m*m...*m*m*m...*m} and one can see that this is a finite number times a number which is growing without bound. Hence the sequence of terms of the series grows without bound for any positive value of x .

Older Posts »

Create a free website or blog at