College Math Teaching

May 21, 2021

Introduction to infinite series for inexperienced calculus teachers

Filed under: calculus, mathematics education, pedagogy, Power Series, sequences, series — oldgote @ 1:26 pm

Let me start by saying that this is NOT: this is not an introduction for calculus students (too steep) nor is this intended for experienced calculus teachers. Nor is this a “you should teach it THIS way” or “introduce the concepts in THIS order or emphasize THESE topics”; that is for the individual teacher to decide.

Rather, this is a quick overview to help the new teacher (or for the teacher who has not taught it in a long time) decide for themselves how to go about it.

And yes, I’ll be giving a lot of opinions; disagree if you like.

What series will be used for.

Of course, infinite series have applications in probability theory (discrete density functions, expectation and higher moment values of discrete random variables), financial mathematics (perpetuities), etc. and these are great reasons to learn about them. But in calculus, these tend to be background material for power series.

Power series: \sum^{\infty}_{k=0} a_k (x-c)^k , the most important thing is to determine the open interval of absolute convergence; that is, the intervals on which \sum^{\infty}_{k=0} |a_k (x-c)^k | converges.

We teach that these intervals are *always* symmetric about x = c (that is, at x = c only, on some open interval (c-\delta, c+ \delta) or the whole real line. Side note: this is an interesting place to point out the influence that the calculus of complex variables has on real variable calculus! These open intervals are the most important aspect as one can prove that one can differentiate and integrate said series “term by term” on the open interval of absolute convergence; sometimes one can extend the results to the boundary of the interval.

Therefore, if time is limited, I tend to focus on material more relevant for series that are absolutely convergent though there are some interesting (and fun) things one can do for a series which is conditionally convergent (convergent, but not absolutely convergent; e. g. \sum^{\infty}_{k=1} (-1)^{k+1} {1 \over k} .

Important principles: I think it is a good idea to first deal with geometric series and then series with positive terms…make that “non-negative” terms.

Geometric series: \sum ^{\infty}_{k =0} x^k ; here we see that for x \neq 1 , \sum ^{n}_{k =0} x^k= {1-x^{n+1} \over 1-x } and is equal to n+1 for n = 1 ; to show this do the old “shifted sum” addition: S = 1 + x + x^2 + ...x^n , xS = x+x^2 + ...+x^{n+1} then subtract: S-xS = (1-x)S = 1-x^{n+1} as most of the terms cancel with the subtraction.

Now to show the geometric series converges, (convergence being the standard kind: \sum^n_{k = 0} c_k = S_n the “n’th partial sum, then the series \sum^{\infty}_{k = 0} c_k  converges if an only if the sequence of partial sums S_n converges; yes there are other types of convergence)

Now that we’ve established that for the geometric series, S_n =  {1-x^{n+1} \over 1-x }  and we get convergence if |x^{n+1}| goes to zero, which happens only if |x| < 1 .

Why geometric series: two of the most common series tests (root and ratio tests) involve a comparison to a geometric series. Also, the geometric series concept is used both in the theory of improper integrals and in measure theory (e. g., showing that the rational numbers have measure zero).

Series of non-negative terms. For now, we’ll assume that \sum a_k has all a_k \geq 0 (suppressing the indices).

Main principle: though most texts talk about the various tests, I believe that most of the tests involved really involve three key principles, two of which the geometric series and the following result from sequences of positive numbers:

Key sequence result: every monotone bounded sequence of positive numbers converges to its least upper bound.

True: many calculus texts don’t do that much with the least upper bound concept but I feel it is intuitive enough to at least mention. If the least upper bound is, say, b , then if a_n is the sequence in question, there has to be some N  > 0 such that a_n > b-\delta for any small, positive \delta . Then because a_n is monotone, b> a_{m} > b-\delta for all m > n

The third key principle is “common sense” : if \sum c_k converges (standard convergence) then c_k \rightarrow 0 as a sequence. This is pretty clear if the c_k are non-negative; the idea is that the sequence of partial sums S_n cannot converge to a limit unless |S_n -S_{n+1}| becomes arbitrarily small. Of course, this is true even if the terms are not all positive.

Secondary results I think that the next results are “second order” results: the main results depend on these, and these depend on the key 3 that we just discussed.

The first of these secondary results is the direct comparison test for series of non-negative terms:

Direct comparison test

If 0< c_n \leq b_n  and \sum b_n converges, then so does \sum c_n . If \sum c_n diverges, then so does \sum b_n .

The proof is basically the “bounded monotone sequence” principle applied to the partial sums. I like to call it “if you are taller than an NBA center then you are tall” principle.

Evidently, some see this result as a “just get to something else” result, but it is extremely useful; one can apply this to show that the exponential of a square matrix is defined; it is the principle behind the Weierstrass M-test, etc. Do not underestimate this test!

Absolute convergence: this is the most important kind of convergence for power series as this is the type of convergence we will have on an open interval. A series is absolutely convergent if \sum |c_k| converges. Now, of course, absolute convergence implies convergence:

Note 0 < |c_k| -c_k \leq 2|c_k| and if \sum |c_k| converges, then \sum |c_k|-c_k converges by direct comparison. Now note c_k = |c_k|-(|c_k| -c_k) \rightarrow \sum c_k is the difference of two convergent series: \sum |c_k| -\sum (|c_k|-c_k ) and therefore converges.

Integral test This is an important test for convergence at a point. This test assumes that f is a non-negative, non-decreasing function on some [1, \infty) (that is, a >b \rightarrow f(a) \geq f(b) ) Then \sum f(n) converges if and only if \int_1^{\infty} f(x)dx converges as an improper integral.

Proof: \sum_{n=2} f(n) is just a right endpoint Riemann sum for \int_1^{\infty} f(x)dx and therefore the sequence of partial sums is an increasing, bounded sequence. Now if the sum converges, note that \sum_{n=1} f(n) is the right endpoint estimate for \int_1^{\infty} f(x)dx so the integral can be defined as a limit of a bounded, increasing sequence so the integral converges.

Yes, these are crude whiteboards but they get the job done.

Note: we need the hypothesis that f is decreasing (or non-decreasing). Example: the function f(x) = \begin{cases}  x , & \text{ if } x \notin \{1, 2, 3,...\} \\ 0, & \text{ otherwise} \end{cases} certainly has \sum f(n) converging but \int^{\infty}_{1} f(x) dx diverging.

Going the other way, defining f(x) = \begin{cases}  2^n , & \text{ if }  x \in [n, n+2^{-2n}] \\0, & \text{ otherwise} \end{cases} gives an unbounded function with unbounded sum \sum_{n=1} 2^n but the integral converges to the sum \sum_{n=1} 2^{-n} =1 . The “boxes” get taller and skinnier.

Note: the above shows the integral and sum starting at 0; same principle though.

Now wait a minute: we haven’t really gone over how students will do most of their homework and exam problems. We’ve covered none of these: p-test, limit comparison test, ratio test, root test. Ok, logically, we have but not practically.

Let’s remedy that. First, start with the “point convergence” tests.

p-test. This says that \sum {1 \over k^p} converges if p> 1 and diverges otherwise. Proof: Integral test.

Limit comparison test Given two series of positive terms: \sum b_k and \sum c_k

Suppose lim_{k \rightarrow \infty} {b_k \over c_k} = L

If \sum c_k converges and 0 \leq L < \infty then so does \sum b_k .

If \sum c_k diverges and 0 < L \leq \infty then so does \sum b_k

I’ll show the “converge” part of the proof: choose \epsilon = L then N such that n > N \rightarrow  {b_n \over c_n } < 2L This means \sum_{k=n} b_k \leq \sum_{k=n} c_k and we get convergence by direct comparison. See how useful that test is?

But note what is going on: it really isn’t necessary for lim_{k \rightarrow \infty} {b_k \over c_k}  to exist; for the convergence case it is only necessary that there be some M for which M >  {b_k \over c_k}  ; if one is familiar with the limit superior (“limsup”) that is enough to make the test work.

We will see this again.

Why limit comparison is used: Something like \sum {1 \over 4k^5-2k^2-14} clearly converges, but nailing down the proof with direct comparison can be hard. But a limit comparison with \sum {1 \over k^5} is pretty easy.

Ratio test this test is most commonly used when the series has powers and/or factorials in it. Basically: given \sum c_n consider lim_{k \rightarrow \infty} {c_{k+1} \over c_{k}} = L (if the limit exists..if it doesn’t..stay tuned).

If L < 1 the series converges. If L > 1 the series diverges. If L = 1 the test is inconclusive.

Note: if it turns out that there is exists some N >0 such that for all n > N we have {c_{n+1} \over c_n } < \gamma < 1 then the series converges (we can use the limsup concept here as well)

Why this works: suppose there exists some N >0 such that for all n > N we have {c_{n+1} \over c_n } < \gamma < 1 Then write \sum_{k=n} c_k = c_n + c_{n+1} + c_{n+2} + ....

now factor out a c_n to obtain c_n (1 + {c_{n+1} \over c_n} + {c_{n+2} \over c_n} + {c_{n+3} \over c_{n}} +....)

Now multiply the terms by 1 in a clever way:

c_n (1 + {c_{n+1} \over c_n} + {c_{n+2} \over c_{n+1}}{c_{n+1} \over c_n} + {c_{n+3} \over c_{n+2}}  {c_{n+2} \over c_{n+1}}  {c_{n+1} \over c_{n}}   +....) See where this is going: each ratio is less than \gamma so we have:

\sum_{k=n} c_k \leq c_n \sum_{j=0} (\gamma)^j which is a convergent geometric series.

See: there is geometric series and the direct comparison test, again.

Root Test No, this is NOT the same as the ratio test. In fact, it is a bit “stronger” than the ratio test in that the root test will work for anything the ratio test works for, but there are some series that the root test works for that the ratio test comes up empty.

I’ll state the “lim sup” version of the ratio test: if there exists some N such that, for all n>N we have (c_n)^{1 \over n} < \gamma < 1 then the series converges (exercise: find the “divergence version”).

As before: if the condition is met, \sum_{k=n} c_n \leq \sum_{k=n} \gamma^k so the original series converges by direction comparison.

Now as far as my previous remark about the ratio test: Consider the series:

1 + ({1 \over 3}) + ({2 \over 3})^2 + ({1 \over 3})^3 + ({2 \over 3})^4 +...({1 \over 3})^{2k-1} +({2 \over 3})^{2k} ...

Yes, this series is bounded by the convergent geometric series with r = {2 \over 3} and therefore converges by direct comparison. And the limsup version of the root test works as well.

But the ratio test is a disaster as {({2 \over 3})^{2k}  \over  ({1 \over 3})^{2k-1} } ={2^{2k} \over 3 } which is unbounded..but {({1 \over 3})^{2k+1}  \over  ({2 \over 3})^{2k} }  ={1 \over (2^{2k} 3) } .

What about non-absolute convergence (aka “conditional convergence”)

Series like \sum_{k=1} (-1)^{k+1} {1 \over k} converges but does NOT converge absolutely (p-test). On one hand, such series are a LOT of fun..but the convergence is very slow and unstable and so might say that these series are not as important as the series that converges absolutely. But there is a lot of interesting mathematics to be had here.

So, let’s chat about these a bit.

We say \sum c_k is conditionally convergent if the series converges but \sum |c_k| diverges.

One elementary tool for dealing with these is the alternating series test:

for this, let c_k >0 and for all k, c_{k+1} < c_k .

Then \sum_{k=1} (-1)^{k+1} c_k converges if and only if c_k \rightarrow 0 as a sequence.

That the sequence of terms goes to zero is necessary. That it is sufficient in this alternating case: first note that the terms of the sequence of partial sums are bounded above by c_1 (as the magnitudes get steadily smaller) and below by c_1 - c_2 (same reason. Note also that S_{2k+2} = S_{2k} -c_{2k+1} + c_{2k+2} < S_{2k} so the sequence of partial sums of even index are an increasing bounded sequence and therefore converges to some limit, say, L . But S_{2k+1} = S_{2k} + c_{2k+1} and so by a routine “epsilon-N” argument the odd partial sums converge to L as well.

Of course, there are conditionally convergent series that are NOT alternating. And conditionally convergent series have some interesting properties.

One of the most interesting properties is that such series can be “rearranged” (“derangment” in Knopp’s book) to either converge to any number of choice or to diverge to infinity or to have no limit at all.

Here is an outline of the arguments:

To rearrange a series to converge to L , start with the positive terms (which must diverge as the series is conditionally convergent) and add them up to exceed L ; stop just after L is exceeded. Call that partial sum u_1. Note: this could be 0 terms. Now use the negative terms to go of the left of L and stop the first one past. Call that l_1 Then move to the right, past L again with the positive terms..note that the overshoot is smaller as the terms are smaller. This is u_2 . Then go back again to get l_2 to the left of L . Repeat.

Note that at every stage, every partial sum after the first one past L is between some u_i, l_i and the u_i, l_i bracket L and the distance is shrinking to become arbitrarily small.

To rearrange a series to diverge to infinity: Add the positive terms to exceed 1. Add a negative term. Then add the terms to exceed 2. Add a negative term. Repeat this for each positive integer n .

Have fun with this; you can have the partial sums end up all over the place.

That’s it for now; I might do power series later.

May 10, 2021

Series convergence tests: the “harder to use in calculus 1” tests may well be the most useful.

I talked about the root and ratio test here and how the root test is the stronger of the two tests. What I should point out that the proof of the root test depends on the basic comparison test.

And so..a professor on Twitter asked:

Of course, one proves the limit comparison test by the direct comparison test. But in a calculus course, the limit comparison test might appear to be more readily useful..example:

Show \sum {1 \over k^2-1} converges.

So..what about the direct comparison test?

As someone pointed out: the direct comparison can work very well when you don’t know much about the matrix.

One example can be found when one shows that the matrix exponential e^A where A is a n \times n matrix.

For those unfamiliar: e^A = \sum^{\infty}_{k=0} {A^k \over k!} where the powers make sense as A is square and we merely add the corresponding matrix entries.

What enables convergence is the factorial in the denominators of the individual terms; the i-j’th element of each A^k can get only so large.

But how does one prove convergence?

The usual way is to dive into matrix norms; one that works well is |A| = \sum_{(i,j)} |a_{i,j}| (just sum up the absolute value of the elements (the Taxi cab norm or l_1 norm )

Then one can show |AB| \leq |A||B| and |a_{i,j}| \leq |A| and together this implies the following:

For any index k where a^k_{i,j} is the i-j’th element of A^k we have:

| a^k_{i,j}  | \leq |A^k| \leq |A|^k

It then follows that | [ e^A ]_{i,j} | \leq \sum^{\infty}_{k=0} {|A^k |\over k!} \leq  \sum^{\infty}_{k=0} {|A|^k \over k!} =e^{|A|} . Therefore every series that determines an entry of the matrix e^A is an absolutely convergent series by direct comparison. and is therefore a convergent series.

February 5, 2016

More fun with selective sums of divergent series

Just a reminder: if \sum_{k=1}^{\infty} a_k is a series and c_1, c_2, ...c_n ,, is some sequence consisting of 0’s and 1’s then a selective sum of the series is \sum_{k=1}^{\infty} c_k a_k . The selective sum concept is discussed in the MAA book Real Infinite Series (MAA Textbooks) by Bonar and Khoury (2006) and I was introduced to the concept by Ferdinands’s article Selective Sums of an Infinite Series in the June 2015 edition of Mathematics Magazine (Vol. 88, 179-185).

There is much of interest there, especially if one considers convergent series or alternating series.

This post will be about divergent series of positive terms for which lim_{n \rightarrow \infty} a_n = 0 and a_{n+1} < a_n for all n .

The first fun result is this one: any selected x > 0 is a selective sum of such a series. The proof of this isn’t that bad. Since lim_{n \rightarrow \infty} a_n = 0 we can find a smallest n such that a_n \leq x . Clearly if a_n = x we are done: our selective sum has c_n = 1 and the rest of the c_k = 0 .

If not, set n_1 = n and note that because the series diverges, there is a largest m_1 so that \sum_{k=n_1}^{m_1} a_k \leq x . Now if \sum_{k=n_1}^{m_1} a_k = x we are done, else let \epsilon_1 = x - \sum_{k=n_1}^{m_1} a_k and note \epsilon_1 < a_{m_1+1} . Now because the a_k tend to zero, there is some first n_2 so that a_{n_2} \leq \epsilon_1 . If this is equality then the required sum is a_{n_2} + \sum_{k=n_1}^{m_1} a_k , else we can find the largest m_2 so that \sum_{k=n_1}^{m_1} a_k + \sum_{k=n_2}^{m_2} a_k \leq x

This procedure can be continued indefinitely. So if we label \sum_{k=n_j}^{m_{j}} a_k = s_j we see that s_1 + s_2 + ...s_{n} = t_{n} form an increasing, bounded sequence which converges to the least upper bound of its range, and it isn’t hard to see that the least upper bound is x because x-t_{n} =\epsilon_n < a_{m_n+1}

So now that we can obtain any positive real number as the selective sum of such a series, what can we say about the set of all selective sums for which almost all of the c_k = 0 (that is, all but a finite number of the c_k are zero).

Answer: the set of all such selective sums are dense in the real line, and this isn’t that hard to see, given our above construction. Let (a,b) be any open interval in the real line and let a < x < b . Then one can find some N such that for all n > N we have x - a_n > a . Now consider our construction and choose m large enough such that x - t_m > x - a_n > a . Then the t_m represents the finite selected sum that lies in the interval (a,b) .

We can be even more specific if we now look at a specific series, such as the harmonic series \sum_{k=1}^{\infty} \frac{1}{k} . We know that the set of finite selected sums forms a dense subset of the real line. But it turns out that the set of select sums is the rationals. I’ll give a slightly different proof than one finds in Bonar and Khoury.

First we prove that every rational in (0,1] is a finite select sum. Clearly 1 is a finite select sum. Otherwise: Given \frac{p}{q} we can find the minimum n so that \frac{1}{n} \leq \frac{p}{q} < \frac{1}{n-1} . If \frac{p}{q} = \frac{1}{n} we are done. Otherwise: the strict inequality shows that pn-p < q which means pn-q < p . Then note \frac{p}{q} - \frac{1}{n} = \frac{pn-q}{qn} and this fraction has a strictly smaller numerator than p . So we can repeat our process with this new rational number. And this process must eventually terminate because the numerators generated from this process form a strictly decreasing sequence of positive integers. The process can only terminate when the new faction has a numerator of 1. Hence the original fraction is some sum of fractions with numerator 1.

Now if the rational number r in question is greater than one, one finds n_1 so that \sum^{n_1}_{k=1} \frac{1}{k} \leq r but \sum^{n_1+1}_{k=1} \frac{1}{k} > r . Then write r-\sum^{n_1+1}_{k=1} \frac{1}{k} and note that its magnitude is less than \frac{1}{n_1+1} . We then use the procedure for numbers in (0,1) noting that our starting point excludes the previously used terms of the harmonic series.

There is more we can do, but I’ll stop here for now.

January 26, 2016

More Fun with Divergent Series: redefining series convergence (Cesàro, etc.)

Filed under: analysis, calculus, sequences, series — Tags: , , — collegemathteaching @ 10:21 pm

This post is more designed to entertain myself than anything else. This builds up a previous post which talks about deleting enough terms from a divergent series to make it a convergent one.

This post is inspired by Chapter 8 of Konrad Knopp’s classic Theory and Application of Infinite Series. The title of the chapter is Divergent Series.

Notation: when I talk about a series converging, I mean “converging” in the usual sense; e. g. if s_n = \sum_{k=0}^{k=n} a_k and lim_{n \rightarrow \infty}s_n = s then \sum_{k=0}^{\infty} a_k is said to be convergent with sum s .

All of this makes sense since things like limits are carefully defined. But as Knopp points out, in the “days of old”, mathematicians say these series as formal objects rather than the result of careful construction. So some of these mathematicians (like Euler) had no problem saying things like \sum^{\infty}_{k=0} (-1)^k = 1-1+1-1+1..... = \frac{1}{2} . Now this is complete nonsense by our usual modern definition. But we might note that \frac{1}{1-x} = \sum^{\infty}_{k=0} x^k for -1 < x < 1 and note that x = -1 IS in the domain of the left hand side.

So, is there a way of redefining the meaning of “infinite sum” that gives us this result, while not changing the value of convergent series (defined in the standard way)? As Knopp points out in his book, the answer is “yes” and he describes several definitions of summation that

1. Do not change the value of an infinite sum that converges in the traditional sense and
2. Allows for more series to coverge.

We’ll discuss one of these methods, commonly referred to as Cesàro summation. There are ways to generalize this.

How this came about

Consider the Euler example: 1 -1 + 1 -1 + 1 -1...... . Clearly, s_{2k} = 1, s_{2k+1} = 0 and so this geometric series diverges. But notice that the arithmetic average of the partial sums, computed as c_n = \frac{s_0 + s_1 +...+s_n}{n+1} does tend to \frac{1}{2} as n tends to infinity: c_{2n} = \frac{\frac{2n}{2}}{2n+1} = \frac{n}{2n+1} whereas c_{2n+1} = \frac{\frac{2n}{2}}{2n+2} =\frac{n}{2n+2} and both of these quantities tend to \frac{1}{2} as n tends to infinity.

So, we need to see that this method of summing is workable; that is, do infinite sums that converge in the previous sense still converge to the same number with this method?

The answer is, of course, yes. Here is how to see this: Let x_n be a sequence that converges to zero. Then for any \epsilon > 0 we can find M such that k > M implies that |x_k| < \epsilon . So for n > k we have \frac{x_1 + x_2 + ...+ x_{k-1} + x_k + ...+ x_n}{n} = \frac{x_1+ ...+x_{k-1}}{n} + \frac{x_k + x_{k+1} + ....x_n}{n} Because k is fixed, the first fraction tends to zero as n tends to infinity. The second fraction is smaller than \epsilon in absolute value. But \epsilon is arbitrary, hence this arithmetic average of this null sequence is itself a null sequence.

Now let x_n \rightarrow L and let c_n = \frac{x_1 + x_2 + ...+ x_{k-1} + x_k + ...+ x_n}{n} Now subtract note c_n-L =  \frac{(x_1-L) + (x_2-L) + ...+ (x_{k-1}-L) +(x_k-L) + ...+ (x_n-L)}{n} and the x_n-L forms a null sequence. Then so do the c_n-L .

Now to be useful, we’d have to show that series that are summable in the Cesàro obey things like the multiplicative laws; they do but I am too lazy to show that. See the Knopp book.

I will mention a couple of interesting (to me) things though. Neither is really profound.

1. If a series diverges to infinity (that is, if for any positive M there exists n such that for all k \geq n, s_k > M , then this series is NOT Cesàro summable. It is relatively easy to see why: given such an M, k then consider \frac{s_1 + s_2 + s_3 + ...+s_{k-1} + s_k + s_{k+1} + ...s_n}{n} = \frac{s_1+ s_2 + ...+s_{k-1}}{n} + \frac{s_k + s_{k+1} .....+s_{n}}{n} which is greater than \frac{n-k}{n} M for large n . Hence the Cesàro partial sum becomes unbounded.

Upshot: there is no hope in making something like \sum^{\infty}_{n=1} \frac{1}{n} into a convergent series by this method. Now there is a way of making an alternating, divergent series into a convergent one via doing something like a “double Cesàro sum” (take arithmetic averages of the arithmetic averages) but that is a topic for another post.

2. Cesàro summation may speed up convergent of an alternating series which passes the alternating series test, OR it might slow it down. I’ll have to develop this idea more fully. But I invite the reader to try Cesàro summation for \sum^{\infty}_{k=1} (-1)^{k+1} \frac{1}{k} and on \sum^{\infty}_{k=1} (-1)^{k+1} \frac{1}{k^2} and on \sum^{\infty}_{k=0} (-1)^k \frac{1}{2^k} . In the first two cases, the series converges slowly enough so that Cesàro summation speeds up convergence. Cesàro slows down the convergence in the geometric series though. It is interesting to ponder why.

January 14, 2016

Trimming a divergent series into a convergent one

Filed under: calculus, induction, sequences, series — Tags: , , — collegemathteaching @ 10:28 pm

This post is motivated by this cartoon
harmonic-series

which I found at a Evelyn Lamb’s post on an AMS Blog, this fun Forbes math post by Kevin Kundson and by a June 2015 article in Mathematics Magazine by R. John Ferdinands called Selective Sums of an Infinite Series.

Here is the following question: start with a divergent series of positive terms which form a decreasing (non-increasing) sequence which tends to zero, say, \sum^{\infty}_{k =1} \frac{1}{k} . Now how does one select a subset of series terms to delete so as to obtain a convergent series? The Kundson article shows that one can do this with the harmonic series by, say, deleting all numbers that contain a specific digit (say, 9). I’ll talk about the proof here. But I’d like to start more basic and to bring in language used in the Ferdinands article.

So, let’s set the stage: we will let \sum a_k denote the divergent sum in question. All terms will be positive, a_{k} \geq a_{k+1} for all k and lim_{k \rightarrow \infty} a_k = 0 . Now let c_k represent a sequence where c_k \in \{0,1\} for all k ; then \sum c_ka_k is called a selective sum of \sum a_k . I’ll call the c_k the selecting sequence and, from the start, rule out selecting sequences that are either eventually 1 (which means that the selected series diverges since the original series did) or eventually zero (just a finite sum).

Now we’ll state a really easy result:

There is some non-eventually constant c_k such that \sum c_ka_k converges. Here is why: because lim_{k \rightarrow \infty} a_k = 0 , for each n \in \{1,2,3...\} one can find a maximal index n_j, n_j \notin \{n_1, n_2, ...n_{j-1} \} so that \frac{1}{2^n} > a_{n_j} . Now select c_k = 1 if k \in \{n_1, n_2, n_3,... \} and c_k =0 otherwise. Then \sum \frac{1}{2^k} > \sum c_ka_k and therefore the selected series converges by comparison with a convergent geometric series.

Of course, this result is petty lame; this technique discards a lot of terms. A cheap way to discard “fewer” terms (“fewer” meaning: in terms of “set inclusion”): Do the previous construction, but instead of using \frac{1}{2} use \frac{M}{M+1} where M is a positive integer of choice. Note that \sum^{\infty}_{k=1} (\frac{M}{M+1})^k = M

Here is an example of how this works: Consider the divergent series \sum \frac{1}{\sqrt{k}} and the convergent geometric series \sum (\frac{1000}{1001})^k Of course \frac{1000}{1001} < 1 so c_1 = 0 but then for k \in \{2,3,....4169 \} we have (\frac{1000}{1001})^k > \frac{1}{\sqrt{k}} . So c_k = 1 for k \in \{2,3,4,....4169 \} . But c_{4170} = 0 because (\frac{1000}{1001})^{4170} < \frac{1}{\sqrt{4170}}. The next non-zero selection coefficient is c_{4171} as (\frac{1000}{1001})^{4170} > \frac{1}{\sqrt{4171}} .

Now playing with this example, we see that \frac{1}{\sqrt{k}} > (\frac{1000}{1001})^{4171} for k \in \{4172, 4173,....4179 \} but not for k = 4180 . So c_k = 0 for k \in \{4172,....4179 \} and c_{4180} = 1 . So the first few n_j are \{2, 3, ....4169, 4171, 4180 \} . Of course the gap between the n_j grows as k does.

Now let’s get back to the cartoon example. From this example, we’ll attempt to state a more general result.

Claim: given \sum^{\infty}_{k=1} c_k \frac{1}{k} where c_k = 0 if k contains a 9 as one of its digits, then \sum^{\infty}_{k=1} c_k \frac{1}{k} converges. Hint on how to prove this (without reading the solution): count the number of integers between 10^k and 10^{k+1} that lack a 9 as a digit. Then do a comparison test with a convergent geometric series, noting that every term \frac{1}{10^k}, \frac{1}{10^k + 1}......, \frac{1}{8(10^k) +88} is less than or equal to \frac{1}{10^k} .

How to prove the claim: we can start by “counting” the number of integers between 0 and 10^k that contain no 9’s as a digit.

Between 0 and 9: clearly 0-8 inclusive, or 9 numbers.

Between 10 and 99: a moment’s thought shows that we have 8(9) = 72 numbers with no 9 as a digit (hint: consider 10-19, 20-29…80-89) so this means that we have 9 + 8(9) = 9(1+8) = 9^2 numbers between 0 and 99 with no 9 as a digit.

This leads to the conjecture: there are 9^k numbers between 0 and 10^k -1 with no 9 as a digit and (8)9^{k-1} between 10^{k-1} and 10^k-1 with no 9 as a digit.

This is verified by induction. This is true for k = 1

Assume true for k = n . Then to find the number of numbers without a 9 between 10^n and 10^{n+1} -1 we get 8 (9^n) which then means we have 9^n + 8(9^n) = 9^n (8+1) = 9^{n+1} numbers between 0 and 10^{n+1}-1 with no 9 as a digit. So our conjecture is proved by induction.

Now note that 0+ 1 + \frac{1}{2} + ....+ \frac{1}{8} < 8*1*1

\frac{1}{10} + ...+ \frac{1}{18} + \frac{1}{20} + ...+ \frac{1}{28} + \frac{1}{30} + ...+ \frac{1}{88} < 8*9*\frac{1}{10}

\frac{1}{100} + ...\frac{1}{88} + \frac{1}{200} + ....\frac{1}{888} < 8*(9^2)\frac{1}{100}

This establishes that \sum_{k=10^n}^{10^{n+1}-1} c_k \frac{1}{k} < 8*(9^k)\frac{1}{10^k}

So it follows that \sum^{\infty}_{k=1} c_k \frac{1}{k} < 8\sum^{\infty}{k=0} (\frac{9}{10})^k = 8 \frac{1}{1-\frac{9}{10}} = 80 and hence our selected sum is convergent.

Further questions: ok, what is going on is that we threw out enough terms of the harmonic series for the series to converge. Between terms \frac{1}{10^k} and \frac{1}{10^{k+1}-1} we allowed 8*(9^k) terms to survive.

This suggests that if we permit up to M (10-\epsilon)^k terms between 10^k and 10^{k+1}-1 to survive (M, \epsilon fixed and positive) then we will have a convergent series. I’d be interested in seeing if there is an generalization of this.

But I am tried, I have a research article to review and I need to start class preparation for the upcoming spring semester. So I’ll stop here. For now. 🙂

April 10, 2015

Cantor sets and countable products of discrete spaces (0, 1)^Z

Filed under: advanced mathematics, analysis, point set topology, sequences — Tags: , , , — collegemathteaching @ 11:09 am

This might seem like a strange topic but right now our topology class is studying compact metric spaces. One of the more famous of these is the “Cantor set” or “Cantor space”. I discussed the basics of these here.

Now if you know the relationship between a countable product of two point discrete spaces (in the product topology) and Cantor spaces/Cantor Sets, this post is probably too elementary for you.

Construction: start with a two point set D = \{0, 2 \} and give it the discrete topology. The reason for choosing 0 and 2 to represent the elements will become clear later. Of course, D_2 is a compact metric space (with the discrete metric: d(x,y) = 1 if x \neq y .

Now consider the infinite product of such spaces with the product topology: C = \Pi^{\infty}_{i=1} D_i where each D_i is homeomorphic to D . It follows from the Tychonoff Theorem that C is compact, though we can prove this directly: Let \cup_{\alpha \in I} U_{\alpha} be any open cover for C . Then choose an arbitrary U from this open cover; because we are using the product topology U = O_1 \times O_2 \times ....O_k \times (\Pi_{i=k+1}^{\infty} D_i ) where each O_i is a one or two point set. This means that the cardinality of C - U is at most 2^k -1 which requires at most 2^k -1 elements of the open cover to cover.

Now let’s examine some properties.

Clearly the space is Hausdorff (T_2 ) and uncountable.

1. Every point of C is a limit point of C . To see this: denote x \in C by the sequence \{x_i \} where x_i \in \{0, 2 \} . Then any open set containing \{ x_i \} is O_1 \times O_2 \times...O_k \times \Pi^{\infty}_{i=k+1} D_i and contains ALL points y_i where y_i = x_i for i = \{1, 2, ...k \} . So all points of C are accumulation points of C ; in fact they are condensation points (or perfect limit points ).

(refresher: accumulation points are those for which every open neighborhood contains an infinite number of points of the set in question; condensation points contain an uncountable number of points, and perfect limit points are those for which every open neighborhood contains as many points as the set in question has (same cardinality).

2. C is totally disconnected (the components are one point sets). Here is how we will show this: given x, y \in C, x \neq y, there exists disjoint open sets U_x, U_y, x \in U_x, y \in U_y, U_x \cup U_y = C . Proof of claim: if x \neq y there exists a first coordinate k for which x_k \neq y_k (that is, a first k for which the canonical projection maps disagree (\pi_k(x) \neq pi_k(y) ). Then
U_x = D_1 \times D_2 \times ....\times D_{k-1} \times x_k \times \Pi^{\infty}_{i=k+1} D_i,

U_y = D_1 \times D_2 \times.....\times D_{k-1} \times y_k \times \Pi^{\infty}_{i = k+1} D_i

are the required disjoint open sets whose union is all of C .

3. C is countable, as basis elements for open sets consist of finite sequences of 0’s and 2’s followed by an infinite product of D_i .

4. C is metrizable as well; d(x,y) = \sum^{\infty}_{i=1} \frac{|x_i - y_i|}{3^i} . Note that is metric is well defined. Suppose x \neq y . Then there is a first k, x_k \neq y_k . Then note

d(x,y) = \frac{|x_k - y_k|}{3^k} + \sum^{\infty}_{i = k+1} \frac{|x_i - y_i|}{3^i} \rightarrow |x_k -y_k| =2 = \sum^{\infty}_{i=1} \frac{|x_{i+k} -y_{i+k}|}{3^i} \leq \frac{1}{3} \frac{1}{1 -\frac{2}{3}} =1

which is impossible.

5. By construction C is uncountable, though this follows from the fact that C is compact, Haudorff and dense in itself.

6. C \times C is homeomorphic to C . The homeomorphism is given by f( \{x_i \}, \{y_i \}) = \{ x_1, y_1, x_2, y_2,... \} \in C . It follows that C is homeomorphic to a finite product with itself (product topology). Here we use the fact that if f: X \rightarrow Y is a continuous bijection with X compact and Y Hausdorff then f is a homeomorphism.

Now we can say a bit more: if C_i is a copy of C then \Pi^{\infty}_{i =1 } C_i is homeomorphic to C . This will follow from subsequent work, but we can prove this right now, provided we review some basic facts about countable products and counting.

First lets show that there is a bijection between Z \times Z and Z . A bijection is suggested by this diagram:

ztimeszmyway

which has the following formula (coming up with it is fun; it uses the fact that \sum^k _{n=1} n = \frac{k(k+1)}{2} :

\phi(k,1) = \frac{(k)(k+1)}{2} for k even
\phi(k,1) = \frac{(k-1)(k)}{2} + 1 for k odd
\phi(k-j, j+1) =\phi(k,1) + j for k odd, j \in \{1, 2, ...k-1 \}
\phi(k-j, j+1) = \phi(k,1) - j for k even, j \in \{1, 2, ...k-1 \}

Here is a different bijection; it is a fun exercise to come up with the relevant formulas:

zxzcountable

Now lets give the map between \Pi^{\infty}_{i=1} C_i and C . Let \{ y_i \} \in C and denote the elements of \Pi^{\infty}_{i=1} C_i by \{ x^i_j \} where \{ x_1^1, x_2^1, x_3^ 1....\} \in C_1, \{x_1^2, x_2 ^2, x_3^3, ....\} \in C_2, ....\{x_1^k, x_2^k, .....\} \in C_k ... .

We now describe a map f: C \rightarrow \Pi^{\infty}_{i=1} C_i by

f(\{y_i \}) = \{ x^i_j \} = \{y_{\phi(i,j)} \}

Example: x^1_1 = y_1, x^1_2 = y_2, x^2_1 = y_3, x^3_1 =y_4, x^2_2 = y_5, x^1_3 =y_6,...

That this is a bijection between compact Hausdorff spaces is immediate. If we show that f^{-1} is continuous, we will have shown that f is a homeomorphism.

But that isn’t hard to do. Let U \subset C be open; U = U_1 \times U_2 \times U_3.... \times U_{m-1} \times \Pi^{\infty}_{k=m} C_k . Then there is some k_m for which \phi(k_m, 1) \geq M . Then if f^i_j denotes the i,j component of f we wee that for all i+j \geq k_m+1, f^i_j(U) = C (these are entries on or below the diagonal containing (k,1) depending on whether k_m is even or odd.

So f(U) is of the form V_1 \times V_2 \times ....V_{k_m} \times \Pi^{\infty}_{i = k_m +1} C_i where each V_j is open in C_j . This is an open set in the product topology of \Pi^{\infty}_{i=1} C_i so this shows that f^{-1} is continuous. Therefore f^{-1} is a homeomorphism, therefore so is f.

Ok, what does this have to do with Cantor Sets and Cantor Spaces?

If you know what the “middle thirds” Cantor Set is I urge you stop reading now and prove that that Cantor set is indeed homeomorphic to C as we have described it. I’ll give this quote from Willard, page 121 (Hardback edition), section 17.9 in Chapter 6:

The proof is left as an exercise. You should do it if you think you can’t, since it will teach you a lot about product spaces.

What I will do I’ll give a geometric description of a Cantor set and show that this description, which easily includes the “deleted interval” Cantor sets that are used in analysis courses, is homeomorphic to C .

Set up
I’ll call this set F and describe it as follows:

F \subset R^n (for those interested in the topology of manifolds this poses no restrictions since any manifold embeds in R^n for sufficiently high n ).

Reminder: the diameter of a set F \subset R^n will be lub \{ d(x,y) | x, y \in F \}
Let \epsilon_1, \epsilon_2, \epsilon_3 .....\epsilon_k ... be a strictly decreasing sequence of positive real numbers such that \epsilon_k \rightarrow 0 .

Let F^0 be some closed n-ball in R^n (that is, F^) is a subset homeomorphic to a closed n-ball; we will use that convention throughout)

Let F^1_{(0) }, F^1_{(2)} be two disjoint closed n-balls in the interior of F^0 , each of which has diameter less than \epsilon_1 .

F^1 = F^1_{(0) } \cup F^1_{(2)}

Let F^2_{(0, 0)}, F^2_{(0,2)} be disjoint closed n-balls in the interior F^1_{(0) } and F^2_{(2,0)}, F^2_{(2,2)} be disjoint closed n-balls in the interior of F^1_{(2) } , each of which (all 4 balls) have diameter less that \epsilon_2 . Let F^2 = F^2_{(0, 0)}\cup F^2_{(0,2)} \cup F^2_{(2, 0)} \cup F^2_{(2,2)}

cantorset

To describe the construction inductively we will use a bit of notation: a_i \in \{0, 2 \} for all i \in \{1, 2, ...\} and \{a_i \} will represent an infinite sequence of such a_i .
Now if F^k has been defined, we let F^{k+1}_{(a_1, a_2, ...a_{k}, 0)} and F^{k+1}_{(a_1, a_2,....,a_{k}, 2)} be disjoint closed n-balls of diameter less than \epsilon_{k+1} which lie in the interior of F^k_{(a_1, a_2,....a_k) } . Note that F^{k+1} consists of 2^{k+1} disjoint closed n-balls.

Now let F = \cap^{\infty}_{i=1} F^i . Since these are compact sets with the finite intersection property (\cap^{m}_{i=1}F^i =F^i \neq \emptyset for all m ), F is non empty and compact. Now for any choice of sequence \{a_i \} we have F_{ \{a_i \} } =\cap^{\infty}_{i=1} F^i_{(a_1, ...a_i)} is nonempty by the finite intersection property. On the other hand, if x, y \in F, x \neq y then d(x,y) = \delta > 0 so choose \epsilon_m such that \epsilon_m < \delta . Then x, y lie in different components of F^m since the diameters of these components are less than \epsilon_m .

Then we can say that the F_{ \{a_i} \} uniquely define the points of F . We can call such points x_{ \{a_i \} }

Note: in the subspace topology, the F^k_{(a_1, a_2, ...a_k)} are open sets, as well as being closed.

Finding a homeomorphism from F to C .
Let f: F \rightarrow C be defined by f( x_{ \{a_i \} } ) = \{a_i \} . This is a bijection. To show continuity: consider the open set U =  y_1 \times y_2 ....\times y_m \times \Pi^{\infty}_{i=m} D_i . Under f this pulls back to the open set (in the subspace topology) F^{m+1}_{(y1, y2, ...y_m, 0 ) } \cup F^{m+1}_{(y1, y2, ...y_m, 2)} hence f is continuous. Because F is compact and C is Hausdorff, f is a homeomorphism.

This ends part I.

We have shown that the Cantor sets defined geometrically and defined via “deleted intervals” are homeomorphic to C . What we have not shown is the following:

Let X be a compact Hausdorff space which is dense in itself (every point is a limit point) and totally disconnected (components are one point sets). Then X is homeomorphic to C . That will be part II.

March 15, 2015

Compact spaces and Tychonoff’s Theorem I

Note to the reader: when I first started this post, I thought it would be a few easy paragraphs. The size began to swell, so I am taking this post in several bite sized parts. This is part I.

Pretty much everyone knows what a compact space is. But not everyone is up on equivalent definitions and on how to prove that the arbitrary product of compact spaces is compact.
The genesis of this blog post is David Wright’s Proceedings of the American Mathematical Society paper (1994) on Tychonoff’s Theorem.

Since I am also writing for my undergraduate topology class, I’ll keep things elementary where possible and perhaps put in more detail than a professional mathematician would have patience for.

I should start by saying why this topic is dear to me: my research area is knot theory; in particular I studied embeddings of the circle into the 3-dimensional sphere S^3 , which can be thought of as the “compactification” of R^3 ; basically one starts with R^3 and then adds a point \infty and declares that the neighborhoods of this new point will be generated by sets of the following form: \{ (x,y,z) | x^2 + y^2 + z^2 > M^2 \} The reason we do this: we often study the complement of the embedded circle, and it is desirable to have a compact set as the complement.

I’ve also studied (in detail) certain classes of embeddings of the real line into non-compact manifolds; to make this study a bit more focused, I insist that such embeddings be “proper” in that the inverse image of a compact set be compact. Hence compactness comes up again, even when the objects of study are not compact.

So, what do we mean by “compact”?

Instead of just blurting out the definition and equivalent formulations, I’ll try to give some intuition. If we are talking about a subset of a metric space, a compact subset is one that is both closed and bounded. Now that is NOT the definition of compactness, though it is true that:

Given a set X \subset R^n , X is compact if and only if X is both closed (as a topological subset) and bounded (in that it fits within a sufficiently large closed ball). In R^1 compact subsets can be thought of as selected finite unions and arbitrary intersections of closed intervals. In the higher dimensions, think of the finite union and arbitrary intersections of things like closed balls.

Now it is true that if f:X \rightarrow Y is continuous, then if X is a compact topological space, then f(X) is compact (either as a space, or in the subspace topology, if f is not onto.

This leads to a big theorem of calculus: the Extreme Value Theorem: if f:R^n \rightarrow R is continuous over a compact subset D \subset R^n then f attains both a maximum and a minimum value over D .

Now in calculus, we rarely use the word “compact” but instead say something about D be a closed, bounded subset. In the case where n = 1 we usually say that D =[a,b] , a closed interval.

So, in terms of intuition, if one is thinking about subsets of R^n , one can think of a compact space as a domain on which any continuous real valued function always attains both a minimum and a maximum value.

Now for the definition
We need some terminology: a collection of open sets U_{\alpha} is said to be an open cover of a space X if \cup_{\beta \in I } U_{\beta} = X and if A \subset X a collection of open sets is said to be an open cover of A if A \subset \cup_{\beta \in I } U_{\beta} A finite subcover is a finite subcollection of the open sets such that \cup^k_{i=1} U_i = \cup_{\beta \in I} U_{\beta} .

Here is an example: (\frac{3}{4}, 1] \cup^{\infty}_{n=1} [0, \frac{n}{n+1}) is an open cover of [0,1] in the subspace topology. A finite subcover (from this collection) would be [0, \frac{4}{5}) \cup (\frac{3}{4}, 1]

Let X be a topological space. We say that X is a compact topological space if any open over of X has a finite subcover. If C \subset X we say that C is a compact subset of X if any open cover of C has a finite subcover.

Prior to going through examples, I think that it is wise to mention something. One logically equivalent definition is this: A space (or a subset) is compact if every cover by open basis elements has a finite subcover. Here is why: if X is compact, then ANY open cover has a finite subcover, and an open cover by basis elements is an open cover. On the other hand: if we assume the “every open cover by open basis elements has a finite subcover” condition: then if \mathscr{U} is an open cover, then we can view this open cover as an open cover of the basis elements whose union is each open U_{\beta} \in \mathscr{U} . This open cover of basis elements has a finite subcover of basis elements..say B_1, B_2, ....B_k . Then for each basis element, choose a single U_i \in \mathscr{U} for which B_i \subset U_i . That is the required open subcover.

Now, when convenient, we can assume that the open cover in question (during a proof) consists of basic open sets. That will simplify things at times.

So, what are some compact spaces and sets, and what are some basic facts?

Let’s see some compact sets, some non compact sets and see some basic facts.

1. Let X be any topological space and A \subset X a finite subset. Then A is a compact subset. Proof: given any open cover of A choose one open set per element of A which contains said element.

2. Let R have the usual topology. Then the integers Z \subset R^1 is not a compact subset; choose the open cover \cup^{\infty}_{n = -\infty} (n - \frac{1}{4}, n+ \frac{1}{4}) is an infinite cover with no finite subcover. In fact, ANY unbounded subset A \subset R^n in the usual metric topology fails to be compact: for a \in A with d(a, 0) \geq n choose B_a(\frac{1}{n}) ; clearly this open cover can have no finite subcover.

3. The finite union of compact subsets is compact (easy exercise).

4. If C \subset X is compact and X is a Hausdorff topological space (T_2 ) then C is closed. Here is why: let x \notin C and for every c \in C choose U_c, V_c open where x \in U_c, c \in V_c . Now \cup_{c \in C}V_c is an open set which contains C and has a finite subcover \cup_{i=1}^k V_i Note that each U_i is an open set which contains x and now we have only a finite number of these. Hence x \in \cap^k_{i=1} U_i which is disjoint from \cup_{i=1}^k V_i which contains C . Because x was an arbitrary point in X -C , X-C is open which means that C is closed. Note: this proof, with one minor addition, shows that a compact Hausdorff space is regular (T_3 ) we need only show that a closed subset of a compact Hausdorff space is compact. That is easy enough to do: let \mathscr{U} be an open cover for C ; then the collection \mathscr{U} \cup (X-C) is an open cover for X , which has a finite subcover. Let that be \cup^k_{i=1} U_i \cup (X-C) where each U_i \in \mathscr{U} . Now since X-C does not cover C, \cup^k_{i=1} U_i does.

So we have proved that a closed subset of a compact set is compact.

5. Let R (or any infinite set) be given the finite complement topology (that is, the open sets are the empty set together with sets whose complements consist of a finite number of points). Then ANY subset is compact! Here is why: let C be any set and let \mathscr{U} be any open cover. Choose U_1 \in \mathscr{U}. Since X -U_1 is a finite set of points, only a finite number of them can be in C , say c_1, c_2, ...c_k . Then for each of these, select one open set in the open cover that contains the point; that is the finite subcover.

Note: this shows that non-closed sets can be compact sets, if the underlying topology is not Hausdorff.

6. If f: X \rightarrow Y is continuous and onto and X is compact, then so is Y . Proof: let \cup_{\beta \in I} U_{\beta} cover Y and note that \cup_{\beta}f^{-1}(U_{\beta}) covers X , hence a finite number of these open sets cover: X = \cup^{k}_{i=1}f^{-1}(U_i). Therefore \cup^k_{i=1}U_i covers Y . Note: this shows that being compact is a topological invariant; that is, if two spaces are homeomorphic, then either both spaces are compact or neither one is.

7. Ok, let’s finally prove something. Let R^1 have the usual topology. Then [0, 1] (and therefore any closed interval) is compact. This is (part) of the famous Heine-Borel Theorem. The proof uses the least upper bound axiom of the real numbers.

Let \mathscr{U} be any open cover for [0,1] . If no finite subcover exists, let x be the least upper bound of the subset F of [0,1] that CAN be covered by a finite subcollection of \mathscr{U} . Now x > 0 because at least one element of \mathscr{U} contains 0 and therefore contains [0, \delta ) for some \delta > 0 . Assume that x < 1 . Now suppose x \in F , that is x is part of the subset that can be covered by a finite subcover. Then because x \in U_{\beta} for some U_{\beta} \in \mathscr{U} then (x-\delta, x + \delta) \subset U_{\beta} which means that x + \delta \in F , which means that x isn’t an upper bound for F .

Now suppose x \notin F ; then because x < 1 there is still some U_{\beta} where (x-\delta, x+ \delta) \subset U_{\beta} . But since x = lub(F) then x - \delta \in F and so [0, x- \delta ) \subset F . So if F can be covered by \cup^k_{i=1} U_i then \cup^k_{i=1} U_i \cup U_{\beta} is a finite subcover of [0, x + \delta ) which means that x was not an upper bound. It follows that x = 1 which means that the unit interval is compact.

Now what about the closed ball in R^n ? The traditional way is to show that the closed ball is a closed subset of a closed hypercube in R^n and so if we show that the product of compact spaces is compact, we would be done. That is for later.

8. Now endow R^1 with the lower limit topology. That is, the open sets are generated by basis elements [a, b) . Note that the lower limit topology is strictly finer than the usual topology. Now in this topology: [0,1] is not compact. (note: none of (0,1), [0,1), (0, 1] are compact in the coarser usual topology, so there is no need to consider these). To see this, cover [0,1] by \cup ^{\infty}_{n=1} [0, \frac{n}{n+1}) \cup [1, \frac{3}{2}) and it is easy to see that this open cover has no finite subcover. In fact, with a bit of work, one can show that every compact subset is at most countable and nowhere dense; in fact, if A is compact in the lower limit topology and a \in A there exists some y_a where (y_a, a) \cap A = \emptyset .

January 16, 2015

Power sets, Function spaces and puzzling notation

I’ll probably be posting point-set topology stuff due to my being excited about teaching the course…finally.

Power sets and exponent notation
If A is a set, then the power set of A , often denoted by 2^A , is a set that consists of all subsets of A .

For example, if A = \{1, 2, 3 \} , then 2^A = \{ \emptyset , \{1 \}, \{ 2 \}, \{3 \}, \{1, 2 \}, \{1,3 \}, \{2, 3 \}, \{1, 2, 3 \} \} . Now is is no surprise that if the set A is finite and has n elements, then 2^A has 2^n elements.

However, there is another helpful way of listing 2^A . A subset of A can be defined by which elements of A that it has. So, if we order the elements of A as 1, 2, 3 then the power set of A can be identified as follows: \emptyset = (0, 0, 0), \{1 \} = (1, 0, 0), \{ 2 \} = (0,1,0), \{ 3 \} = (0, 0, 1), \{1,2 \} = (1, 1, 0), \{1,3 \} = (1, 0, 1), \{2,3 \} = (0, 1, 1), \{1, 2, 3 \} = (1, 1, 1)

So there is a natural correspondence between the elements of a power set and a sequence of binary digits. Of course, this makes the counting much easier.

The binary notation might seem like an unnecessary complication at first, but now consider the power set of the natural numbers: 2^N . Of course, listing the power sets would be, at least, cumbersome if not impossible! But there the binary notation really shows its value. Remember that the binary notation is a sequence of 0’s and 1’s where a 0 in the i’th slot means that element isn’t an element in a subset and a 1 means that it is.

Since a subset of the natural numbers is defined by its list of elements, every subset has an infinite binary sequence associated with it. We can order the sequence in the usual order 1, 2, 3, 4, ….
and the sequence 1, 0, 0, 0…… corresponds to the set with just 1 in it, the sequence 1, 0, 1, 0, 1, 0, 1, 0,… corresponds to the set consisting of all odd integers, etc.

Then, of course, one can use Cantor’s Diagonal Argument to show that 2^N is uncountable; in fact, if one uses the fact that every non-negative real number has a binary expansion (possibly infinite), one then shows that 2^N has the same cardinality as the real numbers.

Power notation
We can expand on this power notation. Remember that 2^A can be thought of setting up a “slot” or an “index” for each element of A and then assigning a 1 or 0 for every element of A . One can then think of this in an alternate way: 2^A can be thought of as the set of ALL functions from the elements of A to the set \{ 0, 1 \} . This coincides with the “power set” concept as set membership is determined by being either “in” or “not in”. So, the set in the exponent can be thought of either as the indexing set and the base as the value each indexed value can take on (sequences, in the case that the exponent set is either finite or countably finite), OR this can be thought of as the set of all functions where the exponent set is the domain and the base set is the range.

Remember, we are talking about ALL possible functions and not all “continuous” functions, or all “morphisms”, etc.

So, N^N can be thought of as either set set of all possible sequences of positive integers, or, equivalently, the set of all functions of N  to N .

Then R^N is the set of all real number sequences (i. e. the types of sequences we study in calculus), or, equivalently, the set of all real valued functions of the positive integers.

Now it is awkward to try to assign an ordering to the reals, so when we consider R^R it is best to think of this as the set of all functions f: R \rightarrow R , or equivalently, the set of all strings which are indexed by the real numbers and have real values.

Note that sequences don’t really seem to capture R^R in the way that they capture, say, R^N . But there is another concept that does, and that concept is the concept of the net, which I will talk about in a subsequent post.

February 17, 2014

Aitken acceleration: another “easy” example for the students

In a previous post I showed some spreadsheet data to demonstrate the Aitken acceleration process. Here, I’ll go through an example where a sequence converges linearly: let p_n = L +a^n + b^n where 0 < a < b < 1 . We use the form q_n = \frac{p_{n+2}p_{n} -(p_{n+1})^2}{p_{n+2} - 2p_{n+1} + p_n} (I am too lazy to use the traditional “p-hat” notation). First, note that the denominator works out to a^n(1-a)^2 +b^n(1-b)^2

The numerator is a tiny bit more work: the L^2 terms cancel and as far as the rest:
L(a^{n+2} + b^{n+2}) + L(a^n + b^n) +(a^{n+2}+b^{n+2})(a^n + b^2)-2L(a^{n+1}+b^{n+1})-(a^{n+1}+b^{n+1})^2
which simplifies to a term involving L and one that doesn’t. Here is the term involving L :

L(a^{n+2}-2a^{n+1}  + a^n + b^{n+2} -2b^{n+1} +b^n) = L(a^n(1-a)^2 +b^n(1-b)^2)

which, of course, is just L times the denominator.

Now the terms not involving L: (a^{n+2}+b^{n+2})(a^n+b^n) - (a^{n+1} + b^{n+1})^2 = b^na^n(b^2+a^2-2ab) = b^na^n(b-a)^2

So our fraction is merely

\frac{L((a^n(1-a)^2 +b^n(1-b)^2)) + b^na^n(b-a)^2}{a^n(1-a)^2 +b^n(1-b)^2} = L + \frac{b^na^n(b-a)^2}{a^n(1-a)^2 +b^n(1-b)^2}

This can be rearranged to L + \frac{(b-a)^2}{\frac{(1-a)^2}{b^n} + \frac{(1-b)^2}{a^n}}

Clearly as n goes to infinity, the error goes to zero very quickly. It might be instructive to look at the ratio of the errors for p_n and q_n :

This ratio is
(a^n + b^n)\frac{a^n(1-a)^2 + b^n(1-b)^2}{a^nb^n(b-a)^2} =(a^n +b^n)(\frac{1}{b^n}(\frac{1-a}{b-a})^2 + \frac{1}{a^n}(\frac{1-b}{b-a})^2)

Note that in the right hand factor: both squared factors are fixed and the coefficients go to infinity as n goes to infinity. If one multiplies out, one obtains:

((\frac{a}{b})^n +1)(\frac{1-a}{b-a})^2 +  ((\frac{b}{a})^n +1)(\frac{1-b}{b-a})^2 . In the limit, the first term decreases to (\frac{1-a}{b-a})^2 and the second goes to infinity.

Hence the errors in the accelerated sequence are smaller.

February 8, 2014

Demonstrating Aitken’s sequence acceleration

Right now, we are studying the various root finding algorithms in Numerical Analysis (bisection, Newton’s, etc.)

Such algorithms yield a sequence of numbers, which (hopefully) converge to a solution: p_0, p_1, p_2, p_3, .....

Of course, each point in the sequence is obtained by calculations and, if there were a way to combine these points so as to obtain a sequence that converges faster (while not adding much to the computation complexity), there is some benefit. And yes, it is possible to use the sequence points, do the series manipulation and use the manipulated points in the root finding algorithm itself (e. g. Steffensen’s method).

In this post, I’ll talk about Aitken’s method and how one can cook up examples that not only show that the method can work but give the students some intuition as to why it might work.

I’ll provide just a bit of background in the event that the general reader comes across this.

Let p_i \rightarrow p . If we have \frac{|p_{n+1} - p|}{(|p_n -p|)^{\alpha}} \rightarrow \lambda (\lambda is positive, of course) we say that the convergence is LINEAR if \alpha = 1 and \lambda < 1 the inequality must be strict. If \alpha = 2 then we say that convergence is quadratic (regardless of \lambda .)

To see the reason for the terminology, just multiply both sides of the “defining equation” by the denominator. In the linear case : |p_{n+1} - p| = \lambda |p_{n} - p| so one can think: “get new approximation by multiplying the old approximation by some constant less than one”. For example: p_n = \frac{1}{2^n} exhibits linear convergence to 0; that is a decent way to think about it.

Now (in the linear convergence case anyway), suppose you think of your approximation having an error that shrinks with iteration but shrinks in the following way: the n’th iteration looks like p_n = p + a^n + b^n where a, b are constants strictly between 0 and 1. Of course, as n goes to infinity, p_n approaches the limit p as the error terms die away.

Aitken’s method is this: let’s denote a new sequence by q_n . (I am not using the traditional P-hat out of laziness) Then q_n = p_n -\frac{(p_{n+1} - p_n)^2}{p_{n+2} -2p_{n+1} + p_{n}} . To see how this formula is obtained, check out these excellent course notes. Or the Wiki article is pretty good.

Look at the numerator of what is being subtracted off: if we have the terms written (very roughly) as p_{n+1} = p + a^{n+1} + b^{n+1}, p_n = a^n + b^n and if, say, a is much larger than b, then a^{n+1} will be closer to a^n than b^{n+1} is to b^n , hence more of this error will be subtracted away.

Yes, I know that his is simple-minded, but it “gets the spirit” of the process. I’ve set up some spreadsheets with “cooked” sequences linearly converging to zero p_n = a^n + b^n and showed how the Aitken process works there. Note: the spreadsheet round off errors start occurring at the 10^{-16} range; you can see that here.

aitkenacceleration

(click to see the larger image)

To see an abstract example where p_n = L + a^n + b^n where a, b \in (0,1) , go to the next post in this series.

Older Posts »

Blog at WordPress.com.