College Math Teaching

April 6, 2013

Calculus and Analysis: the power of examples

In my non-math life I am an avid runner and walker. Ok, my enthusiasm for these sports greatly excedes my talent and accomplishments for these sports; I once (ONCE) broke 40 minutes for the 10K run and that was in 1982; the winner (a fellow named Bill Rodgers) won that race and finished 11 minutes ahead of me that day! 🙂 Now I’ve gotten even slower; my fastest 10K is around 53 minutes and I haven’t broken 50 since 2005. 😦

But alas I got a minor bug and had to skip today’s planned races; hence I am using this morning to blog about some math.

Real Analysis and Calculus
I’ve said this before and I’ll say it again: one of my biggest struggles with real analysis and calculus was that I often didn’t see the point of the nuances in the proof of the big theorems. My immature intuition was one in which differentiable functions were, well, analytic (though I didn’t know that was my underlying assumption at the time). Their graphs were nice smooth lines, though I knew about corners (say, f(x) = |x| at x = 0 .

So, it appears to me that one of the way we can introduce the big theorems (along with the nuances) is to have a list of counter examples at the ready and be ready to present these PRIOR to the proof; that way we can say “ok, HERE is why we need to include this hypothesis” or “here is why this simple minded construction won’t work.”

So, what are my favorite examples? Well, one is the function f(x) =\left\{ \begin{array}{c}e^{\frac{-1}{x^2}}, x \ne 0 \\  0, x = 0  \end{array}\right. is a winner. This gives an example of a C^{\infty} function that is not analytic (on any open interval containing 0 ).

The family of examples I’d like to focus on today is f(x) =\left\{ \begin{array}{c}x^ksin(\frac{\pi}{ x}), x \ne 0 \\  0, x = 0  \end{array}\right. , k fixed, k \in {1, 2, 3,...}.

Note: henceforth, when I write f(x) = x^ksin(\frac{\pi}{x}) I’ll let it be understood that I mean the conditional function that I wrote above.

Use of this example:
1. Squeeze theorem in calculus: of course, |x| \ge |xsin(\frac{\pi}{x})| \ge 0 ; this is one time we can calculate a limit without using a function which one can merely “plug in”. It is easy to see that lim_{x \rightarrow 0 } |xsin(\frac{\pi}{x})| = 0 .

2. Use of the limit definition of derivative: one can see that lim_{h \rightarrow 0 }\frac{h^2sin(\frac{\pi}{h}) - 0}{h} =0 ; this is one case where we can’t merely “calculate”.

3. x^2sin(\frac{\pi}{x}) provides an example of a function that is differentiable at the origin but is not continuously differentiable there. It isn’t hard to see why; away from 0 the derivative is 2x sin(\frac{\pi}{x}) - \pi cos(\frac{\pi}{x}) and the limit as x approaches zero exists for the first term but not the second. Of course, by upping the power of k one can find a function that is k-1 times differentiable at the origin but not k-1 continuously differentiable.

4. The proof of the chain rule. Suppose f is differentiable at g(a) and g is differentiable at a. Then we know that f(g(x)) is differentiable at x=a and the derivative is f'(g(a))g'(a) . The “natural” proof (say, for g non-constant near x = a looks at the difference quotient: lim_{x \rightarrow a} \frac{f(g(x))-f(g(a))}{x-a} =lim_{x \rightarrow a} \frac{f(g(x))-f(g(a))}{g(x)-g(a)} \frac{g(x)-g(a)}{x-a} which works fine, so long as g(x) \ne g(a) . So what could possibly go wrong; surely the set of values of x for which g(x) = g(a) for a differentiable function is finite right? 🙂 That is where x^2sin(\frac{\pi}{x}) comes into play; this equals zero at an infinite number of points in any neighborhood of the origin.

Hence the proof of the chain rule needs a workaround of some sort. This is a decent article on this topic; it discusses the usual workaround: define G(x) =\left\{ \begin{array}{c}\frac{f(g(x))-f(g(a))}{g(x)-g(a)}, g(x)-g(a) \ne 0 \\  f'(g(x)), g(x)-g(a) = 0  \end{array}\right. . Then it is easy to see that lim_{x \rightarrow a} \frac{f(g(x))-f(g(a))}{x-a} = lim_{x \rightarrow a}G(x)\frac{g(x)-g(a)}{x-a} since the second factor of the last term is zero when x = a and the limit of G(x) exists at x = a .

Of course, one doesn’t have to worry about any of this if one introduces the “grown up” definition of derivative from the get-go (as in: best linear approximation) and if one has a very gifted class, why not?

5. The concept of “bounded variation” and the Riemann-Stiltjes integral: given functions f, g over some closed interval [a,b] and partitions P look at upper and lower sums of \sum_{x_i \in P} f(x_i)(g(x_{i}) - g(x_{i-1}) = \sum_{x_i \in P}f(x_i)\Delta g_i and if the upper and lower sums converge as the width of the partions go to zero, you have the integral \int^b_a f dg . But this works only if g has what is known as “bounded variation”: that is, there exists some number M > 0 such that M > \sum_{x_i \in P} |g(x_i)-g(x_{i-1})| for ALL partitions P. Now if g(x) is differentiable with a bounded derivative on [a,b] (e. g. g is continuously differentiable on [a,b] then it isn’t hard to see that g had bounded variation. Just let W be a bound for |g'(x)| and then use the Mean Value Theorem to replace each |g(x_i) - g(x_{i-1})| by |g'(x_i^*)||x_i - x_{i-1}| and the result follows easily.

So, what sort of function is continuous but NOT of bounded variation? Yep, you guessed it! Now to make the bookkeeping easier we’ll use its sibling function: xcos(\frac{\pi}{x}). 🙂 Now consider a partition of the following variety: P = \{0, \frac{1}{n}, \frac{1}{n-1}, ....\frac{1}{3}, \frac{1}{2}, 1\} . Example: say \{0, \frac{1}{5}, \frac{1}{4}, \frac{1}{3}, \frac{1}{2}, 1\} . Compute the variation: |0-(- \frac{1}{5})|+  |(- \frac{1}{5}) - \frac{1}{4}| + |\frac{1}{4} - (-\frac{1}{3})|+ |-\frac{1}{3} - \frac{1}{2}| + |\frac{1}{2} -(-1)| = \frac{1}{5} + 2(\frac{1}{4} + \frac{1}{3} + \frac{1}{2}) + 1 . This leads to trouble as this sum has no limit as we progress with more points in the sequence of partitions; we end up with a divergent series (the Harmonic Series) as one term as points are added to the partition.

6. The concept of Absolute Continuity: this is important when one develops the Fundamental Theorem of Calculus for the Lebesgue integral. You know what it means for f to be continuous on an interval. You know what it means for f to be uniformly continuous on an interval (basically, for the whole interval, the same \delta works for a given \epsilon no matter where you are, and if the interval is a closed one, an easy “compactness” argument shows that continuity and uniform continuity are equivalent. Absolute continuity is like uniform continuity on steroids. I’ll state it for a closed interval: f is absolutely continuous on an interval [a,b] if, given any \epsilon > 0 there is a \delta > 0 such that for \sum |x_{i}-y_{i}|  < \delta, \sum |f(x_i) - f(y_{i})| < \epsilon where (x_i, y_{i}) are pairwise disjoint intervals. An example of a function that is continuous on a closed interval but not absolutely continuous? Yes; f(x) = xcos(\frac{\pi}{x}) on any interval containing 0 is an example; the work that we did in paragraph 5 works nicely; just make the intervals pairwise disjoint.

February 3, 2011

Why A Bounded Condition is Necessary for Intergral Convergence Theorems

Filed under: advanced mathematics, analysis, calculus, integrable function, integrals, Lebesgue Integration — collegemathteaching @ 1:07 am

One of the reasons that the Lebesgue theory is an improvement on the Riemann theory is that we have better convergence properties for integrals. For example, we could define the following sequence of functions: f _{k}(x)=\left\{ \begin{array}{c}1,x\notin {1/2, 1/4, 3/4. ...1/2^k, 3/2^k,...(2^{k-1} -1)/2^k} \\ 0,x\in {1/2, 1/4, 3/4. ...1/2^k, 3/2^k,...(2^{k-1} -1)/2^k} \end{array}\right. Then for each k , \int_{0}^{1} f_{k}(x)dx =1 for all k and f_{k}(x) \to f(x) point wise and f does not have a Riemann integral. It does have a Lebesgue integral which is 1.

In fact, there are many convergence theorems, one of which is the Bounded Convergence Theorem:
If f_{k} is a sequence of measurable functions on a finite measure set S and the functions f_{k} are uniformly bounded on S and f_{n} \to f pointwise on S then lim \int_{S} f_{k} = \int_{S} lim f_{k} = \int_{S} f .

We’ll take on this convergence theorem a little later (in another post). But why do we need a condition of the “uniformly bounded” type?
We’ll present an elementary but fun counter example to the theorem if the “uniformly bounded” condition is dropped.

Well start our construction by considering the sequence {1/2, 1/3, ....1/k,...} . We then note that 1/n and 1/n+1 are 1/(n)(n+1) units apart hence we can form disjoint segments ( 1/n - 1/(3(n)(n+1)), 1/n + 1/(3(n)(n+1))) which can be used as the base for disjoint isosceles triangles whose upper vertex is at coordinate (1/n, (3n)(n+1)) . Each of these triangles enclose an area of 1.
Now we can define a sequence of functions f_{k} by letting f_{k} = 0 off the base of the triangle centered at 1/k and letting the graph of f_{k} follow the upper two edges of the triangle. Then for all k we have \int_{0}^{1} f_{k}(x)dx = 1 . Hence lim \int_{0}^{1} f_{k} (x)dx = 1 . Also note that f_{k} \to f where f(x) = 0 for all x \in [0,1] . This isn’t hard to see; first note that f_{k}(0) = 0 for all k and for t > 0 f_{m} (t) = 0 for all m where (3m + 4)/(3m^2 +3m) < t . Of course, \int_{0}^{1} f(x)dx = 0 . Each f_{k} is bounded but the sequence is NOT uniformly bounded as the peaks of the triangles get higher and higher.

Note: I made a mistake when I first posted this; of course we don’t NEED “uniform boundedness” but we need either a boundedness condition or some condition that ensures that the measure of the unbounded parts is zero.

Example: order the rationals between 0 and 1 by q_{i} and let each rational be written as p_{i}/d_{i} in lowest terms.
Then define f_{k}(x) to be d_{i} for x \in {q_{1},...,q_{k}} and 1 otherwise. Then the f_{k} are NOT uniformly bounded as sequence but \int_{0}^{1}f_{k} = 1 for all k and \int_{0}^{1}f =1 (where f_{k}\to f ).
But this is ok as f differs from the constant function y = 1 on a set of measure zero.

January 31, 2011

Cantor Sets in [0,1] and Lebesgue Measure

I still remember one of the more moronic things I have ever written on an exam; I said “Set E has measure zero and is therefore countable….”. My professor wrote “whatever happened to the Cantor Set”, which he had told us about and had covered extensively.

It was one of those things that I had heard but really didn’t internalize into a working part of my mathematical mind; the latter is difficult to do.

So what is a Cantor Set? Actually, it depends on who you ask. 🙂 A topologist is likely to give a different answer than an analyst; I’ll discuss what is going on.

First, I’ll construct the traditional Middle Thirds Cantor Set; this is an example of a subset of [0,1] which
1. Has Lebesgue outer measure zero (and is therefore measurable)
2. Is uncountable.
It also has some other interesting properties and can be generalized; as George Simmons in Introduction to Topology and Modern Analysis says on page 68:

[…] the Cantor set is a very intricate mathematical object and is just the sort of thing that mathematicians delight in.

Descriptions: I’ll describe the Cantor set by describing what is NOT in it: the intervals (1/3, 2/3), (1/9.2/9), (7/9. 8/9), (1/27, 2/27), (7/27, 8/27), (19/27, 20/27), (25/27, 26/27), ..... Or, I can describe it as an intersection of closed sets: C_{1} = [0,1/3] \cup [2/3,1], C_{2} = [0, 1/9] \cup [2/9, 1/3] \cup [2/3,7/9] \cup [8/9,1],... and then the Cantor set C =\cap_{i=1}^{\infty} C_{i} . Another way of describing the Cantor middle thirds set is the set of numbers whose base three fraction expansion does not contain the digit “1” anywhere (as these numbers lie in the removed intervals, always the middle third).

Some facts are immediate: The Cantor set is closed as its complement is the union of open sets. It is bounded and hence compact. The middle thirds Cantor set has measure zero; here is why: if we add the measures of the complement: 1/3 + 2(1/9) + 4(1/27) + ..=(1/3)\sum_{k=0}^{\infty}(2/3)^k = (1/3)(1/(1-(2/3)) = 1

But the Cantor set is uncountable! This can be seen in many ways; there is a topological proof; we’ll present a brute force one.
Let C denote the Cantor middle thirds set and let x \in C . Note that each C_{k} consists of 2^{k-1} disjoint intervals of length (1/3)^k . Now we will map x to a point on the sequence of 1’s and 0’s: {f_{1}(x), f_{2}(x), f_{3} (x), ....} where f_{1}(x) = 0 if x \leq 1/3 and 1 if x \geq 2/3 . Then f_{2}(x) = 0 if x is in the first third of the next interval and 1 if it is in the final third. We do this at every stage and note that the map from C onto the sequence is onto. The map is also one to one because if x \neq y then they are at least \epsilon apart, and choose a non-deleted interval that is shorter than this \epsilon ; that interval cannot contain both x and y .

We can say much more about the Cantor set(s); I’ll conclude with a few interesting tidbits:

1. It isn’t that tough to exhibit some elements of C that are NOT endpoints of deleted intervals; merely choose a base three fraction sum that does not contain 1’s in its numerator. Here is one example: choose
2/3 + 2/9 + 2/81 + 2/729 ..... = (2/3)+ (2/9) (1 + 1/9 + 1/81 +... =(2/3)+ (2/9(1/(1-(1/9))) = 2/3 +1/4 = 11/12
2. One can alter the construction of the Cantor middle third set to obtain a Cantor set of any measure less than 1: do the same middle thirds construction but ensure that the length of each removed interval is \delta^k at stage k Then one can compute the length of the removed intervals:
\delta + 2\delta^2 + 4\delta^3.....=\delta/(1-2\delta) . Now if r is the desired measure of the complement (less than 1, of course), one solves r =   \delta/(1-2\delta) for \delta to obtain \delta = r/(1+2r) . One notes that r = 1 yields \delta = 1/3 .

The Cantor set (of any measure) has some interesting properties. For one, every point is a limit point (remember, it has the subspace topology). This isn’t hard to see; let I be a neighborhood of any point x of the Cantor set of width \epsilon and let k be such that (1/3)^k < \epsilon/2 . Then this non-deleted interval containing x must lie in I, which means that I contains other points of the Cantor set. This property is called “being dense in itself”. This property, plus being compact, is enough to prove that the set is uncountable.
The Cantor set is totally disconnected; that is, the only connected components are one point sets. This is why: given any two different points in a Cantor set, there is a deleted interval between them.

Note: it might not make sense to talk of THE Cantor set as one can construct Cantor sets of different measures. But it does make sense to talk about the Cantor set in terms of topology as every compact, dense in itself, totally disconnected metric space is homeomorphic (in fact, homeomorphic to the countably infinite product of {0, 1} where every factor has the discrete topology and the whole space has the product topology. Note: sometimes this topological space is referred to as the Cantor SPACE with Cantor SET being reserved for middle thirds type of construction.

Astonishingly, every compact metric space is the image of the Cantor space (and therefore of the Cantor set); that is a topic for a later time.

What does this have to do with integration?
Remember that the standard Cantor middle thirds set has measure zero; hence two functions defined on [0,1] that are equal except on the Cantor set have equal integrals; we will also see that the Cantor set can cause some mischief when we start talking about the Fundamental Theorem of Calculus. 🙂
If you can’t wait, see here.

January 25, 2011

Some Sets Which Are Lebesgue Measurable

We’ve yet to show some Lebesgue measurable sets; we’ll do that now.
Note: by subaddivity of Lebesgue outer measure, we know that for all sets T , we have:
m^*(T) \leq m^*(T \cap E) + m^*(T \cap C(E)) for all measurable sets E ; again here we use C( ) to denote the complement of a set.
So to show that a set is measurable, we need only show m^*(T) \geq m^*(T \cap E) + m^*(T \cap C(E)) for the set in question.

\mathbb{R} and \emptyset are measurable.

Proof: m^*(T) = m^*(T \cap R) + m^*(T\cap \emptyset) =m^*(T)

So now we know that the set of measurable sets isn’t empty. 🙂
Actually we can do better than that.

Proposition If a set has Lebesgue outer measure zero, then it is measurable.
Proof: Let E have measure zero; that is, m^*(E) = 0 . Let T be arbitrary:
m^*(T \cap E) + m^*(T \cap C(E)) . But T \cap E \subset E and therefore m^*(E) \geq m^*(T \cap E) which implies that m^*(T \cap E) = 0 But T \cap E \subset T which implies that m^*(T) \geq m^*(T\cap E) =  m^*(T \cap E) + m^*(T \cap C(E)) which is what we had to show.

Now have from the definition: m^*({x}) = 0 (one point sets have measure zero). Thus by countable additivity of measurable sets, any countable set (e. g., the rationals) has measure zero.

What about sets (other than \mathbb{R} ) that have positive measure?
Here is how we are going to proceed: we will show that sets of the form (a, \infty) are measurable. Then that will mean that sets like [a,\infty) are measurable which will then imply that open intervals (a,b) are measurable. But then, because of the fact that the measurable sets form a sigma algebra (closed with respect to countable unions, intersections, complements), we will get, free of charge, all of the topologically open sets to be measurable (remember that the reals are a second countable topological space), all closed sets, all countable intersections of open sets (called G-delta sets) and all countable unions of closed sets (the F-sigma sets). Note: the smallest sigma-algebra that contains the open intervals is called the Borel Sets.

It is true that not all measurable sets are Borel sets, but that is a topic for another day.

Showing the intervals (a, \infty) are measurable.
Let T be arbitrary and let T_{1} = T \cap (a, \infty) and let T_{2} = T \cap (-\infty, a] . With no loss of generality we can assume that m^*(T) is finite, otherwise there is nothing to show.
So, given any \epsilon > 0 we can find a countable collection of intervals I_{i} that cover T such that m^*(T) + \epsilon \geq \sum_{i=1}^{\infty} l(I_{i}) =\sum_{i=1}^{\infty} m^*(I_{i}) (note that l(I_{i}) denotes the length of the interval, which is its outer measure).
Let T_{1} = T \cap (a,\infty) and T_{2} = T \cap (-\infty, a] , I_{1,i} = I_{i} \cap (a,\infty) and I_{2,i} = I_{i} \cap (-\infty, a]
Now m^*(T_1) \leq \sum_{i=1}^{\infty} l(I_{1,i} = \sum_{i=1}^{\infty} m^*(I_{1,i}) because T_{1} \subset \cup_{i=1}^{\infty} I_{1,i} and
m^*(T_2) \leq \sum_{i=1}^{\infty} l(I_{2,i}) = \sum_{i=1}^{\infty} m^*(I_{2,i}) because T_{2} \subset \cup_{i=1}^{\infty} I_{2,i}

So m^*(T_{1}) + m^*(T_{2}) \leq \sum_{i=1}^{\infty} (m^*(I_{1,i}) + m^*(I_{2,i})) = \sum_{i=1}^{\infty} (m^*(I_{i})) \leq m^*(T) + \epsilon
Since this is true for all \epsilon it follows that m^*(T_{1}) + m^*(T_{2}) \leq m*^(T) which is what we had to show.

So now that we have a feel for what sorts of sets are measurable (at least the Borel sets, and a bit more than that), we are ready to get back to Lebesgue integration. We’ve defined the Lebesgue integral for bounded functions over a closed interval; we can now move to unbounded functions and to some promised convergence theorems.

Note from my past
I’d imagine that most of us have written moronic things on examinations from time to time. I still remember writing the following on an analysis exam: “E has measure zero and is therefore countable”….to which my professor replied: “Nonsense…whatever happened to the Cantor set?”.

I’ll have to deal with the Cantor set in it’s own post; well show that we can construct a Cantor set with zero measure and one of any given finite measure. This isn’t just a “cool thing”; it is also an essential part of some interesting counterexamples.

January 24, 2011

Lebesgue Measure: Outer Measure, Measure, Why the Caratheodory Criterion works

Filed under: advanced mathematics, analysis, Lebesgue Integration, Measure Theory — collegemathteaching @ 9:10 pm

In this post, I will distinguish between Lebesgue outer measure (denoted by m^{*}(E) ) and Lebesgue measure (denoted by m(E) ). Recall that Lebesgue outer measure of a set E is inf \sum_{i=1}^l(I_{i}) where \cup_{i=1} I_{i} is a covering of E by open intervals I_{i} = (a_{i},b_{i}) and l(I_{i}) = b_{i} - a_{i} . Lebesgue outer measure is defined for all subsets of the real line E but in a previous post we showed an example of a subset for which the combination of translation invariance and countable additivity do NOT hold for outer measure.

We need to restrict to subsets of the real line for which translation invariance and countable additivity holds; when we use Lebesgue outer measure on these sets we call it Lebesgue measure.

So, what we are going to show is this: if we restrict our sets to those sets E for which:
m^{*}(T) = m^{*}(T \cap E) + m^*(T\cap\tilde{E}) (where \tilde{E} is the set complement of E ) for ALL subsets T , then Lebesgue outer measure IS countably additive with respect to those subsets. Lebesgue outer measure applied to these sets is called Lebesgue measure.

Note: If m^{*}(T) = m^{*}(T \cap E) + m^*(T\cap\tilde{E}) is true we say that “E splits T and sometimes refer to T as a “test set”.

So, why is this condition the one that we want? Well, well prove the following results assuming that the given sets E_{i} splits all test sets T .

1. If E_{1} and E_{2} are measurable sets (e. g., splits all sets T ) then E_{1} \cap E_{2} is also measurable (splits all sets) and E_{1} \cup E_{2} is also measurable.
Proof: first recall that m^{*}(T) \leq m^{*}(T\cap E) + m^{*}(T\cap\tilde{E}) by countable subadditivity of Lebesgue outer measure. We need to show the other inequality.

First recall that \tilde{E_{1}} \cap\tilde{E_{2}} = \tilde{(E_{1}\cup E_{2})} by DeMorgan’s laws. (the latter is supposed to be the compliment of the union of the set E_{1}\cup E_{2} .

Now because E_{2} is measurable, (1) m^*(T\cap\tilde{E_{1}})=m^*(T\cap\tilde{E_{1}}\cap E_{2})+m^*(T\cap\tilde{E_{1}}\cap\tilde{E_{2}})

Now recall that, from basic set theory, E_{1}\cup E_{2} = E_{1}\cup(E_{2}\cap\tilde{E_{1}})
So from countable subadditivity of outer measure:
(2) m^*(T\cap E_{1}\cup E_{2}) \leq m^*(T\cap E_{1})) + m^*(T\cap (E_{2} \cap\tilde{E_{1}}))

So now attempt to compute: (3)
m^*(T\cap (E_{1} \cup E_{2})) + m^*(T \cap\tilde{E_{1} \cup E_{2}}) \leq m^*(T\cap E_{1}) + m^*(T\cap (E_{2} \cap\tilde{E_{1}}) +m^*(T \cap\tilde{E_{1}} \cap\tilde{E_{2}} )

But E_{2} is measurable and therefore splits T \cap\tilde{E_{1}} and so the last two terms can be combined to m^*(T \cap\tilde{E_{1}}) so (3) becomes m^*(T \cap E_{1}) + m^*(T \cap\tilde{E_{1}}) = m^*(T) which is the right hand of inequality (2).

So, it follows by a routine induction argument than a finite union of measurable sets is measurable.
Now, what about the intersection? If E_{1} and E_{2} are measurable, so are their complements (and vica-versa; the definition is symmetric). Now recall that E_{1} \cap E_{2} = C({(\tilde{E_{1}} \cap\tilde{E_{2}})}) (note: the outer “C() ” denotes set complement as I couldn’t get the LaTex command for the outer “tilde” to work) and the result follows.

2. Now we show finite additivity of disjoint measurable sets E_{i} :
We need to show that m^*(T \cap \cup_{i=1}^{n} E_{i}) \geq \sum_{i=1}^{n} m^*(T \cap E_{i})
Clearly, the statement is true for n = 1 . Assume that the statement is true for all integers up to n - 1 .
Now by disjointness, T \cap (\cup_{i=1}^{n} E_{i}) \cup E_{n} = T\cup E_{n} and T \cap \cup_{i=1}^{n} \cap\tilde{E_{n}} = T\cap(\cup_{i=1}^{n-1} E_{i} .
Now E_{n} splits T \cap (\cup_{i=1}^{n}) E_{i} therefore
m^*( T \cap (\cup_{i=1}^{n}) E_{i} )= m^*(T \cap E_{n}) +m^*(T \cap (\cup_{i=1}^{n-1} E_{i}) =
m^*(T \cap E_{n})+ \sum_{i=1}^{n-1} m^*(T \cap E_{i})

3. We now need to show that the countable union of measurable sets is measurable.
First note that if E = \cap_{i=1}^{\infty}E_{i} we can assume that the E_{i} are disjoint. Here is why: Let F_{1} = E_{1} , F_{2} = E_{2} \cap\tilde{E_{1}} , F_{3} = E_{3} \cap\tilde{E_{1}\cup E_{2}} and so on. Then cup_{i=1}^{\infty} E_{i} = cup_{i=1}^{\infty} F_{i} and the F_{i} are mutually disjoint. So we can assume with no loss of generality that the E_{i} have this property.

Note: I am getting tired of the “tilde” notation and so will be using the C() notation to denote the set complement.

Now let G_{n} = \cup_{i=1}^{n}E_{i} . Then G_{n} is measurable and C(G_{n}) \supset C(E) . Then:
m^*(T) = m^*(T \cap G_{n})+m^*(T\cap C(G_{n}) \geq m^*(T \cap G_{n})+ m^*(T\cap C(E))
By finite additivity m^*(T \cap G_{n}) = \sum_{i=1}^{n} m^*(T \cap E_{i}) Hence we can substitute into the right hand side of the inequality to obtain:
m^*(T) \geq \sum_{i=1}^{n} (m^*(T \cap E_{i})+m^*(T \cap C(E)))
This is true for all values of n
This means m^*(T) \geq \sum_{i=1}^{\infty} (m^*(T \cap E_{i})+m^*(T \cap C(E))) \geq m^*(T \cap E) + m^*(T\cap C(E)) ; the latter inequality following from countable subaddivity.

4. Now we can show finite additivity for for disjoint measurable sets in Lebesgue measure (no longer outer measure) by replacing T with \mathbb{R} .

5. We now consider a countably infinite collection of disjoint measurable sets. We must show that m(\cup_{i=1}^{\infty} E_{i}) \geq \sum_{i=1}^{\infty} m(E_{i}) This follows from the fact that \cup_{i=1}^{\infty} E_{i} \supset \cup_{i=1}^{k} E_{i} for all k . This means that m(\cup_{i=1}^{\infty} E_{i} ) \geq m(\cup_{i=1}^{k} E_{i} ) for all k . Hence the infinite sum is either bounded by m(\cup_{i=1}^{\infty} E_{i} ) and therefore converges or is infinite as is m(\cup_{i=1}^{\infty} E_{i} )

All of this shows that restricting the sets we consider to those who obey the Caratheodory criterion with respect to Lebesgue outer measure gives us the properties that we want.

Of course, we’ve yet to classify WHICH sets are measurable or which sets form important subclasses of measurable sets.
Here is an outline of what we are going to do: we’ll expand on lemmas which state things like “doing operations X, Y and Z to measurable sets yields a measurable set”; we’ve done this with respect to finite unions, finite intersections, and now countable unions. We’ll then see that such sets form a type of “set algebra” and what is known as a “sigma-algebra”.
That is for another day (soon) 🙂

Update Since we are close to closing the deal as far as the “set algebra” and “sigma-algebra” result, I’ll go ahead and finish in this post. First I’d like to recall something from basic set theory (I’ll use C(A) to denote the complement of the set A ):
C(B) \cup C(A) = C(A \cap B). I’ll leave the formal proof of this to the reader, but the basic idea is that if something isn’t an element of both A and B , then it must be in the complement of one or the other set. This leads to the following result for sets E_{i} :
\cup_{i=1}^{\infty} C(E_{i}) = C(\cap_{i=1}^{\infty} E_{i}) Again, I won’t prove this, but the idea is as follows: if something is an element of the left hand side, then it must be in the complement of at least one of the E_{i} which means it can’t be in ALL of them and therefore can’t be in the intersection.

What does this have to do with our measurable sets? We’ve just seen that the countable union of measurable sets is measurable and by symmetry, the complement of a measurable set is measurable. Hence the countable intersection of measurable sets is measurable.

Set algebras An algebra of sets is a collection of sets which is closed with respect to finite unions and complements; hence it is immediate that the set of measurable sets forms a set algebra.

Sigma algebras A set algebra is called a sigma-algebra if it is closed with respect to countable unions as well (and by DeMorgan’s laws: closed with respect to countable intersections as well). So we’ve just shown that the collection of measurable sets forms a sigma-algebra.

What we haven’t shown is a single measurable set as yet! THAT is what we are going to do next.

January 18, 2011

Lebesgue Integral: Bounded Functions on a Bounded Set

Now that we have an idea of what the Lebesgue integral is, how do we define it?

If we limit ourselves to bounded, measurable functions on the real line, we could do the following:
suppose there is a real number M>0 such that -M\leq f\leq M over [0,1]. Then for some integer k, we could set up the partition of the range:


Now set up E_{-k}=f^{-1}([y_{-k},y_{-k+1})),....E_{k}=f^{-1}([y_{k-1},y_{k}))

And then have \varphi _{k}=\sum_{i=-k}^{k-1}y_{i}m(E_{i}) and \psi_{k}=\sum_{i=-k+1}^{k}y_{i}m(E_{i})

Note that \varphi _{k} plays the role of the lower sum and \psi _{k} plays the role of the upper sum. If \inf \psi _{k}=\sup \varphi _{k} the function is integrable (as it always is if f is measurable and bounded) and we define \int_{0}^{1}f(x)dx=\inf \psi _{k}=\sup \varphi _{k}.

Note: It is possible to define the Lebesgue integral without having the concept of measurable function first: we can start with a bounded function
f and partition the range of f by y_{0},y_{1},...y_{n}. We can then look at all partitions E_{1}....E_{n} of [0,1] by measurable sets.

Then consider the characteristic functions \chi _{i}(x)=\left\{ \begin{array}{c}1,x\in E_{i} \\ 0,x\notin E_{i}\end{array}\right. so we can form a type of general step function \varphi_{n}=\sum_{i=1}^{n}y_{i-1}m(E_{i}) and \psi_{n}=\sum_{i=1}^{n}y_{i}m(E_{i}) and call f integrable if \sup \{\varphi_{n},\varphi _{n}\leq f\}=\inf \{\psi _{n},\psi _{n}\geq f\}. Think of the first function as approximating f from below by generalized step functions, and the second as approximating f from above.

Note: one of the big deals about the Lebesgue integral is that we get better convergence properties; that is, if we have a sequence of integrable
functions $f_{n}\rightarrow f$ pointwise over a measurable set, then with only mild extra hypothesis, we can show that the limit function is also
integrable and that the integral can be obtained as some sort of limit of integrals.

But to make any headway on such theorems, we’ll have to retreat to some theorems concerning measurable sets; so far we’ve shown that some sets are not measurable; we haven’t developed sufficient conditions for a set to be measurable.

January 17, 2011

Integration: Riemann Integration, Limitations and Lebesgue’s Idea

What about integration? Here we will see what Lebesgue integration is about, how it differs from Riemann integration and why we need to learn about the algebra of measurable sets.

Brief review of Riemann Integration

Remember that the idea was as follows: we limit ourselves to bounded functions. suppose we want to compute \int_{a}^{b}f(x)dx. We partitioned the interval [a,b] into several subintervals:

a=x_{0}<x_{1}<x_{2}...<x_{n-1}<x_{n}=b. Let m_{i}=\inf f(\xi ),\xi \in \lbrack x_{i-1},x_{i}] and let M_{i}=\sup f(\omega ),\omega \in \lbrack x_{i-1},x_{i}]. Let \Delta x_{i}=x_{i}-x_{i-1}. Call this partition P.

Then L_{P}=\sum_{j=1}^{n}m_{j}\Delta x_{j} and U_{P}=\sum_{j=1}^{n}M_{j}\Delta x_{j} are called the lower sums and upper sums for f with respect
to the partition P.

One proves theorems such as if Q is a refinement of partition P then L_{P}\leq L_{Q} and U_{Q}\leq U_{P} (that is, as you make the refinement finer…with smaller intervals, the lower sums go up (or stay the same) and the upper sums go down (or stay the same) and then one can define U to the
be infimum (greatest lower bound) of all of the possible upper sums and L to the the supremum (least upper bound) of all of the possible lower sums. If U=L we then declare that to be the (Riemann) integral of f over [a,b].

Note that this puts some restrictions on functions that can be integrated; for example f being unbounded, say from above, on a finite interval will prevent upper sums from being finite. Or, if there is some dense subset of [a,b] for which f obtains values that are a set distance away from the the values that f attains on the compliment of that subset, the upper and lower sums will never converge to a single value. So this not only puts restrictions on which functions have a Riemann integral, but it also precludes some “reasonable sounding” convergence theorems from being true.

For example, suppose we enumerate the rational numbers by q_{1},q_{2},...q_{k}... and define f_{1}(x)=\left\{\begin{array}{c}1,x\neq q_{1} \\ 0,x=q_{1}\end{array}\right. and then inductively define f_{k}(x)=\left\{\begin{array}{c}1,x\notin \{q_{1},q_{2},..q_{k}\} \\ 0,x\in \{q_{1},q_{2},..q_{k}\}\end{array}\right.  . Then f_{k}\rightarrow f=\left\{\begin{array}{c}1,x\notin \{q_{1},q_{2},..q_{k}....\} \\ 0,x\in \{q_{1},q_{2},..q_{k},...\}\end{array}\right. and for each k, \int_{0}^{1}f_{k}(x)dx=1 but f, the limit function, is not Riemann integrable.

So, there are a couple of things to note here:

1. The Riemann integral involves partitioning the interval to be integrated over without regards to the function being integrated at all; that is, if you were doing \int_{0}^{1}e^{\sqrt{x}}dx or \int_{0}^{1}\sin (x^{2})dx you wouldn’t partition [0,1] any differently.

2. The elements of any partition of the Riemann integral are intervals of finite length.

The Lebesgue integral changes these two features;

1. We’ll use information about the function being integrated to help us select partitions and

2. The elements of our partition need not be intervals of finite length; they just need to be measurable sets.

For example, suppose we wish to compute \int_{0}^{1}4x-x^{2}dx by using a Lebesgue integral.

Partition the range of f into 4 subintervals:

Y_{1}=0\leq y<.25,

Y_{2}=.25\leq y<.5,

Y_{3}=.5\leq y<.75,

Y_{4}=.75\leq y\leq 1.

Now consider the inverse image of these subintervals and label these:

E_{1}=f^{-1}(Y_{1})=[0,\frac{1}{2}-\frac{1}{4}\sqrt{3})\cup (\frac{1}{2}+\frac{1}{4}\sqrt{3},1]

E_{2}=f^{-1}(Y_{2})=[\frac{1}{2}-\frac{1}{4}\sqrt{3},\frac{1}{2}-\frac{1}{2}\sqrt{\frac{1}{2}})\cup \lbrack \frac{1}{2}+\frac{1}{2}\sqrt{\frac{1}{2}},\frac{1}{2}+\frac{1}{4}\sqrt{3})

E_{3}=f^{-1}(Y_{3})=[\frac{1}{2}-\frac{1}{2}\sqrt{\frac{1}{2}},\frac{1}{4})\cup \lbrack \frac{3}{4},\frac{1}{2}+\frac{1}{2}\sqrt{\frac{1}{2}})


Then we form something similar to upper and lower sums. Recall the measure of an interval is just its length.

So we obtain something like an upper sum:


and a lower sum as well:


See the above figure for an illustration of an upper sum.

Then we proceed by refining the partitions of our range; for this to work we need for the inverse image of the partitions of the range to be measurable sets; this is why we need theorems about what constitues a measureable set.

A measurable function is one whose inverse images (or partitions of its range into intervals) are measurable sets.

The Lebesgue integral can be defined as either the infimum of all the upper sums or the supremum of all of the lower sums.

If one wants to see how this works, try doing this for \int_{0}^{1}g(x)dx where
g(x)=\left\{ \begin{array}{c}1,x\notin Q \\ \frac{1}{p},x=\frac{p}{q}\end{array}\right. where Q is the rationals and \frac{p}{q} is in lowest terms.

Then the subinterval which includes 1 in any partition of the range will have an inverse image with measure 1 whereas all subintervals whose upper bounds are strictly less than 1 will have measure zero. Hence it follows that \int_{0}^{1}g(x)dx=1 though g is not Riemann integrable.

Of course, this has been sketchy and we haven’t covered which types of sets are measurable. We’ll also discuss some convergence theorems as well.

January 13, 2011

A Non-Measurable Set

In our previous post, we talked about how to define a measure on a subset of the real line (or on [0,1] ).

In today’s post, we’ll give an example of a type of set for which no reasonable measure can be defined.
Let us recall what we wanted in a measure:

1.m([a,b])=b-a (and the same for open, and half open intervals). After all, this is supposed to be a length function. Of course,
b > a .

2. m(E+r)=m(E): that is, measure should be “translation invariant” that is, if you translate a set E by adding the same constant to every element of E, the translated set should have the same measure.

3. m(\{x\})=m(\emptyset )=0 (the measure of a single point and the empty set ought to be zero)

4. If F\subset E then m(F)\leq m(E) and we’d like to have some additivity properties:

5. Suppose E=\cup _{i=1}^{\infty }E_{i}. Then we’d want m(E)\leq\sum_{i=1}^{\infty }m(E_{i}) provided the left hand sum converges (of course, this convergence is absolute since all terms are positive; hence the order of the terms is of no consequence; this uses our calculus results).

Of course, we are talking about a countable union; sums don’t make sense otherwise.

And, if the sets E_{i} are disjoint, we’d love to have m(E)=\sum_{i=1}^{\infty }m(E_{i})

Now let’s define a set for which we could never have a measure which meets the above properties. Yes, we will be using the Axiom of Choice, which, roughly speaking, states that given an infinite disjoint collection of sets, we can “choose” one element from each set.

Start by selecting an irrational number x\in \lbrack 0,1] . Then let the set E consist of all y\in \lbrack 0,1] in which the following is true: y - x is irrational and if z\in E then y - z is irrational. One way to think of this is to establish one point in E and then consider a maximal set with respect to equality modulo the irrationals. Or one can think of building this set inductively with an uncountable number of steps: start with x , choose y so that y - x is irrational, then add z so that both x - z and x - y are irrational and so on. Of course, all arithmetic is done modulo 1.

Now lets look at the following: the rational numbers are countable so we can order them q_{1},q_{2}...,q_{k}. Then consider the translates of E : E_1 =E+q_{1}, E_2 = E + q_{2}.... .
Then we can establish: E is disjoint from each E_{i} because, say, if y\in E and y = z + q_{i} with z\in E then y - z would be rational with both y, z\in E . That is impossible. For a similar reason, all of the E_{i} are disjoint.

Now if w\in E' (the complement of E in [0,1] ), then there is a z in E such that w - z is rational; hence the collection of E_{i} forms the complement of E in [0,1].

So now we have the following situation: if measure is invariant under translation, E and each E_{i} have the same measure. Notice also that all of these sets are disjoint and that \lbrack 0,1]=E\cup \{\cup _{i=1}^{\infty }E_{i}\}

If each set has measure zero and we have countable additivity of disjoint sets, we’d have that [0,1] has measure zero. But if E has a non-zero measure and the measure is translation invariant, then we’d have [0,1] having infinite measure (remember we can’t form a positive infinite sum because each E_{i} has to have exactly the same measure.)

So, no matter how we define measure, we will have to exclude at least some sets (such as E ).

One other thing to note: E has as it’s compliment a countable number of translates of itself, hence it is impossible for E to meet the Caratheodory characterization requirement that E and its complement have measures that add up to the measure of [0,1]

January 12, 2011

The Lebesgue Integral: motivating the mastery of the tools of abstract mathematics

My Ph. D. basic math classes were a long time ago; in the fall of 1985 I enrolled in the beginning topology, algebra, analysis and algebraic topology classes. The algebra, analysis and topology classes were designed to get the student through the preliminary examinations.

The ironic thing is that while I passed analysis on my first try (summer of1986) and algebra on the first try (summer of 1987), it took me three tries to get through topology which I finally passed in the winter of 1988. Of course, that is what I ended up specializing in.

But I can honestly say that most of my analysis was ”studying the problems that might appear on the exam”; I really didn’t know what I was doing nor did I understand the material. I had no perspective whatsoever; I really didn’t know where the course was going. Therefore I remember almost none of it.

What does this have to do with the teaching of college mathematics? I see it this way: often, prior to tackling an abstract concept (say, Lebesgue Integration), one first assembles some tools (say, measure theory, the concept of a sigma algebra of sets, Borel sets, etc.) and then uses these tools as the need arises. But the tools are often developed without any context; I didn’t see why any of this was needed or what it would be used for. I had no perspective; I didn’t know WHY I had to be patient with these esoteric (to me at that time) tools.

So, what I remember now is to constantly remind the students where we are going with a topic; I try to explain WHY we need to be patient with this piece of drudgery or that.

Lebesgue Integration

My goal: I hope to review Lebesgue integration for my own benefit and hopefully, via some notes, perhaps provide some perspective for the student who is going through this process for the first time.

For my references, I am choosing A Primer of Lebesgue Integration by H. S. Bear and Real Analysis by H. L. Royden.

Why Lebesgue Integration?

Ground Rules

For the duration, I’ll be discussing the integration of positive functions over a closed, bounded interval [a,b]. In fact, I’ll go ahead and use [0,1] with no loss of generality.

Of course, if we are merely concerned with integrating functions which are piecewise continuous then the Riemann integral (say, as defined as the limit of upper and lower sums) works just fine.

However what happens if we want to, say, extend the class of functions that we can integrate?

Suppose, for example, we wish to compute \int_{0}^{1}\frac{1}{\sqrt{x}}dx or attempt to compute \int_{0}^{1}\frac{1}{x}dx?

Ok, strictly speaking, these functions aren’t defined on [0,1], so we can replace the integrand by, say a conditional function:

f(x)=\left\{ \begin{array}{c}\frac{1}{\sqrt{x}},x>0 \\ 0,x=0\end{array}\right. in the first case and similarly in the second case.

Of course, we tell our students that while these functions are NOT Riemann integrable (unbounded functions aren’t as the upper sum will never be finite for any finite partition), we can possibly extend the notion of integrability by using the improper integral technique, which, of course, shows that the first integral converges and the second one does not.

But again, we have a type of piecewise continuity. What happens when we don’t?

Let us examine another couple of functions:

g(x)=\left\{ \begin{array}{c}1,x\text{ rational} \\ 0,\text{ }x\text{ irrational}\end{array}\right.

h(x)=\left\{ \begin{array}{c}q,x=\frac{p}{q},\text{ in lowest terms, }p,q\in \{1,2,3..\} \\ 0\text{ otherwise}\end{array}\right\}

Notice that g is bounded whereas h is not. Notice that neither is Riemann integrable as there is no hope of a upper and lower sums converging. But notice something else: we know that the rationals compose a minescule portion of the real numbers; hence there might be another theory of integration that would all us to integrate both of these functions and obtain 0 for the integral. But there isn’t a good way to obtain the integrals of these functions as a limit of Riemann integrals; there are no smaller intervals to work with. Hence the need for a more expansive theory of integration.

About the rational numbers: why do they ”take up little space”? I’ll assume that the reader knows that the rationals are countable and the reals are not, hence there are many more irrational numbers than rational ones.

But as far as ”taking up space”: what do we mean by that? We will answer this when we develop ”measure theory”; that is, a way of assigning a generalization of a “length” to more complicated subsets of the real line. For now I’ll explain why the set of rational numbers doesn’t take up much space, even though they are a dense subset of the real line.

I will show that we can cover all of the rationals by a countable set of intervals whose lengths add up to an arbitrarily small number. That is,
for any given \varepsilon >0, I’ll construct a countable set of intervals [a_{i},b_{i}] such that \cup _{i=1}^{\infty}[a_{i},b_{i}] contains ALL of the rational numbers AND \sum_{i=1}^{\infty }(b_{i}-a_{i})\leq\varepsilon . Of course, these intervals are NOT disjoint; far from it. But their union contains all of the rational numbers.

Review: Recall \sum_{k=0}^{\infty }r^{k}=\frac{1}{1-r} for |r|<1.

Let q_{0},q_{1},q_{2}.... be the set of rational numbers (which are countable). Then for q_{k}, consider the interval [q_{k}-\frac{1}{2}(\frac{\varepsilon +1}{\varepsilon })^{k},q_{k}+\frac{1}{2}(\frac{\varepsilon+1}{\varepsilon })^{k}]. This interval has length (\frac{\varepsilon +1}{\varepsilon })^{k}.

Hence adding the lengths of the intervals together is the same as \sum_{k=0}^{\infty }(\frac{\varepsilon +1}{\varepsilon})^{k}=\varepsilon . So the sum of the lengths of closed intervals which covers all of the rationals can be made as small as desired.

So, getting back to the problem of integration: how do we do this? The idea behind the Lebesgue integral is to first define this integral for
“simple functions” that is, functions that take on precisely one value over a set. For example: a step function can be thought of as a kind of
sum of simple functions.

Then \chi (x) is a simple function over a set I and m(I) is the “measure”; (a kind of length) of I, then the new integral would be defined as \int_{I}\chi (x)=\chi (x)m(I) (think of the integral in terms of “area”; and this would be a height (in the “function direction”) times “width of the interval” operation).

What will be new is that we’ll consider I‘s that might be much more complicated than a simple disjoint union of intervals. Well also show that if A=I\cup J and the union is disjoint and if \chi (x) is simple over both I and J, then \int_{A}\chi = \int_{I}\chi + \int_{J}\chi

So, going back to our example functions g and h: note that the “measure” of the rational numbers is zero as these can be covered by intervals whose lengths sum to arbitrarily small sums. So if we let Q denote the rational numbers in [0,1] and I denote the irrationals, note that g is a simple function over these two sets. Hence,

\int_{[0,1]}g= \int_{Q}g+ \int_{I}g=m(Q)\ast 1+m(I)\ast 0=0\ast 1+1\ast 0=0.

The second integral is more problematic; we’d have to break Q into a countable union of points whose coordinates have denominators 2, 3, 4,….and to prove a theorem about adding up a countable number of integrals
(e. g., forming a sequence). Getting to that point will take some work, as you can see.

Measure Theory.

Ok, we want to determine a “generalized length” of subsets of the real line, (subsets of [0,1] for now) and these subsets can
be far more complicated than those which are finite disjoint collections of intervals (if you know what the Cantor “middle thirds” set is, you might
review that and if you don’t know what it is, look up “Cantor Set”; that will serve as a nice example of a complicated subset of the real line).
This is a main point of measure theory.

So, in abstract terms, what we are looking for is some sort of a map m from the collection of subsets of [0,1] to the non-negative real numbers that serves as some sort of a length function. It might be a good time to stop and ask yourself: “what properties would we want this map (called a “measure”) to have?

Here are some key properties:

1.m([a,b])=b-a (and the same for open, and half open intervals). After all, this is supposed to be a length function. Of course,
b > a .

2. m(E+r)=m(E): that is, measure should be “translation invariant” that is, if you translate a set E by adding the same constant to every element of E, the translated set should have the same measure.

3. m(\{x\})=m(\emptyset )=0 (the measure of a single point and the empty set ought to be zero)

4. If F\subset E then m(F)\leq m(E) and we’d like to have some additivity properties:

5. Suppose E=\cup _{i=1}^{\infty }E_{i}. Then we’d want m(E)\leq\sum_{i=1}^{\infty }m(E_{i}) provided the left hand sum converges (of course, this convergence is absolute since all terms are positive; hence the order of the terms is of no consequence; this uses our calculus results).

Of course, we are talking about a countable union; sums don’t make sense otherwise.

And, if the sets E_{i} are disjoint, we’d love to have m(E)=\sum_{i=1}^{\infty }m(E_{i})

So how would we go about defining such a measure?

Here is one standard way: Every subset of the real line can be covered by a countable union of open intervals. (Note: if you said to yourself
anything about a Lindelof space, you have no business reading this article unless it is to critique it). So given a set E, let \{(a_{i},b_{i})\} be a countable collection of open subintervals whose union contains E (that is, an open cover for E). If \sum_{i}(b_{i}-a_{i}) is finite, call that the “length” of the cover of E. Then consider the infimum of all lengths of covers of E (note: since E \subset \lbrack 0,1], there exists a cover of length 1 so such an infimum exists). This infimum is
called the measure of E.

This definition of measure gives us most of what we want (1-5). Note: proving 1 is a bit trickier than it might first appear; it is immediate that
m([a,b])\leq b-a. To go the other way, one might use the fact that every open cover has a finite subcover (the Heinie-Borel Theorem from your analysis class) and induct on the number of open sets in your finite subcover). Features 2, 3, and 4 are pretty easy to demonstrate.

5 (called “countable subadditivity”) Makes for an interesting exercise. I’ll sketch out a solution here:

Since each E_{i} has a measure, we’ll find a collection of open intervals I_{ij} that cover and, if l(I_{ij}) denotes the length of the open interval, we can assume that m(E_{i})+\frac{\varepsilon }{2^{i}}>\sum_{j} l(I_{ij}). Then note that
\cup _{ij}I_{ij} covers E=\cup _{i=1}^{\infty }E_{i} and
\sum_{i}(m(E_{i})+\frac{\varepsilon }{2^{i}})>\sum_{i}\sum_{j} l(I_{ij})\rightarrow \sum_{i}(m(E_{i}))+\varepsilon >\sum_{i}\sum_{j} l(I_{ij}) for any \varepsilon >0.

Hence \sum_{i}(m(E_{i}))\geq m(E).

So what about the case where the E_{i} are disjoint; do we get equality?

The answer is…well, NO…not for ALL subsets E.

It will turn out that we will have to restrict our measure to certain subsets called “measurable sets”. The measurable sets form one large collection of subsets of [0,1] that “work”. These sets will be those sets that have this condition: m(E)+m(E^{\prime})=1 (where E^{\prime }=[0,1]-E; the set compliment of E).

It isn’t obvious that this is the condition that we’ll need; this complement condition is called the Carathedory characterization.

Note: I am being a little bit sloppy here. Strictly speaking, I should use the concept of “outer measure” (that is, the “infimum of the sum of the lengths of the covers”) when talking about the Carathedory characterization and denote that by m^{*}(E) and only use m(E) when I am talking about the “measure” of the “measurable sets”; otherwise I would be using circular logic. But this isn’t a text book so I’ll abuse notation.

Note: it is easy to establish that intervals and one point sets are measurable. To find other sets that are measurable, we’ll need to use
results from the algebra of sets;, sigma-algebras, etc.; the idea is to show that, say, the collection of measurable sets are closed with respect to countable unions, countable intersections, set complementation, etc.

This is one reason why set algebra topics are often covered in real analysis textbooks in the first sections or chapters. We’ll also show an example of a non-measurable set in a future post (2 dimensional versions of these come
into play in the Banach-Tarski “paradox”).

In the next section, we construct a non-measurable set.

Create a free website or blog at