College Math Teaching

August 31, 2014

The convolution integral: do some examples in Calculus III or not?

For us, calculus III is the most rushed of the courses, especially if we start with polar coordinates. Getting to the “three integral theorems” is a real chore. (ok, Green’s, Divergence and Stoke’s theorem is really just \int_{\Omega} d \sigma = \int_{\partial \Omega} \sigma but that is the subject of another post)

But watching this lecture made me wonder: should I say a few words about how to calculate a convolution integral?

Note: I’ve discussed a type of convolution integral with regards to solving differential equations here.

In the context of Fourier Transforms, the convolution integral is defined as it was in analysis class: f*g = \int^{\infty}_{-\infty} f(x-t)g(t) dt . Typically, we insist that the functions be, say, L^1 and note that it is a bit of a chore to show that the convolution of two L^1 functions is L^1 ; one proves this via the Fubini-Tonelli Theorem.

(The straight out product of two L^1 functions need not be L^1 ; e.g, consider f(x) = \frac {1}{\sqrt{x}} for x \in (0,1] and zero elsewhere)

So, assuming that the integral exists, how do we calculate it? Easy, you say? Well, it can be, after practice.

But to test out your skills, let f(x) = g(x) be the function that is 1 for x \in [\frac{-1}{2}, \frac{1}{2}] and zero elsewhere. So, what is f*g ???

So, it is easy to see that f(x-t)g(t) only assumes the value of 1 on a specific region of the (x,t) plane and is zero elsewhere; this is just like doing an iterated integral of a two variable function; at least the first step. This is why it fits well into calculus III.

f(x-t)g(t) = 1 for the following region: (x,t), -\frac{1}{2} \le x-t \le \frac{1}{2}, -\frac{1}{2} \le t \le \frac{1}{2}

This region is the parallelogram with vertices at (-1, -\frac{1}{2}), (0, -\frac{1}{2}), (0 \frac{1}{2}), (1, \frac{1}{2}) .

convolutiondraw

Now we see that we can’t do the integral in one step. So, the function we are integrating f(x-t)f(t) has the following description:

f(x-t)f(t)=\left\{\begin{array}{c} 1,x \in [-1,0], -\frac{1}{2} t \le \frac{1}{2}+x \\ 1 ,x\in [0,1], -\frac{1}{2}+x \le t \le \frac{1}{2} \\ 0 \text{ elsewhere} \end{array}\right.

So the convolution integral is \int^{\frac{1}{2} + x}_{-\frac{1}{2}} dt = 1+x for x \in [-1,0) and \int^{\frac{1}{2}}_{-\frac{1}{2} + x} dt = 1-x for x \in [0,1] .

That is, of course, the tent map that we described here. The graph is shown here:

tentmapgraph

So, it would appear to me that a good time to do a convolution exercise is right when we study iterated integrals; just tell the students that this is a case where one “stops before doing the outside integral”.

Advertisements

October 25, 2013

A Laplace Transform of a function of non-exponential order

Many differential equations textbooks (“First course” books) limit themselves to taking Laplace transforms of functions of exponential order. That is a reasonable thing to do. However I’ll present an example of a function NOT of exponential order that has a valid (if not very useful) Laplace transform.

Consider the following function: n \in \{1, 2, 3,...\}

g(t)= \begin{cases}      1,& \text{if } 0 \leq t \leq 1\\      10^n,              & \text{if } n \leq t \leq n+\frac{1}{100^n} \\  0,  & \text{otherwise}  \end{cases}

Now note the following: g is unbounded on [0, \infty) , lim_{t \rightarrow \infty} g(t) does not exist and
\int^{\infty}_0 g(t)dt = 1 + \frac{1}{10} + \frac{1}{100^2} + .... = \frac{1}{1 - \frac{1}{10}} = \frac{10}{9}

One can think of the graph of g as a series of disjoint “rectangles”, each of width \frac{1}{100^n} and height 10^n The rectangles get skinnier and taller as n goes to infinity and there is a LOT of zero height in between the rectangles.

notexponentialorder

Needless to say, the “boxes” would be taller and skinnier.

Note: this is an example can be easily modified to provide an example of a function which is l^2 (square integrable) which is unbounded on [0, \infty) . Hat tip to Ariel who caught the error.

It is easy to compute the Laplace transform of g :

G(s) = \int^{\infty}_0 g(t)e^{-st} dt . The transform exists if, say, s \geq 0 by routine comparison test as |e^{-st}| \leq 1 for that range of s and the calculation is easy:

G(s) = \int^{\infty}_0 g(t)e^{-st} dt = \frac{1}{s} (1-e^{-s}) + \frac{1}{s} \sum^{\infty}_{n=1} (\frac{10}{e^s})^n(1-e^{\frac{-s}{100^n}})

Note: if one wants to, one can see that the given series representation converges for s \geq 0 by using the ratio test and L’Hoptial’s rule.

February 3, 2011

Why A Bounded Condition is Necessary for Intergral Convergence Theorems

Filed under: advanced mathematics, analysis, calculus, integrable function, integrals, Lebesgue Integration — collegemathteaching @ 1:07 am

One of the reasons that the Lebesgue theory is an improvement on the Riemann theory is that we have better convergence properties for integrals. For example, we could define the following sequence of functions: f _{k}(x)=\left\{ \begin{array}{c}1,x\notin {1/2, 1/4, 3/4. ...1/2^k, 3/2^k,...(2^{k-1} -1)/2^k} \\ 0,x\in {1/2, 1/4, 3/4. ...1/2^k, 3/2^k,...(2^{k-1} -1)/2^k} \end{array}\right. Then for each k , \int_{0}^{1} f_{k}(x)dx =1 for all k and f_{k}(x) \to f(x) point wise and f does not have a Riemann integral. It does have a Lebesgue integral which is 1.

In fact, there are many convergence theorems, one of which is the Bounded Convergence Theorem:
If f_{k} is a sequence of measurable functions on a finite measure set S and the functions f_{k} are uniformly bounded on S and f_{n} \to f pointwise on S then lim \int_{S} f_{k} = \int_{S} lim f_{k} = \int_{S} f .

We’ll take on this convergence theorem a little later (in another post). But why do we need a condition of the “uniformly bounded” type?
We’ll present an elementary but fun counter example to the theorem if the “uniformly bounded” condition is dropped.

Well start our construction by considering the sequence {1/2, 1/3, ....1/k,...} . We then note that 1/n and 1/n+1 are 1/(n)(n+1) units apart hence we can form disjoint segments ( 1/n - 1/(3(n)(n+1)), 1/n + 1/(3(n)(n+1))) which can be used as the base for disjoint isosceles triangles whose upper vertex is at coordinate (1/n, (3n)(n+1)) . Each of these triangles enclose an area of 1.
Now we can define a sequence of functions f_{k} by letting f_{k} = 0 off the base of the triangle centered at 1/k and letting the graph of f_{k} follow the upper two edges of the triangle. Then for all k we have \int_{0}^{1} f_{k}(x)dx = 1 . Hence lim \int_{0}^{1} f_{k} (x)dx = 1 . Also note that f_{k} \to f where f(x) = 0 for all x \in [0,1] . This isn’t hard to see; first note that f_{k}(0) = 0 for all k and for t > 0 f_{m} (t) = 0 for all m where (3m + 4)/(3m^2 +3m) < t . Of course, \int_{0}^{1} f(x)dx = 0 . Each f_{k} is bounded but the sequence is NOT uniformly bounded as the peaks of the triangles get higher and higher.

Note: I made a mistake when I first posted this; of course we don’t NEED “uniform boundedness” but we need either a boundedness condition or some condition that ensures that the measure of the unbounded parts is zero.

Example: order the rationals between 0 and 1 by q_{i} and let each rational be written as p_{i}/d_{i} in lowest terms.
Then define f_{k}(x) to be d_{i} for x \in {q_{1},...,q_{k}} and 1 otherwise. Then the f_{k} are NOT uniformly bounded as sequence but \int_{0}^{1}f_{k} = 1 for all k and \int_{0}^{1}f =1 (where f_{k}\to f ).
But this is ok as f differs from the constant function y = 1 on a set of measure zero.

January 18, 2011

Lebesgue Integral: Bounded Functions on a Bounded Set

Now that we have an idea of what the Lebesgue integral is, how do we define it?

If we limit ourselves to bounded, measurable functions on the real line, we could do the following:
suppose there is a real number M>0 such that -M\leq f\leq M over [0,1]. Then for some integer k, we could set up the partition of the range:

-M=y_{-k},\frac{(1-k)M}{k}=y_{-k+1},\frac{(2-k)M}{k}=y_{-k+2},....0=y_{0},\frac{M}{k}=y_{1},....\frac{(k-1)M}{k}=y_{k-1},M=y_{k}

Now set up E_{-k}=f^{-1}([y_{-k},y_{-k+1})),....E_{k}=f^{-1}([y_{k-1},y_{k}))

And then have \varphi _{k}=\sum_{i=-k}^{k-1}y_{i}m(E_{i}) and \psi_{k}=\sum_{i=-k+1}^{k}y_{i}m(E_{i})

Note that \varphi _{k} plays the role of the lower sum and \psi _{k} plays the role of the upper sum. If \inf \psi _{k}=\sup \varphi _{k} the function is integrable (as it always is if f is measurable and bounded) and we define \int_{0}^{1}f(x)dx=\inf \psi _{k}=\sup \varphi _{k}.

Note: It is possible to define the Lebesgue integral without having the concept of measurable function first: we can start with a bounded function
f and partition the range of f by y_{0},y_{1},...y_{n}. We can then look at all partitions E_{1}....E_{n} of [0,1] by measurable sets.

Then consider the characteristic functions \chi _{i}(x)=\left\{ \begin{array}{c}1,x\in E_{i} \\ 0,x\notin E_{i}\end{array}\right. so we can form a type of general step function \varphi_{n}=\sum_{i=1}^{n}y_{i-1}m(E_{i}) and \psi_{n}=\sum_{i=1}^{n}y_{i}m(E_{i}) and call f integrable if \sup \{\varphi_{n},\varphi _{n}\leq f\}=\inf \{\psi _{n},\psi _{n}\geq f\}. Think of the first function as approximating f from below by generalized step functions, and the second as approximating f from above.

Note: one of the big deals about the Lebesgue integral is that we get better convergence properties; that is, if we have a sequence of integrable
functions $f_{n}\rightarrow f$ pointwise over a measurable set, then with only mild extra hypothesis, we can show that the limit function is also
integrable and that the integral can be obtained as some sort of limit of integrals.

But to make any headway on such theorems, we’ll have to retreat to some theorems concerning measurable sets; so far we’ve shown that some sets are not measurable; we haven’t developed sufficient conditions for a set to be measurable.

January 17, 2011

Integration: Riemann Integration, Limitations and Lebesgue’s Idea

What about integration? Here we will see what Lebesgue integration is about, how it differs from Riemann integration and why we need to learn about the algebra of measurable sets.

Brief review of Riemann Integration

Remember that the idea was as follows: we limit ourselves to bounded functions. suppose we want to compute \int_{a}^{b}f(x)dx. We partitioned the interval [a,b] into several subintervals:

a=x_{0}<x_{1}<x_{2}...<x_{n-1}<x_{n}=b. Let m_{i}=\inf f(\xi ),\xi \in \lbrack x_{i-1},x_{i}] and let M_{i}=\sup f(\omega ),\omega \in \lbrack x_{i-1},x_{i}]. Let \Delta x_{i}=x_{i}-x_{i-1}. Call this partition P.

Then L_{P}=\sum_{j=1}^{n}m_{j}\Delta x_{j} and U_{P}=\sum_{j=1}^{n}M_{j}\Delta x_{j} are called the lower sums and upper sums for f with respect
to the partition P.

One proves theorems such as if Q is a refinement of partition P then L_{P}\leq L_{Q} and U_{Q}\leq U_{P} (that is, as you make the refinement finer…with smaller intervals, the lower sums go up (or stay the same) and the upper sums go down (or stay the same) and then one can define U to the
be infimum (greatest lower bound) of all of the possible upper sums and L to the the supremum (least upper bound) of all of the possible lower sums. If U=L we then declare that to be the (Riemann) integral of f over [a,b].

Note that this puts some restrictions on functions that can be integrated; for example f being unbounded, say from above, on a finite interval will prevent upper sums from being finite. Or, if there is some dense subset of [a,b] for which f obtains values that are a set distance away from the the values that f attains on the compliment of that subset, the upper and lower sums will never converge to a single value. So this not only puts restrictions on which functions have a Riemann integral, but it also precludes some “reasonable sounding” convergence theorems from being true.

For example, suppose we enumerate the rational numbers by q_{1},q_{2},...q_{k}... and define f_{1}(x)=\left\{\begin{array}{c}1,x\neq q_{1} \\ 0,x=q_{1}\end{array}\right. and then inductively define f_{k}(x)=\left\{\begin{array}{c}1,x\notin \{q_{1},q_{2},..q_{k}\} \\ 0,x\in \{q_{1},q_{2},..q_{k}\}\end{array}\right.  . Then f_{k}\rightarrow f=\left\{\begin{array}{c}1,x\notin \{q_{1},q_{2},..q_{k}....\} \\ 0,x\in \{q_{1},q_{2},..q_{k},...\}\end{array}\right. and for each k, \int_{0}^{1}f_{k}(x)dx=1 but f, the limit function, is not Riemann integrable.

So, there are a couple of things to note here:

1. The Riemann integral involves partitioning the interval to be integrated over without regards to the function being integrated at all; that is, if you were doing \int_{0}^{1}e^{\sqrt{x}}dx or \int_{0}^{1}\sin (x^{2})dx you wouldn’t partition [0,1] any differently.

2. The elements of any partition of the Riemann integral are intervals of finite length.

The Lebesgue integral changes these two features;

1. We’ll use information about the function being integrated to help us select partitions and

2. The elements of our partition need not be intervals of finite length; they just need to be measurable sets.

For example, suppose we wish to compute \int_{0}^{1}4x-x^{2}dx by using a Lebesgue integral.

Partition the range of f into 4 subintervals:

Y_{1}=0\leq y<.25,

Y_{2}=.25\leq y<.5,

Y_{3}=.5\leq y<.75,

Y_{4}=.75\leq y\leq 1.

Now consider the inverse image of these subintervals and label these:

E_{1}=f^{-1}(Y_{1})=[0,\frac{1}{2}-\frac{1}{4}\sqrt{3})\cup (\frac{1}{2}+\frac{1}{4}\sqrt{3},1]

E_{2}=f^{-1}(Y_{2})=[\frac{1}{2}-\frac{1}{4}\sqrt{3},\frac{1}{2}-\frac{1}{2}\sqrt{\frac{1}{2}})\cup \lbrack \frac{1}{2}+\frac{1}{2}\sqrt{\frac{1}{2}},\frac{1}{2}+\frac{1}{4}\sqrt{3})

E_{3}=f^{-1}(Y_{3})=[\frac{1}{2}-\frac{1}{2}\sqrt{\frac{1}{2}},\frac{1}{4})\cup \lbrack \frac{3}{4},\frac{1}{2}+\frac{1}{2}\sqrt{\frac{1}{2}})

E_{4}=f^{-1}(Y_{4})=[\frac{1}{4},\frac{3}{4})

Then we form something similar to upper and lower sums. Recall the measure of an interval is just its length.

So we obtain something like an upper sum:

U=\frac{1}{4}m(E_{1})+\frac{1}{2}m(E_{2})+\frac{3}{4}m(E_{3})+1m(E_{4})

and a lower sum as well:

L=0m(E_{1})+\frac{1}{4}m(E_{2})+\frac{1}{2}m(E_{3})+\frac{3}{4}m(E_{4})

See the above figure for an illustration of an upper sum.

Then we proceed by refining the partitions of our range; for this to work we need for the inverse image of the partitions of the range to be measurable sets; this is why we need theorems about what constitues a measureable set.

A measurable function is one whose inverse images (or partitions of its range into intervals) are measurable sets.

The Lebesgue integral can be defined as either the infimum of all the upper sums or the supremum of all of the lower sums.

If one wants to see how this works, try doing this for \int_{0}^{1}g(x)dx where
g(x)=\left\{ \begin{array}{c}1,x\notin Q \\ \frac{1}{p},x=\frac{p}{q}\end{array}\right. where Q is the rationals and \frac{p}{q} is in lowest terms.

Then the subinterval which includes 1 in any partition of the range will have an inverse image with measure 1 whereas all subintervals whose upper bounds are strictly less than 1 will have measure zero. Hence it follows that \int_{0}^{1}g(x)dx=1 though g is not Riemann integrable.

Of course, this has been sketchy and we haven’t covered which types of sets are measurable. We’ll also discuss some convergence theorems as well.

January 12, 2011

The Lebesgue Integral: motivating the mastery of the tools of abstract mathematics

My Ph. D. basic math classes were a long time ago; in the fall of 1985 I enrolled in the beginning topology, algebra, analysis and algebraic topology classes. The algebra, analysis and topology classes were designed to get the student through the preliminary examinations.

The ironic thing is that while I passed analysis on my first try (summer of1986) and algebra on the first try (summer of 1987), it took me three tries to get through topology which I finally passed in the winter of 1988. Of course, that is what I ended up specializing in.

But I can honestly say that most of my analysis was ”studying the problems that might appear on the exam”; I really didn’t know what I was doing nor did I understand the material. I had no perspective whatsoever; I really didn’t know where the course was going. Therefore I remember almost none of it.

What does this have to do with the teaching of college mathematics? I see it this way: often, prior to tackling an abstract concept (say, Lebesgue Integration), one first assembles some tools (say, measure theory, the concept of a sigma algebra of sets, Borel sets, etc.) and then uses these tools as the need arises. But the tools are often developed without any context; I didn’t see why any of this was needed or what it would be used for. I had no perspective; I didn’t know WHY I had to be patient with these esoteric (to me at that time) tools.

So, what I remember now is to constantly remind the students where we are going with a topic; I try to explain WHY we need to be patient with this piece of drudgery or that.

Lebesgue Integration

My goal: I hope to review Lebesgue integration for my own benefit and hopefully, via some notes, perhaps provide some perspective for the student who is going through this process for the first time.

For my references, I am choosing A Primer of Lebesgue Integration by H. S. Bear and Real Analysis by H. L. Royden.

Why Lebesgue Integration?

Ground Rules

For the duration, I’ll be discussing the integration of positive functions over a closed, bounded interval [a,b]. In fact, I’ll go ahead and use [0,1] with no loss of generality.

Of course, if we are merely concerned with integrating functions which are piecewise continuous then the Riemann integral (say, as defined as the limit of upper and lower sums) works just fine.

However what happens if we want to, say, extend the class of functions that we can integrate?

Suppose, for example, we wish to compute \int_{0}^{1}\frac{1}{\sqrt{x}}dx or attempt to compute \int_{0}^{1}\frac{1}{x}dx?

Ok, strictly speaking, these functions aren’t defined on [0,1], so we can replace the integrand by, say a conditional function:

f(x)=\left\{ \begin{array}{c}\frac{1}{\sqrt{x}},x>0 \\ 0,x=0\end{array}\right. in the first case and similarly in the second case.

Of course, we tell our students that while these functions are NOT Riemann integrable (unbounded functions aren’t as the upper sum will never be finite for any finite partition), we can possibly extend the notion of integrability by using the improper integral technique, which, of course, shows that the first integral converges and the second one does not.

But again, we have a type of piecewise continuity. What happens when we don’t?

Let us examine another couple of functions:

g(x)=\left\{ \begin{array}{c}1,x\text{ rational} \\ 0,\text{ }x\text{ irrational}\end{array}\right.

h(x)=\left\{ \begin{array}{c}q,x=\frac{p}{q},\text{ in lowest terms, }p,q\in \{1,2,3..\} \\ 0\text{ otherwise}\end{array}\right\}

Notice that g is bounded whereas h is not. Notice that neither is Riemann integrable as there is no hope of a upper and lower sums converging. But notice something else: we know that the rationals compose a minescule portion of the real numbers; hence there might be another theory of integration that would all us to integrate both of these functions and obtain 0 for the integral. But there isn’t a good way to obtain the integrals of these functions as a limit of Riemann integrals; there are no smaller intervals to work with. Hence the need for a more expansive theory of integration.

About the rational numbers: why do they ”take up little space”? I’ll assume that the reader knows that the rationals are countable and the reals are not, hence there are many more irrational numbers than rational ones.

But as far as ”taking up space”: what do we mean by that? We will answer this when we develop ”measure theory”; that is, a way of assigning a generalization of a “length” to more complicated subsets of the real line. For now I’ll explain why the set of rational numbers doesn’t take up much space, even though they are a dense subset of the real line.

I will show that we can cover all of the rationals by a countable set of intervals whose lengths add up to an arbitrarily small number. That is,
for any given \varepsilon >0, I’ll construct a countable set of intervals [a_{i},b_{i}] such that \cup _{i=1}^{\infty}[a_{i},b_{i}] contains ALL of the rational numbers AND \sum_{i=1}^{\infty }(b_{i}-a_{i})\leq\varepsilon . Of course, these intervals are NOT disjoint; far from it. But their union contains all of the rational numbers.

Review: Recall \sum_{k=0}^{\infty }r^{k}=\frac{1}{1-r} for |r|<1.

Let q_{0},q_{1},q_{2}.... be the set of rational numbers (which are countable). Then for q_{k}, consider the interval [q_{k}-\frac{1}{2}(\frac{\varepsilon +1}{\varepsilon })^{k},q_{k}+\frac{1}{2}(\frac{\varepsilon+1}{\varepsilon })^{k}]. This interval has length (\frac{\varepsilon +1}{\varepsilon })^{k}.

Hence adding the lengths of the intervals together is the same as \sum_{k=0}^{\infty }(\frac{\varepsilon +1}{\varepsilon})^{k}=\varepsilon . So the sum of the lengths of closed intervals which covers all of the rationals can be made as small as desired.

So, getting back to the problem of integration: how do we do this? The idea behind the Lebesgue integral is to first define this integral for
“simple functions” that is, functions that take on precisely one value over a set. For example: a step function can be thought of as a kind of
sum of simple functions.

Then \chi (x) is a simple function over a set I and m(I) is the “measure”; (a kind of length) of I, then the new integral would be defined as \int_{I}\chi (x)=\chi (x)m(I) (think of the integral in terms of “area”; and this would be a height (in the “function direction”) times “width of the interval” operation).

What will be new is that we’ll consider I‘s that might be much more complicated than a simple disjoint union of intervals. Well also show that if A=I\cup J and the union is disjoint and if \chi (x) is simple over both I and J, then \int_{A}\chi = \int_{I}\chi + \int_{J}\chi

So, going back to our example functions g and h: note that the “measure” of the rational numbers is zero as these can be covered by intervals whose lengths sum to arbitrarily small sums. So if we let Q denote the rational numbers in [0,1] and I denote the irrationals, note that g is a simple function over these two sets. Hence,

\int_{[0,1]}g= \int_{Q}g+ \int_{I}g=m(Q)\ast 1+m(I)\ast 0=0\ast 1+1\ast 0=0.

The second integral is more problematic; we’d have to break Q into a countable union of points whose coordinates have denominators 2, 3, 4,….and to prove a theorem about adding up a countable number of integrals
(e. g., forming a sequence). Getting to that point will take some work, as you can see.

Measure Theory.

Ok, we want to determine a “generalized length” of subsets of the real line, (subsets of [0,1] for now) and these subsets can
be far more complicated than those which are finite disjoint collections of intervals (if you know what the Cantor “middle thirds” set is, you might
review that and if you don’t know what it is, look up “Cantor Set”; that will serve as a nice example of a complicated subset of the real line).
This is a main point of measure theory.

So, in abstract terms, what we are looking for is some sort of a map m from the collection of subsets of [0,1] to the non-negative real numbers that serves as some sort of a length function. It might be a good time to stop and ask yourself: “what properties would we want this map (called a “measure”) to have?

Here are some key properties:

1.m([a,b])=b-a (and the same for open, and half open intervals). After all, this is supposed to be a length function. Of course,
b > a .

2. m(E+r)=m(E): that is, measure should be “translation invariant” that is, if you translate a set E by adding the same constant to every element of E, the translated set should have the same measure.

3. m(\{x\})=m(\emptyset )=0 (the measure of a single point and the empty set ought to be zero)

4. If F\subset E then m(F)\leq m(E) and we’d like to have some additivity properties:

5. Suppose E=\cup _{i=1}^{\infty }E_{i}. Then we’d want m(E)\leq\sum_{i=1}^{\infty }m(E_{i}) provided the left hand sum converges (of course, this convergence is absolute since all terms are positive; hence the order of the terms is of no consequence; this uses our calculus results).

Of course, we are talking about a countable union; sums don’t make sense otherwise.

And, if the sets E_{i} are disjoint, we’d love to have m(E)=\sum_{i=1}^{\infty }m(E_{i})

So how would we go about defining such a measure?

Here is one standard way: Every subset of the real line can be covered by a countable union of open intervals. (Note: if you said to yourself
anything about a Lindelof space, you have no business reading this article unless it is to critique it). So given a set E, let \{(a_{i},b_{i})\} be a countable collection of open subintervals whose union contains E (that is, an open cover for E). If \sum_{i}(b_{i}-a_{i}) is finite, call that the “length” of the cover of E. Then consider the infimum of all lengths of covers of E (note: since E \subset \lbrack 0,1], there exists a cover of length 1 so such an infimum exists). This infimum is
called the measure of E.

This definition of measure gives us most of what we want (1-5). Note: proving 1 is a bit trickier than it might first appear; it is immediate that
m([a,b])\leq b-a. To go the other way, one might use the fact that every open cover has a finite subcover (the Heinie-Borel Theorem from your analysis class) and induct on the number of open sets in your finite subcover). Features 2, 3, and 4 are pretty easy to demonstrate.

5 (called “countable subadditivity”) Makes for an interesting exercise. I’ll sketch out a solution here:

Since each E_{i} has a measure, we’ll find a collection of open intervals I_{ij} that cover and, if l(I_{ij}) denotes the length of the open interval, we can assume that m(E_{i})+\frac{\varepsilon }{2^{i}}>\sum_{j} l(I_{ij}). Then note that
\cup _{ij}I_{ij} covers E=\cup _{i=1}^{\infty }E_{i} and
\sum_{i}(m(E_{i})+\frac{\varepsilon }{2^{i}})>\sum_{i}\sum_{j} l(I_{ij})\rightarrow \sum_{i}(m(E_{i}))+\varepsilon >\sum_{i}\sum_{j} l(I_{ij}) for any \varepsilon >0.

Hence \sum_{i}(m(E_{i}))\geq m(E).

So what about the case where the E_{i} are disjoint; do we get equality?

The answer is…well, NO…not for ALL subsets E.

It will turn out that we will have to restrict our measure to certain subsets called “measurable sets”. The measurable sets form one large collection of subsets of [0,1] that “work”. These sets will be those sets that have this condition: m(E)+m(E^{\prime})=1 (where E^{\prime }=[0,1]-E; the set compliment of E).

It isn’t obvious that this is the condition that we’ll need; this complement condition is called the Carathedory characterization.

Note: I am being a little bit sloppy here. Strictly speaking, I should use the concept of “outer measure” (that is, the “infimum of the sum of the lengths of the covers”) when talking about the Carathedory characterization and denote that by m^{*}(E) and only use m(E) when I am talking about the “measure” of the “measurable sets”; otherwise I would be using circular logic. But this isn’t a text book so I’ll abuse notation.

Note: it is easy to establish that intervals and one point sets are measurable. To find other sets that are measurable, we’ll need to use
results from the algebra of sets;, sigma-algebras, etc.; the idea is to show that, say, the collection of measurable sets are closed with respect to countable unions, countable intersections, set complementation, etc.

This is one reason why set algebra topics are often covered in real analysis textbooks in the first sections or chapters. We’ll also show an example of a non-measurable set in a future post (2 dimensional versions of these come
into play in the Banach-Tarski “paradox”).

In the next section, we construct a non-measurable set.

Blog at WordPress.com.