January | 2011 | College Math Teaching

January 31, 2011

Cantor Sets in [0,1] and Lebesgue Measure

Filed under: advanced mathematics, analysis, calculus, cantor set, cantor space, Lebesgue Integration, Measure Theory, point set topology — collegemathteaching @ 2:34 am

I still remember one of the more moronic things I have ever written on an exam; I said “Set E has measure zero and is therefore countable….”. My professor wrote “whatever happened to the Cantor Set”, which he had told us about and had covered extensively.

It was one of those things that I had heard but really didn’t internalize into a working part of my mathematical mind; the latter is difficult to do.

So what is a Cantor Set? Actually, it depends on who you ask. 🙂 A topologist is likely to give a different answer than an analyst; I’ll discuss what is going on.

First, I’ll construct the traditional Middle Thirds Cantor Set; this is an example of a subset of $[0,1]$ which
1. Has Lebesgue outer measure zero (and is therefore measurable)
2. Is uncountable.
It also has some other interesting properties and can be generalized; as George Simmons in Introduction to Topology and Modern Analysis says on page 68:

[…] the Cantor set is a very intricate mathematical object and is just the sort of thing that mathematicians delight in.

Descriptions: I’ll describe the Cantor set by describing what is NOT in it: the intervals $(1/3, 2/3), (1/9.2/9), (7/9. 8/9), (1/27, 2/27), (7/27, 8/27), (19/27, 20/27), (25/27, 26/27), ....$ . Or, I can describe it as an intersection of closed sets: $C_{1} = [0,1/3] \cup [2/3,1], C_{2} = [0, 1/9] \cup [2/9, 1/3] \cup [2/3,7/9] \cup [8/9,1],...$ and then the Cantor set $C =\cap_{i=1}^{\infty} C_{i}$ . Another way of describing the Cantor middle thirds set is the set of numbers whose base three fraction expansion does not contain the digit “1” anywhere (as these numbers lie in the removed intervals, always the middle third).

Some facts are immediate: The Cantor set is closed as its complement is the union of open sets. It is bounded and hence compact. The middle thirds Cantor set has measure zero; here is why: if we add the measures of the complement: $1/3 + 2(1/9) + 4(1/27) + ..=(1/3)\sum_{k=0}^{\infty}(2/3)^k = (1/3)(1/(1-(2/3)) = 1$

But the Cantor set is uncountable! This can be seen in many ways; there is a topological proof; we’ll present a brute force one.
Let $C$ denote the Cantor middle thirds set and let $x \in C$ . Note that each $C_{k}$ consists of $2^{k-1}$ disjoint intervals of length $(1/3)^k$ . Now we will map $x$ to a point on the sequence of 1’s and 0’s: ${f_{1}(x), f_{2}(x), f_{3} (x), ....}$ where $f_{1}(x) = 0$ if $x \leq 1/3$ and 1 if $x \geq 2/3$ . Then $f_{2}(x) = 0$ if $x$ is in the first third of the next interval and 1 if it is in the final third. We do this at every stage and note that the map from $C$ onto the sequence is onto. The map is also one to one because if $x \neq y$ then they are at least $\epsilon$ apart, and choose a non-deleted interval that is shorter than this $\epsilon$ ; that interval cannot contain both $x$ and $y$ .

We can say much more about the Cantor set(s); I’ll conclude with a few interesting tidbits:

1. It isn’t that tough to exhibit some elements of $C$ that are NOT endpoints of deleted intervals; merely choose a base three fraction sum that does not contain 1’s in its numerator. Here is one example: choose
$2/3 + 2/9 + 2/81 + 2/729 ..... = (2/3)+ (2/9) (1 + 1/9 + 1/81 +... =(2/3)+ (2/9(1/(1-(1/9))) = 2/3 +1/4 = 11/12$
2. One can alter the construction of the Cantor middle third set to obtain a Cantor set of any measure less than 1: do the same middle thirds construction but ensure that the length of each removed interval is $\delta^k$ at stage $k$ Then one can compute the length of the removed intervals:
$\delta + 2\delta^2 + 4\delta^3.....=\delta/(1-2\delta)$ . Now if $r$ is the desired measure of the complement (less than 1, of course), one solves $r = \delta/(1-2\delta)$ for $\delta$ to obtain $\delta = r/(1+2r)$ . One notes that $r = 1$ yields $\delta = 1/3$ .

The Cantor set (of any measure) has some interesting properties. For one, every point is a limit point (remember, it has the subspace topology). This isn’t hard to see; let $I$ be a neighborhood of any point $x$ of the Cantor set of width $\epsilon$ and let $k$ be such that $(1/3)^k < \epsilon/2$ . Then this non-deleted interval containing $x$ must lie in $I$ , which means that $I$ contains other points of the Cantor set. This property is called “being dense in itself”. This property, plus being compact, is enough to prove that the set is uncountable.
The Cantor set is totally disconnected; that is, the only connected components are one point sets. This is why: given any two different points in a Cantor set, there is a deleted interval between them.

Note: it might not make sense to talk of THE Cantor set as one can construct Cantor sets of different measures. But it does make sense to talk about the Cantor set in terms of topology as every compact, dense in itself, totally disconnected metric space is homeomorphic (in fact, homeomorphic to the countably infinite product of ${0, 1}$ where every factor has the discrete topology and the whole space has the product topology. Note: sometimes this topological space is referred to as the Cantor SPACE with Cantor SET being reserved for middle thirds type of construction.

Astonishingly, every compact metric space is the image of the Cantor space (and therefore of the Cantor set); that is a topic for a later time.

What does this have to do with integration?
Remember that the standard Cantor middle thirds set has measure zero; hence two functions defined on $[0,1]$ that are equal except on the Cantor set have equal integrals; we will also see that the Cantor set can cause some mischief when we start talking about the Fundamental Theorem of Calculus. 🙂
If you can’t wait, see here.

Comments (1)

January 29, 2011

Taylor Series: student misunderstanding

Filed under: advanced mathematics, calculus, Power Series, student learning, Taylor Series — collegemathteaching @ 10:05 pm

I am going to take a break from the Lebesgue stuff and maybe write more on that tomorrow.
My numerical analysis class just turned in some homework and some really have some misunderstanding about Taylor Series and Power Series. I’ll provide some helpful hints to perplexed students.

For the experts who might be reading this: my assumption is that we are dealing with functions $f$ which are real analytic over some interval. To students: this means that $f$ can be differentiated as often as we’d like, that the series converges absolutely on some open interval and that the remainder term goes to zero as the number of terms approaches infinity.

This post will be about computing such a series.
First, I’ll give a helpful reminder that is crucial in calculating these series: a Taylor series is really just a power series representation of a function. And if one finds a power series which represents a function over a given interval and is expanded about a given point, THAT SERIES IS UNIQUE, no matter how you come up with it. I’ll explain with an example:

Say you want to represent $f(x) = 1/(1-x)$ over the interval $(-1,1)$ . You could compute it this way: you probably learned about the geometric series and that $f(x) = 1/(1-x) = 1 + x + x^2 + x^3....+ x^k+.... = \sum_{i=0}^{\infty} x^i$ for $x \in (-1,1)$ .

Well, you could compute it by Taylor’s theorem which says that such a series can be obtained by:
$f(x) = f(0) + f^{'}(0)x + f^{''}())x^2/2! + f^{iii}(0)x^3/3! +.... + \sum_{k=0}^{\infty} f^{k}(0)x^k/k!$ If you do such a calculation for $f(x) = 1/1-x$ one obtains $f^{'} = (1-x)^2$ , $f^{''} = 2(1-x)^3$ , $f^{iii} = 3!(1-x)^4 ....$ and plugging into Taylor’s formula leads to the usual geometric series. That is, the series can be calculated by any valid method; one does NOT need to retreat to the Taylor definition for calculation purposes.

Example: in the homework problem, students were asked to calculate Taylor polynomials (of various orders and about $x=0$ ) for a function that looked like this:

$f(x) = 3x(sin(3x)) - (x-3)^2$ . Some students tried to calculate the various derivatives and plug into Taylor’s formula with grim results. It is much easier than that if one remembers that power series are unique! Sure, one CAN use Taylor’s formula but that doesn’t mean that one should. Instead it is much easier if one remembers that $sin(x) = x -x^3/3! + x^5/5! - x^7/7!......$ Now to get $sin(3x)$ one just substitutes $3x$ for $x$ and obtains: $sin(3x) = 3x -(x^3)3^3/3! + (x^5)3^5/5! -( x^7)3^7/7!......$ . Then $3x(sin(3x)) =9x^2 -(x^4)3^4/3! + (x^6)3^6/5! -( x^8)3^8/7!....$ and one subtracts off $(x-3)^2 = x^2 -6x +9$ to obtain the full power series: $-9 + 6x+ 8x^2 -(x^4)3^4/3! + (x^6)3^6/5! -( x^8)3^8/7!....= -9 + 6x+ 8x^2 +\sum_{k=2}^{\infty} (-1)^{k+1}x^{2k}3^{2k}/(2k-1)!$

Now calculating the bound for the remainder after $k$ terms is, in general, a pain. Sure, one can estimate with a graph, but that sort of defeats the point of approximating to begin with; one can use thumb rules which overstate the magnitude of the remainder term.

Comments (3)

January 25, 2011

Some Sets Which Are Lebesgue Measurable

Filed under: advanced mathematics, analysis, Lebesgue Integration, mathematical ability, mathematics education, Measure Theory — collegemathteaching @ 8:49 pm

We’ve yet to show some Lebesgue measurable sets; we’ll do that now.
Note: by subaddivity of Lebesgue outer measure, we know that for all sets $T$ , we have:
$m^*(T) \leq m^*(T \cap E) + m^*(T \cap C(E))$ for all measurable sets $E$ ; again here we use $C( )$ to denote the complement of a set.
So to show that a set is measurable, we need only show $m^*(T) \geq m^*(T \cap E) + m^*(T \cap C(E))$ for the set in question.

$\mathbb{R}$ and $\emptyset$ are measurable.

Proof: $m^*(T) = m^*(T \cap R) + m^*(T\cap \emptyset) =m^*(T)$

So now we know that the set of measurable sets isn’t empty. 🙂
Actually we can do better than that.

Proposition If a set has Lebesgue outer measure zero, then it is measurable.
Proof: Let $E$ have measure zero; that is, $m^*(E) = 0$ . Let $T$ be arbitrary:
$m^*(T \cap E) + m^*(T \cap C(E))$ . But $T \cap E \subset E$ and therefore $m^*(E) \geq m^*(T \cap E)$ which implies that $m^*(T \cap E) = 0$ But $T \cap E \subset T$ which implies that $m^*(T) \geq m^*(T\cap E) = m^*(T \cap E) + m^*(T \cap C(E))$ which is what we had to show.

Now have from the definition: $m^*({x}) = 0$ (one point sets have measure zero). Thus by countable additivity of measurable sets, any countable set (e. g., the rationals) has measure zero.

What about sets (other than $\mathbb{R}$ ) that have positive measure?
Here is how we are going to proceed: we will show that sets of the form $(a, \infty)$ are measurable. Then that will mean that sets like $[a,\infty)$ are measurable which will then imply that open intervals $(a,b)$ are measurable. But then, because of the fact that the measurable sets form a sigma algebra (closed with respect to countable unions, intersections, complements), we will get, free of charge, all of the topologically open sets to be measurable (remember that the reals are a second countable topological space), all closed sets, all countable intersections of open sets (called G-delta sets) and all countable unions of closed sets (the F-sigma sets). Note: the smallest sigma-algebra that contains the open intervals is called the Borel Sets.

It is true that not all measurable sets are Borel sets, but that is a topic for another day.

Showing the intervals $(a, \infty)$ are measurable.
Let $T$ be arbitrary and let $T_{1} = T \cap (a, \infty)$ and let $T_{2} = T \cap (-\infty, a]$ . With no loss of generality we can assume that $m^*(T)$ is finite, otherwise there is nothing to show.
So, given any $\epsilon > 0$ we can find a countable collection of intervals $I_{i}$ that cover $T$ such that $m^*(T) + \epsilon \geq \sum_{i=1}^{\infty} l(I_{i}) =\sum_{i=1}^{\infty} m^*(I_{i})$ (note that $l(I_{i})$ denotes the length of the interval, which is its outer measure).
Let $T_{1} = T \cap (a,\infty)$ and $T_{2} = T \cap (-\infty, a]$ , $I_{1,i} = I_{i} \cap (a,\infty)$ and $I_{2,i} = I_{i} \cap (-\infty, a]$
Now $m^*(T_1) \leq \sum_{i=1}^{\infty} l(I_{1,i} = \sum_{i=1}^{\infty} m^*(I_{1,i})$ because $T_{1} \subset \cup_{i=1}^{\infty} I_{1,i}$ and
$m^*(T_2) \leq \sum_{i=1}^{\infty} l(I_{2,i}) = \sum_{i=1}^{\infty} m^*(I_{2,i})$ because $T_{2} \subset \cup_{i=1}^{\infty} I_{2,i}$

So $m^*(T_{1}) + m^*(T_{2}) \leq \sum_{i=1}^{\infty} (m^*(I_{1,i}) + m^*(I_{2,i})) = \sum_{i=1}^{\infty} (m^*(I_{i})) \leq m^*(T) + \epsilon$
Since this is true for all $\epsilon$ it follows that $m^*(T_{1}) + m^*(T_{2}) \leq m*^(T)$ which is what we had to show.

So now that we have a feel for what sorts of sets are measurable (at least the Borel sets, and a bit more than that), we are ready to get back to Lebesgue integration. We’ve defined the Lebesgue integral for bounded functions over a closed interval; we can now move to unbounded functions and to some promised convergence theorems.

Note from my past
I’d imagine that most of us have written moronic things on examinations from time to time. I still remember writing the following on an analysis exam: “ $E$ has measure zero and is therefore countable”….to which my professor replied: “Nonsense…whatever happened to the Cantor set?”.

I’ll have to deal with the Cantor set in it’s own post; well show that we can construct a Cantor set with zero measure and one of any given finite measure. This isn’t just a “cool thing”; it is also an essential part of some interesting counterexamples.

Comments (1)

January 24, 2011

Lebesgue Measure: Outer Measure, Measure, Why the Caratheodory Criterion works

Filed under: advanced mathematics, analysis, Lebesgue Integration, Measure Theory — collegemathteaching @ 9:10 pm

In this post, I will distinguish between Lebesgue outer measure (denoted by $m^{*}(E)$ ) and Lebesgue measure (denoted by $m(E)$ ). Recall that Lebesgue outer measure of a set $E$ is $inf \sum_{i=1}^l(I_{i})$ where $\cup_{i=1} I_{i}$ is a covering of $E$ by open intervals $I_{i} = (a_{i},b_{i})$ and $l(I_{i}) = b_{i} - a_{i}$ . Lebesgue outer measure is defined for all subsets of the real line $E$ but in a previous post we showed an example of a subset for which the combination of translation invariance and countable additivity do NOT hold for outer measure.

We need to restrict to subsets of the real line for which translation invariance and countable additivity holds; when we use Lebesgue outer measure on these sets we call it Lebesgue measure.

So, what we are going to show is this: if we restrict our sets to those sets $E$ for which:
$m^{*}(T) = m^{*}(T \cap E) + m^*(T\cap\tilde{E})$ (where $\tilde{E}$ is the set complement of $E$ ) for ALL subsets $T$ , then Lebesgue outer measure IS countably additive with respect to those subsets. Lebesgue outer measure applied to these sets is called Lebesgue measure.

Note: If $m^{*}(T) = m^{*}(T \cap E) + m^*(T\cap\tilde{E})$ is true we say that “ $E$ splits $T$ and sometimes refer to $T$ as a “test set”.

So, why is this condition the one that we want? Well, well prove the following results assuming that the given sets $E_{i}$ splits all test sets $T$ .

1. If $E_{1}$ and $E_{2}$ are measurable sets (e. g., splits all sets $T$ ) then $E_{1} \cap E_{2}$ is also measurable (splits all sets) and $E_{1} \cup E_{2}$ is also measurable.
Proof: first recall that $m^{*}(T) \leq m^{*}(T\cap E) + m^{*}(T\cap\tilde{E})$ by countable subadditivity of Lebesgue outer measure. We need to show the other inequality.

First recall that $\tilde{E_{1}} \cap\tilde{E_{2}} = \tilde{(E_{1}\cup E_{2})}$ by DeMorgan’s laws. (the latter is supposed to be the compliment of the union of the set $E_{1}\cup E_{2}$ .

Now because $E_{2}$ is measurable, (1) $m^*(T\cap\tilde{E_{1}})=m^*(T\cap\tilde{E_{1}}\cap E_{2})+m^*(T\cap\tilde{E_{1}}\cap\tilde{E_{2}})$

Now recall that, from basic set theory, $E_{1}\cup E_{2} = E_{1}\cup(E_{2}\cap\tilde{E_{1}})$
So from countable subadditivity of outer measure:
(2) $m^*(T\cap E_{1}\cup E_{2}) \leq m^*(T\cap E_{1})) + m^*(T\cap (E_{2} \cap\tilde{E_{1}}))$

So now attempt to compute: (3)
$m^*(T\cap (E_{1} \cup E_{2})) + m^*(T \cap\tilde{E_{1} \cup E_{2}}) \leq m^*(T\cap E_{1}) + m^*(T\cap (E_{2} \cap\tilde{E_{1}}) +m^*(T \cap\tilde{E_{1}} \cap\tilde{E_{2}} )$

But $E_{2}$ is measurable and therefore splits $T \cap\tilde{E_{1}}$ and so the last two terms can be combined to $m^*(T \cap\tilde{E_{1}})$ so (3) becomes $m^*(T \cap E_{1}) + m^*(T \cap\tilde{E_{1}}) = m^*(T)$ which is the right hand of inequality (2).

So, it follows by a routine induction argument than a finite union of measurable sets is measurable.
Now, what about the intersection? If $E_{1}$ and $E_{2}$ are measurable, so are their complements (and vica-versa; the definition is symmetric). Now recall that $E_{1} \cap E_{2}$ = $C({(\tilde{E_{1}} \cap\tilde{E_{2}})})$ (note: the outer “ $C()$ ” denotes set complement as I couldn’t get the LaTex command for the outer “tilde” to work) and the result follows.

2. Now we show finite additivity of disjoint measurable sets $E_{i}$ :
We need to show that $m^*(T \cap \cup_{i=1}^{n} E_{i}) \geq \sum_{i=1}^{n} m^*(T \cap E_{i})$
Clearly, the statement is true for $n = 1$ . Assume that the statement is true for all integers up to $n - 1$ .
Now by disjointness, $T \cap (\cup_{i=1}^{n} E_{i}) \cup E_{n} = T\cup E_{n}$ and $T \cap \cup_{i=1}^{n} \cap\tilde{E_{n}} = T\cap(\cup_{i=1}^{n-1} E_{i}$ .
Now $E_{n}$ splits $T \cap (\cup_{i=1}^{n}) E_{i}$ therefore
$m^*( T \cap (\cup_{i=1}^{n}) E_{i} )= m^*(T \cap E_{n}) +m^*(T \cap (\cup_{i=1}^{n-1} E_{i}) =$
$m^*(T \cap E_{n})+ \sum_{i=1}^{n-1} m^*(T \cap E_{i})$

3. We now need to show that the countable union of measurable sets is measurable.
First note that if $E = \cap_{i=1}^{\infty}E_{i}$ we can assume that the $E_{i}$ are disjoint. Here is why: Let $F_{1} = E_{1}$ , $F_{2} = E_{2} \cap\tilde{E_{1}}$ , $F_{3} = E_{3} \cap\tilde{E_{1}\cup E_{2}}$ and so on. Then $cup_{i=1}^{\infty} E_{i} = cup_{i=1}^{\infty} F_{i}$ and the $F_{i}$ are mutually disjoint. So we can assume with no loss of generality that the $E_{i}$ have this property.

Note: I am getting tired of the “tilde” notation and so will be using the $C()$ notation to denote the set complement.

Now let $G_{n} = \cup_{i=1}^{n}E_{i}$ . Then $G_{n}$ is measurable and $C(G_{n}) \supset C(E)$ . Then:
$m^*(T) = m^*(T \cap G_{n})+m^*(T\cap C(G_{n}) \geq m^*(T \cap G_{n})+ m^*(T\cap C(E))$
By finite additivity $m^*(T \cap G_{n}) = \sum_{i=1}^{n} m^*(T \cap E_{i})$ Hence we can substitute into the right hand side of the inequality to obtain:
$m^*(T) \geq \sum_{i=1}^{n} (m^*(T \cap E_{i})+m^*(T \cap C(E)))$
This is true for all values of $n$
This means $m^*(T) \geq \sum_{i=1}^{\infty} (m^*(T \cap E_{i})+m^*(T \cap C(E))) \geq m^*(T \cap E) + m^*(T\cap C(E))$ ; the latter inequality following from countable subaddivity.

4. Now we can show finite additivity for for disjoint measurable sets in Lebesgue measure (no longer outer measure) by replacing $T$ with $\mathbb{R}$ .

5. We now consider a countably infinite collection of disjoint measurable sets. We must show that $m(\cup_{i=1}^{\infty} E_{i}) \geq \sum_{i=1}^{\infty} m(E_{i})$ This follows from the fact that $\cup_{i=1}^{\infty} E_{i} \supset \cup_{i=1}^{k} E_{i}$ for all $k$ . This means that $m(\cup_{i=1}^{\infty} E_{i} ) \geq m(\cup_{i=1}^{k} E_{i} )$ for all $k$ . Hence the infinite sum is either bounded by $m(\cup_{i=1}^{\infty} E_{i} )$ and therefore converges or is infinite as is $m(\cup_{i=1}^{\infty} E_{i} )$

All of this shows that restricting the sets we consider to those who obey the Caratheodory criterion with respect to Lebesgue outer measure gives us the properties that we want.

Of course, we’ve yet to classify WHICH sets are measurable or which sets form important subclasses of measurable sets.
Here is an outline of what we are going to do: we’ll expand on lemmas which state things like “doing operations X, Y and Z to measurable sets yields a measurable set”; we’ve done this with respect to finite unions, finite intersections, and now countable unions. We’ll then see that such sets form a type of “set algebra” and what is known as a “sigma-algebra”.
That is for another day (soon) 🙂

Update Since we are close to closing the deal as far as the “set algebra” and “sigma-algebra” result, I’ll go ahead and finish in this post. First I’d like to recall something from basic set theory (I’ll use $C(A)$ to denote the complement of the set $A$ ):
$C(B) \cup C(A) = C(A \cap B)$ . I’ll leave the formal proof of this to the reader, but the basic idea is that if something isn’t an element of both $A$ and $B$ , then it must be in the complement of one or the other set. This leads to the following result for sets $E_{i}$ :
$\cup_{i=1}^{\infty} C(E_{i}) = C(\cap_{i=1}^{\infty} E_{i})$ Again, I won’t prove this, but the idea is as follows: if something is an element of the left hand side, then it must be in the complement of at least one of the $E_{i}$ which means it can’t be in ALL of them and therefore can’t be in the intersection.

What does this have to do with our measurable sets? We’ve just seen that the countable union of measurable sets is measurable and by symmetry, the complement of a measurable set is measurable. Hence the countable intersection of measurable sets is measurable.

Set algebras An algebra of sets is a collection of sets which is closed with respect to finite unions and complements; hence it is immediate that the set of measurable sets forms a set algebra.

Sigma algebras A set algebra is called a sigma-algebra if it is closed with respect to countable unions as well (and by DeMorgan’s laws: closed with respect to countable intersections as well). So we’ve just shown that the collection of measurable sets forms a sigma-algebra.

What we haven’t shown is a single measurable set as yet! THAT is what we are going to do next.

Comments (5)

January 18, 2011

Lebesgue Integral: Bounded Functions on a Bounded Set

Filed under: advanced mathematics, analysis, calculus, integrable function, integrals, Lebesgue Integration, Measure Theory — collegemathteaching @ 2:24 am

Now that we have an idea of what the Lebesgue integral is, how do we define it?

If we limit ourselves to bounded, measurable functions on the real line, we could do the following:
suppose there is a real number $M>0$ such that $-M\leq f\leq M$ over $[0,1]$ . Then for some integer $k,$ we could set up the partition of the range:

$-M=y_{-k},\frac{(1-k)M}{k}=y_{-k+1},\frac{(2-k)M}{k}=y_{-k+2},....0=y_{0},\frac{M}{k}=y_{1},....\frac{(k-1)M}{k}=y_{k-1},M=y_{k}$

Now set up $E_{-k}=f^{-1}([y_{-k},y_{-k+1})),....E_{k}=f^{-1}([y_{k-1},y_{k}))$

And then have $\varphi _{k}=\sum_{i=-k}^{k-1}y_{i}m(E_{i})$ and $\psi_{k}=\sum_{i=-k+1}^{k}y_{i}m(E_{i})$

Note that $\varphi _{k}$ plays the role of the lower sum and $\psi _{k}$ plays the role of the upper sum. If $\inf \psi _{k}=\sup \varphi _{k}$ the function is integrable (as it always is if $f$ is measurable and bounded) and we define $\int_{0}^{1}f(x)dx=\inf \psi _{k}=\sup \varphi _{k}.$

Note: It is possible to define the Lebesgue integral without having the concept of measurable function first: we can start with a bounded function
$f$ and partition the range of $f$ by $y_{0},y_{1},...y_{n}.$ We can then look at all partitions $E_{1}....E_{n}$ of $[0,1]$ by measurable sets.

Then consider the characteristic functions $\chi _{i}(x)=\left\{ \begin{array}{c}1,x\in E_{i} \\ 0,x\notin E_{i}\end{array}\right.$ so we can form a type of general step function $\varphi_{n}=\sum_{i=1}^{n}y_{i-1}m(E_{i})$ and $\psi_{n}=\sum_{i=1}^{n}y_{i}m(E_{i})$ and call $f$ integrable if $\sup \{\varphi_{n},\varphi _{n}\leq f\}=\inf \{\psi _{n},\psi _{n}\geq f\}$ . Think of the first function as approximating $f$ from below by generalized step functions, and the second as approximating $f$ from above.

Note: one of the big deals about the Lebesgue integral is that we get better convergence properties; that is, if we have a sequence of integrable
functions $f_{n}\rightarrow f$ pointwise over a measurable set, then with only mild extra hypothesis, we can show that the limit function is also
integrable and that the integral can be obtained as some sort of limit of integrals.

But to make any headway on such theorems, we’ll have to retreat to some theorems concerning measurable sets; so far we’ve shown that some sets are not measurable; we haven’t developed sufficient conditions for a set to be measurable.

Comments (1)

January 17, 2011

Integration: Riemann Integration, Limitations and Lebesgue’s Idea

Filed under: advanced mathematics, analysis, calculus, integrable function, integrals, Lebesgue Integration, Measure Theory — collegemathteaching @ 6:34 am

What about integration? Here we will see what Lebesgue integration is about, how it differs from Riemann integration and why we need to learn about the algebra of measurable sets.

Brief review of Riemann Integration

Remember that the idea was as follows: we limit ourselves to bounded functions. suppose we want to compute $\int_{a}^{b}f(x)dx$ . We partitioned the interval $[a,b]$ into several subintervals:

$a=x_{0}<x_{1}<x_{2}...<x_{n-1}<x_{n}=b.$ Let $m_{i}=\inf f(\xi ),\xi \in \lbrack x_{i-1},x_{i}]$ and let $M_{i}=\sup f(\omega ),\omega \in \lbrack x_{i-1},x_{i}]$ . Let $\Delta x_{i}=x_{i}-x_{i-1}$ . Call this partition $P$ .

Then $L_{P}=\sum_{j=1}^{n}m_{j}\Delta x_{j}$ and $U_{P}=\sum_{j=1}^{n}M_{j}\Delta x_{j}$ are called the lower sums and upper sums for $f$ with respect
to the partition $P$ .

One proves theorems such as if $Q$ is a refinement of partition $P$ then $L_{P}\leq L_{Q}$ and $U_{Q}\leq U_{P}$ (that is, as you make the refinement finer…with smaller intervals, the lower sums go up (or stay the same) and the upper sums go down (or stay the same) and then one can define $U$ to the
be infimum (greatest lower bound) of all of the possible upper sums and $L$ to the the supremum (least upper bound) of all of the possible lower sums. If $U=L$ we then declare that to be the (Riemann) integral of $f$ over $[a,b].$

Note that this puts some restrictions on functions that can be integrated; for example $f$ being unbounded, say from above, on a finite interval will prevent upper sums from being finite. Or, if there is some dense subset of $[a,b]$ for which $f$ obtains values that are a set distance away from the the values that $f$ attains on the compliment of that subset, the upper and lower sums will never converge to a single value. So this not only puts restrictions on which functions have a Riemann integral, but it also precludes some “reasonable sounding” convergence theorems from being true.

For example, suppose we enumerate the rational numbers by $q_{1},q_{2},...q_{k}...$ and define $f_{1}(x)=\left\{\begin{array}{c}1,x\neq q_{1} \\ 0,x=q_{1}\end{array}\right.$ and then inductively define $f_{k}(x)=\left\{\begin{array}{c}1,x\notin \{q_{1},q_{2},..q_{k}\} \\ 0,x\in \{q_{1},q_{2},..q_{k}\}\end{array}\right.$ . Then $f_{k}\rightarrow f=\left\{\begin{array}{c}1,x\notin \{q_{1},q_{2},..q_{k}....\} \\ 0,x\in \{q_{1},q_{2},..q_{k},...\}\end{array}\right.$ and for each $k$ , $\int_{0}^{1}f_{k}(x)dx=1$ but $f$ , the limit function, is not Riemann integrable.

So, there are a couple of things to note here:

1. The Riemann integral involves partitioning the interval to be integrated over without regards to the function being integrated at all; that is, if you were doing $\int_{0}^{1}e^{\sqrt{x}}dx$ or $\int_{0}^{1}\sin (x^{2})dx$ you wouldn’t partition $[0,1]$ any differently.

2. The elements of any partition of the Riemann integral are intervals of finite length.

The Lebesgue integral changes these two features;

1. We’ll use information about the function being integrated to help us select partitions and

2. The elements of our partition need not be intervals of finite length; they just need to be measurable sets.

For example, suppose we wish to compute $\int_{0}^{1}4x-x^{2}dx$ by using a Lebesgue integral.

Partition the range of $f$ into 4 subintervals:

$Y_{1}=0\leq y<.25,$

$Y_{2}=.25\leq y<.5,$

$Y_{3}=.5\leq y<.75,$

$Y_{4}=.75\leq y\leq 1.$

Now consider the inverse image of these subintervals and label these:

$E_{1}=f^{-1}(Y_{1})=[0,\frac{1}{2}-\frac{1}{4}\sqrt{3})\cup (\frac{1}{2}+\frac{1}{4}\sqrt{3},1]$

$E_{2}=f^{-1}(Y_{2})=[\frac{1}{2}-\frac{1}{4}\sqrt{3},\frac{1}{2}-\frac{1}{2}\sqrt{\frac{1}{2}})\cup \lbrack \frac{1}{2}+\frac{1}{2}\sqrt{\frac{1}{2}},\frac{1}{2}+\frac{1}{4}\sqrt{3})$

$E_{3}=f^{-1}(Y_{3})=[\frac{1}{2}-\frac{1}{2}\sqrt{\frac{1}{2}},\frac{1}{4})\cup \lbrack \frac{3}{4},\frac{1}{2}+\frac{1}{2}\sqrt{\frac{1}{2}})$

$E_{4}=f^{-1}(Y_{4})=[\frac{1}{4},\frac{3}{4})$

Then we form something similar to upper and lower sums. Recall the measure of an interval is just its length.

So we obtain something like an upper sum:

$U=\frac{1}{4}m(E_{1})+\frac{1}{2}m(E_{2})+\frac{3}{4}m(E_{3})+1m(E_{4})$

and a lower sum as well:

$L=0m(E_{1})+\frac{1}{4}m(E_{2})+\frac{1}{2}m(E_{3})+\frac{3}{4}m(E_{4})$

See the above figure for an illustration of an upper sum.

Then we proceed by refining the partitions of our range; for this to work we need for the inverse image of the partitions of the range to be measurable sets; this is why we need theorems about what constitues a measureable set.

A measurable function is one whose inverse images (or partitions of its range into intervals) are measurable sets.

The Lebesgue integral can be defined as either the infimum of all the upper sums or the supremum of all of the lower sums.

If one wants to see how this works, try doing this for $\int_{0}^{1}g(x)dx$ where
$g(x)=\left\{ \begin{array}{c}1,x\notin Q \\ \frac{1}{p},x=\frac{p}{q}\end{array}\right.$ where $Q$ is the rationals and $\frac{p}{q}$ is in lowest terms.

Then the subinterval which includes 1 in any partition of the range will have an inverse image with measure 1 whereas all subintervals whose upper bounds are strictly less than 1 will have measure zero. Hence it follows that $\int_{0}^{1}g(x)dx=1$ though $g$ is not Riemann integrable.

Of course, this has been sketchy and we haven’t covered which types of sets are measurable. We’ll also discuss some convergence theorems as well.

Comments (19)

January 13, 2011

A Non-Measurable Set

Filed under: advanced mathematics, analysis, calculus, Lebesgue Integration, Measure Theory, transfinite induction — collegemathteaching @ 5:52 pm

In our previous post, we talked about how to define a measure on a subset of the real line (or on $[0,1]$ ).

In today’s post, we’ll give an example of a type of set for which no reasonable measure can be defined.
Let us recall what we wanted in a measure:

1. $m([a,b])=b-a$ (and the same for open, and half open intervals). After all, this is supposed to be a length function. Of course,
$b > a$ .

2. $m(E+r)=m(E):$ that is, measure should be “translation invariant” that is, if you translate a set $E$ by adding the same constant to every element of $E,$ the translated set should have the same measure.

3. $m(\{x\})=m(\emptyset )=0$ (the measure of a single point and the empty set ought to be zero)

4. If $F\subset E$ then $m(F)\leq m(E)$ and we’d like to have some additivity properties:

5. Suppose $E=\cup _{i=1}^{\infty }E_{i}.$ Then we’d want $m(E)\leq\sum_{i=1}^{\infty }m(E_{i})$ provided the left hand sum converges (of course, this convergence is absolute since all terms are positive; hence the order of the terms is of no consequence; this uses our calculus results).

Of course, we are talking about a countable union; sums don’t make sense otherwise.

And, if the sets $E_{i}$ are disjoint, we’d love to have $m(E)=\sum_{i=1}^{\infty }m(E_{i})$

Now let’s define a set for which we could never have a measure which meets the above properties. Yes, we will be using the Axiom of Choice, which, roughly speaking, states that given an infinite disjoint collection of sets, we can “choose” one element from each set.

Start by selecting an irrational number $x\in \lbrack 0,1]$ . Then let the set $E$ consist of all $y\in \lbrack 0,1]$ in which the following is true: $y - x$ is irrational and if $z\in E$ then $y - z$ is irrational. One way to think of this is to establish one point in $E$ and then consider a maximal set with respect to equality modulo the irrationals. Or one can think of building this set inductively with an uncountable number of steps: start with $x$ , choose $y$ so that $y - x$ is irrational, then add $z$ so that both $x - z$ and $x - y$ are irrational and so on. Of course, all arithmetic is done modulo 1.

Now lets look at the following: the rational numbers are countable so we can order them $q_{1},q_{2}...,q_{k}$ . Then consider the translates of $E$ : $E_1 =E+q_{1}, E_2 = E + q_{2}....$ .
Then we can establish: $E$ is disjoint from each $E_{i}$ because, say, if $y\in E$ and $y = z + q_{i}$ with $z\in E$ then $y - z$ would be rational with both $y, z\in E$ . That is impossible. For a similar reason, all of the $E_{i}$ are disjoint.

Now if $w\in E'$ (the complement of $E$ in [0,1] ), then there is a $z$ in $E$ such that $w - z$ is rational; hence the collection of $E_{i}$ forms the complement of $E$ in [0,1].

So now we have the following situation: if measure is invariant under translation, $E$ and each $E_{i}$ have the same measure. Notice also that all of these sets are disjoint and that $\lbrack 0,1]=E\cup \{\cup _{i=1}^{\infty }E_{i}\}$

If each set has measure zero and we have countable additivity of disjoint sets, we’d have that [0,1] has measure zero. But if $E$ has a non-zero measure and the measure is translation invariant, then we’d have [0,1] having infinite measure (remember we can’t form a positive infinite sum because each $E_{i}$ has to have exactly the same measure.)

So, no matter how we define measure, we will have to exclude at least some sets (such as $E$ ).

One other thing to note: $E$ has as it’s compliment a countable number of translates of itself, hence it is impossible for $E$ to meet the Caratheodory characterization requirement that $E$ and its complement have measures that add up to the measure of $[0,1]$

Comments (3)

January 12, 2011

The Lebesgue Integral: motivating the mastery of the tools of abstract mathematics

Filed under: advanced mathematics, analysis, calculus, integrable function, integrals, Lebesgue Integration, mathematics education, Measure Theory, student learning — collegemathteaching @ 10:41 pm

My Ph. D. basic math classes were a long time ago; in the fall of 1985 I enrolled in the beginning topology, algebra, analysis and algebraic topology classes. The algebra, analysis and topology classes were designed to get the student through the preliminary examinations.

The ironic thing is that while I passed analysis on my first try (summer of1986) and algebra on the first try (summer of 1987), it took me three tries to get through topology which I finally passed in the winter of 1988. Of course, that is what I ended up specializing in.

But I can honestly say that most of my analysis was ”studying the problems that might appear on the exam”; I really didn’t know what I was doing nor did I understand the material. I had no perspective whatsoever; I really didn’t know where the course was going. Therefore I remember almost none of it.

What does this have to do with the teaching of college mathematics? I see it this way: often, prior to tackling an abstract concept (say, Lebesgue Integration), one first assembles some tools (say, measure theory, the concept of a sigma algebra of sets, Borel sets, etc.) and then uses these tools as the need arises. But the tools are often developed without any context; I didn’t see why any of this was needed or what it would be used for. I had no perspective; I didn’t know WHY I had to be patient with these esoteric (to me at that time) tools.

So, what I remember now is to constantly remind the students where we are going with a topic; I try to explain WHY we need to be patient with this piece of drudgery or that.

Lebesgue Integration

My goal: I hope to review Lebesgue integration for my own benefit and hopefully, via some notes, perhaps provide some perspective for the student who is going through this process for the first time.

For my references, I am choosing A Primer of Lebesgue Integration by H. S. Bear and Real Analysis by H. L. Royden.

Why Lebesgue Integration?

Ground Rules

For the duration, I’ll be discussing the integration of positive functions over a closed, bounded interval $[a,b]$ . In fact, I’ll go ahead and use $[0,1]$ with no loss of generality.

Of course, if we are merely concerned with integrating functions which are piecewise continuous then the Riemann integral (say, as defined as the limit of upper and lower sums) works just fine.

However what happens if we want to, say, extend the class of functions that we can integrate?

Suppose, for example, we wish to compute $\int_{0}^{1}\frac{1}{\sqrt{x}}dx$ or attempt to compute $\int_{0}^{1}\frac{1}{x}dx$ ?

Ok, strictly speaking, these functions aren’t defined on $[0,1]$ , so we can replace the integrand by, say a conditional function:

$f(x)=\left\{ \begin{array}{c}\frac{1}{\sqrt{x}},x>0 \\ 0,x=0\end{array}\right.$ in the first case and similarly in the second case.

Of course, we tell our students that while these functions are NOT Riemann integrable (unbounded functions aren’t as the upper sum will never be finite for any finite partition), we can possibly extend the notion of integrability by using the improper integral technique, which, of course, shows that the first integral converges and the second one does not.

But again, we have a type of piecewise continuity. What happens when we don’t?

Let us examine another couple of functions:

$g(x)=\left\{ \begin{array}{c}1,x\text{ rational} \\ 0,\text{ }x\text{ irrational}\end{array}\right.$

$h(x)=\left\{ \begin{array}{c}q,x=\frac{p}{q},\text{ in lowest terms, }p,q\in \{1,2,3..\} \\ 0\text{ otherwise}\end{array}\right\}$

Notice that $g$ is bounded whereas $h$ is not. Notice that neither is Riemann integrable as there is no hope of a upper and lower sums converging. But notice something else: we know that the rationals compose a minescule portion of the real numbers; hence there might be another theory of integration that would all us to integrate both of these functions and obtain 0 for the integral. But there isn’t a good way to obtain the integrals of these functions as a limit of Riemann integrals; there are no smaller intervals to work with. Hence the need for a more expansive theory of integration.

About the rational numbers: why do they ”take up little space”? I’ll assume that the reader knows that the rationals are countable and the reals are not, hence there are many more irrational numbers than rational ones.

But as far as ”taking up space”: what do we mean by that? We will answer this when we develop ”measure theory”; that is, a way of assigning a generalization of a “length” to more complicated subsets of the real line. For now I’ll explain why the set of rational numbers doesn’t take up much space, even though they are a dense subset of the real line.

I will show that we can cover all of the rationals by a countable set of intervals whose lengths add up to an arbitrarily small number. That is,
for any given $\varepsilon >0,$ I’ll construct a countable set of intervals $[a_{i},b_{i}]$ such that $\cup _{i=1}^{\infty}[a_{i},b_{i}]$ contains ALL of the rational numbers AND $\sum_{i=1}^{\infty }(b_{i}-a_{i})\leq\varepsilon .$ Of course, these intervals are NOT disjoint; far from it. But their union contains all of the rational numbers.

Review: Recall $\sum_{k=0}^{\infty }r^{k}=\frac{1}{1-r}$ for $|r|<1$ .

Let $q_{0},q_{1},q_{2}....$ be the set of rational numbers (which are countable). Then for $q_{k},$ consider the interval $[q_{k}-\frac{1}{2}(\frac{\varepsilon +1}{\varepsilon })^{k},q_{k}+\frac{1}{2}(\frac{\varepsilon+1}{\varepsilon })^{k}]$ . This interval has length $(\frac{\varepsilon +1}{\varepsilon })^{k}.$

Hence adding the lengths of the intervals together is the same as $\sum_{k=0}^{\infty }(\frac{\varepsilon +1}{\varepsilon})^{k}=\varepsilon$ . So the sum of the lengths of closed intervals which covers all of the rationals can be made as small as desired.

So, getting back to the problem of integration: how do we do this? The idea behind the Lebesgue integral is to first define this integral for
“simple functions” that is, functions that take on precisely one value over a set. For example: a step function can be thought of as a kind of
sum of simple functions.

Then $\chi (x)$ is a simple function over a set $I$ and $m(I)$ is the “measure”; (a kind of length) of $I,$ then the new integral would be defined as $\int_{I}\chi (x)=\chi (x)m(I)$ (think of the integral in terms of “area”; and this would be a height (in the “function direction”) times “width of the interval” operation).

What will be new is that we’ll consider $I$ ‘s that might be much more complicated than a simple disjoint union of intervals. Well also show that if $A=I\cup J$ and the union is disjoint and if $\chi (x)$ is simple over both $I$ and $J,$ then $\int_{A}\chi =$ $\int_{I}\chi +$ $\int_{J}\chi$

So, going back to our example functions $g$ and $h:$ note that the “measure” of the rational numbers is zero as these can be covered by intervals whose lengths sum to arbitrarily small sums. So if we let $Q$ denote the rational numbers in $[0,1]$ and $I$ denote the irrationals, note that $g$ is a simple function over these two sets. Hence,

$\int_{[0,1]}g=$ $\int_{Q}g+$ $\int_{I}g=m(Q)\ast 1+m(I)\ast 0=0\ast 1+1\ast 0=0.$

The second integral is more problematic; we’d have to break $Q$ into a countable union of points whose coordinates have denominators 2, 3, 4,….and to prove a theorem about adding up a countable number of integrals
(e. g., forming a sequence). Getting to that point will take some work, as you can see.

Measure Theory.

Ok, we want to determine a “generalized length” of subsets of the real line, (subsets of $[0,1]$ for now) and these subsets can
be far more complicated than those which are finite disjoint collections of intervals (if you know what the Cantor “middle thirds” set is, you might
review that and if you don’t know what it is, look up “Cantor Set”; that will serve as a nice example of a complicated subset of the real line).
This is a main point of measure theory.

So, in abstract terms, what we are looking for is some sort of a map $m$ from the collection of subsets of $[0,1]$ to the non-negative real numbers that serves as some sort of a length function. It might be a good time to stop and ask yourself: “what properties would we want this map (called a “measure”) to have?

Here are some key properties:

1. $m([a,b])=b-a$ (and the same for open, and half open intervals). After all, this is supposed to be a length function. Of course,
$b > a$ .

2. $m(E+r)=m(E):$ that is, measure should be “translation invariant” that is, if you translate a set $E$ by adding the same constant to every element of $E,$ the translated set should have the same measure.

3. $m(\{x\})=m(\emptyset )=0$ (the measure of a single point and the empty set ought to be zero)

4. If $F\subset E$ then $m(F)\leq m(E)$ and we’d like to have some additivity properties:

5. Suppose $E=\cup _{i=1}^{\infty }E_{i}.$ Then we’d want $m(E)\leq\sum_{i=1}^{\infty }m(E_{i})$ provided the left hand sum converges (of course, this convergence is absolute since all terms are positive; hence the order of the terms is of no consequence; this uses our calculus results).

Of course, we are talking about a countable union; sums don’t make sense otherwise.

And, if the sets $E_{i}$ are disjoint, we’d love to have $m(E)=\sum_{i=1}^{\infty }m(E_{i})$

So how would we go about defining such a measure?

Here is one standard way: Every subset of the real line can be covered by a countable union of open intervals. (Note: if you said to yourself
anything about a Lindelof space, you have no business reading this article unless it is to critique it). So given a set $E,$ let $\{(a_{i},b_{i})\}$ be a countable collection of open subintervals whose union contains $E$ (that is, an open cover for $E$ ). If $\sum_{i}(b_{i}-a_{i})$ is finite, call that the “length” of the cover of $E.$ Then consider the infimum of all lengths of covers of $E$ (note: since $E \subset \lbrack 0,1]$ , there exists a cover of length 1 so such an infimum exists). This infimum is
called the measure of $E.$

This definition of measure gives us most of what we want (1-5). Note: proving 1 is a bit trickier than it might first appear; it is immediate that
$m([a,b])\leq b-a.$ To go the other way, one might use the fact that every open cover has a finite subcover (the Heinie-Borel Theorem from your analysis class) and induct on the number of open sets in your finite subcover). Features 2, 3, and 4 are pretty easy to demonstrate.

5 (called “countable subadditivity”) Makes for an interesting exercise. I’ll sketch out a solution here:

Since each $E_{i}$ has a measure, we’ll find a collection of open intervals $I_{ij}$ that cover and, if $l(I_{ij})$ denotes the length of the open interval, we can assume that $m(E_{i})+\frac{\varepsilon }{2^{i}}>\sum_{j}$ $l(I_{ij}).$ Then note that
$\cup _{ij}I_{ij}$ covers $E=\cup _{i=1}^{\infty }E_{i}$ and
$\sum_{i}(m(E_{i})+\frac{\varepsilon }{2^{i}})>\sum_{i}\sum_{j}$ $l(I_{ij})\rightarrow \sum_{i}(m(E_{i}))+\varepsilon >\sum_{i}\sum_{j}$ $l(I_{ij})$ for any $\varepsilon >0.$

Hence $\sum_{i}(m(E_{i}))\geq m(E)$ .

So what about the case where the $E_{i}$ are disjoint; do we get equality?

The answer is…well, NO…not for ALL subsets $E.$

It will turn out that we will have to restrict our measure to certain subsets called “measurable sets”. The measurable sets form one large collection of subsets of $[0,1]$ that “work”. These sets will be those sets that have this condition: $m(E)+m(E^{\prime})=1$ (where $E^{\prime }=[0,1]-E$ ; the set compliment of $E).$

It isn’t obvious that this is the condition that we’ll need; this complement condition is called the Carathedory characterization.

Note: I am being a little bit sloppy here. Strictly speaking, I should use the concept of “outer measure” (that is, the “infimum of the sum of the lengths of the covers”) when talking about the Carathedory characterization and denote that by $m^{*}(E)$ and only use $m(E)$ when I am talking about the “measure” of the “measurable sets”; otherwise I would be using circular logic. But this isn’t a text book so I’ll abuse notation.

Note: it is easy to establish that intervals and one point sets are measurable. To find other sets that are measurable, we’ll need to use
results from the algebra of sets;, sigma-algebras, etc.; the idea is to show that, say, the collection of measurable sets are closed with respect to countable unions, countable intersections, set complementation, etc.

This is one reason why set algebra topics are often covered in real analysis textbooks in the first sections or chapters. We’ll also show an example of a non-measurable set in a future post (2 dimensional versions of these come
into play in the Banach-Tarski “paradox”).

In the next section, we construct a non-measurable set.

Comments (4)

January 7, 2011

The Dirac Delta Function in an Elementary Differential Equations Course

Filed under: applied mathematics, class room experiment, density function, differential equations, dirac delta function, distributions, generalized functions, Impulse, Laplace transform, mathematics education, Normal distribution — collegemathteaching @ 9:01 pm

The Dirac Delta Function in Differential Equations

The delta ”function” is often introduced into differential equations courses during the section on Laplace transforms. Of course the delta
”function” isn’t a function at all but rather what is known as a ”distribution” (more on this later)

A typical introduction is as follows: if one is working in classical mechanics and one applies a force $F(t)$ to a constant mass $m$ at time $t,$ then one can define the impulse $I$ of $F$ over an interval $[a,b]$ by $I=\int_{a}^{b}F(t)dt=m(v(a)-v(b))$ where $v$ is the velocity. So we can do a translation to set $a=0$ and then consider a unit impulse and vary $F(t)$
according to where $b$ is; that is, define
$\delta ^{\varepsilon}(t)=\left\{ \begin{array}{c}\frac{1}{\varepsilon },0\leq t\leq \varepsilon \\ 0\text{ elsewhere}\end{array}\right. .$

Then $F(t)=\delta ^{\varepsilon }(t)$ is the force function that produces unit impulse for a given $\varepsilon >0.$

Then we wave our hands and say $\delta (t)=\lim _{\varepsilon \rightarrow 0}\delta ^{\varepsilon }(t)$ (this is a great reason to introduce the concept of the limit of functions in a later course) and then argue that for all functions that are continuous over an interval containing 0,
$\int_{0}^{\infty }\delta (t)f(t)dt=f(0)$ .

The (hand waving) argument at this stage goes something like: ”the mean value theorem for integrals says that there is a $c_{\varepsilon }$
between $0$ and $\varepsilon$ such that $\int_{0}^{\varepsilon }\delta^{\varepsilon }(t)f(t)dt=\frac{1}{\varepsilon}f(c_{\varepsilon})(\varepsilon -0)=f(c_{\varepsilon })$ Therefore as $\varepsilon\rightarrow 0,$ $\int_{0}^{\varepsilon }\delta^{\varepsilon}(t)f(t)dt=f(c_{\varepsilon })\rightarrow f(0)$ by continuity. Therefore we can define the Laplace transform $L(\delta (t))=e^{-s0}=1.$ ”

Illustrating what the delta ”function” does.

I came across this example by accident; I was holding a review session for students and asked for them to give me a problem to solve.

They chose $y^{\prime \prime }+ay^{\prime }+by=\delta$ (I can remember what $a$ and $b$ were but they aren’t important here as we will see) with initial conditions $y(0)=0,y^{\prime }(0)=-1$

So using the Laplace transform, we obtained:

$(s^{2}+as+b)Y-sy(0)-y^{\prime }(0)-ay(0)=1$

But with $y(0)=0,y^{\prime }(0)=-1$ this reduces to $(s^{2}+as+b)Y+1=1\rightarrow Y=0$

In other words, we have the ”same solution” as if we had $y^{\prime\prime }+ay^{\prime }+by=0$ with $y(0)=0,y^{\prime }(0)=0$ .

So that might be a way to talk about the delta ”function”; it is exactly the ”impulse” one needs to ”cancel out” an initial velocity of $-1$ or,
equivalently, to give an initial velocity of $1$ and to do so instantly.

Another approach to the delta function

Though it is true that $\int_{-\infty }^{\infty }\delta^{\varepsilon }(t)dt=1$ for all $\varepsilon$ and
$\int_{-\infty}^{\infty }\delta (t)dt=1$ by design, note that $\delta ^{\varepsilon }(t)$ fails to be continuous at $0$ and at $\varepsilon$ .

So, can we obtain the delta ”function” as a limit of other functions that are everywhere continuous and differentiable?

In an attempt to find such a family of functions, It is a fun exercise to look at a limit of normal density functions with mean zero:

$f_{\sigma }(t)=\frac{1}{\sigma \sqrt{2\pi }}\exp (-\frac{1}{2\sigma ^{2}}t^{2})$ . Clearly for all
$\sigma >0,\int_{-\infty }^{\infty }f_{\sigma}(t)dt=1$ and $\int_{0}^{\infty }f_{\sigma }(t)dt=\frac{1}{2}$ .

Here is the graph of some of these functions: we use $\sigma = .5$ , $\sigma = .25$ and $\sigma = .1$ respectively.

densities

Calculating the Laplace transform

$L(\frac{1}{\sigma \sqrt{2\pi }}\exp (-\frac{1}{2\sigma ^{2}}t^{2}))=$ $\frac{1}{\sigma \sqrt{2\pi }}\int_{0}^{\infty }\exp (-\frac{1}{2\sigma^{2}}t^{2})\exp (-st)dt=$

Do some algebra to combine the exponentials, complete the square and do some algebra to obtain:

$\frac{1}{\sigma \sqrt{2\pi }}\int_{0}^{\infty }\exp (-\frac{1}{2\sigma ^{2}}(t+\sigma ^{2}s)^{2})\exp (\frac{s^{2}\sigma^{2}}{2})dt=\exp (\frac{s^{2}\sigma ^{2}}{2})[\frac{1}{\sigma \sqrt{2\pi }}\int_{0}^{\infty }\exp (-\frac{1}{2\sigma ^{2}}(t+\sigma^{2}s)^{2})dt]$

Now do the usual transformation to the standard normal random variable via $z=\dfrac{t+\sigma ^{2}s}{\sigma }$

And we obtain:

$L(f_{\sigma }(t))=\exp (\frac{s^{2}\sigma ^{2}}{2})P(Z>\sigma s)$ for all $\sigma >0$ . Note: assume $s>0$ and that $P$ is shorthand for the usual probability distribution function.

Now if we take a limit as $\sigma \rightarrow 0$ we get $\frac{1}{2}$ on the right hand side.

Hence, one way to define $\delta$ is as $2\lim _{\sigma \rightarrow0}f_{\sigma }(t)$ . This means that while
$\lim_{\sigma \rightarrow0}\int_{-\infty }^{\infty }2f_{\sigma }(t)dt$ is off by a factor of 2,
$\lim_{\sigma \rightarrow 0}\int_{0}^{\infty }2f_{\sigma }(t)dt=1$ as desired.

Since we now have derivatives of the functions to examine, why don’t we?

$\frac{d}{dt}2f_{\sigma }(t)=-\frac{2t}{\sigma ^{3}\sqrt{2\pi }}\exp (-\frac{1}{2\sigma ^{2}}t^{2})$ which is zero at $t=0$ for all $\sigma >0.$ But the behavior of the derivative is interesting: the derivative is at its minimum at $t=\sigma$ and at its maximum at $t=-\sigma$ (as we tell our probability students: the standard deviation is the distance from the origin to the inflection points) and as $\sigma \rightarrow 0,$ the inflection points get closer together and the second derivative at the
origin approaches $-\infty ,$ which can be thought of as an instant drop from a positive velocity at $t=0$ .

Here are the graphs of the derivatives of the density functions that were plotted above; note how the part of the graph through the origin becomes more vertical as the standard deviation approaches zero.

derivatives

Comments (6)

College Math Teaching

January 31, 2011

Cantor Sets in [0,1] and Lebesgue Measure

January 29, 2011

Taylor Series: student misunderstanding

January 25, 2011

Some Sets Which Are Lebesgue Measurable

January 24, 2011

Lebesgue Measure: Outer Measure, Measure, Why the Caratheodory Criterion works

January 18, 2011

Lebesgue Integral: Bounded Functions on a Bounded Set

January 17, 2011

Integration: Riemann Integration, Limitations and Lebesgue’s Idea

January 13, 2011

A Non-Measurable Set

January 12, 2011

The Lebesgue Integral: motivating the mastery of the tools of abstract mathematics

January 7, 2011

The Dirac Delta Function in an Elementary Differential Equations Course

Categories

Blogroll

Archives

Top Clicks

Top Posts