College Math Teaching

April 28, 2023

Taylor Polynomials without series (advantages, drawbacks)

Filed under: calculus, series, Taylor polynomial., Taylor Series — oldgote @ 12:32 am

I was in a weird situation this semester in my “applied calculus” (aka “business calculus”) class. I had an awkward amount of time left (1 week) and I still wanted to do something with Taylor polynomials, but I had nowhere near enough time to cover infinite series and power series.

So, I just went ahead and introduced it “user’s manual” style, knowing that I could justify this, if I had to (and no, I didn’t), even without series. BUT there are some drawbacks too.

Let’s see how this goes. We’ll work with series centered at c =0 (expand about 0) and assume that f has as many continuous derivatives as desired on an interval connecting 0 to x .

Now we calculate: \int^x_0 f'(t) dt = f(x) -f(0) , of course. But we could do the integral another way: let’s use parts and say u = f'(t), dv = dt \rightarrow du = f''(t), v = (t-x) . Note the choice for v and that x is a constant in the integral. We then get f(x) -f(0)=\int^x_0 f'(t) dt = (f'(t)(t-x)|^x_0 -\int^x_0f''(t)(t-x) dx . Evaluation:

f(x) =f(0)+f'(0)x -\int^x_0f''(t)(t-x) dx and we’ve completed the first step.

Though we *could* do the inductive step now, it is useful to grind through a second iteration to see the pattern.

We take our expression and compute \int^x_0f''(t)(t-x) dx  by parts again, with u = f''(t), dv =t-x \rightarrow du =f'''(t), v = {(t-x)^2 \over 2!} and insert into our previous expression:

f(x) =f(0)+f'(0)x - (f''(t){(t-x)^2 \over 2!}|^x_0 + \int^x_0 f'''(t){(t-x)^2 \over 2!} dt which works out to:

f(x) = f(0)+f'(0)x +f''(0){x^2 \over 2} + \int^x_0 f'''(t){(t-x)^2 \over 2!} dt and note the alternating sign of the integral.

Now to use induction: assume that:

f(x) = f(0)+f'(0)x +f''(0){x^2 \over 2} + ....f^{(k)}(0){x^k \over k!} + (-1)^k \int^x_0 f^{(k+1)}(t) {(t-x)^k \over k!} dt

Now let’s look at the integral: as usual, use parts as before and we obtain:

(-1)^k (f^{(k+2)}(t) {(t-x)^{k+1} \over (k+1)!}|^x_0 - \int^x_0 f^{(k+2)}(t) {(t-x)^{k+1} \over (k+1)!} dt ). Taking some care with the signs we end up with

(-1)^k (-f^{(k+1)}(0){(-x)^{k+1} \over (k+1)! } )+ (-1)^{k+1}\int^x_0 f^{(k+2)}(t) {(t-x)^{k+1} \over (k+1)!} dt which works out to (-1)^{2k+2} (f^{(k+1)}(0) {x^{k+1} \over (k+1)!} )+ (-1)^{k+1}\int^x_0 f^{(k+2)}(t) {(t-x)^{k+1} \over (k+1)!} dt .

Substituting this evaluation into our inductive step equation gives the desired result.

And note: NOTHING was a assumed except for f having the required number of continuous derivatives!

BUT…yes, there is a catch. The integral is often regarded as a “correction term.” But the Taylor polynomial is really only useful so long as the integral can be made small. And that is the issue with this approach: there are times when the integral cannot be made small; it is possible that x can be far enough out that the associated power series does NOT converge on (-x, x) and the integral picks that up, but it may well be hidden, or at least non-obvious.

And that is why, in my opinion, it is better to do series first.

Let’s show an example.

Consider f(x) = {1 \over 1+x } . We know from work with the geometric series that its series expansion is 1 -x +x^2-x^3....+(-1)^k x^k + .... and that the interval of convergence is (-1,1) But note that f is smooth over [0, \infty) and so our Taylor polynomial, with integral correction, should work for x > 0 .

So, nothing that f^{(k)} = (-1)^k(k!)(1+x)^{-(k+1)} our k-th Taylor polynomial relation is:

f(x) =1-x+x^2-x^3 .....+(-1)^kx^k +(-1)^k \int^x_0 (-1)^{k+1}(k+1)!{1 \over (1+t)^{k+2} } {(t-x)^k \over k!} dt

Let’s focus on the integral; the “remainder”, if you will.

Rewrite it as: (-1)^{2k+1} (k+1) \int^x_0 ({(t -x) \over (t+1) })^k {1 \over (t+1)} dt .

Now this integral really isn’t that hard to do, if we use an algebraic trick:

Rewrite ({(t -x) \over (t+1) })^k  = ({(t+1 -x-1) \over (t+1) })^k = (1-{(x+1) \over (t+1) })^k

Now the integral is a simple substitutions integral: let u = 1-{(x+1) \over (t+1) } \rightarrow du = (x+1)( {1 \over (t+1)})^2 dt so our integral is transformed into:

(-1) ({k+1 \over x+1}) \int^0_{-x} u^{k} du = (-1) {k  \over (k+1)(x+1)} (-(-x)^{k+1}) = (-1)^{k+1} {k+1 \over (k+1)(x+1)} x^{k+1} =(-1)^{k+1}{1 \over (x+1)}x^{k+1}

This remainder cannot be made small if x \geq 1 no matter how big we make k

But, in all honesty, this remainder could have been computed with simple algebra.

{1 \over x+1} =1-x+x^2....+(-1)^k x^k + R and now solve for R algebraically .

The larger point is that the “error” is hidden in the integral remainder term, and this can be tough to see in the case where the associated Taylor series has a finite radius of convergence but is continuous on the whole real line, or a half line.

March 26, 2023

Annoying calculations: Beta integral

March 7, 2023

Teaching double integrals: why you should *always* sketch the region

The problem (from Larson’s Calculus, an Applied Approach, 10’th edition, Section 7.8, no. 18 in my paperback edition, no. 17 in the e-book edition) does not seem that unusual at a very quick glance:

\int^2_0 \int^{\sqrt{1-y^2}}_0 -5xy dx dy if you have a hard time reading the image. AND, *if* you just blindly do the formal calculations:

-{5 \over 2} \int^2_0 x^2y|^{x=\sqrt{1-y^2}}_{x=0}  dy = -{5 \over 2} \int^2_0 y-y^3 dy = -{5 \over 2}(2-4) = 5 which is what the text has as “the answer.”

But come on. We took a function that was negative in the first quadrant, integrated it entirely in the first quadrant (in standard order) and ended up with a positive number??? I don’t think so!

Indeed, if we perform \int^2_0 \int^1_0 -5xy dxdy =-5 which is far more believable.

So, we KNOW something is wrong. Now let’s attempt to sketch the region first:

Oops! Note: if we just used the quarter circle boundary we obtain

\int^1_0 \int^{x=\sqrt{1-y^2}}_{x=0} -5xy dxdy = -{5 \over 8}

The 3-dimensional situation: we are limited by the graph of the function, the cylinder x^2+y^2 =1 and the planes y=0, x =0 ; the plane y=2 is outside of this cylinder. (from here: the red is the graph of z = -5xy

Now think about what the “formal calculation” really calculated and wonder if it was just a coincidence that we got the absolute value of the integral taken over the rectangle 0 \leq x \leq 1, 0 \leq y \leq 2

October 7, 2021

A “weird” implicit graph

Filed under: calculus, implicit differentiation, pedagogy — oldgote @ 12:46 am

I was preparing some implicit differentiation exercises and decided to give this one:

If sin^2(y) + cos^2(x) =1 find {dy \over dx} That is fairly straight forward, no? But there is a bit more here than meets the eye, as I quickly found out. I graphed this on Desmos and:

What in the world? Then I pondered for a minute or two and then it hit me:

sin^2(y) = 1-cos^2(x) \rightarrow sin^2(y) = sin^2(x) \rightarrow \pm(y \pm 2k \pi ) = \pm (x +2 k \pi) which leads to families of lines with either slope 1 or slope negative 1 and y intercepts multiples of \pi

Now, just blindly doing the problem we get 2sin(x)cos(x) = 2 {dy \over dx} cos(y)sin(y) which leads to: {sin(x)cos(x) \over sin(y)cos(y)} = {dy \over dx} = \pm {\sqrt{1-cos^2(y)} \sqrt{1-sin^2(x)} \over \sqrt{1-cos^2(y)} \sqrt{1-sin^2(x)}}  = \pm 1 by both the original equation and the circle identity.

May 21, 2021

Introduction to infinite series for inexperienced calculus teachers

Filed under: calculus, mathematics education, pedagogy, Power Series, sequences, series — oldgote @ 1:26 pm

Let me start by saying that this is NOT: this is not an introduction for calculus students (too steep) nor is this intended for experienced calculus teachers. Nor is this a “you should teach it THIS way” or “introduce the concepts in THIS order or emphasize THESE topics”; that is for the individual teacher to decide.

Rather, this is a quick overview to help the new teacher (or for the teacher who has not taught it in a long time) decide for themselves how to go about it.

And yes, I’ll be giving a lot of opinions; disagree if you like.

What series will be used for.

Of course, infinite series have applications in probability theory (discrete density functions, expectation and higher moment values of discrete random variables), financial mathematics (perpetuities), etc. and these are great reasons to learn about them. But in calculus, these tend to be background material for power series.

Power series: \sum^{\infty}_{k=0} a_k (x-c)^k , the most important thing is to determine the open interval of absolute convergence; that is, the intervals on which \sum^{\infty}_{k=0} |a_k (x-c)^k | converges.

We teach that these intervals are *always* symmetric about x = c (that is, at x = c only, on some open interval (c-\delta, c+ \delta) or the whole real line. Side note: this is an interesting place to point out the influence that the calculus of complex variables has on real variable calculus! These open intervals are the most important aspect as one can prove that one can differentiate and integrate said series “term by term” on the open interval of absolute convergence; sometimes one can extend the results to the boundary of the interval.

Therefore, if time is limited, I tend to focus on material more relevant for series that are absolutely convergent though there are some interesting (and fun) things one can do for a series which is conditionally convergent (convergent, but not absolutely convergent; e. g. \sum^{\infty}_{k=1} (-1)^{k+1} {1 \over k} .

Important principles: I think it is a good idea to first deal with geometric series and then series with positive terms…make that “non-negative” terms.

Geometric series: \sum ^{\infty}_{k =0} x^k ; here we see that for x \neq 1 , \sum ^{n}_{k =0} x^k= {1-x^{n+1} \over 1-x } and is equal to n+1 for n = 1 ; to show this do the old “shifted sum” addition: S = 1 + x + x^2 + ...x^n , xS = x+x^2 + ...+x^{n+1} then subtract: S-xS = (1-x)S = 1-x^{n+1} as most of the terms cancel with the subtraction.

Now to show the geometric series converges, (convergence being the standard kind: \sum^n_{k = 0} c_k = S_n the “n’th partial sum, then the series \sum^{\infty}_{k = 0} c_k  converges if an only if the sequence of partial sums S_n converges; yes there are other types of convergence)

Now that we’ve established that for the geometric series, S_n =  {1-x^{n+1} \over 1-x }  and we get convergence if |x^{n+1}| goes to zero, which happens only if |x| < 1 .

Why geometric series: two of the most common series tests (root and ratio tests) involve a comparison to a geometric series. Also, the geometric series concept is used both in the theory of improper integrals and in measure theory (e. g., showing that the rational numbers have measure zero).

Series of non-negative terms. For now, we’ll assume that \sum a_k has all a_k \geq 0 (suppressing the indices).

Main principle: though most texts talk about the various tests, I believe that most of the tests involved really involve three key principles, two of which the geometric series and the following result from sequences of positive numbers:

Key sequence result: every monotone bounded sequence of positive numbers converges to its least upper bound.

True: many calculus texts don’t do that much with the least upper bound concept but I feel it is intuitive enough to at least mention. If the least upper bound is, say, b , then if a_n is the sequence in question, there has to be some N  > 0 such that a_n > b-\delta for any small, positive \delta . Then because a_n is monotone, b> a_{m} > b-\delta for all m > n

The third key principle is “common sense” : if \sum c_k converges (standard convergence) then c_k \rightarrow 0 as a sequence. This is pretty clear if the c_k are non-negative; the idea is that the sequence of partial sums S_n cannot converge to a limit unless |S_n -S_{n+1}| becomes arbitrarily small. Of course, this is true even if the terms are not all positive.

Secondary results I think that the next results are “second order” results: the main results depend on these, and these depend on the key 3 that we just discussed.

The first of these secondary results is the direct comparison test for series of non-negative terms:

Direct comparison test

If 0< c_n \leq b_n  and \sum b_n converges, then so does \sum c_n . If \sum c_n diverges, then so does \sum b_n .

The proof is basically the “bounded monotone sequence” principle applied to the partial sums. I like to call it “if you are taller than an NBA center then you are tall” principle.

Evidently, some see this result as a “just get to something else” result, but it is extremely useful; one can apply this to show that the exponential of a square matrix is defined; it is the principle behind the Weierstrass M-test, etc. Do not underestimate this test!

Absolute convergence: this is the most important kind of convergence for power series as this is the type of convergence we will have on an open interval. A series is absolutely convergent if \sum |c_k| converges. Now, of course, absolute convergence implies convergence:

Note 0 < |c_k| -c_k \leq 2|c_k| and if \sum |c_k| converges, then \sum |c_k|-c_k converges by direct comparison. Now note c_k = |c_k|-(|c_k| -c_k) \rightarrow \sum c_k is the difference of two convergent series: \sum |c_k| -\sum (|c_k|-c_k ) and therefore converges.

Integral test This is an important test for convergence at a point. This test assumes that f is a non-negative, non-decreasing function on some [1, \infty) (that is, a >b \rightarrow f(a) \geq f(b) ) Then \sum f(n) converges if and only if \int_1^{\infty} f(x)dx converges as an improper integral.

Proof: \sum_{n=2} f(n) is just a right endpoint Riemann sum for \int_1^{\infty} f(x)dx and therefore the sequence of partial sums is an increasing, bounded sequence. Now if the sum converges, note that \sum_{n=1} f(n) is the right endpoint estimate for \int_1^{\infty} f(x)dx so the integral can be defined as a limit of a bounded, increasing sequence so the integral converges.

Yes, these are crude whiteboards but they get the job done.

Note: we need the hypothesis that f is decreasing (or non-decreasing). Example: the function f(x) = \begin{cases}  x , & \text{ if } x \notin \{1, 2, 3,...\} \\ 0, & \text{ otherwise} \end{cases} certainly has \sum f(n) converging but \int^{\infty}_{1} f(x) dx diverging.

Going the other way, defining f(x) = \begin{cases}  2^n , & \text{ if }  x \in [n, n+2^{-2n}] \\0, & \text{ otherwise} \end{cases} gives an unbounded function with unbounded sum \sum_{n=1} 2^n but the integral converges to the sum \sum_{n=1} 2^{-n} =1 . The “boxes” get taller and skinnier.

Note: the above shows the integral and sum starting at 0; same principle though.

Now wait a minute: we haven’t really gone over how students will do most of their homework and exam problems. We’ve covered none of these: p-test, limit comparison test, ratio test, root test. Ok, logically, we have but not practically.

Let’s remedy that. First, start with the “point convergence” tests.

p-test. This says that \sum {1 \over k^p} converges if p> 1 and diverges otherwise. Proof: Integral test.

Limit comparison test Given two series of positive terms: \sum b_k and \sum c_k

Suppose lim_{k \rightarrow \infty} {b_k \over c_k} = L

If \sum c_k converges and 0 \leq L < \infty then so does \sum b_k .

If \sum c_k diverges and 0 < L \leq \infty then so does \sum b_k

I’ll show the “converge” part of the proof: choose \epsilon = L then N such that n > N \rightarrow  {b_n \over c_n } < 2L This means \sum_{k=n} b_k \leq \sum_{k=n} c_k and we get convergence by direct comparison. See how useful that test is?

But note what is going on: it really isn’t necessary for lim_{k \rightarrow \infty} {b_k \over c_k}  to exist; for the convergence case it is only necessary that there be some M for which M >  {b_k \over c_k}  ; if one is familiar with the limit superior (“limsup”) that is enough to make the test work.

We will see this again.

Why limit comparison is used: Something like \sum {1 \over 4k^5-2k^2-14} clearly converges, but nailing down the proof with direct comparison can be hard. But a limit comparison with \sum {1 \over k^5} is pretty easy.

Ratio test this test is most commonly used when the series has powers and/or factorials in it. Basically: given \sum c_n consider lim_{k \rightarrow \infty} {c_{k+1} \over c_{k}} = L (if the limit exists..if it doesn’t..stay tuned).

If L < 1 the series converges. If L > 1 the series diverges. If L = 1 the test is inconclusive.

Note: if it turns out that there is exists some N >0 such that for all n > N we have {c_{n+1} \over c_n } < \gamma < 1 then the series converges (we can use the limsup concept here as well)

Why this works: suppose there exists some N >0 such that for all n > N we have {c_{n+1} \over c_n } < \gamma < 1 Then write \sum_{k=n} c_k = c_n + c_{n+1} + c_{n+2} + ....

now factor out a c_n to obtain c_n (1 + {c_{n+1} \over c_n} + {c_{n+2} \over c_n} + {c_{n+3} \over c_{n}} +....)

Now multiply the terms by 1 in a clever way:

c_n (1 + {c_{n+1} \over c_n} + {c_{n+2} \over c_{n+1}}{c_{n+1} \over c_n} + {c_{n+3} \over c_{n+2}}  {c_{n+2} \over c_{n+1}}  {c_{n+1} \over c_{n}}   +....) See where this is going: each ratio is less than \gamma so we have:

\sum_{k=n} c_k \leq c_n \sum_{j=0} (\gamma)^j which is a convergent geometric series.

See: there is geometric series and the direct comparison test, again.

Root Test No, this is NOT the same as the ratio test. In fact, it is a bit “stronger” than the ratio test in that the root test will work for anything the ratio test works for, but there are some series that the root test works for that the ratio test comes up empty.

I’ll state the “lim sup” version of the ratio test: if there exists some N such that, for all n>N we have (c_n)^{1 \over n} < \gamma < 1 then the series converges (exercise: find the “divergence version”).

As before: if the condition is met, \sum_{k=n} c_n \leq \sum_{k=n} \gamma^k so the original series converges by direction comparison.

Now as far as my previous remark about the ratio test: Consider the series:

1 + ({1 \over 3}) + ({2 \over 3})^2 + ({1 \over 3})^3 + ({2 \over 3})^4 +...({1 \over 3})^{2k-1} +({2 \over 3})^{2k} ...

Yes, this series is bounded by the convergent geometric series with r = {2 \over 3} and therefore converges by direct comparison. And the limsup version of the root test works as well.

But the ratio test is a disaster as {({2 \over 3})^{2k}  \over  ({1 \over 3})^{2k-1} } ={2^{2k} \over 3 } which is unbounded..but {({1 \over 3})^{2k+1}  \over  ({2 \over 3})^{2k} }  ={1 \over (2^{2k} 3) } .

What about non-absolute convergence (aka “conditional convergence”)

Series like \sum_{k=1} (-1)^{k+1} {1 \over k} converges but does NOT converge absolutely (p-test). On one hand, such series are a LOT of fun..but the convergence is very slow and unstable and so might say that these series are not as important as the series that converges absolutely. But there is a lot of interesting mathematics to be had here.

So, let’s chat about these a bit.

We say \sum c_k is conditionally convergent if the series converges but \sum |c_k| diverges.

One elementary tool for dealing with these is the alternating series test:

for this, let c_k >0 and for all k, c_{k+1} < c_k .

Then \sum_{k=1} (-1)^{k+1} c_k converges if and only if c_k \rightarrow 0 as a sequence.

That the sequence of terms goes to zero is necessary. That it is sufficient in this alternating case: first note that the terms of the sequence of partial sums are bounded above by c_1 (as the magnitudes get steadily smaller) and below by c_1 - c_2 (same reason. Note also that S_{2k+2} = S_{2k} -c_{2k+1} + c_{2k+2} < S_{2k} so the sequence of partial sums of even index are an increasing bounded sequence and therefore converges to some limit, say, L . But S_{2k+1} = S_{2k} + c_{2k+1} and so by a routine “epsilon-N” argument the odd partial sums converge to L as well.

Of course, there are conditionally convergent series that are NOT alternating. And conditionally convergent series have some interesting properties.

One of the most interesting properties is that such series can be “rearranged” (“derangment” in Knopp’s book) to either converge to any number of choice or to diverge to infinity or to have no limit at all.

Here is an outline of the arguments:

To rearrange a series to converge to L , start with the positive terms (which must diverge as the series is conditionally convergent) and add them up to exceed L ; stop just after L is exceeded. Call that partial sum u_1. Note: this could be 0 terms. Now use the negative terms to go of the left of L and stop the first one past. Call that l_1 Then move to the right, past L again with the positive terms..note that the overshoot is smaller as the terms are smaller. This is u_2 . Then go back again to get l_2 to the left of L . Repeat.

Note that at every stage, every partial sum after the first one past L is between some u_i, l_i and the u_i, l_i bracket L and the distance is shrinking to become arbitrarily small.

To rearrange a series to diverge to infinity: Add the positive terms to exceed 1. Add a negative term. Then add the terms to exceed 2. Add a negative term. Repeat this for each positive integer n .

Have fun with this; you can have the partial sums end up all over the place.

That’s it for now; I might do power series later.

May 10, 2021

Series convergence tests: the “harder to use in calculus 1” tests may well be the most useful.

I talked about the root and ratio test here and how the root test is the stronger of the two tests. What I should point out that the proof of the root test depends on the basic comparison test.

And so..a professor on Twitter asked:

Of course, one proves the limit comparison test by the direct comparison test. But in a calculus course, the limit comparison test might appear to be more readily useful..example:

Show \sum {1 \over k^2-1} converges.

So..what about the direct comparison test?

As someone pointed out: the direct comparison can work very well when you don’t know much about the matrix.

One example can be found when one shows that the matrix exponential e^A where A is a n \times n matrix.

For those unfamiliar: e^A = \sum^{\infty}_{k=0} {A^k \over k!} where the powers make sense as A is square and we merely add the corresponding matrix entries.

What enables convergence is the factorial in the denominators of the individual terms; the i-j’th element of each A^k can get only so large.

But how does one prove convergence?

The usual way is to dive into matrix norms; one that works well is |A| = \sum_{(i,j)} |a_{i,j}| (just sum up the absolute value of the elements (the Taxi cab norm or l_1 norm )

Then one can show |AB| \leq |A||B| and |a_{i,j}| \leq |A| and together this implies the following:

For any index k where a^k_{i,j} is the i-j’th element of A^k we have:

| a^k_{i,j}  | \leq |A^k| \leq |A|^k

It then follows that | [ e^A ]_{i,j} | \leq \sum^{\infty}_{k=0} {|A^k |\over k!} \leq  \sum^{\infty}_{k=0} {|A|^k \over k!} =e^{|A|} . Therefore every series that determines an entry of the matrix e^A is an absolutely convergent series by direct comparison. and is therefore a convergent series.

July 14, 2020

An alternative to trig substitution, sort of..

Ok, just for fun: \int \sqrt{1+x^2} dx =

The usual is to use x =tan(t), dx =sec^2(t) dt which transforms this to the dreaded \int sec^3(t) dt integral, which is a double integration by parts.
Is there a way out? I think so, though the price one pays is a trickier conversion back to x.

Let’s try x =sinh(t) \rightarrow dx = cosh(t) dt so upon substituting we obtain \int |cosh(t)|cosh(t) dt and noting that cosh(t) > 0 alaways:

\int cosh^2(t)dt Now this can be integrated by parts: let u=cosh(t) dv = cosh(t) dt \rightarrow du =sinh(t), v = sinh(t)

So \int cosh^2(t)dt = cosh(t)sinh(t) -\int sinh^2(t)dt but this easily reduces to:

\int cosh^2(t)dt = cosh(t)sinh(t) -\int cosh^2(t)-1 dt \rightarrow 2\int cosh^2(t)dt  = cosh(t)sinh(t) -t + C

Division by 2: \int cosh^2(t)dt = \frac{1}{2}(cosh(t)sinh(t)-t)+C

That was easy enough.

But we now have the conversion to x: \frac{1}{2}(cosh(t)sinh(t) \rightarrow \frac{1}{2}x \sqrt{1+x^2}

So far, so good. But what about t \rightarrow   arcsinh(x) ?

Write: sinh(t) = \frac{e^{t}-e^{-t}}{2} =  x \rightarrow e^{t}-e^{-t} =2x \rightarrow e^{t}-2x -e^{-t} =0

Now multiply both sides by e^{t} to get e^{2t}-2xe^t -1 =0 and use the quadratic formula to get e^t = \frac{1}{2}(2x\pm \sqrt{4x^2+4} \rightarrow e^t = x \pm \sqrt{x^2+1}

We need e^t > 0 so e^t = x + \sqrt{x^2+1} \rightarrow t = ln|x + \sqrt{x^2+1}| and that is our integral:

\int \sqrt{1+x^2} dx = \frac{1}{2}x \sqrt{1+x^2} + \frac{1}{2} ln|x + \sqrt{x^2+1}| + C

I guess that this isn’t that much easier after all.

July 12, 2020

Logarithmic differentiation: do we not care about domains anymore?

Filed under: calculus, derivatives, elementary mathematics, pedagogy — collegemathteaching @ 11:29 pm

The introduction is for a student who might not have seen logarithmic differentiation before: (and yes, this technique is extensively used..for example it is used in the “maximum likelihood function” calculation frequently encountered in statistics)

Suppose you are given, say, f(x) =sin(x)e^x(x-2)^3(x+1) and you are told to calculate the derivative?

Calculus texts often offer the technique of logarithmic differentiation: write ln(f(x)) = ln(sin(x)e^x(x-2)^3(x+1)) = ln(sin(x)) + x + 3ln(x-2) + ln(x+1)
Now differentiate both sides: ln((f(x))' = \frac{f'(x)}{f(x)}  = \frac{cos(x)}{sin(x)} + 1 + \frac{3}{x-2} + {1 \over x+1}

Now multiply both sides by f(x) to obtain

f'(x) = f(x)(\frac{cos(x)}{sin(x)} + 1 + \frac{3}{x-2} + {1 \over x+1}) =

\

(sin(x)e^x(x-2)^3(x+1)(\frac{cos(x)}{sin(x)} + 1 + \frac{3}{x-2} + {1 \over x+1})

And this is correct…sort of. Why I say sort of: what happens, at say, x = 0 ? The derivative certainly exists there but what about that second factor? Yes, the sin(x) gets cancelled out by the first factor, but AS WRITTEN, there is an oh-so-subtle problem with domains.

You can only substitute x \in \{ 0, \pm k \pi \} only after simplifying ..which one might see as a limit process.

But let’s stop and take a closer look at the whole process: we started with f(x) = g_1(x) g_2(x) ...g_n(x) and then took the log of both sides. Where is the log defined? And when does ln(ab) = ln(a) + ln(b) ? You got it: this only works when a > 0, b > 0 .

So, on the face of it, ln(g_1 (x) g_2(x) ...g_n(x)) = ln(g_1(x) ) + ln(g_2(x) ) + ...ln(g_n(x)) is justified only when each g_i(x) > 0 .

Why can we get away with ignoring all of this, at least in this case?

Well, here is why:

1. If f(x) \neq 0 is a differentiable function then \frac{d}{dx} ln(|f(x)|) = \frac{f'(x)}{f(x)}
Yes, this is covered in the derivation of \int {dx \over x} material but here goes: write

|f(x)| =   \begin{cases}      f(x) ,& \text{if } f(x) > 0 \\      -f(x),              & \text{otherwise}  \end{cases}

Now if f(x) > 0 we get { d \over dx} ln(f(x)) = {f'(x) \over f(x) } as usual. If f(x) < 0 then |f(x)| = =f(x), |f(x)|' = (-f(x))' = -f'(x) and so in either case:

\frac{d}{dx} ln(|f(x)|) = \frac{f'(x)}{f(x)} as required.

THAT is the workaround for calculating {d \over dx } ln(g_1(x)g_2(x)..g_n(x)) where g_1(x)g_2(x)..g_n(x) \neq 0 : just calculate {d \over dx } ln(|g_1(x)g_2(x)..g_n(x)|) . noting that |g_1(x)g_2(x)..g_n(x)| = |g_1(x)| |g_2(x)|...|g_n(x)|

Yay! We are almost done! But, what about the cases where at least some of the factors are zero at, say x= x_0 ?

Here, we have to bite the bullet and admit that we cannot take the log of the product where any of the factors have a zero, at that point. But this is what we can prove:

Given g_1(x) g_2(x)...g_n(x) is a product of differentiable functions and g_1(a) g_2(a)...g_k(a) = 0 k \leq n then
(g_1(a)g_2(a)...g_n(a))' = lim_{x \rightarrow a}  g_1(x)g_2(x)..g_n(x) ({g_1'(x) \over g_1(x)} + {g_2'(x) \over g_2(x)} + ...{g_n'(x) \over g_n(x})

This works out to what we want by cancellation of factors.

Here is one way to proceed with the proof:

1. Suppose f, g are differentiable and f(a) = g(a) = 0 . Then (fg)'(a) = f'(a)g(a) + f(a)g'(a) = 0 and lim_{x \rightarrow a} f(x)g(x)({f'(x) \over f(x)} + {g'(x) \over g(x)}) = 0
2. Now suppose f, g are differentiable and f(a) =0 ,  g(a) \neq 0 . Then (fg)'(a) = f'(a)g(a) + f(a)g'(a) = f'(a)g(a) and lim_{x \rightarrow a} f(x)g(x)({f'(x) \over f(x)} + {g'(x) \over g(x)}) = f'(a)g(a)
3.Now apply the above to g_1(x) g_2(x)...g_n(x) is a product of differentiable functions and g_1(a) g_2(a)...g_k(a) = 0 k \leq n
If k = n then (g_1(a)g_2(a)...g_n(a))' = lim_{x \rightarrow a}  g_1(x)g_2(x)..g_n(x) ({g_1'(x) \over g_1(x)} + {g_2'(x) \over g_2(x)} + ...{g_n'(x) \over g_n(x}) =0 by inductive application of 1.

If k < n then let g_1...g_k = f, g_{k+1} ..g_n  =g as in 2. Then by 2, we have (fg)' =  f'(a)g(a) Now this quantity is zero unless k = 1 and f'(a) neq 0 . But in this case note that lim_{x \rightarrow a} g_1(x)g_2(x)...g_n(x)({g_1'(x) \over g_1(x)} + {g_2'(x) \over g_2(x)} + ...+ {g_n'(x) \over g_n(x)})  = lim_{x \rightarrow a} g_2(x)...g_n(x)(g_1'(x)) =g(a)g_1(a)

So there it is. Yes, it works ..with appropriate precautions.

July 10, 2020

This always bothered me about partial fractions…

Filed under: algebra, calculus, complex variables, elementary mathematics, integration by substitution — Tags: — collegemathteaching @ 12:03 am

Let’s look at an “easy” starting example: write \frac{1}{(x-1)(x+1)} = \frac{A}{x-1} + \frac{B}{x+1}
We know how that goes: multiply both sides by (x-1)(x+1) to get 1 = A(x+1) + B(x-1) and then since this must be true for ALL x , substitute x=-1 to get B = -{1 \over 2} and then substitute x = 1 to get A = {1 \over 2} . Easy-peasy.

BUT…why CAN you do such a substitution since the original domain excludes x =1, x = -1 ?? (and no, I don’t want to hear about residues and “poles of order 1”; this is calculus 2. )

Lets start with \frac{1}{(x-1)(x+1)} = \frac{A}{x-1} + \frac{B}{x+1} with the restricted domain, say x \neq 1
Now multiply both sides by x-1 and note that, with the restricted domain x \neq 1 we have:

\frac{1}{x+1}  = A + \frac{B(x-1)}{x+1} But both sides are equal on the domain (-1, 1) \cup (1, \infty) and the limit on the left hand side is lim_{x \rightarrow 1} {1 \over x+1 } = {1 \over 2} So the right hand side has a limit which exists and is equal to A . So the result follows..and this works for the calculation for B as well.

Yes, no engineer will care about this. But THIS is the reason we can substitute the non-domain points.

As an aside: if you are trying to solve something like {x^2 + 3x + 2 \over (x^2+1)(x-3) } = {Ax + B \over x^2+1 } + {C \over x-3 } one can do the denominator clearing, and, as appropriate substitute x = i and compare real and imaginary parts ..and yes, now you can use poles and residues.

March 16, 2019

The beta function integral: how to evaluate them

My interest in “beta” functions comes from their utility in Bayesian statistics. A nice 78 minute introduction to Bayesian statistics and how the beta distribution is used can be found here; you need to understand basic mathematical statistics concepts such as “joint density”, “marginal density”, “Bayes’ Rule” and “likelihood function” to follow the youtube lecture. To follow this post, one should know the standard “3 semesters” of calculus and know what the gamma function is (the extension of the factorial function to the real numbers); previous exposure to the standard “polar coordinates” proof that \int^{\infty}_{-\infty} e^{x^2} dx = \sqrt{\pi} would be very helpful.

So, what it the beta function? it is \beta(a,b) = \frac{\Gamma(a) \Gamma(b)}{\Gamma(a+b)} where \Gamma(x) = \int_0^{\infty} t^{x-1}e^{-t} dt . Note that \Gamma(n+1) = n! for integers n The gamma function is the unique “logarithmically convex” extension of the factorial function to the real line, where “logarithmically convex” means that the logarithm of the function is convex; that is, the second derivative of the log of the function is positive. Roughly speaking, this means that the function exhibits growth behavior similar to (or “greater”) than e^{x^2}

Now it turns out that the beta density function is defined as follows: \frac{\Gamma(a+b)}{\Gamma(a)\Gamma(b)} x^{a-1}(1-x)^{b-1} for 0 < x < 1 as one can see that the integral is either proper or a convergent improper integral for 0 < a < 1, 0 < b < 1 .

I'll do this in two steps. Step one will convert the beta integral into an integral involving powers of sine and cosine. Step two will be to write \Gamma(a) \Gamma(b) as a product of two integrals, do a change of variables and convert to an improper integral on the first quadrant. Then I'll convert to polar coordinates to show that this integral is equal to \Gamma(a+b) \beta(a,b)

Step one: converting the beta integral to a sine/cosine integral. Limit t \in [0, \frac{\pi}{2}] and then do the substitution x = sin^2(t), dx = 2 sin(t)cos(t) dt . Then the beta integral becomes: \int_0^1 x^{a-1}(1-x)^{b-1} dx = 2\int_0^{\frac{\pi}{2}} (sin^2(t))^{a-1}(1-sin^2(t))^{b-1} sin(t)cos(t)dt = 2\int_0^{\frac{\pi}{2}} (sin(t))^{2a-1}(cos(t))^{2b-1} dt

Step two: transforming the product of two gamma functions into a double integral and evaluating using polar coordinates.

Write \Gamma(a) \Gamma(b) = \int_0^{\infty} x^{a-1} e^{-x} dx  \int_0^{\infty} y^{b-1} e^{-y} dy

Now do the conversion x = u^2, dx = 2udu, y = v^2, dy = 2vdv to obtain:

\int_0^{\infty} 2u^{2a-1} e^{-u^2} du  \int_0^{\infty} 2v^{2b-1} e^{-v^2} dv (there is a tiny amount of algebra involved)

From which we now obtain

4\int^{\infty}_0 \int^{\infty}_0 u^{2a-1}v^{2b-1} e^{-(u^2+v^2)} dudv

Now we switch to polar coordinates, remembering the rdrd\theta that comes from evaluating the Jacobian of x = rcos(\theta), y = rsin(\theta)

4 \int^{\frac{\pi}{2}}_0 \int^{\infty}_0 r^{2a +2b -1} (cos(\theta))^{2a-1}(sin(\theta))^{2b-1} e^{-r^2} dr d\theta

This splits into two integrals:

2 \int^{\frac{\pi}{2}}_0 (cos(\theta))^{2a-1}(sin(\theta))^{2b-1} d \theta 2\int^{\infty}_0 r^{2a +2b -1}e^{-r^2} dr

The first of these integrals is just \beta(a,b) so now we have:

\Gamma(a) \Gamma(b) = \beta(a,b) 2\int^{\infty}_0 r^{2a +2b -1}e^{-r^2} dr

The second integral: we just use r^2 = x \rightarrow 2rdr = dx \rightarrow \frac{1}{2}\frac{1}{\sqrt{x}}dx = dr to obtain:

2\int^{\infty}_0 r^{2a +2b -1}e^{-r^2} dr = \int^{\infty}_0 x^{a+b-\frac{1}{2}} e^{-x} \frac{1}{\sqrt{x}}dx = \int^{\infty}_0 x^{a+b-1} e^{-x} dx =\Gamma(a+b) (yes, I cancelled the 2 with the 1/2)

And so the result follows.

That seems complicated for a simple little integral, doesn’t it?

Older Posts »

Create a free website or blog at WordPress.com.