College Math Teaching

September 8, 2018

Proving a differentiation formula for f(x) = x ^(p/q) with algebra

Filed under: calculus, derivatives, elementary mathematics, pedagogy — collegemathteaching @ 1:55 am

Yes, I know that the proper way to do this is to prove the derivative formula for f(x) = x^n and then use, say, the implicit function theorem or perhaps the chain rule.

But an early question asked students to use the difference quotient method to find the derivative function (ok, the “gradient”) for f(x) = x^{\frac{3}{2}} And yes, one way to do this is to simplify the difference quotient \frac{t^{\frac{3}{2}} -x^{\frac{3}{2}} }{t-x} by factoring t^{\frac{1}{2}} -x^{\frac{1}{2}} from both the numerator and the denominator of the difference quotient. But this is rather ad-hoc, I think.

So what would one do with, say, f(x) = x^{\frac{p}{q}} where p, q are positive integers?

One way: look at the difference quotient: \frac{t^{\frac{p}{q}}-x^{\frac{p}{q}}}{t-x} and do the following (before attempting a limit, of course): let u= t^{\frac{1}{q}}, v =x^{\frac{1}{q}} at which our difference quotient becomes: \frac{u^p-v^p}{u^q -v^q}

Now it is clear that u-v is a common factor..but HOW it factors is essential.

So let’s look at a little bit of elementary algebra: one can show:

x^{n+1} - y^{n+1} = (x-y) (x^n + x^{n-1}y + x^{n-2}y^2 + ...+ xy^{n-1} + y^n)

= (x-y)\sum^{n}_{i=0} x^{n-i}y^i (hint: very much like the geometric sum proof).

Using this:

\frac{u^p-v^p}{u^q -v^q} = \frac{(u-v)\sum^{p-1}_{i=0} u^{p-1-i}v^i}{(u-v)\sum^{q-1}_{i=0} u^{q-1-i}v^i}=\frac{\sum^{p-1}_{i=0} u^{p-1-i}v^i}{\sum^{q-1}_{i=0} u^{q-1-i}v^i} Now as

t \rightarrow x we have u \rightarrow v (for the purposes of substitution) so we end up with:

\frac{\sum^{p-1}_{i=0} v^{p-1-i}v^i}{\sum^{q-1}_{i=0} v^{q-1-i}v^i}  = \frac{pv^{p-1}}{qv^{q-1}} = \frac{p}{q}v^{p-q} (the number of terms is easy to count).

Now back substitute to obtain \frac{p}{q} x^{\frac{(p-q)}{q}} = \frac{p}{q} x^{\frac{p}{q}-1} which, of course, is the familiar formula.

Note that this algebraic identity could have been used for the old f(x) = x^n case to begin with.


February 22, 2018

What is going on here: sum of cos(nx)…

Filed under: analysis, derivatives, Fourier Series, pedagogy, sequences of functions, series, uniform convergence — collegemathteaching @ 9:58 pm

This started innocently enough; I was attempting to explain why we have to be so careful when we attempt to differentiate a power series term by term; that when one talks about infinite sums, the “sum of the derivatives” might fail to exist if the sum is infinite.

Anyone who is familiar with Fourier Series and the square wave understands this well:

\frac{4}{\pi} \sum^{\infty}_{k=1} \frac{1}{2k-1}sin((2k-1)x)  = (\frac{4}{\pi})( sin(x) + \frac{1}{3}sin(3x) + \frac{1}{5}sin(5x) +.....) yields the “square wave” function (plus zero at the jump discontinuities)

Here I graphed to 2k-1 = 21

Now the resulting function fails to even be continuous. But the resulting function is differentiable except for the points at the jump discontinuities and the derivative is zero for all but a discrete set of points.

(recall: here we have pointwise convergence; to get a differentiable limit, we need other conditions such as uniform convergence together with uniform convergence of the derivatives).

But, just for the heck of it, let’s differentiate term by term and see what we get:

(\frac{4}{\pi})\sum^{\infty}_{k=1} cos((2k-1)x) = (\frac{4}{\pi})(cos(x) + cos(3x) + cos(5x) + cos(7x) +.....)...

It is easy to see that this result doesn’t even converge to a function of any sort.

Example: let’s see what happens at x = \frac{\pi}{4}: cos(\frac{\pi}{4}) = \frac{1}{\sqrt{2}}

cos(\frac{\pi}{4}) + cos(3\frac{\pi}{4}) =0

cos(\frac{\pi}{4}) + cos(3\frac{\pi}{4}) + cos(5\frac{\pi}{4}) = -\frac{1}{\sqrt{2}}

cos(\frac{\pi}{4}) + cos(3\frac{\pi}{4}) + cos(5\frac{\pi}{4}) + cos(7\frac{\pi}{4}) = 0

And this repeats over and over again; no limit is possible.

Something similar happens for x = \frac{p}{q}\pi where p, q are relatively prime positive integers.

But something weird is going on with this sum. I plotted the terms with 2k-1 \in \{1, 3, ...35 \}

(and yes, I am using \frac{\pi}{4} csc(x) as a type of “envelope function”)

BUT…if one, say, looks at cos(29x) + cos(31x) + cos(33x) + cos(35x)

we really aren’t getting a convergence (even at irrational multiples of \pi ). But SOMETHING is going on!

I decided to plot to cos(61x)

Something is going on, though it isn’t convergence. Note: by accident, I found that the pattern falls apart when I skipped one of the terms.

This is something to think about.

I wonder: for all x \in (0, \pi), sup_{n \in \{1, 3, 5, 7....\}}|\sum^{n}_{k \in \{1,3,..\}}cos(kx)| \leq |csc(x)| and we can somehow get close to csc(x) for given values of x by allowing enough terms…but the value of x is determined by how many terms we are using (not always the same value of x ).

March 25, 2014

The error term and approximation of derivatives

I’ll go ahead and work with the common 3 point derivative formulas:

This is the three-point endpoint formula: (assuming that f has 3 continuous derivatives on the appropriate interval)

f'(x_0) = \frac{1}{2h}(-3f(x_0) + 4f(x_0+h) -f(x_0 + 2h)) + \frac{h^2}{3} f^{3}(\omega) where \omega is some point in the interval.

The three point midpoint formula is:

f'(x_0) = \frac{1}{2h}(f(x_0 + h) -f(x_0 -h)) -\frac{h^2}{6}f^{3}(\omega) .

The derivation of these formulas: can be obtained from either using the Taylor series centered at x_0 or using the Lagrange polynomial through the given points and differentiating.

That isn’t the point of this note though.

The point: how can one demonstrate, by an example, the role the error term plays.

I suggest trying the following: let x vary from, say, 0 to 3 and let h = .25 . Now use the three point derivative estimates on the following functions:

1. f(x) = e^x .

2. g(x) = e^x + 10sin(\frac{\pi x}{.25}) .

Note one: the three point estimates for the derivatives will be exactly the same for both f(x) and g(x) . It is easy to see why.

Note two: the “errors” will be very, very different. It is easy to see why: look at the third derivative term: for f(x) it is e^x -10(\frac{\pi}{.25})^2sin(\frac{\pi x}{.25})

The graphs shows the story.


Clearly, the 3 point derivative estimates cannot distinguish these two functions for these “sample values” of x , but one can see how in the case of g , the degree that g wanders away from f is directly related to the higher order derivative of g .

March 14, 2014

Approximating the derivative and round off error: class demonstration

In numerical analysis we are covering “approximate differentiation”. One of the formulas we are using: f'(x_0) = \frac{f(x_0 + h) -f(x_0 -h)}{2h} - \frac{h^2}{6} f^{(3)}(\zeta) where \zeta is some number in [x_0 -h, x_0 + h] ; of course we assume that the third derivative is continuous in this interval.

The derivation can be done in a couple of ways: one can either use the degree 2 Lagrange polynomial through x_0-h, x_0, x_0 + h and differentiate or one can use the degree 2 Taylor polynomial expanded about x = x_0 and use x = x_0 \pm h and solve for f'(x_0) ; of course one runs into some issues with the remainder term if one uses the Taylor method.

But that isn’t the issue that I want to talk about here.

The issue: “what should we use for h ?” In theory, we should get a better approximation if we make h as small as possible. But if we are using a computer to make a numerical evaluation, we have to concern ourselves with round off error. So what we actually calculate will NOT be f'(x_0) = \frac{f(x_0 + h) -f(x_0 -h)}{2h} but rather f'(x_0) = \frac{\hat{f}(x_0 + h) -\hat{f}(x_0 -h)}{2h} where \hat{f}(x_0 \pm h) = f(x_0 \pm h) - e(x_0 \pm h) where e(x_0 \pm h) is the round off error used in calculating the function at x = x_0 \pm h (respectively).

So, it is an easy algebraic exercise to show that:

f'(x_0) - \frac{f(x_0 + h) -f(x_0 -h)}{2h} = - \frac{h^2}{6} f^{(3)}(\zeta)-\frac{e(x_0 +h) -e(x_0 -h)}{2h} and the magnitude of the actual error is bounded by \frac{h^2 M}{6} + \frac{\epsilon}{2} where M = max\{f^{(3)}(\eta)\} on some small neighborhood of x_0 and \epsilon is a bound on the round-off error of representing f(x_0 \pm h) .

It is an easy calculus exercise (“take the derivative and set equal to zero and check concavity” easy) to see that this error bound is a minimum when h = (\frac{3\epsilon}{M})^{\frac{1}{3}} .

Now, of course, it is helpful to get a “ball park” estimate for what \epsilon is. Here is one way to demonstrate this to the students: solve for \epsilon and obtain \frac{M h^3}{3} = \epsilon and then do some experimentation to determine \epsilon .

That is: obtain an estimate of h by using this “3 point midpoint” estimate for a known derivative near a value of x_0 for which M (a bound for the 3’rd derivative) is easy to obtain, and then obtain an educated guess for h .

Here are a couple of examples: one uses Excel and one uses MATLAB. I used f(x) = e^x at x = 0; of course f'(0) = 1 and M = 1 is reasonable here (just a tiny bit off). I did the 3-point estimation calculation for various values of h and saw where the error started to increase again.

Here is the Excel output for f(x) = e^x at x =0 and at x = 1 respectively. In the first case, use M = 1 and in the second M = e

In the x = 0 case, we see that the error starts to increase again at about h = 10^{-5} ; the same sort of thing appears to happen for x = 1 .

So, in the first case, \epsilon is about \frac{1}{3} \times (10^{-5})^3 = 3.333 \times 10^{-16} ; it is roughly 10^{-15} at x =1 .

Note: one can also approach h by using powers of \frac{1}{2} instead; something interesting happens in the x = 0 case; the x = 1 case gives results similar to what we’ve shown. Reason (I think): 1 is easy to represent in base 2 and the powers of \frac{1}{2} can be represented exactly.

Now we turn to MATLAB and here we do something slightly different: we graph the error for different values of h . Since the values of h are very small, we use a -log_{10} scale by doing the following (approximating f'(0) for f(x) = e^x )

rounoffmatlabcommand. By design, N = -log_{10}(H) . The graph looks like:


Now, the small error scale makes things hard to read, so we turn to using the log scale, this time on the y axis: let LE = -log_{10}(E) and run plot(N, LE):

roundlogscale and sure enough, you can see where the peak is: about 10^{-5} , which is the same as EXCEL.

February 24, 2014

A real valued function that is differentiable at an isolated point

A friend of mine is covering the Cauchy-Riemann equations in his complex variables class and wondered if there is a real variable function that is differentiable at precisely one point.

The answer is “yes”, of course, but the example I could whip up on the spot is rather pathological.

Here is one example:

Let f be defined as follows:

f(x) =\left\{ \begin{array}{c} 0, x = 0 \\ \frac{1}{q^2}, x = \frac{p}{q} \\ x^2, x \ne \frac{p}{q}  \end{array}\right.

That is, f(x) = x^2 if x is irrational or zero, and f(x) is \frac{1}{q^2} if x is rational and x = \frac{p}{q} where gcd(p,q) = 1 .

Now calculate lim_{x \rightarrow 0+} \frac{f(x) - f(0)}{x-0} = lim_{x \rightarrow 0+} \frac{f(x)}{x}

Let \epsilon > 0 be given and choose a positive integer M so that M > \frac{1}{\epsilon} . Let \delta < \frac{1}{M} . Now if 0 < x < \delta and x is irrational, then \frac{f(x)}{x} = \frac{x^2}{x} = x < \frac{1}{M} < \epsilon .

Now the fun starts: if x is rational, then x = \frac{p}{q} < \frac{1}{M} and \frac{f(x)}{x} = \frac{\frac{1}{q^2}}{\frac{p}{q}} = \frac{1}{qp} < \frac{1}{M} < \epsilon .

We looked at the right hand limit; the left hand limit works in the same manner.

Hence the derivative of f exists at x = 0 and is equal to zero. But zero is the only place where this function is even continuous because for any open interval I , inf \{|f(x)| x \in I \} = 0 .

August 4, 2012

Day 2, Madison MAA Mathfest

The day started with a talk by Karen King from the National Council of Teachers of Mathematics.
I usually find math education talks to be dreadful, but this one was pretty good.

The talk was about the importance of future math teachers (K-12) actually having some math background. However, she pointed out that students just having passed math courses didn’t imply that they understood the mathematical issues that they would be teaching…and it didn’t imply that their students would do better.

She gave an example: about half of those seeking to teach high school math couldn’t explain why “division by zero” was undefined! They knew that it was undefined but couldn’t explain why. I found that astonishing since I knew that in high school.

Later, she pointed out that potential teachers with a math degree didn’t understand what the issues were in defining a number like 2^{\pi} . Of course, a proper definition of this concept requires at least limits or at least a rigorous definition of the log function and she was well aware that the vast majority of high school students aren’t ready for such things. Still, the instructor should be; as she said “we all wave our hands from time to time, but WE should know when we are waving our hands.”

She stressed that we need to get future math teachers to get into the habit (she stressed the word: “habit”) of always asking themselves “why is this true” or “why is it defined in this manner”; too many of our math major courses are rule bound, and at times we write our exams in ways that reward memorization only.

Next, Bernd Sturmfels gave the second talk in his series; this was called Convex Algebraic Geometry.

You can see some of the material here. He also lead this into the concept of “Semidefinite programming”.

The best I can tell: one looks at the objects studied by algebraic geometers (root sets of polynomials of several variables) and then takes a “affine slice” of these objects.

One example: the “n-ellipse” is the set of points on the plane that satisfy \sum^m_{k=1} \sqrt{(x-u_k)^2 + (y-v_k)^2} = d where (u_k, v_k) are points in the plane.

Questions: what is the degree of the polynomial that describes the ellipse? What happens if we let d tend to zero? What is the smallest d for which the ellipse is non-vanishing (Fermat-Webber point)? Note: the 2 ellipse is the circle, the 3 ellipse (degree 8) is what we usually think of as an ellipse.

Note: these type of surfaces can be realized as the determinant of a symmetric matrix; these matrices have real eigenvalues. We can plot curves over which an eigenvalue goes to zero and then changes sign. This process leads to what is known as a spectrahedron ; this is a type of shape in space. A polyhedron can be thought of as the spectrahedron of a diagonal matrix.

Then one can seek to optimize a linear function over a spectrahedron; this leads to semidefinite programming, which, in general, is roughly as difficult as linear programming.

One use: some global optimization problems can be reduced to a semidefinite programming problem (not all).

Shorter Talks
There was a talk by Bob Palais which discussed the role of Rodrigues in the discovery of the quaternions. The idea is that Rodrigues discovered the quaternions before Hamilton did; but he talked about these in terms of rotations in space.

There were a few talks about geometry and how to introduce concepts to students; of particular interest was the concept of a geodesic. Ruth Berger talked about the “fish swimming in jello” model: basically suppose you had a sea of jello where the jello’s density was determined by its depth with the most dense jello (turning to infinite density) at the bottom; and it took less energy for the fish to swim in the less dense regions. Then if a fish wanted to swim between two points, what path would it take? The geometry induced by these geodesics results in the upper half plane model for hyperbolic space.

Nick Scoville gave a talk about discrete Morse theory. Here is a user’s guide. The idea: take a simplicial complex and assign numbers (integers) to the points, segments, triangles, etc. The assignment has to follow rules; basically the boundary of a complex has to have a lower number that what it bounds (with one exception….) and such an assignment leads to a Morse function. Critical sets can be defined and the various Betti numbers can be calculated.

Christopher Frayer then talked about the geometry of cubic polynomials. This is more interesting than it sounds.
Think about this: remember Rolles Theorem from calculus? There is an analogue of this in complex variables called the Guass-Lucas Theorem. Basically, the roots of the derivative lie in the convex hull of the roots of the polynomial. Then there is Marden’s Theorem for polynomials of degree 3. One can talk about polynomials that have a root of z = 1 and two other roots in the unit circle; then one can study where the the roots of the derivative lie. For a certain class of these polynomials, there is a dead circle tangent to the unit circle at 1 which encloses no roots of the derivative.

May 2, 2012

Composition of an analystic function with a non-analytic one

Filed under: advanced mathematics, analysis, complex variables, derivatives, Power Series, series — collegemathteaching @ 7:39 pm

On a take home exam, I gave a function of the type: f(z) = sin(k|z|) and asked the students to explain why such a function was continuous everywhere but not analytic anywhere.

This really isn’t hard but that got me to thinking: if f is analytic at z_0 and NON CONSTANT, is f(|z|) ever analytic? Before you laugh, remember that in calculus class, ln|x| is differentiable wherever x \neq 0 .

Ok, go ahead and laugh; after playing around with the Cauchy-Riemann equations at bit, I found that there was a much easier way, if f is analytic on some open neighborhood of a real number.

Since f is analytic at z_0 , z_0 real, write f = \sum ^ {\infty}_{k =0} a_k (z-z_0)^k and then compose f with |z| and substitute into the series. Now if this composition is analytic, pull out the Cauchy-Riemann equations for the composed function f(x+iy) = u(x,y) + iv(x,y) and it is now very easy to see that v_x = v_y =0 on some open disk which then implies by the Cauchy-Riemann equations that u_x = u_y = 0 as well which means that the function is constant.

So, what if z_0 is NOT on the real axis?

Again, we write f(x + iy) = u(x,y) + iv(x,y) and we use U_{X}, U_{Y} to denote the partials of these functions with respect to the first and second variables respectively. Now f(|z|) = f(\sqrt{x^2 + y^2} + 0i) = u(\sqrt{x^2 + y^2},0) + iv(\sqrt{x^2 + y^2},0) . Now turn to the Cauchy-Riemann equations and calculate:
\frac{\partial}{\partial x} u = u_{X}\frac{x}{\sqrt{x^2+y^2}}, \frac{\partial}{\partial y} u = u_{X}\frac{y}{\sqrt{x^2+y^2}}
\frac{\partial}{\partial x} v = v_{X}\frac{x}{\sqrt{x^2+y^2}}, \frac{\partial}{\partial y} v = v_{X}\frac{y}{\sqrt{x^2+y^2}}
Insert into the Cauchy-Riemann equations:
\frac{\partial}{\partial x} u = u_{X}\frac{x}{\sqrt{x^2+y^2}}= \frac{\partial}{\partial y} v = v_{X}\frac{y}{\sqrt{x^2+y^2}}
-\frac{\partial}{\partial x} v = -v_{X}\frac{x}{\sqrt{x^2+y^2}}= \frac{\partial}{\partial y} u = u_{X}\frac{y}{\sqrt{x^2+y^2}}

From this and from the assumption that y \neq 0 we obtain after a little bit of algebra:
u_{X}\frac{x}{y}= v_{X}, u_{X} = -v_{X}\frac{x}{y}
This leads to u_{X}\frac{x^2}{y^2} = v_{X}\frac{x}{y}=-v_{X} which implies either that u_{X} is zero which leads to the rest of the partials being zero (by C-R), or this means that \frac{x^2}{y^2} = -1 which is absurd.

So f must have been constant.

February 7, 2012

Forgotten Basic Algebra: or why we shouldn’t rely on the “conjugate trick”

Filed under: basic algebra, calculus, derivatives, elementary mathematics, how to learn calculus, pedagogy — collegemathteaching @ 7:01 pm

I’ll admit that, after 20 years of teaching at the university level, I sometimes get lazy. But…as I age, I must resist that temptation even though at times I find myself muttering “I don’t have 30 extra f*cking minutes to figure out how to do this…”

But often if I stick with it, it doesn’t take 30 “f*cking” minutes. 🙂

Here is an example: I was trying to remember how to calculate lim_{z \rightarrow w} \frac{z^{1/3} - w^{1/3}}{z - w} and was trying to remember instead of think. I looked at an old calculus book…no avail…then I was shamed into thinking. About 2-3 minutes later it struck me:
“you know how to simplify \frac{u - v}{u^3 - v^3} don’t you?”

Problem solved…shame WIN.

of course things like lim_{z \rightarrow w} \frac{z^{7/8} - w^{7/8}}{z - w} are easily converted to things like \frac{u^7 - v^7}{u^8 - v^8} , etc.

This leads to another point. Often when we teach lim_{h \rightarrow 0} \frac{\sqrt{x + h} - \sqrt{x}}{h} we use the “conjugate trick” which only works for square roots. The above method works for the other fractional powers.

January 12, 2012

So you want to take a course in complex variables

Ok, what should you have at your fingertips prior to taking such a course?

I consider the following to be minimal prerequisites:

Basic calculus

1. limits (epsilon-delta, 2-d limits)

2. limit definition of the derivative

3. basic calculus differentiation and integration formulas:
chain rule, product rule, quotient rule, integration and differentiation of polynomials, log, exponentials, basic trig functions, hyperbolic trig functions, inverse trig functions.

4. Fundamental Theorem of calculus.

5. Sequences (convergence)

6. Series: geometric series test, ratio test, comparison tests

7. Power series: interval of convergence, absolute convergence

8. Power series: term by term differentiation, term by term integrals

9. Taylor/Power series for 1/(1-x), sin(x), cos(x), exp(x)

Multi-variable calculus

1. partial derivatives

2. gradient

3. parametrized curves

4. polar coordinates

5. line and path integrals

6. conservative vector fields

7. Green’s Theorem (for integration of a closed loop in a plane)

The challenge
Some of complex variables will look “just like calculus”. And, some of the calculations WILL be “just like calculus; for example it will turn out if \delta is any piecewise smooth curve running from z_1 to z_2 then \int_{\delta} e^z dz = e^{z_2} - e^{z_1} . But in many cases, the similarity vanishes and more care must be taken.

You will learn many things such as:
1. The complex function sin(z) is unbounded!

2. No non-constant everywhere differentiable function is bounded; compare that to f(x) = \frac{1}{1+x^2} in calculus.

3. Integrals can have some strange properties. For example, if \delta is the unit circle taken once around in the standard direction, \int_{\delta} Log(z) dz depends on where one chooses to start and stop, even if the start and stop points are the same!

4. You’ll come to understand why the Taylor series (expanded about x = 0 ) for \frac{1}{1+x^2} has radius of convergence equal to one…it isn’t just an artifact of the trick used to calculate the series.

5. You’ll come to understand that being differentiable on an open disk is a very strong condition for complex functions; in particular being differentiable on an open disk means being INFINITELY differentiable on that open set (compare to f(x) = x^{4/3} which has one derivative but NOT two derivatives at x = 0

There is much more, of course.

November 3, 2011

Finding a Particular solution: the Convolution Method

Background for students
Remember that when one is trying to solve a non-homogeneous differential equation, say:
y^{\prime \prime} +3y^{\prime} +2y = cos(t) one finds the general solution to y^{\prime \prime} +3y^{\prime} +2y = 0 (which is called the homogeneous solution; in this case it is c_1 e^{-2t} + c_2 e^{-t} and then finds some solution to y^{\prime \prime} +3y^{\prime} +2y = cos(t) . This solution, called a particular solution, will not have an arbitrary constant. Hence that solution cannot meet an arbitrary initial condition.

But adding the homogenous solution to the particular solution yields a general solution with arbitrary constants which can be solved for to meet a given initial condition.

So how does one obtain a particular solution?

Students almost always learn the so-called “method of undetermined coefficients”; this is used when the driving function is a sine, cosine, e^{at} , a polynomial, or some sum and product of such things. Basically, one assumes that the particular solution has a certain form than then substitutes into the differential equation and then determines the coefficients. For example, in our example, one might try y_p = Acos(t) + Bsin(t) and then substitute into the differential equation to solve for A and B . One could also try a complex form; that is, try y_p = Ae^{it} and then determines A and then uses the real part of the solution.

A second method for finding particular solution is to use variation of parameters. Here is how that goes: one obtains two linearly independent homogeneous solutions y_1, y_2 and then seeks a particular solution of the form y_p = v_1y_1 + v_2y_2 where v_1 = -\int \frac{f(t)y_2}{W} dt and v_2 = \int \frac{f(t)y_1}{W} dt where W is the determinant of the Wronskian matrix. This method can solve differential equations like y^{\prime \prime} + y = tan(t) and sometimes is easier to use when the driving function is messy.
But sometimes it can lead to messy, non transparent solutions when “undetermined coefficients” is much easier; for example, try solving y^{\prime \prime} + 4y = cos(5t) with variation of parameters. Then try to do it with undetermined coefficients; though the answers are the same, one method yields a far “cleaner” answer.

There is a third way that gives a particular solution that meets a specific initial condition. Though this method can yield a not-so-easy-to-do-by-hand integral and can sometimes lead to what I might call an answer in obscured form, the answer is in the form of a definite integral that can be evaluated by numerical integration techniques (if one wants, say, the graph of a solution).

This method is the Convolution Method. Many texts introduce convolutions in the Laplace transform section but there is no need to wait until then.

What is a convolution?
We can define the convolution of two functions f and g to be:
f*g = \int_0^t g(u)f(t-u)du . Needless to say, f and g need to meet appropriate “integrability” conditions; this is usually not a problem in a differential equations course.

Example: if f = e^t, g=cos(t) , then f*g = \frac{1}{2}(e^t - cos(t) + sin(t)) . Notice that the dummy variable gets “integrated out” and the variable t remains.

There are many properties of convolutions that I won’t get into here; one interesting one is that f*g = g*f ; proving this is an interesting exercise in change of variable techniques in integration.

The Convolution Method
If y(t) is a homogenous solution to a second order linear differential equation that meets initial conditions: y(0)=0, y^{\prime}(0) =1 and f is the forcing function, then y_p = f*y is the particular solution that meets y_p(0)=0, y_p^{\prime}(0) =0

How might we use this method and why is it true? We’ll answer the “how” question first.

Suppose we want to solve y^{\prime \prime} + y = tan(t) . The homogeneous solution is y_h = c_1 cos(t) + c_2 sin(t) and it is easy to see that we need c_1 = 0, c_2 = 1 to meet the y_h(0)=0, y^{\prime}_h(0) =1 condition. So a particular solution is sin(t)*tan(t) = tan(t)*sin(t)= \int_0^t tan(u)sin(t-u)du = \int_0^t tan(u)(sin(t)cos(u)-cos(t)sin(u))du = sin(t)\int_0^t sin(u)du - cos(t)\int_0^t \frac{sin^2(u)}{cos(u)}du = sin(t)(1-cos(t)) -cos(t)ln|sec(t) + tan(t)| + sin(t)cos(t) = sin(t) -cos(t)ln|sec(t)+tan(t)|

This particular solution meets y_p(0)=0, y_p^{\prime}(0) = 0 .

Why does this work?
This is where “differentiation under the integral sign” comes into play. So we write f*y = \int_0^t f(u)y(t-u)du .
Then (f*y)^{\prime} = ?

Look at the convolution integral as g(x,z) = \int_0^x f(u)y(z-u)du . Now think of x(t) = t, z(t) = t . Then from calculus III: \frac{d}{dt} g(x,z) = g_x \frac{dx}{dt} + g_z \frac{dz}{dt} . Of course, \frac{dx}{dt}=\frac{dz}{dt}=1 .
g_x= f(x)y(z-x) by the Fundamental Theorem of calculus and g_z = \int_0^x f(u) y^{\prime}(z-u) du by differentiation under the integral sign.

So we let x = t, z = t and we see \frac{d}{dt} (f*y) = f(t)y(0) + \int_0^t f(u) y^{\prime}(t-u) du which equals \int_0^t f(u) y^{\prime}(t-u) du because y(0) = 0 . Now by the same reasoning \frac{d^2}{dt^2} (f*y) = f(t)y^{\prime}(0) + \int_0^t f(u) y^{\prime \prime}(t-u) du = f(t)+ \int_0^t f(u) y^{\prime \prime}(t-u) du because y^{\prime}(0) = 1 .
Now substitute into the differential equation y^{\prime \prime} + ay^{\prime} + by = f(t) and use the linear property of integrals to obtain f(t) + \int_0^t f(u) (y^{\prime \prime}(t-u) + ay^{\prime}(t-u) + by(t-u))du = f(t) + \int_0^t f(u) (0)du = f(t)

It is easy to see that (f*y)(0) = 0. Now check \frac{d}{dt} f*y(0) = f(t)y(0) + \int_0^0 f(u) y^{\prime}(t-u) du = 0 .

Older Posts »

Blog at