College Math Teaching

August 1, 2017

Numerical solutions to differential equations: I wish that I had heard this talk first

The MAA Mathfest in Chicago was a success for me. I talked about some other talks I went to; my favorite was probably the one given by Douglas Arnold. I wish I had had this talk prior to teaching numerical analysis for the fist time.

Confession: my research specialty is knot theory (a subset of 3-manifold topology); all of my graduate program classes have been in pure mathematics. I last took numerical analysis as an undergraduate in 1980 and as a “part time, not taking things seriously” masters student in 1981 (at UTSA of all places).

In each course…I. Made. A. “C”.

Needless to say, I didn’t learn a damned thing, even though both professors gave decent courses. The fault was mine.

But…I was what my department had, and away I went to teach the course. The first couple of times, I studied hard and stayed maybe 2 weeks ahead of the class.
Nevertheless, I found the material fascinating.

When it came to understanding how to find a numerical approximation to an ordinary differential equation (say, first order), you have: y' = f(t,y) with some initial value for both y'(0), y(0) . All of the techniques use some sort of “linearization of the function” technique to: given a step size, approximate the value of the function at the end of the next step. One chooses a step size, and some sort of schemes to approximate an “average slope” (e. g. Runga-Kutta is one of the best known).

This is a lot like numerical integration, but in integration, one knows y'(t) for all values; here you have to infer y'(t) from previous approximations of %latex y(t) $. And there are things like error (often calculated by using some sort of approximation to y(t) such as, say, the Taylor polynomial, and error terms which are based on things like the second derivative.

And yes, I faithfully taught all that. But what was unknown to me is WHY one might choose one method over another..and much of this is based on the type of problem that one is attempting to solve.

And this is the idea: take something like the Euler method, where one estimates y(t+h) \approx y(t) + y'(t)h . You repeat this process a bunch of times thereby obtaining a sequence of approximations for y(t) . Hopefully, you get something close to the “true solution” (unknown to you) (and yes, the Euler method is fine for existence theorems and for teaching, but it is too crude for most applications).

But the Euler method DOES yield a piecewise linear approximation to SOME f(t) which might be close to y(t)  (a good approximation) or possibly far away from it (a bad approximation). And this f(t) that you actually get from the Euler (or other method) is important.

It turns out that some implicit methods (using an approximation to obtain y(t+h) and then using THAT to refine your approximation can lead to a more stable system of f(t) (the solution that you actually obtain…not the one that you are seeking to obtain) in that this system of “actual functions” might not have a source or a sink…and therefore never spiral out of control. But this comes from the mathematics of the type of equations that you are seeking to obtain an approximation for. This type of example was presented in the talk that I went to.

In other words, we need a large toolbox of approximations to use because some methods work better with certain types of problems.

I wish that I had known that before…but I know it now. 🙂

Advertisements

July 13, 2015

Trolled by Newton’s Law of Cooling…

Filed under: calculus, differential equations, editorial — Tags: , — collegemathteaching @ 8:55 pm

From a humor website: there is a Facebook account called “customer service” who trolls customers making complaints. Though that isn’t a topic here, it is interesting to see Newton’s Cooling Law get mentioned:

newtonscoolinglaw

November 22, 2014

One upside to a topologist teaching numerical analysis…

Yes, I was glad when we hired people with applied mathematics expertise; though I am enjoying teaching numerical analysis, it is killing me. My training is in pure mathematics (in particular, topology) and so class preparation is very intense for me.

But I so love being able to show the students the very real benefits that come from the theory.

Here is but one example: right now, I am talking about numerical solutions to “stiff” differential equations; basically, a differential equation is “stiff” if the magnitude of the differential equation is several orders of magnitude larger than the magnitude of the solution.

A typical example is the differential equation y' = -\lambda y , y(0) = 1 for \lambda > 0 . Example: y' = -20y, y(0) = 1 . Note that the solution y(t) = e^{-20t} decays very quickly to zero though the differential equation is 20 times larger.

One uses such an equation to test a method to see if it works well for stiff differential equations. One such method is the Euler method: w_{i+1} = w_{i} + h f(t_i, w_i) which becomes w_{i+1} = w_i -20h \lambda w_i. There is a way of assigning a method to a polynomial; in this case the polynomial is p(\mu) = \mu - (1+h\lambda) and if the roots of this polynomial have modulus less than 1, then the method will converge. Well here, the root is (1+h\lambda) and calculating: -1 > 1+ h \lambda > 1 which implies that -2 >   h \lambda > 0 . This is a good reference.

So for \lambda = 20 we find that h has to be less than \frac{1}{10} . And so I ran Euler’s method for the initial problem on [0,1] and showed that the solution diverged wildly for using 9 intervals, oscillated back and forth (with equal magnitudes) for using 10 intervals, and slowly converged for using 11 intervals. It is just plain fun to see the theory in action.

October 1, 2014

Osgood’s uniqueness theorem for differential equations

I am teaching a numerical analysis class this semester and we just started the section on differential equations. I want them to understand when we can expect to have a solution and when a solution satisfying a given initial condition is going to be unique.

We had gone over the “existence” theorem, which basically says: given y' = f(x,y) and initial condition y(x_0) = y_0 where (x_0,y_0) \in int(R) where R is some rectangle in the x,y plane, if f(x,y) is a continuous function over R, then we are guaranteed to have at least one solution to the differential equation which is guaranteed to be valid so long as (x, y(x) stays in R.

I might post a proof of this theorem later; however an outline of how a proof goes will be useful here. With no loss of generality, assume that x_0 = 0 and the rectangle has the lines x = -a, x = a as vertical boundaries. Let \phi_0 = f(0, y_0)x , the line of slope f(0, y_0) . Now partition the interval [-a, a] into -a, -\frac{a}{2}, 0, \frac{a}{2}, a and create a polygonal path as follows: use slope f(0, y_0) at (0, y_0) , slope f(\frac{a}{2}, y_0 + \frac{a}{2}f(0, y_0)) at (\frac{a}{2}, y_0 +  \frac{a}{2}f(0, y_0)) and so on to the right; reverse this process going left. The idea: we are using Euler’s differential equation approximation method to obtain an initial piecewise approximation. Then do this again for step size \frac{a}{4},

In this way, we obtain an infinite family of continuous approximation curves. Because f(x,y) is continuous over R , it is also bounded, hence the curves have slopes whose magnitude are bounded by some M. Hence this family is equicontinuous (for any given \epsilon one can use \delta = \frac{\epsilon}{M} in continuity arguments, no matter which curve in the family we are talking about. Of course, these curves are uniformly bounded, hence by the Arzela-Ascoli Theorem (not difficult) we can extract a subsequence of these curves which converges to a limit function.

Seeing that this limit function satisfies the differential equation isn’t that hard; if one chooses t, s \in (-a.a) close enough, one shows that | \frac{\phi_k(t) - \phi_k(s)}{(t-s)} - f(t, \phi(t))|  0 where |f(x,y_1)-f(x,y_2)| \le K|y_1-y_2| then the differential equation y'=f(x,y) has exactly one solution where \phi(0) = y_0 which is valid so long as the graph (x, \phi(x) ) remains in R .

Here is the proof: K > 0 where |f(x,y_1)-f(x,y_2)| \le K|y_1-y_2| < 2K|y_1-y_2| . This is clear but perhaps a strange step.
But now suppose that there are two solutions, say y_1(x) and y_2(x) where y_1(0) = y_2(0) . So set z(x) = y_1(x) -y_2(x) and note the following: z'(x) = y_1(x) - y_2(x) = f(x,y_1)-f(x,y_2) and |z'(x)| = |f(x,y_1)-f(x,y_2)|   0 . A Mean Value Theorem argument applied to z means that we can assume that we can select our x_1 so that z' > 0 on that interval (since z(0) = 0 ).

So, on this selected interval about x_1 we have z'(x) < 2Kz (we can remove the absolute value signs.).

Now we set up the differential equation: Y' = 2KY, Y(x_1) = z(x_1) which has a unique solution Y=z(x_1)e^{2K(x-x_1)} whose graph is always positive; Y(0) = z(x_1)e^{-2Kx_1} . Note that the graphs of z(x), Y(x) meet at (x_1, z(x_1)) . But z'(x)  0 where z(x_1 - \delta) > Y(x_1 - \delta) .

But since z(0) = 0  z'(x) on that interval.

So, no such point x_1 can exist.

Note that we used the fact that the solution to Y' = 2KY, Y(x_1) > 0 is always positive. Though this is an easy differential equation to solve, note the key fact that if we tried to separate the variables, we’d calculate \int_0^y \frac{1}{Kt} dt and find that this is an improper integral which diverges to positive \infty hence its primitive cannot change sign nor reach zero. So, if we had Y' =2g(Y) where \int_0^y \frac{1}{g(t)} dt is an infinite improper integral and g(t) > 0 , we would get exactly the same result for exactly the same reason.

Hence we can recover Osgood’s Uniqueness Theorem which states:

If f(x,y) is continuous on R and for all (x, y_1), (x, y_2) \in R we have a K > 0 where |f(x,y_1)-f(x,y_2)| \le g(|y_1-y_2|) where g is a positive function and \int_0^y \frac{1}{g(t)} dt diverges to \infty at y=0 then the differential equation y'=f(x,y) has exactly one solution where \phi(0) = y_0 which is valid so long as the graph (x, \phi(x) ) remains in R .

September 23, 2014

Ok, what do you see here? (why we don’t blindly trust software)

I had Dfield8 from MATLAB propose solutions to y' = t(y-2)^{\frac{4}{5}} meeting the following initial conditions:

y(0) = 0, y(0) = 3, y(0) = 2.

homeworkexistanceuniqueness

Now, of course, one of these solutions is non-unique. But, of all of the solutions drawn: do you trust ANY of them? Why or why not?

Note: you really don’t have to do much calculus to see what is wrong with at least one of these. But, if you must know, the general solution is given by y(t) = (\frac{t^2}{10} +C)^5 + 2 (and, of course, the equilibrium solution y = 2 ). But that really doesn’t provide more information that the differential equation does.

By the way, here are some “correct” plots of the solutions, (up to uniqueness)

homeworkexistanceuniqueness2

July 30, 2014

Differential equations mentioned in National Review

Filed under: differential equations, media — Tags: , — collegemathteaching @ 10:29 pm

(hat tip: Vox)

The National Review excerpt:

[…]One part insecure hipsterism, one part unwarranted condescension, the two defining characteristics of self-professed nerds are (a) the belief that one can discover all of the secrets of human experience through differential equations and (b) the unlovely tendency to presume themselves to be smarter than everybody else in the world. Prominent examples include […]

(emphasis mine).

Oh noes! I love differential equations! 🙂

Yeah, I am just having fun with the quote; I couldn’t resist mentioning an article in the popular press that mentions differential equations. I am not sure that I’ll teach the chapter on “all the secrets of human experience” in my upcoming differential equations class though.

December 20, 2013

Teaching the basics of numerical methods for solving differential equations

This semester we had about a week to spend on numerical methods. My goal was to give them the basics of how a numerical method works: given y' = f(t,y), y(t_0) = y_0 one selects a step size \delta t and then one rides the slope of the tangent line: y(t_0 + \delta t) = y_0 + (\delta t) f(t_0, y_0) and repeat the process. This is the basic Euler method; one can do an averaging process to get a better slope (Runge-Kutta) and one, if one desires, can use previous points in a multi-step process (e. g. Adams-Bashforth, etc.). Ultimately, it is starting at a point and using the slope at that point to get a piece wise linear approximation to a solution curve.

But the results of such a process confused students. Example: if one used a spreadsheet to do the approximation process (e. g. Euler or Runge-Kutta order 4), one has an output something like this:

eulerexample

So, there is confusion. They know how to get from one row to the other and what commands to type. But….”where is the solution?” they ask.

One has to emphasize what is obvious to us: the x, y columns, is the approximate solution…a piece wise approximation of one anyway. What we have is a set x, y(x) where these ordered pairs are points in the approximate solution to the differential equation that runs through those points. One cannot assume that the students understand this, even when they can do the algorithm.

An Exam Question

As a bonus question, I gave the following graph:

grapheuler

I then said: “I was testing out an Euler method routine on the differential equation y' = y(2-y), y(0)  = 1 and I got the following output.

A) Is this solution mathematically possible?

B) If there is an error, is this an error induced by the Euler method or by an improper step size, or is a coding error more likely?

Many students got part A correct: some noted that y = 2 is an equilibrium solution and that this differential equation meets the “existence and uniqueness” criteria everywhere; hence the graph of the proposed solution intersecting the equilibrium solution is impossible.

Others noted that the slope of a solution at y = 2 would be zero; others noted that the slopes above y = 2 were negative and this proposed solution leveled out. Others noted that y = 2 is an attractor hence any solution near y = 2 would have to stay near there.

But no one got part B; one even went as far to say that someone with a Ph. D. in math would never make such an elementary coding error (LOL!!!!!)

But the key here: the slopes ARE negative above y = 2 and a correct Euler method (regardless of step size…ok, within reason) would drive the curve down.

So this WAS the result of a coding error.

What went wrong: I was running both RK-order 4 and Euler (for teaching purposes) and I stored the RK slope with one variable and calculated the “new Y” using the RK slope (obtained from the RK approximation) for the Euler method. Hence when the curve “jumped over” the y = 2, the new slope it picked up was a near zero slope from the RK approximation for the same value of t (which was near, but below the y = 2 equilibrium.

My problem is that the two variables in the code differed by a single number (my bad). I was able to fix the problem very quickly though.

An aside
On another part of the test, I told them to solve y' = y (y-1) (y+3), y(0) = 1 and gave them the phrase: “hint: THINK first; this is NOT a hard calculation”. A few students got it, but mostly the A students. There was ONE D student who got it right!

December 18, 2013

Have you ever had a student like this one?

Filed under: academia, differential equations, numerical solution of differential equations — Tags: — collegemathteaching @ 6:39 pm

I am grading differential equations final exams. I have the usual mix…for the most part.

But I have 3 students who are very, very good. And as far as one of these: let’s just say that when he/she writes an answer down that differs from the one I produced for the key…I double check my own work.

And once in a while….I am quietly embarrassed. 🙂

Note: but even this student is confused about numerical methods to solve differential equations; I’ll have to address that (with a post) over break.

November 25, 2013

A fact about Laplace Transforms that no one cares about….

Filed under: differential equations, Laplace transform — Tags: — collegemathteaching @ 10:33 pm

Consider: sin(x) = x - \frac{x^3}{3!} + \frac{x^5}{5!}......

Now take the Laplace transform of the right hand side: \frac{1}{s^2} - \frac{3!}{s^4 3!} + \frac{5!}{s^6 5!} .... = \frac{1}{s^2} (1 -\frac{1}{s^2} + \frac{1}{s^4} ....

This is equal to: \frac{1}{s^2} (\frac{1}{1 + \frac{1}{s^2}}) for s > 1 which is, of course, \frac{1}{1 + s^2} which is exactly what you would expect.

This technique works for e^{x} but gives nonsense for e^{x^2} .

Update: note that we can get a power series for e^{x^2} = 1 + x^2 + \frac{x^4}{2!} + \frac{x^6}{3!} + .... which, on a term by term basis, transforms to \frac{1}{s} + \frac{2!}{s^3} + \frac{4!}{s^5 2!} + \frac{6!}{s^7 3!} + ... = \frac{1}{s} \sum_{k=0} (\frac{1}{s^2})^k\frac{(2k)!}{k!}) which only converges at s = \infty .

November 12, 2013

Why I teach multiple methods for the inverse Laplace Transform.

I’ll demonstrate with a couple of examples:

y''+4y = sin(2t), y(0) = y'(0) = 0

If we use the Laplace transform, we obtain: (s^2+4)Y = \frac{2}{s^2+4} which leads to Y = \frac{2}{(s^2+4)^2} . Now we’ve covered how to do this without convolutions. But the convolution integral is much easier: write Y = \frac{2}{(s^2+4)^2} = \frac{1}{2} \frac{2}{s^2+4}\frac{2}{s^2+4} which means that y = \frac{1}{2}(sin(2t)*sin(2t)) = \frac{1}{2}\int^t_0 sin(2u)sin(2t-2u)du = -\frac{1}{4}tcos(2t) + \frac{1}{8}sin(2t) .

Note: if the integral went too fast for you and you don’t want to use a calculator, use sin(2t-2u) = sin(2t)cos(2u) - cos(2t)sin(2u) and the integral becomes \frac{1}{2}\int^t_0 sin(2t)cos(2u)sin(2u) -cos(2t)sin^2(2u)du =

\frac{1}{2} (sin(2t))\frac{1}{4}sin^2(2u)|^t_0 - cos(2t)(\frac{1}{4})( t - \frac{1}{4}sin(4u)|^t_0 =

\frac{1}{8}sin^3(2t) - \frac{1}{4}tcos(2t) +\frac{1}{16}sin(4t)cos(2t) =

\frac{1}{8}(sin^3(2t) +sin(2t)cos^2(2t))-\frac{1}{4}tcos(2t)

= \frac{1}{8}sin(2t)(sin^2(2t) + cos^2(2t))-\frac{1}{4}tcos(2t) = -\frac{1}{4}tcos(2t) + \frac{1}{8}sin(2t)

Now if we had instead: y''+4y = sin(t), y(0)=0, y'(0) = 0

The Laplace transform of the equation becomes (s^2+4)Y = \frac{1}{s^2+1} and hence Y = \frac{1}{(s^2+1)(s^2+4)} . One could use the convolution method but partial fractions works easily: one can use the calculator (“algebra” plus “expand”) or:

\frac{A+Bs}{s^2+4} + \frac{C + Ds}{s^2+1} =\frac{1}{(s^2+4)(s^2+1)} . Get a common denominator and match numerators:

(A+Bs)(s^2+1) + (C+Ds)(s^2+4)  = 1 . One can use several methods to resolve this: here we will use s = i to see (C + Di)(3) = 1 which means that D = 0 and C = \frac{1}{3} . Now use s = 2i so obtain (A + 2iB)(-3) = 1 which means that B = 0, A = -\frac{1}{3} so Y = \frac{1}{3} (\frac{1}{s^2+1} - \frac{1}{s^2+4} so y = \frac{1}{3} (sin(t) - \frac{1}{2} sin(2t)) = \frac{1}{3}sin(t) -\frac{1}{6}sin(2t)

So, sometimes the convolution leads us to the answer quicker than other techniques and sometimes other techniques are easier.

Of course, the convolution method has utility beyond the Laplace transform setting.

Older Posts »

Create a free website or blog at WordPress.com.