# College Math Teaching

## August 1, 2017

### Numerical solutions to differential equations: I wish that I had heard this talk first

The MAA Mathfest in Chicago was a success for me. I talked about some other talks I went to; my favorite was probably the one given by Douglas Arnold. I wish I had had this talk prior to teaching numerical analysis for the fist time.

Confession: my research specialty is knot theory (a subset of 3-manifold topology); all of my graduate program classes have been in pure mathematics. I last took numerical analysis as an undergraduate in 1980 and as a “part time, not taking things seriously” masters student in 1981 (at UTSA of all places).

In each course…I. Made. A. “C”.

Needless to say, I didn’t learn a damned thing, even though both professors gave decent courses. The fault was mine.

But…I was what my department had, and away I went to teach the course. The first couple of times, I studied hard and stayed maybe 2 weeks ahead of the class.
Nevertheless, I found the material fascinating.

When it came to understanding how to find a numerical approximation to an ordinary differential equation (say, first order), you have: $y' = f(t,y)$ with some initial value for both $y'(0), y(0)$. All of the techniques use some sort of “linearization of the function” technique to: given a step size, approximate the value of the function at the end of the next step. One chooses a step size, and some sort of schemes to approximate an “average slope” (e. g. Runga-Kutta is one of the best known).

This is a lot like numerical integration, but in integration, one knows $y'(t)$ for all values; here you have to infer $y'(t)$ from previous approximations of %latex y(t) \$. And there are things like error (often calculated by using some sort of approximation to $y(t)$ such as, say, the Taylor polynomial, and error terms which are based on things like the second derivative.

And yes, I faithfully taught all that. But what was unknown to me is WHY one might choose one method over another..and much of this is based on the type of problem that one is attempting to solve.

And this is the idea: take something like the Euler method, where one estimates $y(t+h) \approx y(t) + y'(t)h$. You repeat this process a bunch of times thereby obtaining a sequence of approximations for $y(t)$. Hopefully, you get something close to the “true solution” (unknown to you) (and yes, the Euler method is fine for existence theorems and for teaching, but it is too crude for most applications).

But the Euler method DOES yield a piecewise linear approximation to SOME $f(t)$ which might be close to $y(t)$ (a good approximation) or possibly far away from it (a bad approximation). And this $f(t)$ that you actually get from the Euler (or other method) is important.

It turns out that some implicit methods (using an approximation to obtain $y(t+h)$ and then using THAT to refine your approximation can lead to a more stable system of $f(t)$ (the solution that you actually obtain…not the one that you are seeking to obtain) in that this system of “actual functions” might not have a source or a sink…and therefore never spiral out of control. But this comes from the mathematics of the type of equations that you are seeking to obtain an approximation for. This type of example was presented in the talk that I went to.

In other words, we need a large toolbox of approximations to use because some methods work better with certain types of problems.

I wish that I had known that before…but I know it now. 🙂

## November 22, 2014

### One upside to a topologist teaching numerical analysis…

Yes, I was glad when we hired people with applied mathematics expertise; though I am enjoying teaching numerical analysis, it is killing me. My training is in pure mathematics (in particular, topology) and so class preparation is very intense for me.

But I so love being able to show the students the very real benefits that come from the theory.

Here is but one example: right now, I am talking about numerical solutions to “stiff” differential equations; basically, a differential equation is “stiff” if the magnitude of the differential equation is several orders of magnitude larger than the magnitude of the solution.

A typical example is the differential equation $y' = -\lambda y$, $y(0) = 1$ for $\lambda > 0$. Example: $y' = -20y, y(0) = 1$. Note that the solution $y(t) = e^{-20t}$ decays very quickly to zero though the differential equation is 20 times larger.

One uses such an equation to test a method to see if it works well for stiff differential equations. One such method is the Euler method: $w_{i+1} = w_{i} + h f(t_i, w_i)$ which becomes $w_{i+1} = w_i -20h \lambda w_i$. There is a way of assigning a method to a polynomial; in this case the polynomial is $p(\mu) = \mu - (1+h\lambda)$ and if the roots of this polynomial have modulus less than 1, then the method will converge. Well here, the root is $(1+h\lambda)$ and calculating: $-1 > 1+ h \lambda > 1$ which implies that $-2 > h \lambda > 0$. This is a good reference.

So for $\lambda = 20$ we find that $h$ has to be less than $\frac{1}{10}$. And so I ran Euler’s method for the initial problem on $[0,1]$ and showed that the solution diverged wildly for using 9 intervals, oscillated back and forth (with equal magnitudes) for using 10 intervals, and slowly converged for using 11 intervals. It is just plain fun to see the theory in action.

## November 11, 2014

### Math professor FAIL: CHECK THE OBVIOUS first.

Numerical analysis: I had the students run a “5 step predictor, 4 step corrector, variable step size” routine to solve a differential equation. This was a homework problem.
A student reported that the routine failed.

So, I looked at it…the interval for the solution was $[1, 1.2]$ and the initial step size was…. $0.05$ and what is 5 times .05? Yep, the correction part overshot the right endpoint and the “next h” was indeed negative.

This came to me when I was eating dinner…and sure enough..

OMG, I am an idiot. 🙂

## December 20, 2013

### Teaching the basics of numerical methods for solving differential equations

This semester we had about a week to spend on numerical methods. My goal was to give them the basics of how a numerical method works: given $y' = f(t,y), y(t_0) = y_0$ one selects a step size $\delta t$ and then one rides the slope of the tangent line: $y(t_0 + \delta t) = y_0 + (\delta t) f(t_0, y_0)$ and repeat the process. This is the basic Euler method; one can do an averaging process to get a better slope (Runge-Kutta) and one, if one desires, can use previous points in a multi-step process (e. g. Adams-Bashforth, etc.). Ultimately, it is starting at a point and using the slope at that point to get a piece wise linear approximation to a solution curve.

But the results of such a process confused students. Example: if one used a spreadsheet to do the approximation process (e. g. Euler or Runge-Kutta order 4), one has an output something like this:

So, there is confusion. They know how to get from one row to the other and what commands to type. But….”where is the solution?” they ask.

One has to emphasize what is obvious to us: the $x, y$ columns, is the approximate solution…a piece wise approximation of one anyway. What we have is a set $x, y(x)$ where these ordered pairs are points in the approximate solution to the differential equation that runs through those points. One cannot assume that the students understand this, even when they can do the algorithm.

An Exam Question

As a bonus question, I gave the following graph:

I then said: “I was testing out an Euler method routine on the differential equation $y' = y(2-y), y(0) = 1$ and I got the following output.

A) Is this solution mathematically possible?

B) If there is an error, is this an error induced by the Euler method or by an improper step size, or is a coding error more likely?

Many students got part A correct: some noted that $y = 2$ is an equilibrium solution and that this differential equation meets the “existence and uniqueness” criteria everywhere; hence the graph of the proposed solution intersecting the equilibrium solution is impossible.

Others noted that the slope of a solution at $y = 2$ would be zero; others noted that the slopes above $y = 2$ were negative and this proposed solution leveled out. Others noted that $y = 2$ is an attractor hence any solution near $y = 2$ would have to stay near there.

But no one got part B; one even went as far to say that someone with a Ph. D. in math would never make such an elementary coding error (LOL!!!!!)

But the key here: the slopes ARE negative above $y = 2$ and a correct Euler method (regardless of step size…ok, within reason) would drive the curve down.

So this WAS the result of a coding error.

What went wrong: I was running both RK-order 4 and Euler (for teaching purposes) and I stored the RK slope with one variable and calculated the “new Y” using the RK slope (obtained from the RK approximation) for the Euler method. Hence when the curve “jumped over” the $y = 2$, the new slope it picked up was a near zero slope from the RK approximation for the same value of $t$ (which was near, but below the $y = 2$ equilibrium.

My problem is that the two variables in the code differed by a single number (my bad). I was able to fix the problem very quickly though.

An aside
On another part of the test, I told them to solve $y' = y (y-1) (y+3), y(0) = 1$ and gave them the phrase: “hint: THINK first; this is NOT a hard calculation”. A few students got it, but mostly the A students. There was ONE D student who got it right!

## December 18, 2013

### Have you ever had a student like this one?

Filed under: academia, differential equations, numerical solution of differential equations — Tags: — collegemathteaching @ 6:39 pm

I am grading differential equations final exams. I have the usual mix…for the most part.

But I have 3 students who are very, very good. And as far as one of these: let’s just say that when he/she writes an answer down that differs from the one I produced for the key…I double check my own work.

And once in a while….I am quietly embarrassed. 🙂

Note: but even this student is confused about numerical methods to solve differential equations; I’ll have to address that (with a post) over break.

## October 10, 2011

### The Picard Iterates: how they can yield an interval of existence.

One of the many good things about my teaching career is that as I teach across the curriculum, I fill in the gaps of my own education.
I got my Ph. D. in topology (low dimensional manifolds; in particular, knot theory) and hadn’t seen much of differential equations beyond my “engineering oriented” undergraduate course.

Therefore, I learned more about existence and uniqueness theorems when I taught differential equations; though I never taught the existence and uniqueness theorems in a class, I learned the proofs just for my own background. In doing so I learned about the Picard iterated integral technique for the first time; how this is used to establish “uniqueness of solution” can be found here.

However I recently discovered (for myself) what thousands of mathematicians already know: the Picard process can be used to yield an interval of existence for a solution for a differential equation, even if we cannot obtain the solution in closed form.

The situation
I assigned my numerical methods class to solve $y'= t + y^2$ with $y(0)=1$ and to produce the graph of $y(t)$ from $t = 0$ to $t = 3$.

There is a unique solution to this and the solution is valid so long as the $t$ and $y$ value of the solution curve stays finite; note that $\frac{\partial }{\partial y}f(t,y)=2y.$

So, is it possible that the $y$ values for this solution become unbounded?

What follows are the notes I gave to my class.

Numeric output seems to indicate this, but numeric output is NOT proof.

To find a proof of this, let’s turn to the Picard iteration technique. We
know that the Picard iterates will converge to the unique solution.

$y_{0}=1$

$y_{1}=1+\int_{0}^{t}x+1dx=\frac{1}{2}t^{2}+t+1$

$y_{2}=1+\int_{0}^{t}x+(\frac{1}{2}x^{2}+x+1)^{2}dx=$

$y_{2}=\frac{1}{20}t^{5}+\frac{1}{4}t^{4}+\frac{2}{3}t^{3}+\frac{3}{2}t^{2}+t+1$

The integrals get pretty ugly around here; I used MATLAB to calculate the
higher order iterates. I’ll show you $y_{3}$

$y_{3}=\frac{49}{60}t^{5}+\frac{13}{12}t^{4}+\frac{4}{3}t^{3}+\frac{3}{2}t^{2}+t+1+O(t^{11})$

where $O(t^{11})$ means assorted polynomial terms from order 6 to 11.

Here is one more:

$y_{4}=\frac{17}{12}t^{5}+\frac{17}{12}t^{4}+\frac{4}{3}t^{3}+\frac{3}{2}t^{2}+t+1+O(t^{23})$

We notice some patterns developing here. First of all, the coefficient of
the $t^{n}$ term is staying the same for all $y_{m}$ where $m\geq n.$

That is tedious to prove. But what is easier to show (and sufficient) is
that the coefficients for the $t^{n}$ terms for $y_{n}$ all appear to be
bigger than 1. This is important!

Why? If we can show that this is the case, then our ”limit” solution $\sum_{k=0}^{\infty }a_{k}t^{k}$ will have an interval of convergence less than 1. Why? Substitute $t=1$ and see that the sum
diverges because the $a_{k}$ not only fail to converge to zero, but they
stay greater than 1.

So, can we prove this general pattern?

YES!

Here is the idea: $y_{m}=q(t)+p(t)$ where $p(t)$ is a polynomial of order $m$
and $q(t)$ is a polynomial whose terms all have order $m+1$ or greater.

Now put into the Picard process:

$y_{m+1}=1+\int_{0}^{t}((q(x)+p(x))^{2}+xdx=$

$1+\int_{0}^{t}((q(x)^{2}+2p(x)q(x))dx+\int_{0}^{t}(p(x))^{2}+xdx$

Note: all of the terms of $y_{m+1}$ of degree $m+1$ or higher must come from
the second integral.

Now by induction we can assume that all of the coefficients of the
polynomial $p(x)$ are greater than or equal to one.

When we ”square out” the polynomial, the coefficients of the new
polynomial will consist of the sum of positive numbers, each of which is
greater than 1. For the coefficients of the polynomial $(p(x))^{2}$ of
degree $m$ or higher: if one is interested in the $k^{\prime }th$
coefficient, one has to add at least $k+1$ numbers together, each of which
is bigger than one.

Now when one does the integration on these particular terms, one, of course,
divides by $k+1$ (power rule for integration). But that means that the
coefficient (after integration) is then greater than 1.

Here is a specific example:

Say $p(x)=a+bx+cx^{2}+dx^{3}$

Now $p(x)^{2}=a^{2}+(ab+ab)x+(ac+ca+b^{2})x^{2}+(ad+da+bc+cb)x^{3}+\{O(x^{6})\}$

Remember that $a,b,c,d$ are all greater than or equal to one.

Now $p(x)^{2}+x=a^{2}+(ab+ab+1)x+(ac+ca+b^{2})x^{2}+(ad+da+bc+cb)x^{3}+\{O(x^{6})\}$

Now when we integrate term by term, we get:

$\int_{0}^{t}(p(x))^{2}+xdx=a^{2}x+\frac{1}{2}(ab+ab+1)x^{2}+\frac{1}{3}(ac+ca+b^{2})x^{3}+\frac{1}{4}(ad+da+bc+cb)x^{4}+\{O(x^{7})\}$

But note that $ab+ab+1>2,ac+ca+b^{2}\geq 3,$ and $ad+da+bc+cb\geq 4$

Since all of the factors are greater than or equal to 1.

Hence in our new polynomial approximation, the order 4 terms or less all
have coefficients which are greater than or equal to one.

We can make this into a Proposition:

Proposition
Suppose $p(x)=\sum_{j=0}^{k}a_{j}x^{j}$ where each $a_{j}\geq 1.$

If $q(t)=\sum_{j=0}^{2k+1}b_{j}x^{j}=1+\int_{0}^{x}((p(t))^{2}+t)dt$

Then for all $j\leq k+1,b_{j}\geq 1.$

Proof. Of course, $b_{0}=1,b_{1}=a_{0}^{2},$ and $b_{2}=\frac{2a_{0}a_{1}+1}{2}$

Let $n\leq k+1.$

Then we can calculate: (since all of the $a_{n-1},a_{n-2},....a_{1}$ are
defined):

If $n$ is odd, then $b_{n}=\frac{1}{n}(2a_{0}a_{n-1}+2a_{1}a_{n-2}+...2a_{\frac{n-3}{2}}a_{\frac{n+1}{2}}+(a_{\frac{n-1}{2}})^{2})\geq \frac{1}{n}(2\ast \frac{n-1}{2}+1)=1$

If $n$ is even then $b_{n}=\frac{1}{n}(2a_{0}a_{n-1}+2a_{1}a_{n-2}+....2a_{\frac{n-1}{2}}a_{\frac{n+1}{2}})\geq \frac{1}{n}(2\ast \frac{n}{2})=1$

The Proposition is proved.

Of course, this possibly fails for $b_{n}$ where $n>k+1$ as we would fail to
have a sufficient number of terms in our sum.

Now if one wants a challenge, one can modify the above arguments to show that the coefficients of the approximating polynomial never get ”too big”; that is, the coefficient of the $k^{\prime }th$ order term is less than, say, $k$.

It isn’t hard to show that $b_{n}\leq \max a_{i}^{2}$ where $i\in\{0,1,2,...n-1\}$

Then one can compare to the derivative of the geometric series to show that
one gets convergence on an interval up to but not including 1.