# College Math Teaching

## February 17, 2014

### Aitken acceleration: another “easy” example for the students

In a previous post I showed some spreadsheet data to demonstrate the Aitken acceleration process. Here, I’ll go through an example where a sequence converges linearly: let $p_n = L +a^n + b^n$ where $0 < a < b < 1$. We use the form $q_n = \frac{p_{n+2}p_{n} -(p_{n+1})^2}{p_{n+2} - 2p_{n+1} + p_n}$ (I am too lazy to use the traditional “p-hat” notation). First, note that the denominator works out to $a^n(1-a)^2 +b^n(1-b)^2$

The numerator is a tiny bit more work: the $L^2$ terms cancel and as far as the rest:
$L(a^{n+2} + b^{n+2}) + L(a^n + b^n) +(a^{n+2}+b^{n+2})(a^n + b^2)-2L(a^{n+1}+b^{n+1})-(a^{n+1}+b^{n+1})^2$
which simplifies to a term involving $L$ and one that doesn’t. Here is the term involving $L$:

$L(a^{n+2}-2a^{n+1} + a^n + b^{n+2} -2b^{n+1} +b^n) = L(a^n(1-a)^2 +b^n(1-b)^2)$

which, of course, is just $L$ times the denominator.

Now the terms not involving $L$: $(a^{n+2}+b^{n+2})(a^n+b^n) - (a^{n+1} + b^{n+1})^2 = b^na^n(b^2+a^2-2ab) = b^na^n(b-a)^2$

So our fraction is merely

$\frac{L((a^n(1-a)^2 +b^n(1-b)^2)) + b^na^n(b-a)^2}{a^n(1-a)^2 +b^n(1-b)^2}$ $= L + \frac{b^na^n(b-a)^2}{a^n(1-a)^2 +b^n(1-b)^2}$

This can be rearranged to $L + \frac{(b-a)^2}{\frac{(1-a)^2}{b^n} + \frac{(1-b)^2}{a^n}}$

Clearly as $n$ goes to infinity, the error goes to zero very quickly. It might be instructive to look at the ratio of the errors for $p_n$ and $q_n$:

This ratio is
$(a^n + b^n)\frac{a^n(1-a)^2 + b^n(1-b)^2}{a^nb^n(b-a)^2} =$$(a^n +b^n)(\frac{1}{b^n}(\frac{1-a}{b-a})^2 + \frac{1}{a^n}(\frac{1-b}{b-a})^2)$

Note that in the right hand factor: both squared factors are fixed and the coefficients go to infinity as $n$ goes to infinity. If one multiplies out, one obtains:

$((\frac{a}{b})^n +1)(\frac{1-a}{b-a})^2 + ((\frac{b}{a})^n +1)(\frac{1-b}{b-a})^2$. In the limit, the first term decreases to $(\frac{1-a}{b-a})^2$ and the second goes to infinity.

Hence the errors in the accelerated sequence are smaller.

## February 8, 2014

### Demonstrating Aitken’s sequence acceleration

Right now, we are studying the various root finding algorithms in Numerical Analysis (bisection, Newton’s, etc.)

Such algorithms yield a sequence of numbers, which (hopefully) converge to a solution: $p_0, p_1, p_2, p_3, ....$.

Of course, each point in the sequence is obtained by calculations and, if there were a way to combine these points so as to obtain a sequence that converges faster (while not adding much to the computation complexity), there is some benefit. And yes, it is possible to use the sequence points, do the series manipulation and use the manipulated points in the root finding algorithm itself (e. g. Steffensen’s method).

In this post, I’ll talk about Aitken’s method and how one can cook up examples that not only show that the method can work but give the students some intuition as to why it might work.

I’ll provide just a bit of background in the event that the general reader comes across this.

Let $p_i \rightarrow p$. If we have $\frac{|p_{n+1} - p|}{(|p_n -p|)^{\alpha}} \rightarrow \lambda$ ($\lambda$ is positive, of course) we say that the convergence is LINEAR if $\alpha = 1$ and $\lambda < 1$ the inequality must be strict. If $\alpha = 2$ then we say that convergence is quadratic (regardless of $\lambda$.)

To see the reason for the terminology, just multiply both sides of the “defining equation” by the denominator. In the linear case : $|p_{n+1} - p| = \lambda |p_{n} - p|$ so one can think: “get new approximation by multiplying the old approximation by some constant less than one”. For example: $p_n = \frac{1}{2^n}$ exhibits linear convergence to 0; that is a decent way to think about it.

Now (in the linear convergence case anyway), suppose you think of your approximation having an error that shrinks with iteration but shrinks in the following way: the n’th iteration looks like $p_n = p + a^n + b^n$ where $a, b$ are constants strictly between 0 and 1. Of course, as $n$ goes to infinity, $p_n$ approaches the limit $p$ as the error terms die away.

Aitken’s method is this: let’s denote a new sequence by $q_n$. (I am not using the traditional P-hat out of laziness) Then $q_n = p_n -\frac{(p_{n+1} - p_n)^2}{p_{n+2} -2p_{n+1} + p_{n}}$. To see how this formula is obtained, check out these excellent course notes. Or the Wiki article is pretty good.

Look at the numerator of what is being subtracted off: if we have the terms written (very roughly) as $p_{n+1} = p + a^{n+1} + b^{n+1}, p_n = a^n + b^n$ and if, say, $a$ is much larger than $b$, then $a^{n+1}$ will be closer to $a^n$ than $b^{n+1}$ is to $b^n$, hence more of this error will be subtracted away.

Yes, I know that his is simple-minded, but it “gets the spirit” of the process. I’ve set up some spreadsheets with “cooked” sequences linearly converging to zero $p_n = a^n + b^n$ and showed how the Aitken process works there. Note: the spreadsheet round off errors start occurring at the $10^{-16}$ range; you can see that here.

(click to see the larger image)

To see an abstract example where $p_n = L + a^n + b^n$ where $a, b \in (0,1)$, go to the next post in this series.