# College Math Teaching

## July 14, 2020

### An alternative to trig substitution, sort of..

Ok, just for fun: $\int \sqrt{1+x^2} dx =$

The usual is to use $x =tan(t), dx =sec^2(t) dt$ which transforms this to the dreaded $\int sec^3(t) dt$ integral, which is a double integration by parts.
Is there a way out? I think so, though the price one pays is a trickier conversion back to x.

Let’s try $x =sinh(t) \rightarrow dx = cosh(t) dt$ so upon substituting we obtain $\int |cosh(t)|cosh(t) dt$ and noting that $cosh(t) > 0$ alaways:

$\int cosh^2(t)dt$ Now this can be integrated by parts: let $u=cosh(t) dv = cosh(t) dt \rightarrow du =sinh(t), v = sinh(t)$

So $\int cosh^2(t)dt = cosh(t)sinh(t) -\int sinh^2(t)dt$ but this easily reduces to:

$\int cosh^2(t)dt = cosh(t)sinh(t) -\int cosh^2(t)-1 dt \rightarrow 2\int cosh^2(t)dt = cosh(t)sinh(t) -t + C$

Division by 2: $\int cosh^2(t)dt = \frac{1}{2}(cosh(t)sinh(t)-t)+C$

That was easy enough.

But we now have the conversion to x: $\frac{1}{2}(cosh(t)sinh(t) \rightarrow \frac{1}{2}x \sqrt{1+x^2}$

So far, so good. But what about $t \rightarrow arcsinh(x)$?

Write: $sinh(t) = \frac{e^{t}-e^{-t}}{2} = x \rightarrow e^{t}-e^{-t} =2x \rightarrow e^{t}-2x -e^{-t} =0$

Now multiply both sides by $e^{t}$ to get $e^{2t}-2xe^t -1 =0$ and use the quadratic formula to get $e^t = \frac{1}{2}(2x\pm \sqrt{4x^2+4} \rightarrow e^t = x \pm \sqrt{x^2+1}$

We need $e^t > 0$ so $e^t = x + \sqrt{x^2+1} \rightarrow t = ln|x + \sqrt{x^2+1}|$ and that is our integral:

$\int \sqrt{1+x^2} dx = \frac{1}{2}x \sqrt{1+x^2} + \frac{1}{2} ln|x + \sqrt{x^2+1}| + C$

I guess that this isn’t that much easier after all.

## July 12, 2020

### Logarithmic differentiation: do we not care about domains anymore?

Filed under: calculus, derivatives, elementary mathematics, pedagogy — collegemathteaching @ 11:29 pm

The introduction is for a student who might not have seen logarithmic differentiation before: (and yes, this technique is extensively used..for example it is used in the “maximum likelihood function” calculation frequently encountered in statistics)

Suppose you are given, say, $f(x) =sin(x)e^x(x-2)^3(x+1)$ and you are told to calculate the derivative?

Calculus texts often offer the technique of logarithmic differentiation: write $ln(f(x)) = ln(sin(x)e^x(x-2)^3(x+1)) = ln(sin(x)) + x + 3ln(x-2) + ln(x+1)$
Now differentiate both sides: $ln((f(x))' = \frac{f'(x)}{f(x)} = \frac{cos(x)}{sin(x)} + 1 + \frac{3}{x-2} + {1 \over x+1}$

Now multiply both sides by $f(x)$ to obtain

$f'(x) = f(x)(\frac{cos(x)}{sin(x)} + 1 + \frac{3}{x-2} + {1 \over x+1}) =$

\

$(sin(x)e^x(x-2)^3(x+1)(\frac{cos(x)}{sin(x)} + 1 + \frac{3}{x-2} + {1 \over x+1})$

And this is correct…sort of. Why I say sort of: what happens, at say, $x = 0$? The derivative certainly exists there but what about that second factor? Yes, the sin(x) gets cancelled out by the first factor, but AS WRITTEN, there is an oh-so-subtle problem with domains.

You can only substitute $x \in \{ 0, \pm k \pi \}$ only after simplifying ..which one might see as a limit process.

But let’s stop and take a closer look at the whole process: we started with $f(x) = g_1(x) g_2(x) ...g_n(x)$ and then took the log of both sides. Where is the log defined? And when does $ln(ab) = ln(a) + ln(b)$? You got it: this only works when $a > 0, b > 0$.

So, on the face of it, $ln(g_1 (x) g_2(x) ...g_n(x)) = ln(g_1(x) ) + ln(g_2(x) ) + ...ln(g_n(x))$ is justified only when each $g_i(x) > 0$.

Why can we get away with ignoring all of this, at least in this case?

Well, here is why:

1. If $f(x) \neq 0$ is a differentiable function then $\frac{d}{dx} ln(|f(x)|) = \frac{f'(x)}{f(x)}$
Yes, this is covered in the derivation of $\int {dx \over x}$ material but here goes: write

$|f(x)| = \begin{cases} f(x) ,& \text{if } f(x) > 0 \\ -f(x), & \text{otherwise} \end{cases}$

Now if $f(x) > 0$ we get ${ d \over dx} ln(f(x)) = {f'(x) \over f(x) }$ as usual. If $f(x) < 0$ then $|f(x)| = =f(x), |f(x)|' = (-f(x))' = -f'(x)$ and so in either case:

$\frac{d}{dx} ln(|f(x)|) = \frac{f'(x)}{f(x)}$ as required.

THAT is the workaround for calculating ${d \over dx } ln(g_1(x)g_2(x)..g_n(x))$ where $g_1(x)g_2(x)..g_n(x) \neq 0$: just calculate ${d \over dx } ln(|g_1(x)g_2(x)..g_n(x)|)$. noting that $|g_1(x)g_2(x)..g_n(x)| = |g_1(x)| |g_2(x)|...|g_n(x)|$

Yay! We are almost done! But, what about the cases where at least some of the factors are zero at, say $x= x_0$?

Here, we have to bite the bullet and admit that we cannot take the log of the product where any of the factors have a zero, at that point. But this is what we can prove:

Given $g_1(x) g_2(x)...g_n(x)$ is a product of differentiable functions and $g_1(a) g_2(a)...g_k(a) = 0$ $k \leq n$ then
$(g_1(a)g_2(a)...g_n(a))' = lim_{x \rightarrow a} g_1(x)g_2(x)..g_n(x) ({g_1'(x) \over g_1(x)} + {g_2'(x) \over g_2(x)} + ...{g_n'(x) \over g_n(x})$

This works out to what we want by cancellation of factors.

Here is one way to proceed with the proof:

1. Suppose $f, g$ are differentiable and $f(a) = g(a) = 0$. Then $(fg)'(a) = f'(a)g(a) + f(a)g'(a) = 0$ and $lim_{x \rightarrow a} f(x)g(x)({f'(x) \over f(x)} + {g'(x) \over g(x)}) = 0$
2. Now suppose $f, g$ are differentiable and $f(a) =0 , g(a) \neq 0$. Then $(fg)'(a) = f'(a)g(a) + f(a)g'(a) = f'(a)g(a)$ and $lim_{x \rightarrow a} f(x)g(x)({f'(x) \over f(x)} + {g'(x) \over g(x)}) = f'(a)g(a)$
3.Now apply the above to $g_1(x) g_2(x)...g_n(x)$ is a product of differentiable functions and $g_1(a) g_2(a)...g_k(a) = 0$ $k \leq n$
If $k = n$ then $(g_1(a)g_2(a)...g_n(a))' = lim_{x \rightarrow a} g_1(x)g_2(x)..g_n(x) ({g_1'(x) \over g_1(x)} + {g_2'(x) \over g_2(x)} + ...{g_n'(x) \over g_n(x}) =0$ by inductive application of 1.

If $k < n$ then let $g_1...g_k = f, g_{k+1} ..g_n =g$ as in 2. Then by 2, we have $(fg)' = f'(a)g(a)$ Now this quantity is zero unless $k = 1$ and $f'(a) neq 0$. But in this case note that $lim_{x \rightarrow a} g_1(x)g_2(x)...g_n(x)({g_1'(x) \over g_1(x)} + {g_2'(x) \over g_2(x)} + ...+ {g_n'(x) \over g_n(x)}) = lim_{x \rightarrow a} g_2(x)...g_n(x)(g_1'(x)) =g(a)g_1(a)$

So there it is. Yes, it works ..with appropriate precautions.

## January 15, 2019

### Calculus series: derivatives

Filed under: calculus, derivatives — collegemathteaching @ 3:36 am

Reminder: this series is NOT for the student who is attempting to learn calculus for the first time.

Derivatives This is dealing with differentiable functions $f: R^1 \rightarrow R^1$ and no, I will NOT be talking about maps between tangent bundles. Yes, my differential geometry and differential topology courses were on the order of 30 years ago or so. 🙂

In calculus 1, we typically use the following definitions for the derivative of a function at a point: $lim_{x \rightarrow a} \frac{f(x)-f(a)}{x-a} = lim_{h \rightarrow 0} \frac{f(a+h) - f(a)}{h} = f'(a)$. This is opposed to the derivative function which can be thought of as the one dimensional gradient of $f$.

The first definition is easier to use for some calculations, say, calculating the derivative of $f(x) = x ^{\frac{p}{q}}$ at a point. (hint, if you need one: use $u = x^{\frac{1}{q}}$ then it is easier to factor). It can be used for proving a special case of the chain rule as well (the case there we are evaluating $f$ at $x = a$ and $f(x) = f(t)$ for at most a finite number of points near $a$.)

When introducing this concept, the binomial expansion theorem is very handy to use for many of the calculations.

Now there is another definition for the derivative that is helpful when proving the chain rule (sans restrictions).

Note that as $h \rightarrow 0$ we have $|\frac{f(a+h)-f(a)}{h} - f'(a)| < \epsilon$. We can now view $\epsilon$ as a function of $h$ which goes to zero as $h$ does.

That is, $f(a+h) = hf'(a) + f(a) + \frac{\epsilon}{h}$ where $\frac{\epsilon}{h} \rightarrow 0$ and $f'(a)$ is the best linear approximation for $f$ at $x = a$.

We’ll talk about the chain rule a bit later.

But what about the derivative and examples?

It is common to develop intuition for the derivative as applied to nice, smooth..ok, analytic functions. And this might be a fine thing to do for beginning calculus students. But future math majors might benefit from being exposed to just a bit more so I’ll give some examples.

Now, of course, being differentiable at a point means being continuous there (the limit of the numerator of the difference quotient must go to zero for the derivative to exist). And we all know examples of a function being continuous at a point but not being differentiable there. Examples: $|x|, x^{\frac{1}{3}}, x^{\frac{2}{3}}$ are all continuous at zero but none are differentiable there; these give examples of a corner, vertical tangent and a cusp respectively.

But for many of the piecewise defined examples, say, $f(x) = x$ for $x < 0$ and $x^2 + x$ for $x \geq 0$ the derivative fails to exist because the respective derivative functions fail to be continuous at $x =0$; the same is true of the other stated examples.

And of course, we can show that $x^{\frac{3k +2}{3}}$ has $k$ continuous derivatives at the origin but not $k+1$ derivatives.

But what about a function with a discontinuous derivative? Try $f(x) = x^2 sin(\frac{1}{x})$ for $x \neq 0$ and zero at $x =0$. It is easy to see that the derivative exists for all $x$ but the first derivative fails to be continuous at the origin.

The derivative is $0$ at $x = 0$ and $2x sin(\frac{1}{x}) -cos(\frac{1}{x})$ for $x \neq 0$ which is not continuous at the origin.

Ok, what about a function that is differentiable at a single point only? There are different constructions, but if $f(x) = x^2$ for $x$ rational, $x^3$ for $x$ irrational is both continuous and, yes, differentiable at $x = 0$ (nice application of the Squeeze Theorem on the difference quotient).

Yes, there are everywhere continuous, nowhere differentiable functions.

## October 4, 2018

### When is it ok to lie to students? part I

Filed under: calculus, derivatives, pedagogy — collegemathteaching @ 9:32 pm

We’ve arrived at logarithms in our calculus class, and, of course, I explained that $ln(ab) = ln(a) + ln(b)$ only holds for $a, b > 0$. That is all well and good.
And yes, I explained that expressions like $f(x)^{g(x)}$ only makes sense when $f(x) > 0$

But then I went ahead and did a problem of the following type: given $f(x) = \frac{x^3 e^{x^2} cos(x)}{x^4 + 1}$ by using logarithmic differentiation,

$f'(x) = \frac{x^3 e^{x^2} cos(x)}{x^4 + 1} (\frac{3}{x} + 2x -tan(x) -\frac{4x^3}{x^4+ 1})$

And you KNOW exactly what I did. Right?

Note that $f$ is differentiable for all $x$ and, well, the derivative *should* be continuous for all $x$ but..is it? Well, up to inessential singularities, it is. You see: the second factor is not defined for $x = 0, x = \frac{\pi}{2} \pm k \pi$, etc.

Well, let’s multiply it out and obtain:
$f'(x) = \frac{3x^2 e^{x^2} cos(x)}{x^4 + 1} + \frac{2x^4 e^{x^2} cos(x)}{x^4 + 1} - \frac{x^3 e^{x^2} sin(x)}{x^4 + 1}-\frac{4x^6 e^{x^2} cos(x)}{(x^4 + 1)^2}$

So, there is that. We might induce inessential singularities.

And there is the following: in the process of finding the derivative to begin with we did:

$ln(\frac{x^3 e^{x^2} cos(x)}{x^4 + 1}) = ln(x^3) + ln(e^{x^2}) + ln(cos(x)) - ln(x^4 + 1)$ and that expansion is valid only for
$x \in (0, \frac{\pi}{2}) \cup (\frac{5\pi}{2}, \frac{7\pi}{2}) \cup ....$ because we need $x^3 > 0$ and $cos(x) > 0$.

But the derivative formula works anyway. So what is the formula?

It is: if $f = \prod_{j=1}^k f_j$ where $f_j$ is differentiable, then $f' = \sum_{i=1}^k f'_i \prod_{j =1, j \neq i}^k f_j$ and verifying this is an easy exercise in induction.

But the logarithmic differentiation is really just a motivating idea that works for positive functions.

To make this complete: we’ll now tackle $y = f(x)^{g(x)}$ where it is essential that $f(x) > 0$.

Rewrite $y = e^{ln(f(x)^{g(x)})} = e^{g(x)ln(f(x))}$

Then $y' = e^{g(x)ln(f(x))} (g'(x) ln(f(x)) + g(x) \frac{f'(x)}{f(x)}) = f(x)^{g(x)}(g'(x) ln(f(x)) + g(x) \frac{f'(x)}{f(x)})$

This formula is a bit of a universal one. Let’s examine two special cases.

Suppose $g(x) = k$ some constant. Then $g'(x) =0$ and the formula becomes $y = f(x)^k(k \frac{f'(x)}{f(x)}) = kf(x)^{k-1}f'(x)$ which is just the usual constant power rule with the chain rule.

Now suppose $f(x) = a$ for some positive constant. Then $f'(x) = 0$ and the formula becomes $y = a^{g(x)}(ln(a)g'(x))$ which is the usual exponential function differentiation formula combined with the chain rule.

## September 8, 2018

### Proving a differentiation formula for f(x) = x ^(p/q) with algebra

Filed under: calculus, derivatives, elementary mathematics, pedagogy — collegemathteaching @ 1:55 am

Yes, I know that the proper way to do this is to prove the derivative formula for $f(x) = x^n$ and then use, say, the implicit function theorem or perhaps the chain rule.

But an early question asked students to use the difference quotient method to find the derivative function (ok, the “gradient”) for $f(x) = x^{\frac{3}{2}}$ And yes, one way to do this is to simplify the difference quotient $\frac{t^{\frac{3}{2}} -x^{\frac{3}{2}} }{t-x}$ by factoring $t^{\frac{1}{2}} -x^{\frac{1}{2}}$ from both the numerator and the denominator of the difference quotient. But this is rather ad-hoc, I think.

So what would one do with, say, $f(x) = x^{\frac{p}{q}}$ where $p, q$ are positive integers?

One way: look at the difference quotient: $\frac{t^{\frac{p}{q}}-x^{\frac{p}{q}}}{t-x}$ and do the following (before attempting a limit, of course): let $u= t^{\frac{1}{q}}, v =x^{\frac{1}{q}}$ at which our difference quotient becomes: $\frac{u^p-v^p}{u^q -v^q}$

Now it is clear that $u-v$ is a common factor..but HOW it factors is essential.

So let’s look at a little bit of elementary algebra: one can show:

$x^{n+1} - y^{n+1} = (x-y) (x^n + x^{n-1}y + x^{n-2}y^2 + ...+ xy^{n-1} + y^n)$

$= (x-y)\sum^{n}_{i=0} x^{n-i}y^i$ (hint: very much like the geometric sum proof).

Using this:

$\frac{u^p-v^p}{u^q -v^q} = \frac{(u-v)\sum^{p-1}_{i=0} u^{p-1-i}v^i}{(u-v)\sum^{q-1}_{i=0} u^{q-1-i}v^i}=\frac{\sum^{p-1}_{i=0} u^{p-1-i}v^i}{\sum^{q-1}_{i=0} u^{q-1-i}v^i}$ Now as

$t \rightarrow x$ we have $u \rightarrow v$ (for the purposes of substitution) so we end up with:

$\frac{\sum^{p-1}_{i=0} v^{p-1-i}v^i}{\sum^{q-1}_{i=0} v^{q-1-i}v^i} = \frac{pv^{p-1}}{qv^{q-1}} = \frac{p}{q}v^{p-q}$ (the number of terms is easy to count).

Now back substitute to obtain $\frac{p}{q} x^{\frac{(p-q)}{q}} = \frac{p}{q} x^{\frac{p}{q}-1}$ which, of course, is the familiar formula.

Note that this algebraic identity could have been used for the old $f(x) = x^n$ case to begin with.

## February 22, 2018

### What is going on here: sum of cos(nx)…

Filed under: analysis, derivatives, Fourier Series, pedagogy, sequences of functions, series, uniform convergence — collegemathteaching @ 9:58 pm

This started innocently enough; I was attempting to explain why we have to be so careful when we attempt to differentiate a power series term by term; that when one talks about infinite sums, the “sum of the derivatives” might fail to exist if the sum is infinite.

Anyone who is familiar with Fourier Series and the square wave understands this well:

$\frac{4}{\pi} \sum^{\infty}_{k=1}$ $\frac{1}{2k-1}sin((2k-1)x) = (\frac{4}{\pi})( sin(x) + \frac{1}{3}sin(3x) + \frac{1}{5}sin(5x) +.....)$ yields the “square wave” function (plus zero at the jump discontinuities)

Here I graphed to $2k-1 = 21$

Now the resulting function fails to even be continuous. But the resulting function is differentiable except for the points at the jump discontinuities and the derivative is zero for all but a discrete set of points.

(recall: here we have pointwise convergence; to get a differentiable limit, we need other conditions such as uniform convergence together with uniform convergence of the derivatives).

But, just for the heck of it, let’s differentiate term by term and see what we get:

$(\frac{4}{\pi})\sum^{\infty}_{k=1} cos((2k-1)x) = (\frac{4}{\pi})(cos(x) + cos(3x) + cos(5x) + cos(7x) +.....)...$

It is easy to see that this result doesn’t even converge to a function of any sort.

Example: let’s see what happens at $x = \frac{\pi}{4}: cos(\frac{\pi}{4}) = \frac{1}{\sqrt{2}}$

$cos(\frac{\pi}{4}) + cos(3\frac{\pi}{4}) =0$

$cos(\frac{\pi}{4}) + cos(3\frac{\pi}{4}) + cos(5\frac{\pi}{4}) = -\frac{1}{\sqrt{2}}$

$cos(\frac{\pi}{4}) + cos(3\frac{\pi}{4}) + cos(5\frac{\pi}{4}) + cos(7\frac{\pi}{4}) = 0$

And this repeats over and over again; no limit is possible.

Something similar happens for $x = \frac{p}{q}\pi$ where $p, q$ are relatively prime positive integers.

But something weird is going on with this sum. I plotted the terms with $2k-1 \in \{1, 3, ...35 \}$

(and yes, I am using $\frac{\pi}{4} csc(x)$ as a type of “envelope function”)

BUT…if one, say, looks at $cos(29x) + cos(31x) + cos(33x) + cos(35x)$

we really aren’t getting a convergence (even at irrational multiples of $\pi$). But SOMETHING is going on!

I decided to plot to $cos(61x)$

Something is going on, though it isn’t convergence. Note: by accident, I found that the pattern falls apart when I skipped one of the terms.

This is something to think about.

I wonder: for all $x \in (0, \pi), sup_{n \in \{1, 3, 5, 7....\}}|\sum^{n}_{k \in \{1,3,..\}}cos(kx)| \leq |csc(x)|$ and we can somehow get close to $csc(x)$ for given values of $x$ by allowing enough terms…but the value of $x$ is determined by how many terms we are using (not always the same value of $x$).

## March 25, 2014

### The error term and approximation of derivatives

I’ll go ahead and work with the common 3 point derivative formulas:

This is the three-point endpoint formula: (assuming that $f$ has 3 continuous derivatives on the appropriate interval)

$f'(x_0) = \frac{1}{2h}(-3f(x_0) + 4f(x_0+h) -f(x_0 + 2h)) + \frac{h^2}{3} f^{3}(\omega)$ where $\omega$ is some point in the interval.

The three point midpoint formula is:

$f'(x_0) = \frac{1}{2h}(f(x_0 + h) -f(x_0 -h)) -\frac{h^2}{6}f^{3}(\omega)$.

The derivation of these formulas: can be obtained from either using the Taylor series centered at $x_0$ or using the Lagrange polynomial through the given points and differentiating.

That isn’t the point of this note though.

The point: how can one demonstrate, by an example, the role the error term plays.

I suggest trying the following: let $x$ vary from, say, 0 to 3 and let $h = .25$. Now use the three point derivative estimates on the following functions:

1. $f(x) = e^x$.

2. $g(x) = e^x + 10sin(\frac{\pi x}{.25})$.

Note one: the three point estimates for the derivatives will be exactly the same for both $f(x)$ and $g(x)$. It is easy to see why.

Note two: the “errors” will be very, very different. It is easy to see why: look at the third derivative term: for $f(x)$ it is $e^x -10(\frac{\pi}{.25})^2sin(\frac{\pi x}{.25})$

The graphs shows the story.

Clearly, the 3 point derivative estimates cannot distinguish these two functions for these “sample values” of $x$, but one can see how in the case of $g$, the degree that $g$ wanders away from $f$ is directly related to the higher order derivative of $g$.

## March 14, 2014

### Approximating the derivative and round off error: class demonstration

In numerical analysis we are covering “approximate differentiation”. One of the formulas we are using: $f'(x_0) = \frac{f(x_0 + h) -f(x_0 -h)}{2h} - \frac{h^2}{6} f^{(3)}(\zeta)$ where $\zeta$ is some number in $[x_0 -h, x_0 + h]$; of course we assume that the third derivative is continuous in this interval.

The derivation can be done in a couple of ways: one can either use the degree 2 Lagrange polynomial through $x_0-h, x_0, x_0 + h$ and differentiate or one can use the degree 2 Taylor polynomial expanded about $x = x_0$ and use $x = x_0 \pm h$ and solve for $f'(x_0)$; of course one runs into some issues with the remainder term if one uses the Taylor method.

But that isn’t the issue that I want to talk about here.

The issue: “what should we use for $h$?” In theory, we should get a better approximation if we make $h$ as small as possible. But if we are using a computer to make a numerical evaluation, we have to concern ourselves with round off error. So what we actually calculate will NOT be $f'(x_0) = \frac{f(x_0 + h) -f(x_0 -h)}{2h}$ but rather $f'(x_0) = \frac{\hat{f}(x_0 + h) -\hat{f}(x_0 -h)}{2h}$ where $\hat{f}(x_0 \pm h) = f(x_0 \pm h) - e(x_0 \pm h)$ where $e(x_0 \pm h)$ is the round off error used in calculating the function at $x = x_0 \pm h$ (respectively).

So, it is an easy algebraic exercise to show that:

$f'(x_0) - \frac{f(x_0 + h) -f(x_0 -h)}{2h} = - \frac{h^2}{6} f^{(3)}(\zeta)-\frac{e(x_0 +h) -e(x_0 -h)}{2h}$ and the magnitude of the actual error is bounded by $\frac{h^2 M}{6} + \frac{\epsilon}{2}$ where $M = max\{f^{(3)}(\eta)\}$ on some small neighborhood of $x_0$ and $\epsilon$ is a bound on the round-off error of representing $f(x_0 \pm h)$.

It is an easy calculus exercise (“take the derivative and set equal to zero and check concavity” easy) to see that this error bound is a minimum when $h = (\frac{3\epsilon}{M})^{\frac{1}{3}}$.

Now, of course, it is helpful to get a “ball park” estimate for what $\epsilon$ is. Here is one way to demonstrate this to the students: solve for $\epsilon$ and obtain $\frac{M h^3}{3} = \epsilon$ and then do some experimentation to determine $\epsilon$.

That is: obtain an estimate of $h$ by using this “3 point midpoint” estimate for a known derivative near a value of $x_0$ for which $M$ (a bound for the 3’rd derivative) is easy to obtain, and then obtain an educated guess for $h$.

Here are a couple of examples: one uses Excel and one uses MATLAB. I used $f(x) = e^x$ at $x = 0$; of course $f'(0) = 1$ and $M = 1$ is reasonable here (just a tiny bit off). I did the 3-point estimation calculation for various values of $h$ and saw where the error started to increase again.

Here is the Excel output for $f(x) = e^x$ at $x =0$ and at $x = 1$ respectively. In the first case, use $M = 1$ and in the second $M = e$

In the $x = 0$ case, we see that the error starts to increase again at about $h = 10^{-5}$; the same sort of thing appears to happen for $x = 1$.

So, in the first case, $\epsilon$ is about $\frac{1}{3} \times (10^{-5})^3 = 3.333 \times 10^{-16}$; it is roughly $10^{-15}$ at $x =1$.

Note: one can also approach $h$ by using powers of $\frac{1}{2}$ instead; something interesting happens in the $x = 0$ case; the $x = 1$ case gives results similar to what we’ve shown. Reason (I think): 1 is easy to represent in base 2 and the powers of $\frac{1}{2}$ can be represented exactly.

Now we turn to MATLAB and here we do something slightly different: we graph the error for different values of $h$. Since the values of $h$ are very small, we use a $-log_{10}$ scale by doing the following (approximating $f'(0)$ for $f(x) = e^x$)

. By design, $N = -log_{10}(H)$. The graph looks like:

Now, the small error scale makes things hard to read, so we turn to using the log scale, this time on the $y$ axis: let $LE = -log_{10}(E)$ and run plot(N, LE):

and sure enough, you can see where the peak is: about $10^{-5}$, which is the same as EXCEL.

## February 24, 2014

### A real valued function that is differentiable at an isolated point

A friend of mine is covering the Cauchy-Riemann equations in his complex variables class and wondered if there is a real variable function that is differentiable at precisely one point.

The answer is “yes”, of course, but the example I could whip up on the spot is rather pathological.

Here is one example:

Let $f$ be defined as follows:

$f(x) =\left\{ \begin{array}{c} 0, x = 0 \\ \frac{1}{q^2}, x = \frac{p}{q} \\ x^2, x \ne \frac{p}{q} \end{array}\right.$

That is, $f(x) = x^2$ if $x$ is irrational or zero, and $f(x)$ is $\frac{1}{q^2}$ if $x$ is rational and $x = \frac{p}{q}$ where $gcd(p,q) = 1$.

Now calculate $lim_{x \rightarrow 0+} \frac{f(x) - f(0)}{x-0} = lim_{x \rightarrow 0+} \frac{f(x)}{x}$

Let $\epsilon > 0$ be given and choose a positive integer $M$ so that $M > \frac{1}{\epsilon}$. Let $\delta < \frac{1}{M}$. Now if $0 < x < \delta$ and $x$ is irrational, then $\frac{f(x)}{x} = \frac{x^2}{x} = x < \frac{1}{M} < \epsilon$.

Now the fun starts: if $x$ is rational, then $x = \frac{p}{q} < \frac{1}{M}$ and $\frac{f(x)}{x} = \frac{\frac{1}{q^2}}{\frac{p}{q}} = \frac{1}{qp} < \frac{1}{M} < \epsilon$.

We looked at the right hand limit; the left hand limit works in the same manner.

Hence the derivative of $f$ exists at $x = 0$ and is equal to zero. But zero is the only place where this function is even continuous because for any open interval $I$, $inf \{|f(x)| x \in I \} = 0$.

## August 4, 2012

### Day 2, Madison MAA Mathfest

The day started with a talk by Karen King from the National Council of Teachers of Mathematics.
I usually find math education talks to be dreadful, but this one was pretty good.

The talk was about the importance of future math teachers (K-12) actually having some math background. However, she pointed out that students just having passed math courses didn’t imply that they understood the mathematical issues that they would be teaching…and it didn’t imply that their students would do better.

She gave an example: about half of those seeking to teach high school math couldn’t explain why “division by zero” was undefined! They knew that it was undefined but couldn’t explain why. I found that astonishing since I knew that in high school.

Later, she pointed out that potential teachers with a math degree didn’t understand what the issues were in defining a number like $2^{\pi}$. Of course, a proper definition of this concept requires at least limits or at least a rigorous definition of the log function and she was well aware that the vast majority of high school students aren’t ready for such things. Still, the instructor should be; as she said “we all wave our hands from time to time, but WE should know when we are waving our hands.”

She stressed that we need to get future math teachers to get into the habit (she stressed the word: “habit”) of always asking themselves “why is this true” or “why is it defined in this manner”; too many of our math major courses are rule bound, and at times we write our exams in ways that reward memorization only.

Next, Bernd Sturmfels gave the second talk in his series; this was called Convex Algebraic Geometry.

You can see some of the material here. He also lead this into the concept of “Semidefinite programming”.

The best I can tell: one looks at the objects studied by algebraic geometers (root sets of polynomials of several variables) and then takes a “affine slice” of these objects.

One example: the “n-ellipse” is the set of points on the plane that satisfy $\sum^m_{k=1} \sqrt{(x-u_k)^2 + (y-v_k)^2} = d$ where $(u_k, v_k)$ are points in the plane.

Questions: what is the degree of the polynomial that describes the ellipse? What happens if we let $d$ tend to zero? What is the smallest $d$ for which the ellipse is non-vanishing (Fermat-Webber point)? Note: the 2 ellipse is the circle, the 3 ellipse (degree 8) is what we usually think of as an ellipse.

Note: these type of surfaces can be realized as the determinant of a symmetric matrix; these matrices have real eigenvalues. We can plot curves over which an eigenvalue goes to zero and then changes sign. This process leads to what is known as a spectrahedron ; this is a type of shape in space. A polyhedron can be thought of as the spectrahedron of a diagonal matrix.

Then one can seek to optimize a linear function over a spectrahedron; this leads to semidefinite programming, which, in general, is roughly as difficult as linear programming.

One use: some global optimization problems can be reduced to a semidefinite programming problem (not all).

Shorter Talks
There was a talk by Bob Palais which discussed the role of Rodrigues in the discovery of the quaternions. The idea is that Rodrigues discovered the quaternions before Hamilton did; but he talked about these in terms of rotations in space.

There were a few talks about geometry and how to introduce concepts to students; of particular interest was the concept of a geodesic. Ruth Berger talked about the “fish swimming in jello” model: basically suppose you had a sea of jello where the jello’s density was determined by its depth with the most dense jello (turning to infinite density) at the bottom; and it took less energy for the fish to swim in the less dense regions. Then if a fish wanted to swim between two points, what path would it take? The geometry induced by these geodesics results in the upper half plane model for hyperbolic space.

Nick Scoville gave a talk about discrete Morse theory. Here is a user’s guide. The idea: take a simplicial complex and assign numbers (integers) to the points, segments, triangles, etc. The assignment has to follow rules; basically the boundary of a complex has to have a lower number that what it bounds (with one exception….) and such an assignment leads to a Morse function. Critical sets can be defined and the various Betti numbers can be calculated.

Christopher Frayer then talked about the geometry of cubic polynomials. This is more interesting than it sounds.
Think about this: remember Rolles Theorem from calculus? There is an analogue of this in complex variables called the Guass-Lucas Theorem. Basically, the roots of the derivative lie in the convex hull of the roots of the polynomial. Then there is Marden’s Theorem for polynomials of degree 3. One can talk about polynomials that have a root of $z = 1$ and two other roots in the unit circle; then one can study where the the roots of the derivative lie. For a certain class of these polynomials, there is a dead circle tangent to the unit circle at 1 which encloses no roots of the derivative.

Older Posts »