College Math Teaching

March 10, 2023

Normal distribution: annoying calculations

I am thinking about doing a series of videos on the annoying but necessary calculations one encounters in basic calculus based statistics classes. I made the first video, which I am posting here. Below I post the whiteboards (too lazy to typeset).

The video will be posted as soon as it is ready.

August 1, 2014

Yes, engineers DO care about that stuff…..

I took a break and watched a 45 minute video on Fourier Transforms:

A few take away points for college mathematics instructors:

1. When one talks about the Laplace Transform, one should distinguish between the one sided and two sided transforms (e. g., the latter integrates over the full real line, instead of 0 to \infty .

2. Engineers care about being able to take limits (e. g., using L’Hopitals rule and about problems such as lim_{x \rightarrow 0} \frac{sin(2x)}{x} )

3. Engineers care about DOMAINS; they matter a great deal.

4. Sometimes the dabble in taking limits of sequences of functions (in an informal sense); here the Dirac Delta (a generalized function or distribution) is developed (informally) as a limit of Fourier transforms of a pulse function of height 1 and increasing width.

5. Even students at MIT have to be goaded into issuing answers.

6. They care about doing algebra, especially in the case of a change of variable.

So, I am teaching two sections of first semester calculus. I will emphasize things that students (and sometimes, faculty members of other departments) complain about.

August 11, 2011

Quantum Mechanics and Undergraduate Mathematics XIII: simplifications and wave-particle duality

In an effort to make the subject a bit more accessible to undergraduate mathematics students who haven’t had much physics training, we’ve made some simplifications. We’ve dealt with the “one dimensional, non-relativistic situation” which is fine. But we’ve also limited ourselves to the case where:
1. state vectors are actual functions (like those we learn about in calculus)
2. eigenvalues are discretely distributed (e. g., the set of eigenvalues have no limit points in the usual topology of the real line)
3. each eigenvalue corresponds to a unique eigenvector.

In this post we will see what trouble simplifications 1 and 2 cause and why they cannot be lived with. Hey, quantum mechanics is hard!

Finding Eigenvectors for the Position Operator
Let X denote the “position” operator and let us seek out the eigenvectors for this operator.
So X\delta = x_0 \delta where \delta is the eigenvector and x_0 is the associated eigenvalue.
This means x\delta = x_0\delta which implies (x-x_0)\delta = 0 .
This means that for x \neq x_0, \delta = 0 and \delta can be anything for x = x_0 . This would appear to allow the eigenvector to be the “everywhere zero except for x_0 ” function. So let \delta be such a function. But then if \psi is any state vector, \int_{-\infty}^{\infty} \overline{\delta}\psi dx = 0 and \int_{-\infty}^{\infty} \overline{\delta}\delta dx = 0 . Clearly this is unacceptable; we need (at least up to a constant multiple) for \int_{-\infty}^{\infty} \overline{\delta}\delta dx = 1

The problem is that restricting our eigenvectors to the class of functions is just too restrictive to give us results; we have to broaden the class of eigenvectors. One way to do that is to allow for distributions to be eigenvectors; the distribution we need here is the dirac delta. In the reference I linked to, one can see how the dirac delta can be thought of as a sort of limit of valid probability density functions. Note: \overline{\delta} = \delta .

So if we let \delta_0 denote the dirac that is zero except for x = x_0 , we recall that \int_{\infty}^{\infty} \delta_0 \psi dx = \psi(x_0) . This means that the probability density function associated with the position operator is P(X = x_0) = |\psi(x_0)|^2

This has an interesting consequence: if we measure the particle’s position at x = x_0 then the state vector becomes \delta_0 . So the new density function based on an immediate measurement of position would be P( X = x_0) = |\langle \delta_0, \delta_0 \rangle|^2 = 1 and P(X = x) = 0 elsewhere. The particle behaves like a particle with a definite “point” position.

Momentum: a different sort of problem

At first the momentum operator P\psi = -i \hbar \frac{d\psi}{dx} seems less problematic. Finding the eigenvectors and eigenfunctions is a breeze: if \theta_0 is the eigenvector with eigenvalue p_0 then:
\frac{d}{dx} \theta_0 = \frac{i}{\hbar}p_0\theta_0 has solution \theta_0 = exp(i p_0 \frac{x}{\hbar}) .
Do you see the problem?

There are a couple of them: first, this provides no restriction on the eigenvalues; in fact the eigenvalues can be any real number. This violates simplification number 2. Secondly, |\theta_0|^2 = 1 therefore |\langle \theta_0, \theta_0 \rangle |^2 = \infty . Our function is far from square integrable and therefore not a valid “state vector” in its present form. This is where the famous “normalization” comes into play.

Mathematically, one way to do this is to restrict the domain (say, limit the non-zero part to x_0 < x < x_1 ) and multiply by an appropriate constant.

Getting back to our state vector: exp(ip_0 \frac{x}{\hbar}) = cos(\frac{p_0 x}{\hbar}) + i sin(\frac{p_0 x}{\hbar}) . So if we measure momentum, we have basically given a particle a wave characteristic with wavelength \frac{\hbar}{p_0} .

Now what about the duality? Suppose we start by measuring a particle’s position thereby putting the state vector in to \psi = \delta_0 . Now what would be the expectation of momentum? We know that the formula is E(P) = -i\hbar \int-{-\infty}^{infty} \delta_0 \frac{\partial \delta_0}{\partial x} dx . But this quantity is undefined because \frac{\partial \delta_0}{\partial x} is undefined.

If we start in a momentum eigenvector and then wish to calculate the position density function (the expectation will be undefined), we see that |\theta_0|^2 = 1 which can be interpreted to mean that any position measurement is equally likely.

Clearly, momentum and position are not compatible operators. So let’s calculate XP - PX
XP \phi = x(-i\hbar \frac{d}{dx} \phi) = -xi\hbar \frac{d}{dx} \phi and PX\phi = -i \hbar\frac{d}{dx} (x \phi) = -i \hbar (\phi + x \frac{d}{dx}\phi) hence (XP - PX)\phi = i\hbar \phi . Therefore XP-PX = i\hbar . Therefore our generalized uncertainty relation tells us \Delta X \Delta P \geq \frac{1}{2}h
(yes, one might object that \Delta X really shouldn’t be defined….) but this uncertainty relation does hold up. So if one uncertainty is zero, then the other must be infinite; exact position means no defined momentum and vice versa.

So: exact, pointlike position means no defined momentum is possible (hence no wave like behavior) but an exact momentum (pure wave) means no exact pointlike position is possible. Also, remember that measurement of position endows a point like state vector of \delta_0 which destroys the wave like property; measurement of momentum endows a wave like state vector \theta_0 and therefore destroys any point like behavior (any location is equally likely to be observed).

January 7, 2011

The Dirac Delta Function in an Elementary Differential Equations Course

The Dirac Delta Function in Differential Equations

The delta ”function” is often introduced into differential equations courses during the section on Laplace transforms. Of course the delta
”function” isn’t a function at all but rather what is known as a ”distribution” (more on this later)

A typical introduction is as follows: if one is working in classical mechanics and one applies a force F(t) to a constant mass m at time t, then one can define the impulse I of F over an interval [a,b] by I=\int_{a}^{b}F(t)dt=m(v(a)-v(b)) where v is the velocity. So we can do a translation to set a=0 and then consider a unit impulse and vary F(t)
according to where b is; that is, define
\delta ^{\varepsilon}(t)=\left\{ \begin{array}{c}\frac{1}{\varepsilon },0\leq t\leq \varepsilon  \\ 0\text{ elsewhere}\end{array}\right. .

Then F(t)=\delta ^{\varepsilon }(t) is the force function that produces unit impulse for a given \varepsilon >0.

Then we wave our hands and say \delta (t)=\lim _{\varepsilon \rightarrow 0}\delta ^{\varepsilon }(t) (this is a great reason to introduce the concept of the limit of functions in a later course) and then argue that for all functions that are continuous over an interval containing 0,
\int_{0}^{\infty }\delta (t)f(t)dt=f(0).

The (hand waving) argument at this stage goes something like: ”the mean value theorem for integrals says that there is a c_{\varepsilon }
between 0 and \varepsilon such that \int_{0}^{\varepsilon }\delta^{\varepsilon }(t)f(t)dt=\frac{1}{\varepsilon}f(c_{\varepsilon})(\varepsilon -0)=f(c_{\varepsilon }) Therefore as \varepsilon\rightarrow 0, \int_{0}^{\varepsilon }\delta^{\varepsilon}(t)f(t)dt=f(c_{\varepsilon })\rightarrow f(0) by continuity. Therefore we can define the Laplace transform L(\delta (t))=e^{-s0}=1.

Illustrating what the delta ”function” does.

I came across this example by accident; I was holding a review session for students and asked for them to give me a problem to solve.

They chose y^{\prime \prime }+ay^{\prime }+by=\delta (I can remember what a and b were but they aren’t important here as we will see) with initial conditions y(0)=0,y^{\prime }(0)=-1

So using the Laplace transform, we obtained:

(s^{2}+as+b)Y-sy(0)-y^{\prime }(0)-ay(0)=1

But with y(0)=0,y^{\prime }(0)=-1 this reduces to (s^{2}+as+b)Y+1=1\rightarrow Y=0

In other words, we have the ”same solution” as if we had y^{\prime\prime }+ay^{\prime }+by=0 with y(0)=0,y^{\prime }(0)=0.

So that might be a way to talk about the delta ”function”; it is exactly the ”impulse” one needs to ”cancel out” an initial velocity of -1 or,
equivalently, to give an initial velocity of 1 and to do so instantly.

Another approach to the delta function

Though it is true that \int_{-\infty }^{\infty }\delta^{\varepsilon }(t)dt=1 for all \varepsilon and
\int_{-\infty}^{\infty }\delta (t)dt=1 by design, note that \delta ^{\varepsilon }(t)fails to be continuous at 0 and at \varepsilon .

So, can we obtain the delta ”function” as a limit of other functions that are everywhere continuous and differentiable?

In an attempt to find such a family of functions, It is a fun exercise to look at a limit of normal density functions with mean zero:

f_{\sigma }(t)=\frac{1}{\sigma \sqrt{2\pi }}\exp (-\frac{1}{2\sigma ^{2}}t^{2}). Clearly for all
\sigma >0,\int_{-\infty }^{\infty }f_{\sigma}(t)dt=1 and \int_{0}^{\infty }f_{\sigma }(t)dt=\frac{1}{2}.

Here is the graph of some of these functions: we use \sigma = .5 , \sigma = .25 and \sigma = .1 respectively.

densities

Calculating the Laplace transform

L(\frac{1}{\sigma \sqrt{2\pi }}\exp (-\frac{1}{2\sigma ^{2}}t^{2}))= \frac{1}{\sigma \sqrt{2\pi }}\int_{0}^{\infty }\exp (-\frac{1}{2\sigma^{2}}t^{2})\exp (-st)dt=

Do some algebra to combine the exponentials, complete the square and do some algebra to obtain:

\frac{1}{\sigma \sqrt{2\pi }}\int_{0}^{\infty }\exp (-\frac{1}{2\sigma ^{2}}(t+\sigma ^{2}s)^{2})\exp (\frac{s^{2}\sigma^{2}}{2})dt=\exp (\frac{s^{2}\sigma ^{2}}{2})[\frac{1}{\sigma \sqrt{2\pi }}\int_{0}^{\infty }\exp (-\frac{1}{2\sigma ^{2}}(t+\sigma^{2}s)^{2})dt]

Now do the usual transformation to the standard normal random variable via z=\dfrac{t+\sigma ^{2}s}{\sigma }

And we obtain:

L(f_{\sigma }(t))=\exp (\frac{s^{2}\sigma ^{2}}{2})P(Z>\sigma s) for all \sigma >0. Note: assume s>0 and that P is shorthand for the usual probability distribution function.

Now if we take a limit as \sigma \rightarrow 0 we get \frac{1}{2} on the right hand side.

Hence, one way to define \delta is as 2\lim _{\sigma \rightarrow0}f_{\sigma }(t) . This means that while
\lim_{\sigma \rightarrow0}\int_{-\infty }^{\infty }2f_{\sigma }(t)dt is off by a factor of 2,
\lim_{\sigma \rightarrow 0}\int_{0}^{\infty }2f_{\sigma }(t)dt=1 as desired.

Since we now have derivatives of the functions to examine, why don’t we?

\frac{d}{dt}2f_{\sigma }(t)=-\frac{2t}{\sigma ^{3}\sqrt{2\pi }}\exp (-\frac{1}{2\sigma ^{2}}t^{2}) which is zero at t=0 for all \sigma >0. But the behavior of the derivative is interesting: the derivative is at its minimum at t=\sigma and at its maximum at t=-\sigma (as we tell our probability students: the standard deviation is the distance from the origin to the inflection points) and as \sigma \rightarrow 0, the inflection points get closer together and the second derivative at the
origin approaches -\infty , which can be thought of as an instant drop from a positive velocity at t=0.

Here are the graphs of the derivatives of the density functions that were plotted above; note how the part of the graph through the origin becomes more vertical as the standard deviation approaches zero.

derivatives

Create a free website or blog at WordPress.com.