College Math Teaching

August 11, 2011

Quantum Mechanics and Undergraduate Mathematics XIII: simplifications and wave-particle duality

In an effort to make the subject a bit more accessible to undergraduate mathematics students who haven’t had much physics training, we’ve made some simplifications. We’ve dealt with the “one dimensional, non-relativistic situation” which is fine. But we’ve also limited ourselves to the case where:
1. state vectors are actual functions (like those we learn about in calculus)
2. eigenvalues are discretely distributed (e. g., the set of eigenvalues have no limit points in the usual topology of the real line)
3. each eigenvalue corresponds to a unique eigenvector.

In this post we will see what trouble simplifications 1 and 2 cause and why they cannot be lived with. Hey, quantum mechanics is hard!

Finding Eigenvectors for the Position Operator
Let X denote the “position” operator and let us seek out the eigenvectors for this operator.
So X\delta = x_0 \delta where \delta is the eigenvector and x_0 is the associated eigenvalue.
This means x\delta = x_0\delta which implies (x-x_0)\delta = 0 .
This means that for x \neq x_0, \delta = 0 and \delta can be anything for x = x_0 . This would appear to allow the eigenvector to be the “everywhere zero except for x_0 ” function. So let \delta be such a function. But then if \psi is any state vector, \int_{-\infty}^{\infty} \overline{\delta}\psi dx = 0 and \int_{-\infty}^{\infty} \overline{\delta}\delta dx = 0 . Clearly this is unacceptable; we need (at least up to a constant multiple) for \int_{-\infty}^{\infty} \overline{\delta}\delta dx = 1

The problem is that restricting our eigenvectors to the class of functions is just too restrictive to give us results; we have to broaden the class of eigenvectors. One way to do that is to allow for distributions to be eigenvectors; the distribution we need here is the dirac delta. In the reference I linked to, one can see how the dirac delta can be thought of as a sort of limit of valid probability density functions. Note: \overline{\delta} = \delta .

So if we let \delta_0 denote the dirac that is zero except for x = x_0 , we recall that \int_{\infty}^{\infty} \delta_0 \psi dx = \psi(x_0) . This means that the probability density function associated with the position operator is P(X = x_0) = |\psi(x_0)|^2

This has an interesting consequence: if we measure the particle’s position at x = x_0 then the state vector becomes \delta_0 . So the new density function based on an immediate measurement of position would be P( X = x_0) = |\langle \delta_0, \delta_0 \rangle|^2 = 1 and P(X = x) = 0 elsewhere. The particle behaves like a particle with a definite “point” position.

Momentum: a different sort of problem

At first the momentum operator P\psi = -i \hbar \frac{d\psi}{dx} seems less problematic. Finding the eigenvectors and eigenfunctions is a breeze: if \theta_0 is the eigenvector with eigenvalue p_0 then:
\frac{d}{dx} \theta_0 = \frac{i}{\hbar}p_0\theta_0 has solution \theta_0 = exp(i p_0 \frac{x}{\hbar}) .
Do you see the problem?

There are a couple of them: first, this provides no restriction on the eigenvalues; in fact the eigenvalues can be any real number. This violates simplification number 2. Secondly, |\theta_0|^2 = 1 therefore |\langle \theta_0, \theta_0 \rangle |^2 = \infty . Our function is far from square integrable and therefore not a valid “state vector” in its present form. This is where the famous “normalization” comes into play.

Mathematically, one way to do this is to restrict the domain (say, limit the non-zero part to x_0 < x < x_1 ) and multiply by an appropriate constant.

Getting back to our state vector: exp(ip_0 \frac{x}{\hbar}) = cos(\frac{p_0 x}{\hbar}) + i sin(\frac{p_0 x}{\hbar}) . So if we measure momentum, we have basically given a particle a wave characteristic with wavelength \frac{\hbar}{p_0} .

Now what about the duality? Suppose we start by measuring a particle’s position thereby putting the state vector in to \psi = \delta_0 . Now what would be the expectation of momentum? We know that the formula is E(P) = -i\hbar \int-{-\infty}^{infty} \delta_0 \frac{\partial \delta_0}{\partial x} dx . But this quantity is undefined because \frac{\partial \delta_0}{\partial x} is undefined.

If we start in a momentum eigenvector and then wish to calculate the position density function (the expectation will be undefined), we see that |\theta_0|^2 = 1 which can be interpreted to mean that any position measurement is equally likely.

Clearly, momentum and position are not compatible operators. So let’s calculate XP - PX
XP \phi = x(-i\hbar \frac{d}{dx} \phi) = -xi\hbar \frac{d}{dx} \phi and PX\phi = -i \hbar\frac{d}{dx} (x \phi) = -i \hbar (\phi + x \frac{d}{dx}\phi) hence (XP - PX)\phi = i\hbar \phi . Therefore XP-PX = i\hbar . Therefore our generalized uncertainty relation tells us \Delta X \Delta P \geq \frac{1}{2}h
(yes, one might object that \Delta X really shouldn’t be defined….) but this uncertainty relation does hold up. So if one uncertainty is zero, then the other must be infinite; exact position means no defined momentum and vice versa.

So: exact, pointlike position means no defined momentum is possible (hence no wave like behavior) but an exact momentum (pure wave) means no exact pointlike position is possible. Also, remember that measurement of position endows a point like state vector of \delta_0 which destroys the wave like property; measurement of momentum endows a wave like state vector \theta_0 and therefore destroys any point like behavior (any location is equally likely to be observed).

Quantum Mechanics and Undergraduate Mathematics XII: position and momentum operators

Filed under: advanced mathematics, applied mathematics, physics, probability, quantum mechanics, science — collegemathteaching @ 1:52 am

Recall that the position operator is X \psi = x\psi and the momentum operator P \psi = -i\hbar \frac{d}{dx} \psi .

Recalling our abuse of notation that said that the expected value E = \langle \psi, A \psi \rangle , we find that the expected value of position is E(X) = \int_{-\infty}^{\infty} x |\psi|^2 dx . Note: since \int_{-\infty}^{\infty}  |\psi|^2 dx = 1, we can view |\psi|^2 as a probability density function; hence if f is any “reasonable” function of x , then E(f(X)) = \int_{-\infty}^{\infty} f(x) |\psi|^2 dx . Of course we can calculate the variance and other probability moments in a similar way; e. g. E(X^2) =  \int_{-\infty}^{\infty} x |\psi|^2 dx .

Now we turn to momentum; E(P) = \langle \psi, -i\hbar \frac{d}{dx} \psi \rangle = \int_{-\infty}^{\infty} \overline{\psi}\frac{d}{dx}\psi dx and E(P^2) = \langle \psi, P^2\psi \rangle = \langle P\psi, P\psi \rangle = \int_{-\infty}^{\infty} |\frac{d}{dx}\psi|^2 dx

So, back to position: we can now use the fact that |\psi|^2 is a valid density function associated with finding the expected value of position and call this the position probability density function. Hence P(x_1 < x < x_2) = \int_{-\infty}^{\infty} |\psi|^2 dx . But we saw that this can change with time so: P(x_1 < x < x_2; t) = \int_{-\infty}^{\infty} |\psi(x,t)|^2 dx

This is a great chance to practice putting together: differentiation under the integral sign, Schrödinger’s equation and integration by parts. I recommend that the reader try to show:

\frac{d}{dt} \int_{x_1}^{x_2} \overline{\psi}\psi dx = \frac{ih}{2m}(\overline{\psi}\frac{d \psi}{dx}-\psi \frac{d \overline{\psi}}{dx})_{x_1}^{x_2}

The details for the above calculation (students: try this yourself first! 🙂 )

Differentiation under the integral sign:
\frac{d}{dt} \int_{x_1}^{x_2} \overline{\psi} \psi dx = \int_{x_1}^{x_2}\overline{\psi} \frac{\partial \psi}{\partial t} + \psi \frac{\partial \overline{ \psi}}{\partial t} dt

Schrödinger’s equation (time dependent version) with a little bit of algebra:
\frac{\partial \psi}{\partial t} = \frac{i \hbar}{2m} \frac{\partial^2 \psi}{\partial x^2} - \frac{i}{\hbar}V \psi
\frac{\partial \overline{\psi}}{\partial t} = \frac{i \hbar}{2m} \frac{\partial^2 \overline{\psi}}{\partial x^2} + \frac{i}{\hbar}V \overline{\psi}

Note: V is real.

Algebra: eliminate the partial with respect to time terms; multiply the top equation by \overline{\psi} and the second by \psi . Then add the two to obtain:
\overline{\psi} \frac{\partial \psi}{\partial t} + \psi \frac{\partial \overline{ \psi}}{\partial t} = \frac{i \hbar}{2m}(\overline{\psi} \frac{\partial^2 \psi}{\partial x^2} + \psi \frac{\partial^2 \overline{ \psi}}{\partial x^2})

Now integrate by parts:
\frac{i \hbar}{2m} \int_{x_2}^{x_1} (\overline{\psi} \frac{\partial^2 \psi}{\partial x^2} + \psi \frac{\partial^2 \overline{ \psi}}{\partial x^2}) dx =

\frac{ih}{2m} ((\overline{\psi} \frac{\partial \psi}{\partial x})_{x_1}^{x_2} - \int_{x_2}^{x_1} \frac{\partial \overline{\psi}}{\partial x} \frac{\partial \psi}{\partial x} - ( (\psi \frac{\partial \overline{\psi}}{\partial x})_{x_1}^{x_2}  - \int_{x_2}^{x_1}\frac{\partial \psi}{\partial x}\frac{\partial \overline{\psi}}{\partial x}dx)

Now the integrals cancel each other and we obtain our result.

It is common to denote -\frac{ih}{2m}(\overline{\psi}\frac{d \psi}{dx}-\psi \frac{d \overline{\psi}}{dx} by S(x,t) (note the minus sign) and to say \frac{d}{dt}P(x_1 < x < x_2 ; t) = S(x_1,t) - S(x_2,t) (see the reason for the minus sign?)

S(x,t) is called the position probability current at the point x at time t One can think of this as a "probability flow rate" over the point x at time t ; the quantity S(x_1, t) - S(x_2, t) will tell you if the probability of finding the particle between position x_1 and x_2 is going up (positive sign) or down, and by what rate. But it is important that these are position PROBABILITY current and not PARTICLE current; same for |\psi |^2 ; this is the position probability density function, not the particle density function.

NOTE I haven’t talked about the position and momentum eigenvalues or eigenfuctions. We’ll do that in our next post; we’ll run into some mathematical trouble here. No, it won’t be with the position because we already know what a distribution is; the problem is that we’ll find the momentum eigenvector really isn’t square integrable….or even close.

August 10, 2011

Quantum Mechanics and Undergraduate Mathematics XI: an example (potential operator)

Filed under: advanced mathematics, calculus, differential equations, quantum mechanics, science — collegemathteaching @ 8:41 pm

Recall the Schrödinger equations:
-\frac{\hbar^2}{2m} \frac{d^2}{dx^2} \eta_k + V(x) \eta_k = e_k \eta_k and
-\frac{\hbar^2}{2m} \frac{\partial^2}{\partial x^2} \phi + V(x) \phi = i\hbar \frac{\partial}{\partial t}\phi

The first is the time-independent equation which uses the eigenfuctions for the energy operator (Hamiltonian) and the second is the time-dependent state vector equation.

Now suppose that we have a specific energy potential V(x) ; say V(x) = \frac{1}{2}kx^2 . Note: in classical mechanics this follows from Hooke’s law: F(x) = -kx = -\frac{dV}{dx} . In classical mechanics this leads to the following differential equation: ma = m \frac{d^2x}{dt^2} = -kx which leads to \frac{d^2x}{dt^2} + (\frac{k}{m})x = 0 which has general solution x = C_1 sin(wt) +C_2cos(wt) where w = \sqrt{\frac{k}{m}} The energy of the system is given by E = \frac{1}{2}mw^2A^2 where A is the maximum value of x which, of course, is determined by the initial conditions (velocity and displacement at t = 0 ).

Note that there are no a priori restrictions on A . Notation note: A stands for a real number here, not an operator as it has previously.

So what happens in the quantum world? We can look at the stationary states associated with this operator; that means turning to the first Schrödinger equation and substituting V(x) = \frac{1}{2}kx^2 (note k > 0 ):

-\frac{\hbar^2}{2m} \frac{d^2}{dx^2} \eta_k +  \frac{1}{2}kx^2\eta_k = e_k \eta_k

Now let’s do a little algebra to make things easier to see: divide by the leading coefficient and move the right hand side of the equation to the left side to obtain:

\frac{d^2}{dx^2} \eta_k + (\frac{2 e_k m}{(\hbar)^2} - \frac{km}{(\hbar)^2}x^2) \eta_k = 0

Now let’s do a change of variable: let x = rz Now we can use the chain rule to calculate: \frac{d^2}{dx^2} = \frac{1}{r^2}\frac{d^2}{dz^2} . Substitution into our equation in x and multiplication on both sides by r^2 yields:
\frac{d^2}{dz^2} \eta_k + (r^2 \frac{2 e_k m}{\hbar^2} - r^4\frac{km}{(\hbar)^2}z^2) \eta_k = 0
Since r is just a real valued constant, we can choose r = (\frac{km}{\hbar^2})^{-1/4} .
This means that r^2 \frac{2 e_k m}{\hbar^2} = \sqrt{\frac{\hbar^2}{km}}\frac{2 e_k m}{(\hbar)^2} = 2 \frac{e_k}{\hbar}\sqrt{\frac{m}{k}}

So our differential equation has been transformed to:
\frac{d^2}{dz^2} \eta_k + (2 \frac{e_k}{\hbar}\sqrt{\frac{m}{k}} - z^2) \eta_k = 0

We are now going to attempt to solve the eigenvalue problem, which means that we will seek values for e_k that yield solutions to this differential equation; a solution to the differential equation with a set eigenvalue will be an eigenvector.

If we were starting from scratch, this would require quite a bit of effort. But since we have some ready made functions in our toolbox, we note 🙂 that setting e_k = (2k+1) \frac{\hbar}{2} \sqrt{{k}{m}} gives us:
\frac{d^2}{dz^2} \eta_k + (2K+1 - z^2) \eta_k = 0

This is the famous Hermite differential equation.

One can use techniques of ordinary differential equations (say, series techniques) to solve this for various values of n .
It turns out that the solutions are:

\eta_k = (-1)^k (2^k k! \sqrt{\pi})^{-1/2} exp(\frac{z^2}{2})\frac{d^k}{dz^k}exp(-z^2) = (2^k k! \sqrt{\pi})^{-1/2} exp(-z^/2)H_k(z) where here H_k(z) is the k'th Hermite polynomial. Here are a few of these:

Graphs of the eigenvectors (in z ) are here:

(graphs from here)

Of importance is the fact that the allowed eigenvalues are all that can be observed by a measurement and that these form a discrete set.

Ok, what about other operators? We will study both the position and the momentum operators, but these deserve their own post as this is where the fun begins! 🙂

Quantum Mechanics and Undergraduate Mathematics X: Schrödinger’s Equations

Filed under: advanced mathematics, applied mathematics, calculus, physics, quantum mechanics, science — collegemathteaching @ 1:19 am

Recall from classical mechanics: E = \frac{1}{2}mv^2 + V(x) where E is energy and V(x) is potential energy. We also have position x and momentum p = mv Note that we can then write E = \frac{p^2}{2m} + V(x) . Analogues exist in quantum mechanics and this is the subject of:

Postulate 6. Momentum and position (one dimensional motion) are represented by the operators:
X = x and P = -i\hbar \frac{d}{dx} respectively. If f is any “well behaved” function of two variables (say, locally analytic?) then A = f(X, P) = f(x, -i\hbar \frac{d}{dx} ) .

To see how this works: let \phi(x) = (2 \pi)^{-\frac{1}{4}}exp(-\frac{x^2}{4})
Then X \phi = (2 \pi)^{-\frac{1}{4}}x exp(-\frac{x^2}{4}) and P \phi = i\hbar (2 \pi)^{-\frac{1}{4}} 2x exp(-\frac{x^2}{4})

Associated with these is energy Hamiltonian operator H = \frac{1}{2m} P^2 + V(X) where P^2 means “do P twice”. So H = -\frac{\hbar^2}{2m}\frac{d^2}{dx^2} + V(x) .

Note We are going to show that these two operators are Hermitian…sort of. Why sort of: these operators A might not be “closed” in the sense that \langle \phi_1, \phi_2 \rangle exists but \langle \phi_1, A \phi_2 \rangle might not exist. Here is a simple example: let \phi_1 = \phi_2 = \sqrt{\frac{2}{\pi} \frac{1}{x^2 + 1}} . Then \langle \phi_1, \phi_2 \rangle = 1 but \int_{-\infty}^{\infty} x \phi_1 dx fails to exist.

So the unstated assumption is that when we are proving that various operators are Hermetian, we mean that they are Hermetian for state vectors which are transformed into functions for which the given inner product is defined.

So, with this caveat in mind, let’s show that these operators are Hermitian.

X clearly is because \langle \phi_1, x \phi_2 \rangle = \langle x \phi_1, \phi_2 \rangle . If this statement is confusing, remember that x is a real variable and therefore \overline{x} = x . Clearly, any well behaved real valued function of x is also a Hermitian operator. IF we assume that P is a Hermitian operator, then \langle \phi_1, P^2 \phi_2 \rangle = \langle P\phi_1, P\phi_2 \rangle = \langle P^2 \phi_1, \phi_2 \rangle . So we must show that P is Hermitian.

This is a nice exercise in integration by parts:
\langle \phi_1, P\phi_2 \rangle = -i\hbar\langle \phi_1, \frac{d}{dx} \phi_2 \rangle = -i\hbar \int_{-\infty}^{\infty} \overline{\phi_1} \frac{d}{dx} \phi_2 dx . Now we note that \overline{\phi_1} \phi_2 |_{-\infty}^{\infty} = 0 (else the improper integrals would fail to converge this is a property assumed for state vectors; mathematically it is possible that the limit as x \rightarrow \infty doesn’t exist but the integral still converges) and so by the integration by parts formula we get i\hbar\int_{-\infty}^{\infty} \overline{\frac{d}{dx}\phi_1}  \phi_2 dx =\int_{-\infty}^{\infty} \overline{-i\hbar\frac{d}{dx}\phi_1}  \phi_2 dx = \langle P\phi_1, \phi_2 \rangle .

Note that potential energy is a function of x so it too is Hermitian. So our Hamiltonian H(p,x) = \frac{1}{2m}P^2 + V(X) = -\frac{h^2}{2m}\frac{d^2}{dx^2} + V(x) is also Hermitian. That has some consequences:

1. H \eta_k = e_k \eta_k
2. H \psi = i\hbar\frac{\partial}{\partial t} \psi

Now we substitute for H and obtain:

1. -\frac{h^2}{2m} \frac{d^2}{dx^2} \eta_k + V(x)\eta_k = e_k \eta_k

2. -\frac{h^2}{2m} \frac{\partial^2}{\partial x^2} \psi + V(x)\psi = i\hbar \frac{\partial}{\partial t} \psi

These are the Schrödinger equations; the first one is the time independent equation. It is about each Hamiltonian energy eigenvector…or you might say each stationary state vector. This holds for each k . The second one is the time dependent one and applies to the state vector in general (not just the stationary states). It is called the fundamental time evolution equation for the state vector.

Special note: if one adjusts the Hamiltonian by adding a constant, the eigenvectors remain the same but the eigenvalues are adjusted by adding a constant. So the adjusted time vector gets adjusted by a factor of exp(-iC \frac{t}{\hbar}) which has a modulus of 1. So the new state vector describes the same state as the old one.

Next post: we’ll give an example and then derive the eigenvalues and eigenvectors for the position and momentum operators. Yes, this means dusting off the dirac delta distribution.

August 9, 2011

Quantum Mechanics and Undergraduate Mathematics IX: Time evolution of an Observable Density Function

We’ll assume a state function \psi and an observable whose Hermitian operator is denoted by A with eigenvectors \alpha_k and eigenvalues a_k . If we take an observation (say, at time t = 0 ) we obtain the probability density function p(Y = a_k) = | \langle \alpha_k, \psi \rangle |^2 (we make the assumption that there is only one eigenvector per eigenvalue).

We saw how the expectation (the expected value of the associated density function) changes with time. What about the time evolution of the density function itself?

Since \langle \alpha_k, \psi \rangle completely determines the density function and because \psi can be expanded as \psi = \sum_{k=1} \langle \alpha_k, \psi \rangle \alpha_k it make sense to determine \frac{d}{dt} \langle \alpha_k, \psi \rangle . Note that the eigenvectors \alpha_k and eigenvalues a_k do not change with time and therefore can be regarded as constants.

\frac{d}{dt} \langle \alpha_k, \psi \rangle =   \langle \alpha_k, \frac{\partial}{\partial t}\psi \rangle = \langle \alpha_k, \frac{-i}{\hbar}H\psi \rangle = \frac{-i}{\hbar}\langle \alpha_k, H\psi \rangle

We can take this further: we now write H\psi = H\sum_j \langle \alpha_j, \psi \rangle \alpha_j = \sum_j \langle \alpha_j, \psi \rangle H \alpha_j We now substitute into the previous equation to obtain:
\frac{d}{dt} \langle \alpha_k, \psi \rangle = \frac{-i}{\hbar}\langle \alpha_k, \sum_j \langle \alpha_j, \psi \rangle H \alpha_j   \rangle = \frac{-i}{\hbar}\sum_j \langle \alpha_k, H\alpha_j \rangle \langle \alpha_j, \psi \rangle

Denote \langle \alpha_j, \psi \rangle by a_j . Then we see that we have the infinite coupled differential equations: \frac{d}{dt} a_k = \frac{-i}{\hbar} \sum_j a_j \langle \alpha_k, H\alpha_j \rangle . That is, the rate of change of one of the a_k depends on all of the a_j which really isn’t a surprise.

We can see this another way: because we have a density function, \sum_j |\langle \alpha_j, \psi \rangle |^2 =1 . Now rewrite: \sum_j |\langle \alpha_j, \psi \rangle |^2 =  \sum_j \langle \alpha_j, \psi \rangle \overline{\langle \alpha_j, \psi \rangle } =  \sum_j a_j \overline{ a_j} = 1 . Now differentiate with respect to t and use the product rule: \sum_j \frac{d}{dt}a_j \overline{ a_j} + a_j  \frac{d}{dt} \overline{ a_j} = 0

Things get a bit easier if the original operator A is compatible with the Hamiltonian H ; in this case the operators share common eigenvectors. We denote the eigenvectors for H by \eta and then
\frac{d}{dt} a_k = \frac{-i}{\hbar} \sum_j a_j \langle \alpha_k, H\alpha_j \rangle becomes:
\frac{d}{dt} \langle \eta_j, \psi \rangle = \frac{-i}{\hbar} \sum_j \langle \eta_j, \psi \rangle \langle \eta_k, H\eta_j \rangle Now use the fact that the \eta_j are eigenvectors for H and are orthogonal to each other to obtain:
\frac{d}{dt} \langle \eta_k, \psi \rangle = \frac{-i}{\hbar} e_k \langle \eta_k, \psi \rangle where e_k is the eigenvalue for H associated with \eta_k .

Now we use differential equations (along with existence and uniqueness conditions) to obtain:
\langle \eta_k, \psi \rangle  = \langle_k, \psi_0 \rangle exp(-ie_k \frac{t}{\hbar}) where \psi_0 is the initial state vector (before it had time to evolve).

This has two immediate consequences:

1. \psi(x,t) = \sum_j \langle \eta_j, \psi_0 \rangle  exp(-ie_j \frac{t}{\hbar}) \eta_j
That is the general solution to the time-evolution equation. The reader might be reminded that exp(ib) = cos(b) + i sin (b)

2. Returning to the probability distribution: P(Y = e_k) = |\langle \eta_k, \psi \rangle |^2 = |\langle \eta_k, \psi_0 \rangle |^2 ||exp(-ie_k \frac{t}{\hbar})|^2 = |\langle \eta_k, \psi_0 \rangle |^2 . But since A is compatible with H , we have the same eigenvectors, hence we see that the probability density function does not change AT ALL. So such an observable really is a “constant of motion”.

Stationary States
Since H is an observable, we can always write \psi(x,t) = \sum_j \langle \eta_j, \psi(x,t) \rangle \eta_j . Then we have \psi(x,t)= \sum_j \langle \eta_j, \psi_0 \rangle exp(-ie_j \frac{t}{\hbar}) \eta_j

Now suppose \psi_0 is precisely one of the eigenvectors for the Hamiltonian; say \psi_0 = \eta_k for some k . Then:

1. \psi_(x,t) = exp(-ie_k \frac{t}{\hbar}) \eta_k
2. For any t \geq 0 , P(Y = e_k) = 1, P(Y \neq  e_k) = 0

Note: no other operator has made an appearance.
Now recall our first postulate: states are determined only up to scalar multiples of unity modulus. Hence the state undergoes NO time evolution, no matter what observable is being observed.

We can see this directly: let A be an operator corresponding to any observable. Then \langle \alpha_k, A \psi_k \rangle = \langle \alpha_k, A exp(-i e_k \frac{t}{\hbar})\eta_k \rangle = exp(-i e_k \frac{t}{\hbar}\langle \alpha_k, A \eta_k \rangle . Then because the probability distribution is completely determined by the eigenvalues e_k and |\langle \alpha_k, A \eta_k \rangle | and |exp(-i e_k \frac{t}{\hbar}| = 1 , the distribution does NOT change with time. This motivates us to define the stationary states of a system: \psi_{(k)} = exp(- e_k \frac{t}{\hbar})\eta_k .

Gillespie notes that much of the problem solving in quantum mechanics is solving the Eigenvalue problem: H \eta_k = e_k \eta_k which is often difficult to do. But if one can do that, one can determine the stationary states of the system.

August 8, 2011

Quantum Mechanics and Undergraduate Mathematics VIII: Time Evolution of Expectation of an Observable

Filed under: advanced mathematics, applied mathematics, physics, probability, quantum mechanics, science — collegemathteaching @ 3:12 pm

Back to our series on QM: one thing to remember about observables: they are operators with a set collection of eigenvectors and eigenvalues (allowable values that can be observed; “quantum levels” if you will). These do not change with time. So \frac{d}{dt} (A (\psi)) = A (\frac{\partial}{\partial t} \psi) . One can work this out by expanding A \psi if one wants to.

So with this fact, lets see how the expectation of an observable evolves with time (given a certain initial state):
\frac{d}{dt} E(A) = \frac{d}{dt} \langle \psi, A \psi \rangle = \langle \frac{\partial}{\partial t} \psi, A \psi \rangle + \langle \psi, A \frac{\partial}{\partial t} \psi \rangle

Now apply the Hamiltonian to account for the time change of the state vector; we obtain:
\langle -\frac{i}{\hbar}H \psi, A \psi \rangle + \langle \psi, -\frac{i}{\hbar}AH \psi \rangle = \overline{\frac{i}{\hbar}} \langle H \psi, A \psi \rangle + -\frac{i}{\hbar} \langle \psi, AH \psi \rangle

Now use the fact that both H and A are Hermitian to obtain:
\frac{d}{dt} A = \frac{i}{\hbar} \langle \psi, (HA - AH) \psi \rangle .
So, we see the operator HA - AH once again; note that if A, H commute then the expectation of the state vector (or the standard deviation for that matter) does not evolve with time. This is certainly true for H itself. Note: an operator that commutes with H is sometimes called a “constant of motion” (think: “total energy of a system in classical mechanics).

Note also that |\frac{d}{dt} A | = |\frac{i}{\hbar} \langle \psi, (HA - AH) \psi \rangle | \leq 2 \Delta A \Delta H

If A does NOT correspond with a constant of motion, then it is useful to define an evolution time T_A = \frac{\Delta A}{\frac{E(A)}{dt}} where \Delta A = (V(A))^{1/2} This gives an estimate of how much time must elapse before the state changes enough to equal the uncertainty in the observable.

Note: we can apply this to H and A to obtain T_A \Delta H \ge \frac{\hbar}{2}

Consequences: if T_A is small (i. e., the state changes rapidly) then the uncertainty is large; hence energy is impossible to be well defined (as a numerical value). If the energy has low uncertainty then T_A must be large; that is, the state is very slowly changing. This is called the time-energy uncertainty relation.

MathFest Day Three (Lexington 2011)

I left after the second large lecture and didn’t get a chance to blog about them before now.

But what I saw was very good.

The early lecture was by Lauren Ancel Meyers (Texas-Austin) on Mathematical Approaches to Infectious Disease and Control This is one of those talks where I wish I had access to the slides; they were very useful.

She started out by giving a brief review of the classical SIR model of the spread of a disease which uses the mass action principle (from science) that says that the rate of of change of those infected with a disease is proportional to the product of those who are susceptible to the disease and those who can transmit the disease: \frac{dI}{dt}=\beta S I . (this actually came from chemistry). Of course, those who are infected either recover or die; this action reduces the number infected. Of course, the number of susceptible also drop.

This leads to a system of differential equations. The basic reproduction number is significant:
= R_0 = \frac{\beta S}{\nu + \delta} where \nu is the recovery rate and \delta is the death rate. Note: if R_0 < 1 then the disease will die off; if it is greater than 1 we have a pandemic. We can reduce this by reducing S (vaccination or quarantine), increasing recovery or, yes, increasing the death rate (as we do with livestock; remember the massive poultry slaughters to stop the spread of flu).

Of course, this model assumes that the infected organisms contact others at random and have equal probabilities of spreading, that the virus doesn’t evolve, etc.

So this model had to be improved on; methods from percolation theory were developed.

So many factors had to be taken into account such as: how much vaccine is there to spread? How far along is the outbreak? (at first children get it; then adults). How severe is the consequences? (we don’t want the virus to evolve to a more dangerous, more resistant form).

Note that the graph model of transmission is dynamic; it can actually change with time.

Of special interest: one can recover the rate of infections of the various strains (and the strains vary from season to season) by looking at the number of times flu related words were searched for on Google. The graph overlap (search rate versus reported cases) was stunning; the only exception is when a scare occurred; then the word search rate lead the actual cases, but that happened only once (2009). Note also that predictions of what will happen get better with a shorter time window (not a surprise).

There was much more in the talk; for example the role of the location of the providers of vaccines was discussed (what is the optimal way to spread out the availability of a given vaccine?)

Manjur Bhargava, Lecture III
First, he noted that in the case where f(x,y) was cubic, that there is always a rational change of variable to put the curve into the following form: y^2 = x^3 + Ax + B where A, B are integers that have the following property: if p is any prime where p^4 divides A then p^6 does NOT divide B . So this curve can be denoted as E_{A,B} .

Also, there are two “generic” cases of curves depending on whether the cubic in x has only one real root or three real roots.

This is a catalog of elliptical algebraic curves of the form y^2 = x^3 + ax + b taken from here. The everywhere smooth curves are considered; the ones with a disconnected graph are said to have “an egg”; those are the ones in which the cubic in x has three real roots. In the connected case, the cubic has only one; remember that these are genus one curves; we are seeing a slice of a torus in 4-space (a space with two complex dimensions) in the plane.

Also recall that the rational points on the curve may be finite or infinite. It turns out that the rational points (both coordinates rational) have a group structure (this is called the “divisor class group” in algebraic geometry). This group has a structure that can be understood by a simple geometric construction in the plane, though checking that the operation is associative can be very tedious.

I’ll give a description of the group operation and provide an elementary example:

First, note that if (x,y) is a point on an elliptical curve, then so is (x, -y) (note: the y^2 on the left hand side of the defining equation). That is important. Also note that we will restrict ourselves to smooth curves (that have a well defined tangent line).

The elements of our group will be the rational points of the curve (if any?) along with the point at infinity. If P = (x_1, y_1) I will denote (x_1, -y_1) = P' .

The operation: if P, Q are rational points on the curve, construct the line l with equation y = m(x-x_1)+ y_1 Substitute this into y^2 = x^3 + Ax + B and note that we now have a cubic equation in x that has two rational solutions; hence there must be a third rational solution x_r . Associated to that x value is two y values (possibly double if the y value is zero). Call that point on the curve R then define P + Q = R' where R' is the reflection of R about the x axis.

Note the following: that this operation commutes is immediate. If one adds a point to itself, one uses the tangent line as the line through two points; note that such a line might not hit the curve a third time. If such a line is vertical (parallel to the y axis) the result is said to be “0” (the point at infinity); if the line is not vertical but still misses the rest of the curve, it is counted three times; that is: P + P = P' . Here are the situations:

Of course, \infty is the group identity. Associativity is difficult to check directly (elementary algebra but very tedious; perhaps 3-4 pages of it?).

Since the group is Abelian, if the group is finite it must be isomorphic to \oplus_{i = 1}^r Z_i \oplus \frac{Z}{n_1 Z} \oplus \frac{Z}{n_2 Z}....\frac{Z}{n_k Z} where the second part is the torsion part and the number of infinite cyclic factors is the rank. The rank turns out to be the geometric rank; that is, the minimum number of points required to obtain all of the rational points (infinite number) of the curve. Let T be the torsion subgroup; Mazur proved that |T|\le 16 .

Let’s look at an example of a subgroup of such a curve: let the curve be given by y^2 = X^3 + 1 It is easy to see that (0,1), (0, -1), (2, 3), (2, -3), (-1, 0) are all rational points. Let’s see how these work: (-1, 0) + (-1, 0) = 0 so this point has order 2. But there is also some interesting behavior: note that \frac{d}{dx} (y^2) = \frac{d}{dx}(x^3 + 1) which implies that \frac{dy}{dx} = \frac{3x^2}{2y} So the tangent line through (0, 1) and (0, -1) are both horizontal; that means that both of these points have order 3. Note also that (2, 3) + (2,3) = (0,1) as the tangent line runs through the point (0, -1) . Similarly (2, 3) + (0, -1) = (2, -3) So, we can see that (2,3), (2, -3) have order 6, (0, 1), (0, -1) have order 3 and (-1, 0) has order 2. So there is an isomorphism \theta where \theta(2,3) = 1, \theta(2,-3) = 5, \theta(0, 1) = 2, \theta(0, -1) = 4, \theta(-1, 0) = 3 where the integers are mod 6.

So, we’ve shown a finite Abelian subgroup of the group of rationals of this curve. It turns out that these are the only rational points; here all we get is the torsion group. This curve has rank zero (not obvious).

Note: the group of rationals for y^2 = x^3 + 2x + 3 is isomorphic to Z \oplus \frac{Z}{2Z} though this isn’t obvious.

The generator of the Z term is (3,6) and (-1,0) generates the the torsion term.

History note Some of this was tackled by computers many years ago (Birch, Swinnerton-Dyer). Because computers were so limited in those days, the code had to be very efficient and therefore people had to do quite a bit of work prior to putting it into code; evidently this lead to progress. The speaker joked that such progress might not have been so quickly today due to better computers!

If one looks at y^2 = x^3 + Ax + B mod p where p is prime, we should have about p points on the curve. So we’d expect that \frac{N_p}{p} \approx 1 . If there are a lot of rational points on the curve, most of these points would correspond to mod p points. So there is a conjecture by Birch, Swinnerton-Dyer:
\prod_{p \le X} \frac{N_p}{p} \approx c (log(X))^r where r is the rank.

Yes, this is hard; win one million US dollars if you prove it. 🙂

Back to the curves: there are ways of assigning “heights” to these curves; some include:
H(E_{(A,B)}) = max(4|A|^3, 27B^2) or \Delta(E_{(A,B)} -4A^3 - 27B^2

Given this ordering, what are average sizes of ranks?
Katz-Sarnak: half have rank 0, half have rank 1. It was known that average ranks are bounded; previous results had the bound at 2.3, 2, 1.79, assuming that the Generalized Riemann Hypothesis and the Birch, Swinnerton-Dyer conjecture were asssumed.

The speaker and his students got some results without making these large assumptions:

Result 1: when E/Q is ordered by height, the average rank is less than 1.
Result 2: A positive portion (10 percent, at least) have rank 0.
Result 3: at least 80 percent have rank 0 or 1.
Corollary: the BSD is true for a positive proportion of elliptic curves;

The speaker (with his student) proved results 1, 2, and 3 and then worked backwards on the existing “BSD true implies X” results to show that BSD was true for a positive proportion of the elliptic curves.

July 28, 2011

Quantum Mechanics and Undergraduate Mathematics VII: Time Evolution of the State Vector

Filed under: advanced mathematics, applied mathematics, physics, quantum mechanics, science — collegemathteaching @ 2:38 pm

Of course the state vector \psi changes with time. The question is how does it change with time and how does the probability density function associated with an observable change with time?

Note: we will write \psi_t for \psi(x,t) . Now let A be an observable. Note that the eigenvectors and the eigenvalues associated with A do NOT change with time, so if we expand \psi_t in terms of the eigenbasis for A we have \psi_t = \sum_k \langle \alpha_k, \psi_t \rangle \alpha_k hence \frac{\partial \psi_t}{\partial t} = \sum_k \langle \alpha_k, \frac{\partial \psi_t}{\partial t} \rangle \alpha_k

Of course, we need the state vector to “stay in the class of state vectors” when it evolves with respect to time, which means that the norm cannot change; or \frac{d}{dt} \langle \psi_t, \psi_t \rangle = 0 .

Needless to say there has to be some restriction on how the state vector can change with time. So we have another postulate:

Postulate 5
For every physical system there exists a linear Hermitian operator H called the Hamiltonian operator such that:
1. i\hbar \frac{\partial}{\partial t} \psi(x,t) = H\psi(x,t) and
2. H corresponds to the total energy of the system and possesses a complete set of eigenvectors \eta_k and eigenvalues e_k where the eigenvalues are the “allowed values” of the total energy of the system.

Note: \hbar is the constant \frac{h}{\pi} where h is Plank’s constant.

Note: H is not specified; it is something that the physicists have to come up with by observing the system. That is, there is a partial differential equation to be solved!

But does this give us what we want, at least in terms of \psi_t staying at unit norm for all times t ? (note: again we write \psi_t for \psi(x,t) ).

The answer is yes; first note that \frac{d}{dt} \langle \psi_t, \psi_t \rangle = \langle \frac{\partial \psi_t}{\partial t}, \psi_t \rangle + \langle \psi_t,  \frac{\partial \psi_t}{\partial t}\rangle ; this is an easy exercise in using the definition of our inner product and differentiating under the integral sign and noting that the partial derivative operation and the conjugate operation commute).

Now note: \langle \frac{\partial \psi_t}{\partial t}, \psi_t \rangle + \langle \psi_t,  \frac{\partial \psi_t}{\partial t}\rangle = \overline{-\frac{i}{\hbar}}\langle H\psi_t, \psi_t \rangle + -\frac{i}{\hbar}\langle \psi_t, H \psi_t \rangle = \frac{i}{\hbar}(\langle H\psi_t, \psi_t \rangle - \langle \psi_t, H\psi_t \rangle) = 0
because H is Hermitian.

Note: at this point Gillespie takes an aside and notes that if one denotes the state vector at time t = 0 by \psi_0 then one can attempt to find an operator U(t) where \psi_t = U(t)\psi_0 . This leads to i \hbar \frac{\partial U(t)}{\partial t} = HU\psi_0 which must be true for all \psi_0 . This leads to i\hbar \frac{\partial U(t)}{\partial t} = HU(t) with initial condition U(0) = 1 . This leads to the solution U(t) = exp(\frac{-iHt}{\hbar}) where the exponential is defined in the sense of linear algebra (use the power series expansion of exp(A) ). Note: U is not Hermitian and therefore NOT an observable. But it has a special place in quantum mechanics and is called the time-evolution operator.

Next: we’ll deal with the time energy relation for the expected value of a specific observable.

July 25, 2011

Quantum Mechanics and Undergraduate Mathematics VI: Heisenberg Uncertainty Principle

Filed under: advanced mathematics, applied mathematics, physics, probability, quantum mechanics, science — collegemathteaching @ 10:05 pm

Here we use Cauchy-Schwartz inequality, other facts about inner products and basic probability to derive the Heisenberg Uncertainty Principle for incompatible observables A and B . We assume some state vector \psi which has not been given time to evolve between measurements and we will abuse notation by viewing A and B as random variables for their given eigenvalues a_k, b_k given state vector \psi .

What we are after is the following: V(A)V(B) \geq (1/4)|\langle \psi, (AB-BA) \psi \rangle|^2.
When AB-BA = c we get: V(A)V(B) \geq (1/4)|c|^2 which is how it is often stated.

The proof is a bit easier when we make the expected values of A and B equal to zero; we do this by introducing a new linear operator A' = A -E(A) and B' = B - E(B) ; note that (A - E(A))\psi = A\psi - E(A)\psi . The following are routine exercises:
1. A' and B' are Hermitian
2. A'B' - B'A' = AB-BA
3. V(A') = V(A) .

If one is too lazy to work out 3:
V(A') = E((A-E(A))^2) - E(A -E(A)) = E(A^2 - 2AE(A) + E(A)E(A)) = E(A^2) -2E(A)E(A) + (E(A))^2 = V(A)

Now we have everything in place:
\langle \psi, (AB-BA) \psi \rangle = \langle \psi, (A'B'-B'A') \psi \rangle = \langle A'\psi, B' \psi \rangle - \langle B'\psi, A' \psi \rangle = \langle A'\psi, B' \psi \rangle - \overline{\langle A'\psi, B' \psi \rangle} = 2iIm\langle A'\psi, B'\psi \rangle
We now can take the modulus of both sides:
|\langle \psi, (AB-BA)\psi \rangle | = 2 |Im \langle A'\psi, B'\psi \rangle \leq 2|\langle A'\psi, B'\psi\rangle | \leq 2 \sqrt{\langle A'\psi,A'\psi\rangle}\sqrt{\langle B'\psi, B'\psi\rangle} = 2 \sqrt{\langle A\psi,A\psi\rangle}\sqrt{\langle B\psi,B\psi\rangle} = 2\sqrt{V(A)}\sqrt{V(B)}

This means that, unless A and B are compatible observables, there is a lower bound on the product of their standard deviations that cannot be done away with by more careful measurement. It is physically impossible to drive this product to zero. This also means that one of the standard deviations cannot be zero unless the other is infinite.

Quantum Mechanics and Undergraduate Mathematics V: compatible observables

This builds on our previous example. We start with a state \psi and we will make three successive observations of observables which have operators A and B in the following order: A, B, A . The assumption is that these observations are made so quickly that no time evolution of the state vector can take place; all of the change to the state vector will be due to the effect of the observations.

A simplifying assumption will be that the observation operators have the following property: no two different eigenvectors have the same eigenvalues (e. g., the eigenvalue uniquely determines the eigenvector up to multiplication by a constant of unit modulus).

First of all, this is what “compatible observables” means: two observables A, B are compatible if, upon three successive measurements A, B, A the first measurement of A is guaranteed to be the second measurement of A . That is, the state vector after the first measurement of A is the same state vector after the second measurement of A .

So here is what the compatibility theorem says (I am freely abusing notation by calling the observable by the name of its associated operator):

Compatibility Theorem
The following are equivalent:

1. A, B are compatible observables.
2. A, B have a common eigenbasis.
3. A, B commute (as operators)

Note: for this discussion, we’ll assume an eigenbasis of \alpha_i for A and \beta_i for B .

1 implies 2: Suppose the state of the system is \alpha_k just prior to the first measurement. Then the first measurement is a_k . The second measurement yields b_j which means the system is in state \beta_j , in which case the third measurement is guaranteed to be a_k (it is never anything else by the compatible observable assumption). Hence the state vector must have been \alpha_k which is the same as \beta_j . So, by some reindexing we can assume that \alpha_1 = \beta_1 . An argument about completeness and orthogonality finishes the proof of this implication.

2 implies 1: after the first measurement, the state of the system is \alpha_k which, being a basis vector for observable B means that the system after the measurement of B stays in the same state, which implies that the state of the system will remain \alpha_k after the second measurement of A . Since this is true for all basis vectors, we can extend this to all state vectors, hence the observables are compatible.

2 implies 3: a common eigenbasis implies that the operators commute on basis elements so the result follows (by some routine linear-algebra type calculations)

3 implies 2: given any eigenvector \alpha_k we have AB \alpha_k = BA \alpha_k = a_k B \alpha_k which implies that B \alpha_k is an eigenvector for A with eigenvalue \alpha_k . This means that B \alpha_k = c \alpha_k where c has unit modulus; hence \alpha_k must be an eigenvector of B . In this way, we establish a correspondence between the eigenbasis of B with the eigenbasis of A .

Ok, what happens when the observables are NOT compatible?

Here is a lovely application of conditional probability. It works this way: suppose on the first measurement, a_k is observed. This puts us in state vector \alpha_k . Now we measure the observable B which means that there is a probability |\langle \alpha_k, \beta_i \rangle|^2 of observing eigenvalue b_i . Now \beta_i is the new state vector and when observable A is measured, we have a probability |\langle \alpha_j, \beta_i \rangle|^2 of observing eigenvalue a_j in the second measurement of observable A .

Therefore given the initial measurement we can construct a conditional probability density function p(a_j|a_k) = \sum_i p(b_i|a_k)p(a_j|b_i)= \sum_i |\langle \alpha_k, \beta_i \rangle| |^2 |\langle \beta_i, \alpha_j |^2

Again, this makes sense only if the observations were taken so close together so as to not allow the state vector to undergo time evolution; ONLY the measurements changes the state vector.

Next: we move to the famous Heisenberg Uncertainty Principle, which states that, if we view the interaction of the observables A and B with a set state vector and abuse notation a bit and regard the associated density functions (for the eigenvalues) by the same letters, then V(A)V(B) \geq (1/4)|\langle \psi, [AB-BA]\psi \rangle |^2.

Of course, if the observables are compatible, then the right side becomes zero and if AB-BA = c for some non-zero scalar c (that is, (AB-BA) \psi = c\psi for all possible state vectors \psi ), then we get V(A)V(B) \geq (1/4)|c|^2 which is how it is often stated.

« Newer PostsOlder Posts »

Blog at WordPress.com.