College Math Teaching

August 11, 2011

Quantum Mechanics and Undergraduate Mathematics XIII: simplifications and wave-particle duality

In an effort to make the subject a bit more accessible to undergraduate mathematics students who haven’t had much physics training, we’ve made some simplifications. We’ve dealt with the “one dimensional, non-relativistic situation” which is fine. But we’ve also limited ourselves to the case where:
1. state vectors are actual functions (like those we learn about in calculus)
2. eigenvalues are discretely distributed (e. g., the set of eigenvalues have no limit points in the usual topology of the real line)
3. each eigenvalue corresponds to a unique eigenvector.

In this post we will see what trouble simplifications 1 and 2 cause and why they cannot be lived with. Hey, quantum mechanics is hard!

Finding Eigenvectors for the Position Operator
Let X denote the “position” operator and let us seek out the eigenvectors for this operator.
So X\delta = x_0 \delta where \delta is the eigenvector and x_0 is the associated eigenvalue.
This means x\delta = x_0\delta which implies (x-x_0)\delta = 0 .
This means that for x \neq x_0, \delta = 0 and \delta can be anything for x = x_0 . This would appear to allow the eigenvector to be the “everywhere zero except for x_0 ” function. So let \delta be such a function. But then if \psi is any state vector, \int_{-\infty}^{\infty} \overline{\delta}\psi dx = 0 and \int_{-\infty}^{\infty} \overline{\delta}\delta dx = 0 . Clearly this is unacceptable; we need (at least up to a constant multiple) for \int_{-\infty}^{\infty} \overline{\delta}\delta dx = 1

The problem is that restricting our eigenvectors to the class of functions is just too restrictive to give us results; we have to broaden the class of eigenvectors. One way to do that is to allow for distributions to be eigenvectors; the distribution we need here is the dirac delta. In the reference I linked to, one can see how the dirac delta can be thought of as a sort of limit of valid probability density functions. Note: \overline{\delta} = \delta .

So if we let \delta_0 denote the dirac that is zero except for x = x_0 , we recall that \int_{\infty}^{\infty} \delta_0 \psi dx = \psi(x_0) . This means that the probability density function associated with the position operator is P(X = x_0) = |\psi(x_0)|^2

This has an interesting consequence: if we measure the particle’s position at x = x_0 then the state vector becomes \delta_0 . So the new density function based on an immediate measurement of position would be P( X = x_0) = |\langle \delta_0, \delta_0 \rangle|^2 = 1 and P(X = x) = 0 elsewhere. The particle behaves like a particle with a definite “point” position.

Momentum: a different sort of problem

At first the momentum operator P\psi = -i \hbar \frac{d\psi}{dx} seems less problematic. Finding the eigenvectors and eigenfunctions is a breeze: if \theta_0 is the eigenvector with eigenvalue p_0 then:
\frac{d}{dx} \theta_0 = \frac{i}{\hbar}p_0\theta_0 has solution \theta_0 = exp(i p_0 \frac{x}{\hbar}) .
Do you see the problem?

There are a couple of them: first, this provides no restriction on the eigenvalues; in fact the eigenvalues can be any real number. This violates simplification number 2. Secondly, |\theta_0|^2 = 1 therefore |\langle \theta_0, \theta_0 \rangle |^2 = \infty . Our function is far from square integrable and therefore not a valid “state vector” in its present form. This is where the famous “normalization” comes into play.

Mathematically, one way to do this is to restrict the domain (say, limit the non-zero part to x_0 < x < x_1 ) and multiply by an appropriate constant.

Getting back to our state vector: exp(ip_0 \frac{x}{\hbar}) = cos(\frac{p_0 x}{\hbar}) + i sin(\frac{p_0 x}{\hbar}) . So if we measure momentum, we have basically given a particle a wave characteristic with wavelength \frac{\hbar}{p_0} .

Now what about the duality? Suppose we start by measuring a particle’s position thereby putting the state vector in to \psi = \delta_0 . Now what would be the expectation of momentum? We know that the formula is E(P) = -i\hbar \int-{-\infty}^{infty} \delta_0 \frac{\partial \delta_0}{\partial x} dx . But this quantity is undefined because \frac{\partial \delta_0}{\partial x} is undefined.

If we start in a momentum eigenvector and then wish to calculate the position density function (the expectation will be undefined), we see that |\theta_0|^2 = 1 which can be interpreted to mean that any position measurement is equally likely.

Clearly, momentum and position are not compatible operators. So let’s calculate XP - PX
XP \phi = x(-i\hbar \frac{d}{dx} \phi) = -xi\hbar \frac{d}{dx} \phi and PX\phi = -i \hbar\frac{d}{dx} (x \phi) = -i \hbar (\phi + x \frac{d}{dx}\phi) hence (XP - PX)\phi = i\hbar \phi . Therefore XP-PX = i\hbar . Therefore our generalized uncertainty relation tells us \Delta X \Delta P \geq \frac{1}{2}h
(yes, one might object that \Delta X really shouldn’t be defined….) but this uncertainty relation does hold up. So if one uncertainty is zero, then the other must be infinite; exact position means no defined momentum and vice versa.

So: exact, pointlike position means no defined momentum is possible (hence no wave like behavior) but an exact momentum (pure wave) means no exact pointlike position is possible. Also, remember that measurement of position endows a point like state vector of \delta_0 which destroys the wave like property; measurement of momentum endows a wave like state vector \theta_0 and therefore destroys any point like behavior (any location is equally likely to be observed).

Advertisements

Quantum Mechanics and Undergraduate Mathematics XII: position and momentum operators

Filed under: advanced mathematics, applied mathematics, physics, probability, quantum mechanics, science — collegemathteaching @ 1:52 am

Recall that the position operator is X \psi = x\psi and the momentum operator P \psi = -i\hbar \frac{d}{dx} \psi .

Recalling our abuse of notation that said that the expected value E = \langle \psi, A \psi \rangle , we find that the expected value of position is E(X) = \int_{-\infty}^{\infty} x |\psi|^2 dx . Note: since \int_{-\infty}^{\infty}  |\psi|^2 dx = 1, we can view |\psi|^2 as a probability density function; hence if f is any “reasonable” function of x , then E(f(X)) = \int_{-\infty}^{\infty} f(x) |\psi|^2 dx . Of course we can calculate the variance and other probability moments in a similar way; e. g. E(X^2) =  \int_{-\infty}^{\infty} x |\psi|^2 dx .

Now we turn to momentum; E(P) = \langle \psi, -i\hbar \frac{d}{dx} \psi \rangle = \int_{-\infty}^{\infty} \overline{\psi}\frac{d}{dx}\psi dx and E(P^2) = \langle \psi, P^2\psi \rangle = \langle P\psi, P\psi \rangle = \int_{-\infty}^{\infty} |\frac{d}{dx}\psi|^2 dx

So, back to position: we can now use the fact that |\psi|^2 is a valid density function associated with finding the expected value of position and call this the position probability density function. Hence P(x_1 < x < x_2) = \int_{-\infty}^{\infty} |\psi|^2 dx . But we saw that this can change with time so: P(x_1 < x < x_2; t) = \int_{-\infty}^{\infty} |\psi(x,t)|^2 dx

This is a great chance to practice putting together: differentiation under the integral sign, Schrödinger’s equation and integration by parts. I recommend that the reader try to show:

\frac{d}{dt} \int_{x_1}^{x_2} \overline{\psi}\psi dx = \frac{ih}{2m}(\overline{\psi}\frac{d \psi}{dx}-\psi \frac{d \overline{\psi}}{dx})_{x_1}^{x_2}

The details for the above calculation (students: try this yourself first! 🙂 )

Differentiation under the integral sign:
\frac{d}{dt} \int_{x_1}^{x_2} \overline{\psi} \psi dx = \int_{x_1}^{x_2}\overline{\psi} \frac{\partial \psi}{\partial t} + \psi \frac{\partial \overline{ \psi}}{\partial t} dt

Schrödinger’s equation (time dependent version) with a little bit of algebra:
\frac{\partial \psi}{\partial t} = \frac{i \hbar}{2m} \frac{\partial^2 \psi}{\partial x^2} - \frac{i}{\hbar}V \psi
\frac{\partial \overline{\psi}}{\partial t} = \frac{i \hbar}{2m} \frac{\partial^2 \overline{\psi}}{\partial x^2} + \frac{i}{\hbar}V \overline{\psi}

Note: V is real.

Algebra: eliminate the partial with respect to time terms; multiply the top equation by \overline{\psi} and the second by \psi . Then add the two to obtain:
\overline{\psi} \frac{\partial \psi}{\partial t} + \psi \frac{\partial \overline{ \psi}}{\partial t} = \frac{i \hbar}{2m}(\overline{\psi} \frac{\partial^2 \psi}{\partial x^2} + \psi \frac{\partial^2 \overline{ \psi}}{\partial x^2})

Now integrate by parts:
\frac{i \hbar}{2m} \int_{x_2}^{x_1} (\overline{\psi} \frac{\partial^2 \psi}{\partial x^2} + \psi \frac{\partial^2 \overline{ \psi}}{\partial x^2}) dx =

\frac{ih}{2m} ((\overline{\psi} \frac{\partial \psi}{\partial x})_{x_1}^{x_2} - \int_{x_2}^{x_1} \frac{\partial \overline{\psi}}{\partial x} \frac{\partial \psi}{\partial x} - ( (\psi \frac{\partial \overline{\psi}}{\partial x})_{x_1}^{x_2}  - \int_{x_2}^{x_1}\frac{\partial \psi}{\partial x}\frac{\partial \overline{\psi}}{\partial x}dx)

Now the integrals cancel each other and we obtain our result.

It is common to denote -\frac{ih}{2m}(\overline{\psi}\frac{d \psi}{dx}-\psi \frac{d \overline{\psi}}{dx} by S(x,t) (note the minus sign) and to say \frac{d}{dt}P(x_1 < x < x_2 ; t) = S(x_1,t) - S(x_2,t) (see the reason for the minus sign?)

S(x,t) is called the position probability current at the point x at time t One can think of this as a "probability flow rate" over the point x at time t ; the quantity S(x_1, t) - S(x_2, t) will tell you if the probability of finding the particle between position x_1 and x_2 is going up (positive sign) or down, and by what rate. But it is important that these are position PROBABILITY current and not PARTICLE current; same for |\psi |^2 ; this is the position probability density function, not the particle density function.

NOTE I haven’t talked about the position and momentum eigenvalues or eigenfuctions. We’ll do that in our next post; we’ll run into some mathematical trouble here. No, it won’t be with the position because we already know what a distribution is; the problem is that we’ll find the momentum eigenvector really isn’t square integrable….or even close.

August 10, 2011

Quantum Mechanics and Undergraduate Mathematics X: Schrödinger’s Equations

Filed under: advanced mathematics, applied mathematics, calculus, physics, quantum mechanics, science — collegemathteaching @ 1:19 am

Recall from classical mechanics: E = \frac{1}{2}mv^2 + V(x) where E is energy and V(x) is potential energy. We also have position x and momentum p = mv Note that we can then write E = \frac{p^2}{2m} + V(x) . Analogues exist in quantum mechanics and this is the subject of:

Postulate 6. Momentum and position (one dimensional motion) are represented by the operators:
X = x and P = -i\hbar \frac{d}{dx} respectively. If f is any “well behaved” function of two variables (say, locally analytic?) then A = f(X, P) = f(x, -i\hbar \frac{d}{dx} ) .

To see how this works: let \phi(x) = (2 \pi)^{-\frac{1}{4}}exp(-\frac{x^2}{4})
Then X \phi = (2 \pi)^{-\frac{1}{4}}x exp(-\frac{x^2}{4}) and P \phi = i\hbar (2 \pi)^{-\frac{1}{4}} 2x exp(-\frac{x^2}{4})

Associated with these is energy Hamiltonian operator H = \frac{1}{2m} P^2 + V(X) where P^2 means “do P twice”. So H = -\frac{\hbar^2}{2m}\frac{d^2}{dx^2} + V(x) .

Note We are going to show that these two operators are Hermitian…sort of. Why sort of: these operators A might not be “closed” in the sense that \langle \phi_1, \phi_2 \rangle exists but \langle \phi_1, A \phi_2 \rangle might not exist. Here is a simple example: let \phi_1 = \phi_2 = \sqrt{\frac{2}{\pi} \frac{1}{x^2 + 1}} . Then \langle \phi_1, \phi_2 \rangle = 1 but \int_{-\infty}^{\infty} x \phi_1 dx fails to exist.

So the unstated assumption is that when we are proving that various operators are Hermetian, we mean that they are Hermetian for state vectors which are transformed into functions for which the given inner product is defined.

So, with this caveat in mind, let’s show that these operators are Hermitian.

X clearly is because \langle \phi_1, x \phi_2 \rangle = \langle x \phi_1, \phi_2 \rangle . If this statement is confusing, remember that x is a real variable and therefore \overline{x} = x . Clearly, any well behaved real valued function of x is also a Hermitian operator. IF we assume that P is a Hermitian operator, then \langle \phi_1, P^2 \phi_2 \rangle = \langle P\phi_1, P\phi_2 \rangle = \langle P^2 \phi_1, \phi_2 \rangle . So we must show that P is Hermitian.

This is a nice exercise in integration by parts:
\langle \phi_1, P\phi_2 \rangle = -i\hbar\langle \phi_1, \frac{d}{dx} \phi_2 \rangle = -i\hbar \int_{-\infty}^{\infty} \overline{\phi_1} \frac{d}{dx} \phi_2 dx . Now we note that \overline{\phi_1} \phi_2 |_{-\infty}^{\infty} = 0 (else the improper integrals would fail to converge this is a property assumed for state vectors; mathematically it is possible that the limit as x \rightarrow \infty doesn’t exist but the integral still converges) and so by the integration by parts formula we get i\hbar\int_{-\infty}^{\infty} \overline{\frac{d}{dx}\phi_1}  \phi_2 dx =\int_{-\infty}^{\infty} \overline{-i\hbar\frac{d}{dx}\phi_1}  \phi_2 dx = \langle P\phi_1, \phi_2 \rangle .

Note that potential energy is a function of x so it too is Hermitian. So our Hamiltonian H(p,x) = \frac{1}{2m}P^2 + V(X) = -\frac{h^2}{2m}\frac{d^2}{dx^2} + V(x) is also Hermitian. That has some consequences:

1. H \eta_k = e_k \eta_k
2. H \psi = i\hbar\frac{\partial}{\partial t} \psi

Now we substitute for H and obtain:

1. -\frac{h^2}{2m} \frac{d^2}{dx^2} \eta_k + V(x)\eta_k = e_k \eta_k

2. -\frac{h^2}{2m} \frac{\partial^2}{\partial x^2} \psi + V(x)\psi = i\hbar \frac{\partial}{\partial t} \psi

These are the Schrödinger equations; the first one is the time independent equation. It is about each Hamiltonian energy eigenvector…or you might say each stationary state vector. This holds for each k . The second one is the time dependent one and applies to the state vector in general (not just the stationary states). It is called the fundamental time evolution equation for the state vector.

Special note: if one adjusts the Hamiltonian by adding a constant, the eigenvectors remain the same but the eigenvalues are adjusted by adding a constant. So the adjusted time vector gets adjusted by a factor of exp(-iC \frac{t}{\hbar}) which has a modulus of 1. So the new state vector describes the same state as the old one.

Next post: we’ll give an example and then derive the eigenvalues and eigenvectors for the position and momentum operators. Yes, this means dusting off the dirac delta distribution.

August 8, 2011

Quantum Mechanics and Undergraduate Mathematics VIII: Time Evolution of Expectation of an Observable

Filed under: advanced mathematics, applied mathematics, physics, probability, quantum mechanics, science — collegemathteaching @ 3:12 pm

Back to our series on QM: one thing to remember about observables: they are operators with a set collection of eigenvectors and eigenvalues (allowable values that can be observed; “quantum levels” if you will). These do not change with time. So \frac{d}{dt} (A (\psi)) = A (\frac{\partial}{\partial t} \psi) . One can work this out by expanding A \psi if one wants to.

So with this fact, lets see how the expectation of an observable evolves with time (given a certain initial state):
\frac{d}{dt} E(A) = \frac{d}{dt} \langle \psi, A \psi \rangle = \langle \frac{\partial}{\partial t} \psi, A \psi \rangle + \langle \psi, A \frac{\partial}{\partial t} \psi \rangle

Now apply the Hamiltonian to account for the time change of the state vector; we obtain:
\langle -\frac{i}{\hbar}H \psi, A \psi \rangle + \langle \psi, -\frac{i}{\hbar}AH \psi \rangle = \overline{\frac{i}{\hbar}} \langle H \psi, A \psi \rangle + -\frac{i}{\hbar} \langle \psi, AH \psi \rangle

Now use the fact that both H and A are Hermitian to obtain:
\frac{d}{dt} A = \frac{i}{\hbar} \langle \psi, (HA - AH) \psi \rangle .
So, we see the operator HA - AH once again; note that if A, H commute then the expectation of the state vector (or the standard deviation for that matter) does not evolve with time. This is certainly true for H itself. Note: an operator that commutes with H is sometimes called a “constant of motion” (think: “total energy of a system in classical mechanics).

Note also that |\frac{d}{dt} A | = |\frac{i}{\hbar} \langle \psi, (HA - AH) \psi \rangle | \leq 2 \Delta A \Delta H

If A does NOT correspond with a constant of motion, then it is useful to define an evolution time T_A = \frac{\Delta A}{\frac{E(A)}{dt}} where \Delta A = (V(A))^{1/2} This gives an estimate of how much time must elapse before the state changes enough to equal the uncertainty in the observable.

Note: we can apply this to H and A to obtain T_A \Delta H \ge \frac{\hbar}{2}

Consequences: if T_A is small (i. e., the state changes rapidly) then the uncertainty is large; hence energy is impossible to be well defined (as a numerical value). If the energy has low uncertainty then T_A must be large; that is, the state is very slowly changing. This is called the time-energy uncertainty relation.

July 28, 2011

Quantum Mechanics and Undergraduate Mathematics VII: Time Evolution of the State Vector

Filed under: advanced mathematics, applied mathematics, physics, quantum mechanics, science — collegemathteaching @ 2:38 pm

Of course the state vector \psi changes with time. The question is how does it change with time and how does the probability density function associated with an observable change with time?

Note: we will write \psi_t for \psi(x,t) . Now let A be an observable. Note that the eigenvectors and the eigenvalues associated with A do NOT change with time, so if we expand \psi_t in terms of the eigenbasis for A we have \psi_t = \sum_k \langle \alpha_k, \psi_t \rangle \alpha_k hence \frac{\partial \psi_t}{\partial t} = \sum_k \langle \alpha_k, \frac{\partial \psi_t}{\partial t} \rangle \alpha_k

Of course, we need the state vector to “stay in the class of state vectors” when it evolves with respect to time, which means that the norm cannot change; or \frac{d}{dt} \langle \psi_t, \psi_t \rangle = 0 .

Needless to say there has to be some restriction on how the state vector can change with time. So we have another postulate:

Postulate 5
For every physical system there exists a linear Hermitian operator H called the Hamiltonian operator such that:
1. i\hbar \frac{\partial}{\partial t} \psi(x,t) = H\psi(x,t) and
2. H corresponds to the total energy of the system and possesses a complete set of eigenvectors \eta_k and eigenvalues e_k where the eigenvalues are the “allowed values” of the total energy of the system.

Note: \hbar is the constant \frac{h}{\pi} where h is Plank’s constant.

Note: H is not specified; it is something that the physicists have to come up with by observing the system. That is, there is a partial differential equation to be solved!

But does this give us what we want, at least in terms of \psi_t staying at unit norm for all times t ? (note: again we write \psi_t for \psi(x,t) ).

The answer is yes; first note that \frac{d}{dt} \langle \psi_t, \psi_t \rangle = \langle \frac{\partial \psi_t}{\partial t}, \psi_t \rangle + \langle \psi_t,  \frac{\partial \psi_t}{\partial t}\rangle ; this is an easy exercise in using the definition of our inner product and differentiating under the integral sign and noting that the partial derivative operation and the conjugate operation commute).

Now note: \langle \frac{\partial \psi_t}{\partial t}, \psi_t \rangle + \langle \psi_t,  \frac{\partial \psi_t}{\partial t}\rangle = \overline{-\frac{i}{\hbar}}\langle H\psi_t, \psi_t \rangle + -\frac{i}{\hbar}\langle \psi_t, H \psi_t \rangle = \frac{i}{\hbar}(\langle H\psi_t, \psi_t \rangle - \langle \psi_t, H\psi_t \rangle) = 0
because H is Hermitian.

Note: at this point Gillespie takes an aside and notes that if one denotes the state vector at time t = 0 by \psi_0 then one can attempt to find an operator U(t) where \psi_t = U(t)\psi_0 . This leads to i \hbar \frac{\partial U(t)}{\partial t} = HU\psi_0 which must be true for all \psi_0 . This leads to i\hbar \frac{\partial U(t)}{\partial t} = HU(t) with initial condition U(0) = 1 . This leads to the solution U(t) = exp(\frac{-iHt}{\hbar}) where the exponential is defined in the sense of linear algebra (use the power series expansion of exp(A) ). Note: U is not Hermitian and therefore NOT an observable. But it has a special place in quantum mechanics and is called the time-evolution operator.

Next: we’ll deal with the time energy relation for the expected value of a specific observable.

July 25, 2011

Quantum Mechanics and Undergraduate Mathematics VI: Heisenberg Uncertainty Principle

Filed under: advanced mathematics, applied mathematics, physics, probability, quantum mechanics, science — collegemathteaching @ 10:05 pm

Here we use Cauchy-Schwartz inequality, other facts about inner products and basic probability to derive the Heisenberg Uncertainty Principle for incompatible observables A and B . We assume some state vector \psi which has not been given time to evolve between measurements and we will abuse notation by viewing A and B as random variables for their given eigenvalues a_k, b_k given state vector \psi .

What we are after is the following: V(A)V(B) \geq (1/4)|\langle \psi, (AB-BA) \psi \rangle|^2.
When AB-BA = c we get: V(A)V(B) \geq (1/4)|c|^2 which is how it is often stated.

The proof is a bit easier when we make the expected values of A and B equal to zero; we do this by introducing a new linear operator A' = A -E(A) and B' = B - E(B) ; note that (A - E(A))\psi = A\psi - E(A)\psi . The following are routine exercises:
1. A' and B' are Hermitian
2. A'B' - B'A' = AB-BA
3. V(A') = V(A) .

If one is too lazy to work out 3:
V(A') = E((A-E(A))^2) - E(A -E(A)) = E(A^2 - 2AE(A) + E(A)E(A)) = E(A^2) -2E(A)E(A) + (E(A))^2 = V(A)

Now we have everything in place:
\langle \psi, (AB-BA) \psi \rangle = \langle \psi, (A'B'-B'A') \psi \rangle = \langle A'\psi, B' \psi \rangle - \langle B'\psi, A' \psi \rangle = \langle A'\psi, B' \psi \rangle - \overline{\langle A'\psi, B' \psi \rangle} = 2iIm\langle A'\psi, B'\psi \rangle
We now can take the modulus of both sides:
|\langle \psi, (AB-BA)\psi \rangle | = 2 |Im \langle A'\psi, B'\psi \rangle \leq 2|\langle A'\psi, B'\psi\rangle | \leq 2 \sqrt{\langle A'\psi,A'\psi\rangle}\sqrt{\langle B'\psi, B'\psi\rangle} = 2 \sqrt{\langle A\psi,A\psi\rangle}\sqrt{\langle B\psi,B\psi\rangle} = 2\sqrt{V(A)}\sqrt{V(B)}

This means that, unless A and B are compatible observables, there is a lower bound on the product of their standard deviations that cannot be done away with by more careful measurement. It is physically impossible to drive this product to zero. This also means that one of the standard deviations cannot be zero unless the other is infinite.

Quantum Mechanics and Undergraduate Mathematics V: compatible observables

This builds on our previous example. We start with a state \psi and we will make three successive observations of observables which have operators A and B in the following order: A, B, A . The assumption is that these observations are made so quickly that no time evolution of the state vector can take place; all of the change to the state vector will be due to the effect of the observations.

A simplifying assumption will be that the observation operators have the following property: no two different eigenvectors have the same eigenvalues (e. g., the eigenvalue uniquely determines the eigenvector up to multiplication by a constant of unit modulus).

First of all, this is what “compatible observables” means: two observables A, B are compatible if, upon three successive measurements A, B, A the first measurement of A is guaranteed to be the second measurement of A . That is, the state vector after the first measurement of A is the same state vector after the second measurement of A .

So here is what the compatibility theorem says (I am freely abusing notation by calling the observable by the name of its associated operator):

Compatibility Theorem
The following are equivalent:

1. A, B are compatible observables.
2. A, B have a common eigenbasis.
3. A, B commute (as operators)

Note: for this discussion, we’ll assume an eigenbasis of \alpha_i for A and \beta_i for B .

1 implies 2: Suppose the state of the system is \alpha_k just prior to the first measurement. Then the first measurement is a_k . The second measurement yields b_j which means the system is in state \beta_j , in which case the third measurement is guaranteed to be a_k (it is never anything else by the compatible observable assumption). Hence the state vector must have been \alpha_k which is the same as \beta_j . So, by some reindexing we can assume that \alpha_1 = \beta_1 . An argument about completeness and orthogonality finishes the proof of this implication.

2 implies 1: after the first measurement, the state of the system is \alpha_k which, being a basis vector for observable B means that the system after the measurement of B stays in the same state, which implies that the state of the system will remain \alpha_k after the second measurement of A . Since this is true for all basis vectors, we can extend this to all state vectors, hence the observables are compatible.

2 implies 3: a common eigenbasis implies that the operators commute on basis elements so the result follows (by some routine linear-algebra type calculations)

3 implies 2: given any eigenvector \alpha_k we have AB \alpha_k = BA \alpha_k = a_k B \alpha_k which implies that B \alpha_k is an eigenvector for A with eigenvalue \alpha_k . This means that B \alpha_k = c \alpha_k where c has unit modulus; hence \alpha_k must be an eigenvector of B . In this way, we establish a correspondence between the eigenbasis of B with the eigenbasis of A .

Ok, what happens when the observables are NOT compatible?

Here is a lovely application of conditional probability. It works this way: suppose on the first measurement, a_k is observed. This puts us in state vector \alpha_k . Now we measure the observable B which means that there is a probability |\langle \alpha_k, \beta_i \rangle|^2 of observing eigenvalue b_i . Now \beta_i is the new state vector and when observable A is measured, we have a probability |\langle \alpha_j, \beta_i \rangle|^2 of observing eigenvalue a_j in the second measurement of observable A .

Therefore given the initial measurement we can construct a conditional probability density function p(a_j|a_k) = \sum_i p(b_i|a_k)p(a_j|b_i)= \sum_i |\langle \alpha_k, \beta_i \rangle| |^2 |\langle \beta_i, \alpha_j |^2

Again, this makes sense only if the observations were taken so close together so as to not allow the state vector to undergo time evolution; ONLY the measurements changes the state vector.

Next: we move to the famous Heisenberg Uncertainty Principle, which states that, if we view the interaction of the observables A and B with a set state vector and abuse notation a bit and regard the associated density functions (for the eigenvalues) by the same letters, then V(A)V(B) \geq (1/4)|\langle \psi, [AB-BA]\psi \rangle |^2.

Of course, if the observables are compatible, then the right side becomes zero and if AB-BA = c for some non-zero scalar c (that is, (AB-BA) \psi = c\psi for all possible state vectors \psi ), then we get V(A)V(B) \geq (1/4)|c|^2 which is how it is often stated.

July 19, 2011

Quantum Mechanics and Undergraduate Mathematics IV: measuring an observable (example)

Ok, we have to relate the observables to the state of the system. We know that the only possible “values” of the observable are the eigenvalues of the operator and the relation of the operator to the state vector provides the density function. But what does this measurement do to the state? That is, immediately after a measurement is taken, what is the state?

True, the system undergoes a "time evolution" but once an observable is measured, an immediate (termed "successive") measurement will yield the same value; a "repeated" measurement (one made giving the system to undergo a time evolution) might give a different value.

So we get:

Postulate 4 A measurement of an observable generally (?) causes a drastic, uncontrollable alteration in the state vector of the system; immediately after the measurement it will coincide with the eigenvector corresponding to the eigenvalue obtained in the measurement.

Note: we assume that our observable operators have distinct eigenvalues; that is, no two distinct eigenvectors have the same eigenvalue.

That is, if we measure an observable with operator A and obtain measurement a_i then the new system eigenvector is \alpha_i regardless of what \psi was prior to measurement. Of course, this eigenvector can (and usually will) evolve with time.

Roughly speaking, here is what is going on:
Say the system is in state \psi . We measure and observable with operator A . We can only obtain one of the eigenvalues \alpha_k as a measurement. Recall: remember all of those “orbitals” from chemistry class? Those were the energy levels of the electrons and the orbital level was a permissible energy state that we could obtain by a measurement.

Now if we get \alpha_k as a measurement, the new state vector is \alpha_k . One might say that we started with a probability density function (given the state and the observable), we made a measurement, and now, for a brief instant anyway, our density function “collapsed” to the density function P(A = a_k)  = 1 .

This situation (brief) coincides with our classical intuition of an observable “having a value”.

Example (based on our calculation in the previous post):

For the purposes of this example, we’ll set our Hilbert space to the the square integrable piecewise smooth functions on [-\pi, \pi] and let our “state vector” \psi(x) =\left\{ \begin{array}{c}1/\sqrt{\pi}, 0 < x \leq \pi \\ 0,-\pi \leq x \leq 0  \end{array}\right.

Now suppose our observable corresponds to the eigenfunctions mentioned in this post, and we measure “-4” for our observable. This is the eigenvalue for (1/\sqrt{\pi})sin(2x) so our new state vector is (1/\sqrt{\pi})sin(2x) .

So what happens if a different observable is measured IMMEDIATELY (e. g., no chance for a time evolution to take place).

Example We’ll still use the space of square integrable functions over [-\pi, \pi]
One might recall the Legendre polynomials which are eigenfucntions of the following operator:
d/dt((1-t^2) dP_n/dt) = -(n)(n+1) P_n(t) . These polynomials obey the orthogonality relation \int^{1}_{-1} P_m(t)P_n(t)dt = 2/(2n+1) \delta_{m,n} hence \int^{1}_{-1} P_m(t)P_m(t)dt = 2/(2m+1) .
The first few of these are P_0 = 1, P_1  =t, P_2 = (1/2)(3t^2-1), P_3 = (1/2)(5t^3 - 3t), ..

We can adjust these polynomials by the change of variable t =x/\pi and multiply each polynomial P_m by the factor sqrt{2/(\pi (2m+1) } to obtain an orthonormal eigenbasis. Of course, one has to adjust the operator by the chain rule.

So for this example, let P_n denote the adjusted Legendre polynomial with eigenvalue -n(n+1) .

Now back to our original state vector which was changed to state function (1/\sqrt{\pi})sin(2x) .

Now suppose eigenvalue -6 = -2(3) is observed as an observable with the Lengendre operator; this corresponds to eigenvector \sqrt{(2/5)(1/\pi)}(1/2)(3(x/\pi)^2 -1) which is now the new state vector.

Now if we were to do an immediate measurement of the first observable, we’d have to a Fourier like expansion of our new state vector; hence the probability density function for the observables changes from the initial measurement. Bottom line: the order in which the observations are taken matters….in general.

The case in which the order wouldn’t matter: if the second observable had the state vector (from the first measurement) as an element of its eigenbasis.

We will state this as a general principle in our next post.

July 15, 2011

Quantum Mechanics and Undergraduate Mathematics III: an example of a state function

I feel bad that I haven’t given a demonstrative example, so I’ll “cheat” a bit and give one:

For the purposes of this example, we’ll set our Hilbert space to the the square integrable piecewise smooth functions on [-\pi, \pi] and let our “state vector” \psi(x) =\left\{ \begin{array}{c}1/\sqrt{\pi}, 0 < x \leq \pi \\ 0,-\pi \leq x \leq 0  \end{array}\right.

Now consider a (bogus) state operator d^2/dx^2 which has an eigenbasis (1/\sqrt{\pi})cos(kx), (1/\sqrt{\pi})sin(kx), k \in {, 1, 2, 3,...} and 1/\sqrt{2\pi} with eigenvalues 0, -1, -4, -9,...... (note: I know that this is a degenerate case in which some eigenvalues share two eigenfunctions).

Note also that the eigenfunctions are almost the functions used in the usual Fourier expansion; the difference is that I have scaled the functions so that \int^{\pi}_{-\pi} (sin(kx)/\sqrt{\pi})^2 dx = 1 as required for an orthonormal basis with this inner product.

Now we can write \psi = 1/(2 \sqrt{\pi}) + 4/(\pi^{3/2})(sin(x) + (1/3)sin(3x) + (1/5)sin(5x) +..)
(yes, I am abusing the equal sign here)
This means that b_0 = 1/\sqrt{2}, b_k = 2/(k \pi), k \in {1,3,5,7...}

Now the only possible measurements of the operator are 0, -1, -4, -9, …. and the probability density function is: p(A = 0) = 1/2, P(A = -1) = 4/(\pi^2), P(A = -3) = 4/(9 \pi^2),...P(A = -(2k-1))= 4/(((2k-1)\pi)^2)..

One can check that 1/2 + (4/(\pi^2))(1 + 1/9 + 1/25 + 1/49 + 1/81....) = 1.

Here is a plot of the state function (blue line at the top) along with some of the eigenfunctions multiplied by their respective b_k .

July 13, 2011

Quantum Mechanics and Undergraduate Mathematics II

In the first part of this series, we reviewed some of the mathematical background that we’ll use. Now we get into a bit of the physics.

For simplification, we’ll assume one dimensional, non-relativistic motion. No, nature isn’t that simple; that is why particle physics is hard! 🙂

What we will do is to describe a state of a system and the observables. The state of the system is hard to describe; in the classical case (say the damped mass-spring system in harmonic motion), the state of the system is determined by the system parameters (mass, damping constant, spring constant) and the velocity and acceleration at a set time.

And observable is, roughly speaking, something that can give us information about the state of the system. In classical mechanics, one observable might be H(x, p) = P^2/2m + V(x) where p is the system’s momentum and V(x) represents the potential energy at position x . If this seems strange, remember that p = mv therefore kinetic energy is mv^2/2 and solving for momentum p gives us the formula. We bring this up because something similar will appear later.

In quantum mechanics, certain postulates are assumed. I’ll present the ones that Gillespie uses:

Postulate 1: Every possible physical state of a given system corresponds to a Hilbert space vector \psi of unit norm (using the inner product that we talked about) and every such vector corresponds to a possible state of a system. The correspondence of states to the vectors is well defined up to multiplication of a vector by a complex number of unit modulus.

Note: this state vector, while containing all of the knowable information of the system, says nothing about what could be known or how such knowledge might be observed. Of course, this state vector might evolve with time and sometimes it is written as \psi_{t} for this reason.

Postulate 2 There is a one to one correspondence between physical observables and linear Hermitian operators A , each of which possesses a complete, orthonormal set of eigenvectors \alpha_{i} and a corresponding set of real eigenvalues a_i and the only possible values of any measurement of this observable is one of these eigenvalues.

Note: in the cases when the eigenvalues are discretely distributed (e. g., the eigenvalues fail to have a limit point), we get “quantized” behavior from this observable.

We’ll use observables with discrete eigenvalues unless we say otherwise.

Now: is a function of an observable itself an observable? The answer is “yes” if the function is real analytic and we assume that (A)^n(\psi) = A(A(A....A(\psi)) . To see this: assume that f(z) = \sum_i c_i z^i and note that if A is an observable operator then so is cA^n for all n . Note: one can do this by showing that the eigenvectors for A do not change and that the eigenvalues merely go up by power. The completeness of the eigenvectors imply convergence when we pass to f .

Now we have states and observables. But how do they interact?
Remember that we showed the following:

Let A be a linear operator with a complete orthonormal eigenbasis \alpha_i and corresponding real eigenvalues a_i . Let \psi be an element of the Hilbert space with unit norm and let \psi = \sum_j b_j \alpha_j .

Then the function P(y = a_i) = (|b_i|)^2 is a probability density function. (note: b_i = \langle \alpha_i , \psi \rangle ).

This will give us exactly what we need! Basically, if the observable has operator A system and is in state \psi , then the probability of a measurement yielding a result of a_i is (|\langle \alpha_i , \psi \rangle|)^2 Note: it follows that if the state \phi = \alpha_i then the probability of obtaining a_i is exactly one.

We summarize this up by Postulate 3: (page 49 of Gillespie, stated for the “scattered eigenvalues” case):

Postulate 3: If an observable operator A has eigenbasis \alpha_i with eigenvalues a_i and if the corresponding observable is measured on a system which, immediately prior to the measurement is in state \psi then the strongest predictive statement that can be made concerning the result of this measurement is as follows: the probability that the measurement will yield a_k is (|\langle \alpha_i , \psi \rangle|)^2 .

Note: for simplicity, we are restricting ourselves to observables which have distinct eigenvalues (e. g., no two linearly independent eigenvectors have the same eigenvalues). In real life, some observables DO have different eigenvectors with the same eigenvalue (example from calculus; these are NOT Hilbert Space vectors, but if the operator is d^2/dx^2 then sin(x) and cos(x) both have eigenvalue -1. )

Where we are now: we have a probability distribution to work with which means that we can calculate an expected value and a variance. These values will be fundamental when we tackle uncertainty principles!

Just a reminder from our courses in probability theory: if Y is a random variable with density function P

E(Y) = \sum_i y_i P(y_i) and V(Y)  = E(Y^2) -(E(Y))^2 .

So with our density function P(y = a_i) = (|b_i|)^2 (we use b_i = \langle \alpha_i , \psi \rangle to save space), then if E(A) is the expected observed value of the observable (the expected value of the eigenvalues):
E(A) = \sum_i a_i (b_i)^2 . But this quantity can be calculated in another way:

\langle \psi , A(\psi) \rangle = \langle \sum b_i \alpha_i , A(\sum b_i \alpha_i) \rangle =  \langle \sum b_i \alpha_i , \sum a_i b_i \alpha_i) \rangle = \sum_i \overline{b_i} b_i a_i \langle \alpha_i, \alpha_i \rangle =  \sum_i \overline{b_i} b_i a_i = \sum_i |b_i|^2  a_i = E(A) . Yes, I skipped some easy steps.

Using this we find V(A) = \langle \psi, A^2(\psi) \rangle - (\langle \psi, A(\psi) \rangle )^2 and it is customary to denote the standard deviation \sqrt{V(A)} = \Delta(A)

In our next installment, I give an illustrative example.

In a subsequent installment, we’ll show how a measurement of an observable affects the state and later how the distribution of the observable changes with time.

« Newer PostsOlder Posts »

Blog at WordPress.com.