College Math Teaching

August 10, 2011

Quantum Mechanics and Undergraduate Mathematics X: SchrÃ¶dinger’s Equations

Filed under: advanced mathematics, applied mathematics, calculus, physics, quantum mechanics, science — collegemathteaching @ 1:19 am

Recall from classical mechanics: $E = \frac{1}{2}mv^2 + V(x)$ where $E$ is energy and $V(x)$ is potential energy. We also have position $x$ and momentum $p = mv$ Note that we can then write $E = \frac{p^2}{2m} + V(x)$. Analogues exist in quantum mechanics and this is the subject of:

Postulate 6. Momentum and position (one dimensional motion) are represented by the operators:
$X = x$ and $P = -i\hbar \frac{d}{dx}$ respectively. If $f$ is any “well behaved” function of two variables (say, locally analytic?) then $A = f(X, P) = f(x, -i\hbar \frac{d}{dx} )$.

To see how this works: let $\phi(x) = (2 \pi)^{-\frac{1}{4}}exp(-\frac{x^2}{4})$
Then $X \phi = (2 \pi)^{-\frac{1}{4}}x exp(-\frac{x^2}{4})$ and $P \phi = i\hbar (2 \pi)^{-\frac{1}{4}} 2x exp(-\frac{x^2}{4})$

Associated with these is energy Hamiltonian operator $H = \frac{1}{2m} P^2 + V(X)$ where $P^2$ means “do $P$ twice”. So $H = -\frac{\hbar^2}{2m}\frac{d^2}{dx^2} + V(x)$.

Note We are going to show that these two operators are Hermitian…sort of. Why sort of: these operators $A$ might not be “closed” in the sense that $\langle \phi_1, \phi_2 \rangle$ exists but $\langle \phi_1, A \phi_2 \rangle$ might not exist. Here is a simple example: let $\phi_1 = \phi_2 = \sqrt{\frac{2}{\pi} \frac{1}{x^2 + 1}}$. Then $\langle \phi_1, \phi_2 \rangle = 1$ but $\int_{-\infty}^{\infty} x \phi_1 dx$ fails to exist.

So the unstated assumption is that when we are proving that various operators are Hermetian, we mean that they are Hermetian for state vectors which are transformed into functions for which the given inner product is defined.

So, with this caveat in mind, let’s show that these operators are Hermitian.

$X$ clearly is because $\langle \phi_1, x \phi_2 \rangle = \langle x \phi_1, \phi_2 \rangle$. If this statement is confusing, remember that $x$ is a real variable and therefore $\overline{x} = x$. Clearly, any well behaved real valued function of $x$ is also a Hermitian operator. IF we assume that $P$ is a Hermitian operator, then $\langle \phi_1, P^2 \phi_2 \rangle = \langle P\phi_1, P\phi_2 \rangle = \langle P^2 \phi_1, \phi_2 \rangle$. So we must show that $P$ is Hermitian.

This is a nice exercise in integration by parts:
$\langle \phi_1, P\phi_2 \rangle = -i\hbar\langle \phi_1, \frac{d}{dx} \phi_2 \rangle = -i\hbar \int_{-\infty}^{\infty} \overline{\phi_1} \frac{d}{dx} \phi_2 dx$. Now we note that $\overline{\phi_1} \phi_2 |_{-\infty}^{\infty} = 0$ (else the improper integrals would fail to converge this is a property assumed for state vectors; mathematically it is possible that the limit as $x \rightarrow \infty$ doesn’t exist but the integral still converges) and so by the integration by parts formula we get $i\hbar\int_{-\infty}^{\infty} \overline{\frac{d}{dx}\phi_1} \phi_2 dx =\int_{-\infty}^{\infty} \overline{-i\hbar\frac{d}{dx}\phi_1} \phi_2 dx = \langle P\phi_1, \phi_2 \rangle$.

Note that potential energy is a function of $x$ so it too is Hermitian. So our Hamiltonian $H(p,x) = \frac{1}{2m}P^2 + V(X) = -\frac{h^2}{2m}\frac{d^2}{dx^2} + V(x)$ is also Hermitian. That has some consequences:

1. $H \eta_k = e_k \eta_k$
2. $H \psi = i\hbar\frac{\partial}{\partial t} \psi$

Now we substitute for $H$ and obtain:

1. $-\frac{h^2}{2m} \frac{d^2}{dx^2} \eta_k + V(x)\eta_k = e_k \eta_k$

2. $-\frac{h^2}{2m} \frac{\partial^2}{\partial x^2} \psi + V(x)\psi = i\hbar \frac{\partial}{\partial t} \psi$

These are the SchrÃ¶dinger equations; the first one is the time independent equation. It is about each Hamiltonian energy eigenvector…or you might say each stationary state vector. This holds for each $k$. The second one is the time dependent one and applies to the state vector in general (not just the stationary states). It is called the fundamental time evolution equation for the state vector.

Special note: if one adjusts the Hamiltonian by adding a constant, the eigenvectors remain the same but the eigenvalues are adjusted by adding a constant. So the adjusted time vector gets adjusted by a factor of $exp(-iC \frac{t}{\hbar})$ which has a modulus of 1. So the new state vector describes the same state as the old one.

Next post: we’ll give an example and then derive the eigenvalues and eigenvectors for the position and momentum operators. Yes, this means dusting off the dirac delta distribution.

August 9, 2011

Quantum Mechanics and Undergraduate Mathematics IX: Time evolution of an Observable Density Function

We’ll assume a state function $\psi$ and an observable whose Hermitian operator is denoted by $A$ with eigenvectors $\alpha_k$ and eigenvalues $a_k$. If we take an observation (say, at time $t = 0$ ) we obtain the probability density function $p(Y = a_k) = | \langle \alpha_k, \psi \rangle |^2$ (we make the assumption that there is only one eigenvector per eigenvalue).

We saw how the expectation (the expected value of the associated density function) changes with time. What about the time evolution of the density function itself?

Since $\langle \alpha_k, \psi \rangle$ completely determines the density function and because $\psi$ can be expanded as $\psi = \sum_{k=1} \langle \alpha_k, \psi \rangle \alpha_k$ it make sense to determine $\frac{d}{dt} \langle \alpha_k, \psi \rangle$. Note that the eigenvectors $\alpha_k$ and eigenvalues $a_k$ do not change with time and therefore can be regarded as constants.

$\frac{d}{dt} \langle \alpha_k, \psi \rangle = \langle \alpha_k, \frac{\partial}{\partial t}\psi \rangle = \langle \alpha_k, \frac{-i}{\hbar}H\psi \rangle = \frac{-i}{\hbar}\langle \alpha_k, H\psi \rangle$

We can take this further: we now write $H\psi = H\sum_j \langle \alpha_j, \psi \rangle \alpha_j = \sum_j \langle \alpha_j, \psi \rangle H \alpha_j$ We now substitute into the previous equation to obtain:
$\frac{d}{dt} \langle \alpha_k, \psi \rangle = \frac{-i}{\hbar}\langle \alpha_k, \sum_j \langle \alpha_j, \psi \rangle H \alpha_j \rangle = \frac{-i}{\hbar}\sum_j \langle \alpha_k, H\alpha_j \rangle \langle \alpha_j, \psi \rangle$

Denote $\langle \alpha_j, \psi \rangle$ by $a_j$. Then we see that we have the infinite coupled differential equations: $\frac{d}{dt} a_k = \frac{-i}{\hbar} \sum_j a_j \langle \alpha_k, H\alpha_j \rangle$. That is, the rate of change of one of the $a_k$ depends on all of the $a_j$ which really isn’t a surprise.

We can see this another way: because we have a density function, $\sum_j |\langle \alpha_j, \psi \rangle |^2 =1$. Now rewrite: $\sum_j |\langle \alpha_j, \psi \rangle |^2 = \sum_j \langle \alpha_j, \psi \rangle \overline{\langle \alpha_j, \psi \rangle } = \sum_j a_j \overline{ a_j} = 1$. Now differentiate with respect to $t$ and use the product rule: $\sum_j \frac{d}{dt}a_j \overline{ a_j} + a_j \frac{d}{dt} \overline{ a_j} = 0$

Things get a bit easier if the original operator $A$ is compatible with the Hamiltonian $H$; in this case the operators share common eigenvectors. We denote the eigenvectors for $H$ by $\eta$ and then
$\frac{d}{dt} a_k = \frac{-i}{\hbar} \sum_j a_j \langle \alpha_k, H\alpha_j \rangle$ becomes:
$\frac{d}{dt} \langle \eta_j, \psi \rangle = \frac{-i}{\hbar} \sum_j \langle \eta_j, \psi \rangle \langle \eta_k, H\eta_j \rangle$ Now use the fact that the $\eta_j$ are eigenvectors for $H$ and are orthogonal to each other to obtain:
$\frac{d}{dt} \langle \eta_k, \psi \rangle = \frac{-i}{\hbar} e_k \langle \eta_k, \psi \rangle$ where $e_k$ is the eigenvalue for $H$ associated with $\eta_k$.

Now we use differential equations (along with existence and uniqueness conditions) to obtain:
$\langle \eta_k, \psi \rangle = \langle_k, \psi_0 \rangle exp(-ie_k \frac{t}{\hbar})$ where $\psi_0$ is the initial state vector (before it had time to evolve).

This has two immediate consequences:

1. $\psi(x,t) = \sum_j \langle \eta_j, \psi_0 \rangle exp(-ie_j \frac{t}{\hbar}) \eta_j$
That is the general solution to the time-evolution equation. The reader might be reminded that $exp(ib) = cos(b) + i sin (b)$

2. Returning to the probability distribution: $P(Y = e_k) = |\langle \eta_k, \psi \rangle |^2 = |\langle \eta_k, \psi_0 \rangle |^2 ||exp(-ie_k \frac{t}{\hbar})|^2 = |\langle \eta_k, \psi_0 \rangle |^2$. But since $A$ is compatible with $H$, we have the same eigenvectors, hence we see that the probability density function does not change AT ALL. So such an observable really is a “constant of motion”.

Stationary States
Since $H$ is an observable, we can always write $\psi(x,t) = \sum_j \langle \eta_j, \psi(x,t) \rangle \eta_j$. Then we have $\psi(x,t)= \sum_j \langle \eta_j, \psi_0 \rangle exp(-ie_j \frac{t}{\hbar}) \eta_j$

Now suppose $\psi_0$ is precisely one of the eigenvectors for the Hamiltonian; say $\psi_0 = \eta_k$ for some $k$. Then:

1. $\psi_(x,t) = exp(-ie_k \frac{t}{\hbar}) \eta_k$
2. For any $t \geq 0 , P(Y = e_k) = 1, P(Y \neq e_k) = 0$

Note: no other operator has made an appearance.
Now recall our first postulate: states are determined only up to scalar multiples of unity modulus. Hence the state undergoes NO time evolution, no matter what observable is being observed.

We can see this directly: let $A$ be an operator corresponding to any observable. Then $\langle \alpha_k, A \psi_k \rangle = \langle \alpha_k, A exp(-i e_k \frac{t}{\hbar})\eta_k \rangle = exp(-i e_k \frac{t}{\hbar}\langle \alpha_k, A \eta_k \rangle$. Then because the probability distribution is completely determined by the eigenvalues $e_k$ and $|\langle \alpha_k, A \eta_k \rangle |$ and $|exp(-i e_k \frac{t}{\hbar}| = 1$, the distribution does NOT change with time. This motivates us to define the stationary states of a system: $\psi_{(k)} = exp(- e_k \frac{t}{\hbar})\eta_k$.

Gillespie notes that much of the problem solving in quantum mechanics is solving the Eigenvalue problem: $H \eta_k = e_k \eta_k$ which is often difficult to do. But if one can do that, one can determine the stationary states of the system.

August 8, 2011

Quantum Mechanics and Undergraduate Mathematics VIII: Time Evolution of Expectation of an Observable

Filed under: advanced mathematics, applied mathematics, physics, probability, quantum mechanics, science — collegemathteaching @ 3:12 pm

Back to our series on QM: one thing to remember about observables: they are operators with a set collection of eigenvectors and eigenvalues (allowable values that can be observed; “quantum levels” if you will). These do not change with time. So $\frac{d}{dt} (A (\psi)) = A (\frac{\partial}{\partial t} \psi)$. One can work this out by expanding $A \psi$ if one wants to.

So with this fact, lets see how the expectation of an observable evolves with time (given a certain initial state):
$\frac{d}{dt} E(A) = \frac{d}{dt} \langle \psi, A \psi \rangle = \langle \frac{\partial}{\partial t} \psi, A \psi \rangle + \langle \psi, A \frac{\partial}{\partial t} \psi \rangle$

Now apply the Hamiltonian to account for the time change of the state vector; we obtain:
$\langle -\frac{i}{\hbar}H \psi, A \psi \rangle + \langle \psi, -\frac{i}{\hbar}AH \psi \rangle = \overline{\frac{i}{\hbar}} \langle H \psi, A \psi \rangle + -\frac{i}{\hbar} \langle \psi, AH \psi \rangle$

Now use the fact that both $H$ and $A$ are Hermitian to obtain:
$\frac{d}{dt} A = \frac{i}{\hbar} \langle \psi, (HA - AH) \psi \rangle$.
So, we see the operator $HA - AH$ once again; note that if $A, H$ commute then the expectation of the state vector (or the standard deviation for that matter) does not evolve with time. This is certainly true for $H$ itself. Note: an operator that commutes with $H$ is sometimes called a “constant of motion” (think: “total energy of a system in classical mechanics).

Note also that $|\frac{d}{dt} A | = |\frac{i}{\hbar} \langle \psi, (HA - AH) \psi \rangle | \leq 2 \Delta A \Delta H$

If $A$ does NOT correspond with a constant of motion, then it is useful to define an evolution time $T_A = \frac{\Delta A}{\frac{E(A)}{dt}}$ where $\Delta A = (V(A))^{1/2}$ This gives an estimate of how much time must elapse before the state changes enough to equal the uncertainty in the observable.

Note: we can apply this to $H$ and $A$ to obtain $T_A \Delta H \ge \frac{\hbar}{2}$

Consequences: if $T_A$ is small (i. e., the state changes rapidly) then the uncertainty is large; hence energy is impossible to be well defined (as a numerical value). If the energy has low uncertainty then $T_A$ must be large; that is, the state is very slowly changing. This is called the time-energy uncertainty relation.

July 28, 2011

Quantum Mechanics and Undergraduate Mathematics VII: Time Evolution of the State Vector

Filed under: advanced mathematics, applied mathematics, physics, quantum mechanics, science — collegemathteaching @ 2:38 pm

Of course the state vector $\psi$ changes with time. The question is how does it change with time and how does the probability density function associated with an observable change with time?

Note: we will write $\psi_t$ for $\psi(x,t)$. Now let $A$ be an observable. Note that the eigenvectors and the eigenvalues associated with $A$ do NOT change with time, so if we expand $\psi_t$ in terms of the eigenbasis for $A$ we have $\psi_t = \sum_k \langle \alpha_k, \psi_t \rangle \alpha_k$ hence $\frac{\partial \psi_t}{\partial t} = \sum_k \langle \alpha_k, \frac{\partial \psi_t}{\partial t} \rangle \alpha_k$

Of course, we need the state vector to “stay in the class of state vectors” when it evolves with respect to time, which means that the norm cannot change; or $\frac{d}{dt} \langle \psi_t, \psi_t \rangle = 0$.

Needless to say there has to be some restriction on how the state vector can change with time. So we have another postulate:

Postulate 5
For every physical system there exists a linear Hermitian operator $H$ called the Hamiltonian operator such that:
1. $i\hbar \frac{\partial}{\partial t} \psi(x,t) = H\psi(x,t)$ and
2. $H$ corresponds to the total energy of the system and possesses a complete set of eigenvectors $\eta_k$ and eigenvalues $e_k$ where the eigenvalues are the “allowed values” of the total energy of the system.

Note: $\hbar$ is the constant $\frac{h}{\pi}$ where $h$ is Plank’s constant.

Note: $H$ is not specified; it is something that the physicists have to come up with by observing the system. That is, there is a partial differential equation to be solved!

But does this give us what we want, at least in terms of $\psi_t$ staying at unit norm for all times $t$? (note: again we write $\psi_t$ for $\psi(x,t)$ ).

The answer is yes; first note that $\frac{d}{dt} \langle \psi_t, \psi_t \rangle = \langle \frac{\partial \psi_t}{\partial t}, \psi_t \rangle + \langle \psi_t, \frac{\partial \psi_t}{\partial t}\rangle$; this is an easy exercise in using the definition of our inner product and differentiating under the integral sign and noting that the partial derivative operation and the conjugate operation commute).

Now note: $\langle \frac{\partial \psi_t}{\partial t}, \psi_t \rangle + \langle \psi_t, \frac{\partial \psi_t}{\partial t}\rangle = \overline{-\frac{i}{\hbar}}\langle H\psi_t, \psi_t \rangle + -\frac{i}{\hbar}\langle \psi_t, H \psi_t \rangle = \frac{i}{\hbar}(\langle H\psi_t, \psi_t \rangle - \langle \psi_t, H\psi_t \rangle) = 0$
because $H$ is Hermitian.

Note: at this point Gillespie takes an aside and notes that if one denotes the state vector at time $t = 0$ by $\psi_0$ then one can attempt to find an operator $U(t)$ where $\psi_t = U(t)\psi_0$. This leads to $i \hbar \frac{\partial U(t)}{\partial t} = HU\psi_0$ which must be true for all $\psi_0$. This leads to $i\hbar \frac{\partial U(t)}{\partial t} = HU(t)$ with initial condition $U(0) = 1$. This leads to the solution $U(t) = exp(\frac{-iHt}{\hbar})$ where the exponential is defined in the sense of linear algebra (use the power series expansion of $exp(A)$). Note: $U$ is not Hermitian and therefore NOT an observable. But it has a special place in quantum mechanics and is called the time-evolution operator.

Next: we’ll deal with the time energy relation for the expected value of a specific observable.

July 25, 2011

Quantum Mechanics and Undergraduate Mathematics VI: Heisenberg Uncertainty Principle

Filed under: advanced mathematics, applied mathematics, physics, probability, quantum mechanics, science — collegemathteaching @ 10:05 pm

Here we use Cauchy-Schwartz inequality, other facts about inner products and basic probability to derive the Heisenberg Uncertainty Principle for incompatible observables $A$ and $B$. We assume some state vector $\psi$ which has not been given time to evolve between measurements and we will abuse notation by viewing $A$ and $B$ as random variables for their given eigenvalues $a_k, b_k$ given state vector $\psi$.

What we are after is the following: $V(A)V(B) \geq (1/4)|\langle \psi, (AB-BA) \psi \rangle|^2.$
When $AB-BA = c$ we get: $V(A)V(B) \geq (1/4)|c|^2$ which is how it is often stated.

The proof is a bit easier when we make the expected values of $A$ and $B$ equal to zero; we do this by introducing a new linear operator $A' = A -E(A)$ and $B' = B - E(B)$; note that $(A - E(A))\psi = A\psi - E(A)\psi$. The following are routine exercises:
1. $A'$ and $B'$ are Hermitian
2. $A'B' - B'A' = AB-BA$
3. $V(A') = V(A)$.

If one is too lazy to work out 3:
$V(A') = E((A-E(A))^2) - E(A -E(A)) = E(A^2 - 2AE(A) + E(A)E(A)) = E(A^2) -2E(A)E(A) + (E(A))^2 = V(A)$

Now we have everything in place:
$\langle \psi, (AB-BA) \psi \rangle = \langle \psi, (A'B'-B'A') \psi \rangle = \langle A'\psi, B' \psi \rangle - \langle B'\psi, A' \psi \rangle = \langle A'\psi, B' \psi \rangle - \overline{\langle A'\psi, B' \psi \rangle} = 2iIm\langle A'\psi, B'\psi \rangle$
We now can take the modulus of both sides:
$|\langle \psi, (AB-BA)\psi \rangle | = 2 |Im \langle A'\psi, B'\psi \rangle \leq 2|\langle A'\psi, B'\psi\rangle | \leq 2 \sqrt{\langle A'\psi,A'\psi\rangle}\sqrt{\langle B'\psi, B'\psi\rangle} = 2 \sqrt{\langle A\psi,A\psi\rangle}\sqrt{\langle B\psi,B\psi\rangle} = 2\sqrt{V(A)}\sqrt{V(B)}$

This means that, unless $A$ and $B$ are compatible observables, there is a lower bound on the product of their standard deviations that cannot be done away with by more careful measurement. It is physically impossible to drive this product to zero. This also means that one of the standard deviations cannot be zero unless the other is infinite.

Quantum Mechanics and Undergraduate Mathematics V: compatible observables

This builds on our previous example. We start with a state $\psi$ and we will make three successive observations of observables which have operators $A$ and $B$ in the following order: $A, B, A$. The assumption is that these observations are made so quickly that no time evolution of the state vector can take place; all of the change to the state vector will be due to the effect of the observations.

A simplifying assumption will be that the observation operators have the following property: no two different eigenvectors have the same eigenvalues (e. g., the eigenvalue uniquely determines the eigenvector up to multiplication by a constant of unit modulus).

First of all, this is what “compatible observables” means: two observables $A, B$ are compatible if, upon three successive measurements $A, B, A$ the first measurement of $A$ is guaranteed to be the second measurement of $A$. That is, the state vector after the first measurement of $A$ is the same state vector after the second measurement of $A$.

So here is what the compatibility theorem says (I am freely abusing notation by calling the observable by the name of its associated operator):

Compatibility Theorem
The following are equivalent:

1. $A, B$ are compatible observables.
2. $A, B$ have a common eigenbasis.
3. $A, B$ commute (as operators)

Note: for this discussion, we’ll assume an eigenbasis of $\alpha_i$ for $A$ and $\beta_i$ for $B$.

1 implies 2: Suppose the state of the system is $\alpha_k$ just prior to the first measurement. Then the first measurement is $a_k$. The second measurement yields $b_j$ which means the system is in state $\beta_j$, in which case the third measurement is guaranteed to be $a_k$ (it is never anything else by the compatible observable assumption). Hence the state vector must have been $\alpha_k$ which is the same as $\beta_j$. So, by some reindexing we can assume that $\alpha_1 = \beta_1$. An argument about completeness and orthogonality finishes the proof of this implication.

2 implies 1: after the first measurement, the state of the system is $\alpha_k$ which, being a basis vector for observable $B$ means that the system after the measurement of $B$ stays in the same state, which implies that the state of the system will remain $\alpha_k$ after the second measurement of $A$. Since this is true for all basis vectors, we can extend this to all state vectors, hence the observables are compatible.

2 implies 3: a common eigenbasis implies that the operators commute on basis elements so the result follows (by some routine linear-algebra type calculations)

3 implies 2: given any eigenvector $\alpha_k$ we have $AB \alpha_k = BA \alpha_k = a_k B \alpha_k$ which implies that $B \alpha_k$ is an eigenvector for $A$ with eigenvalue $\alpha_k$. This means that $B \alpha_k = c \alpha_k$ where $c$ has unit modulus; hence $\alpha_k$ must be an eigenvector of $B$. In this way, we establish a correspondence between the eigenbasis of $B$ with the eigenbasis of $A$.

Ok, what happens when the observables are NOT compatible?

Here is a lovely application of conditional probability. It works this way: suppose on the first measurement, $a_k$ is observed. This puts us in state vector $\alpha_k$. Now we measure the observable $B$ which means that there is a probability $|\langle \alpha_k, \beta_i \rangle|^2$ of observing eigenvalue $b_i$. Now $\beta_i$ is the new state vector and when observable $A$ is measured, we have a probability $|\langle \alpha_j, \beta_i \rangle|^2$ of observing eigenvalue $a_j$ in the second measurement of observable $A$.

Therefore given the initial measurement we can construct a conditional probability density function $p(a_j|a_k) = \sum_i p(b_i|a_k)p(a_j|b_i)= \sum_i |\langle \alpha_k, \beta_i \rangle| |^2 |\langle \beta_i, \alpha_j |^2$

Again, this makes sense only if the observations were taken so close together so as to not allow the state vector to undergo time evolution; ONLY the measurements changes the state vector.

Next: we move to the famous Heisenberg Uncertainty Principle, which states that, if we view the interaction of the observables $A$ and $B$ with a set state vector and abuse notation a bit and regard the associated density functions (for the eigenvalues) by the same letters, then $V(A)V(B) \geq (1/4)|\langle \psi, [AB-BA]\psi \rangle |^2.$

Of course, if the observables are compatible, then the right side becomes zero and if $AB-BA = c$ for some non-zero scalar $c$ (that is, $(AB-BA) \psi = c\psi$ for all possible state vectors $\psi$ ), then we get $V(A)V(B) \geq (1/4)|c|^2$ which is how it is often stated.

July 19, 2011

Quantum Mechanics and Undergraduate Mathematics IV: measuring an observable (example)

Ok, we have to relate the observables to the state of the system. We know that the only possible “values” of the observable are the eigenvalues of the operator and the relation of the operator to the state vector provides the density function. But what does this measurement do to the state? That is, immediately after a measurement is taken, what is the state?

True, the system undergoes a "time evolution" but once an observable is measured, an immediate (termed "successive") measurement will yield the same value; a "repeated" measurement (one made giving the system to undergo a time evolution) might give a different value.

So we get:

Postulate 4 A measurement of an observable generally (?) causes a drastic, uncontrollable alteration in the state vector of the system; immediately after the measurement it will coincide with the eigenvector corresponding to the eigenvalue obtained in the measurement.

Note: we assume that our observable operators have distinct eigenvalues; that is, no two distinct eigenvectors have the same eigenvalue.

That is, if we measure an observable with operator $A$ and obtain measurement $a_i$ then the new system eigenvector is $\alpha_i$ regardless of what $\psi$ was prior to measurement. Of course, this eigenvector can (and usually will) evolve with time.

Roughly speaking, here is what is going on:
Say the system is in state $\psi$. We measure and observable with operator $A$. We can only obtain one of the eigenvalues $\alpha_k$ as a measurement. Recall: remember all of those “orbitals” from chemistry class? Those were the energy levels of the electrons and the orbital level was a permissible energy state that we could obtain by a measurement.

Now if we get $\alpha_k$ as a measurement, the new state vector is $\alpha_k$. One might say that we started with a probability density function (given the state and the observable), we made a measurement, and now, for a brief instant anyway, our density function “collapsed” to the density function $P(A = a_k) = 1$.

This situation (brief) coincides with our classical intuition of an observable “having a value”.

For the purposes of this example, we’ll set our Hilbert space to the the square integrable piecewise smooth functions on $[-\pi, \pi]$ and let our “state vector” $\psi(x) =\left\{ \begin{array}{c}1/\sqrt{\pi}, 0 < x \leq \pi \\ 0,-\pi \leq x \leq 0 \end{array}\right.$

Now suppose our observable corresponds to the eigenfunctions mentioned in this post, and we measure “-4” for our observable. This is the eigenvalue for $(1/\sqrt{\pi})sin(2x)$ so our new state vector is $(1/\sqrt{\pi})sin(2x)$.

So what happens if a different observable is measured IMMEDIATELY (e. g., no chance for a time evolution to take place).

Example We’ll still use the space of square integrable functions over $[-\pi, \pi]$
One might recall the Legendre polynomials which are eigenfucntions of the following operator:
$d/dt((1-t^2) dP_n/dt) = -(n)(n+1) P_n(t)$. These polynomials obey the orthogonality relation $\int^{1}_{-1} P_m(t)P_n(t)dt = 2/(2n+1) \delta_{m,n}$ hence $\int^{1}_{-1} P_m(t)P_m(t)dt = 2/(2m+1)$.
The first few of these are $P_0 = 1, P_1 =t, P_2 = (1/2)(3t^2-1), P_3 = (1/2)(5t^3 - 3t), ..$

We can adjust these polynomials by the change of variable $t =x/\pi$ and multiply each polynomial $P_m$ by the factor $sqrt{2/(\pi (2m+1) }$ to obtain an orthonormal eigenbasis. Of course, one has to adjust the operator by the chain rule.

So for this example, let $P_n$ denote the adjusted Legendre polynomial with eigenvalue $-n(n+1)$.

Now back to our original state vector which was changed to state function $(1/\sqrt{\pi})sin(2x)$.

Now suppose eigenvalue $-6 = -2(3)$ is observed as an observable with the Lengendre operator; this corresponds to eigenvector $\sqrt{(2/5)(1/\pi)}(1/2)(3(x/\pi)^2 -1)$ which is now the new state vector.

Now if we were to do an immediate measurement of the first observable, we’d have to a Fourier like expansion of our new state vector; hence the probability density function for the observables changes from the initial measurement. Bottom line: the order in which the observations are taken matters….in general.

The case in which the order wouldn’t matter: if the second observable had the state vector (from the first measurement) as an element of its eigenbasis.

We will state this as a general principle in our next post.

July 15, 2011

Quantum Mechanics and Undergraduate Mathematics III: an example of a state function

I feel bad that I haven’t given a demonstrative example, so I’ll “cheat” a bit and give one:

For the purposes of this example, we’ll set our Hilbert space to the the square integrable piecewise smooth functions on $[-\pi, \pi]$ and let our “state vector” $\psi(x) =\left\{ \begin{array}{c}1/\sqrt{\pi}, 0 < x \leq \pi \\ 0,-\pi \leq x \leq 0 \end{array}\right.$

Now consider a (bogus) state operator $d^2/dx^2$ which has an eigenbasis $(1/\sqrt{\pi})cos(kx), (1/\sqrt{\pi})sin(kx), k \in {, 1, 2, 3,...}$ and $1/\sqrt{2\pi}$ with eigenvalues $0, -1, -4, -9,......$ (note: I know that this is a degenerate case in which some eigenvalues share two eigenfunctions).

Note also that the eigenfunctions are almost the functions used in the usual Fourier expansion; the difference is that I have scaled the functions so that $\int^{\pi}_{-\pi} (sin(kx)/\sqrt{\pi})^2 dx = 1$ as required for an orthonormal basis with this inner product.

Now we can write $\psi = 1/(2 \sqrt{\pi}) + 4/(\pi^{3/2})(sin(x) + (1/3)sin(3x) + (1/5)sin(5x) +..)$
(yes, I am abusing the equal sign here)
This means that $b_0 = 1/\sqrt{2}, b_k = 2/(k \pi), k \in {1,3,5,7...}$

Now the only possible measurements of the operator are 0, -1, -4, -9, …. and the probability density function is: $p(A = 0) = 1/2, P(A = -1) = 4/(\pi^2), P(A = -3) = 4/(9 \pi^2),...P(A = -(2k-1))= 4/(((2k-1)\pi)^2)..$

One can check that $1/2 + (4/(\pi^2))(1 + 1/9 + 1/25 + 1/49 + 1/81....) = 1.$

Here is a plot of the state function (blue line at the top) along with some of the eigenfunctions multiplied by their respective $b_k$.

July 13, 2011

Quantum Mechanics and Undergraduate Mathematics II

In the first part of this series, we reviewed some of the mathematical background that we’ll use. Now we get into a bit of the physics.

For simplification, we’ll assume one dimensional, non-relativistic motion. No, nature isn’t that simple; that is why particle physics is hard! ðŸ™‚

What we will do is to describe a state of a system and the observables. The state of the system is hard to describe; in the classical case (say the damped mass-spring system in harmonic motion), the state of the system is determined by the system parameters (mass, damping constant, spring constant) and the velocity and acceleration at a set time.

And observable is, roughly speaking, something that can give us information about the state of the system. In classical mechanics, one observable might be $H(x, p) = P^2/2m + V(x)$ where $p$ is the system’s momentum and $V(x)$ represents the potential energy at position $x$. If this seems strange, remember that $p = mv$ therefore kinetic energy is $mv^2/2$ and solving for momentum $p$ gives us the formula. We bring this up because something similar will appear later.

In quantum mechanics, certain postulates are assumed. I’ll present the ones that Gillespie uses:

Postulate 1: Every possible physical state of a given system corresponds to a Hilbert space vector $\psi$ of unit norm (using the inner product that we talked about) and every such vector corresponds to a possible state of a system. The correspondence of states to the vectors is well defined up to multiplication of a vector by a complex number of unit modulus.

Note: this state vector, while containing all of the knowable information of the system, says nothing about what could be known or how such knowledge might be observed. Of course, this state vector might evolve with time and sometimes it is written as $\psi_{t}$ for this reason.

Postulate 2 There is a one to one correspondence between physical observables and linear Hermitian operators $A$, each of which possesses a complete, orthonormal set of eigenvectors $\alpha_{i}$ and a corresponding set of real eigenvalues $a_i$ and the only possible values of any measurement of this observable is one of these eigenvalues.

Note: in the cases when the eigenvalues are discretely distributed (e. g., the eigenvalues fail to have a limit point), we get “quantized” behavior from this observable.

We’ll use observables with discrete eigenvalues unless we say otherwise.

Now: is a function of an observable itself an observable? The answer is “yes” if the function is real analytic and we assume that $(A)^n(\psi) = A(A(A....A(\psi))$. To see this: assume that $f(z) = \sum_i c_i z^i$ and note that if $A$ is an observable operator then so is $cA^n$ for all $n$. Note: one can do this by showing that the eigenvectors for $A$ do not change and that the eigenvalues merely go up by power. The completeness of the eigenvectors imply convergence when we pass to $f$.

Now we have states and observables. But how do they interact?
Remember that we showed the following:

Let $A$ be a linear operator with a complete orthonormal eigenbasis $\alpha_i$ and corresponding real eigenvalues $a_i$. Let $\psi$ be an element of the Hilbert space with unit norm and let $\psi = \sum_j b_j \alpha_j$.

Then the function $P(y = a_i) = (|b_i|)^2$ is a probability density function. (note: $b_i = \langle \alpha_i , \psi \rangle$).

This will give us exactly what we need! Basically, if the observable has operator $A$ system and is in state $\psi$, then the probability of a measurement yielding a result of $a_i$ is $(|\langle \alpha_i , \psi \rangle|)^2$ Note: it follows that if the state $\phi = \alpha_i$ then the probability of obtaining $a_i$ is exactly one.

We summarize this up by Postulate 3: (page 49 of Gillespie, stated for the “scattered eigenvalues” case):

Postulate 3: If an observable operator $A$ has eigenbasis $\alpha_i$ with eigenvalues $a_i$ and if the corresponding observable is measured on a system which, immediately prior to the measurement is in state $\psi$ then the strongest predictive statement that can be made concerning the result of this measurement is as follows: the probability that the measurement will yield $a_k$ is $(|\langle \alpha_i , \psi \rangle|)^2$.

Note: for simplicity, we are restricting ourselves to observables which have distinct eigenvalues (e. g., no two linearly independent eigenvectors have the same eigenvalues). In real life, some observables DO have different eigenvectors with the same eigenvalue (example from calculus; these are NOT Hilbert Space vectors, but if the operator is $d^2/dx^2$ then $sin(x)$ and $cos(x)$ both have eigenvalue -1. )

Where we are now: we have a probability distribution to work with which means that we can calculate an expected value and a variance. These values will be fundamental when we tackle uncertainty principles!

Just a reminder from our courses in probability theory: if $Y$ is a random variable with density function $P$

$E(Y) = \sum_i y_i P(y_i)$ and $V(Y) = E(Y^2) -(E(Y))^2$.

So with our density function $P(y = a_i) = (|b_i|)^2$ (we use $b_i = \langle \alpha_i , \psi \rangle$ to save space), then if $E(A)$ is the expected observed value of the observable (the expected value of the eigenvalues):
$E(A) = \sum_i a_i (b_i)^2$. But this quantity can be calculated in another way:

$\langle \psi , A(\psi) \rangle = \langle \sum b_i \alpha_i , A(\sum b_i \alpha_i) \rangle = \langle \sum b_i \alpha_i , \sum a_i b_i \alpha_i) \rangle = \sum_i \overline{b_i} b_i a_i \langle \alpha_i, \alpha_i \rangle = \sum_i \overline{b_i} b_i a_i = \sum_i |b_i|^2 a_i = E(A)$. Yes, I skipped some easy steps.

Using this we find $V(A) = \langle \psi, A^2(\psi) \rangle - (\langle \psi, A(\psi) \rangle )^2$ and it is customary to denote the standard deviation $\sqrt{V(A)} = \Delta(A)$

In our next installment, I give an illustrative example.

In a subsequent installment, we’ll show how a measurement of an observable affects the state and later how the distribution of the observable changes with time.

July 11, 2011

Quantum Mechanics for teachers of undergraduate mathematics I

I am planning on writing up a series of notes from the out of print book A Quantum Mechanics Primer by Daniel Gillespie.

My background: mathematics instructor (Ph.D. research area: geometric topology) whose last physics course (at the Naval Nuclear Power School) was almost 30 years ago; sophomore physics was 33 years ago.

Your background: you teach undergraduate mathematics for a living and haven’t had a course in quantum mechanics; those who have the time to study a book such as Quantum Mechanics and the Particles of Nature by Anthony Sudbery would be better off studying that. Those who have had a course in quantum mechanics would be bored stiff.

Topics the reader should know: probability density functions, square integrability, linear algebra, (abstract inner products (Hermitian), eigenbasis, orthonormal basis), basic analysis (convergence of a series of functions) differential equations, dirac delta distribution.

My purpose: present some opportunities to present applications to undergraduate students e. g., “the dirac delta “function” (distribution really) can be thought of as an eigenvector for this linear transformation”, or “here is an application of non-standard inner products and an abstract vector space”, or “here is a non-data application to the idea of the expected value and variance of a probability density function”, etc.

Basic mathematical objects
Our vector space will consist of functions $\psi : R \rightarrow C$ (complex valued functions of a real variable) for which $\int^{\infty}_{-\infty} \overline{\psi} \psi dx$ is finite. Note: the square root of a probability density function is a vector of this vector space. Scalars are complex numbers and the operation is the usual function addition.

Our inner product $\langle \psi , \phi \rangle = \int^{\infty}_{-\infty} \overline{\psi} \phi dx$ has the following type of symmetry: $\langle \psi , \phi \rangle= \overline{\langle \phi , \psi \rangle}$ and $\langle c\psi , \phi \rangle = \langle \psi , \overline{c} \phi \rangle = \overline{c}\langle \psi , \phi \rangle$.

Note: Our vector space will have a metric that is compatible with the inner product; such spaces are called Hilbert spaces. This means that we will allow for infinite sums of functions with some convergence; one might think of “convergence in the mean” which uses our inner product in the usual way to define the mean.

Of interest to us will be the Hermitian linear transformations $H$ where $\langle H(\psi ), \phi \rangle = \langle \psi ,H(\phi) \rangle .$ It is an easy exercise to see that such a linear transformation can only have real eigenvalues. We will also be interested in the subset (NOT a vector subspace) of vectors $\psi$ for which $||(\langle \psi , \phi \rangle)||^2 = 1$.

Eigenvalues and eigenvectors will be defined in the usual way: if $H(\psi) = \alpha \psi$ then we say that $\psi$ is an eigenvector for $H$ with associated eigenvalue $\alpha$. If there is a countable number of orthornormal eigenvectors whose “span” (allowing for infinite sums) includes every element of the vector space, then we say that $H$ has a complete orthonormal eigenbasis.

It is a good warm up exercise to show that if $H$ has a complete orthonormal eigenbasis then $H$ is Hermitian.

Hint: start with $\langle H(\psi ), \phi \rangle$ and expand $\psi$ and $\phi$ in terms of the eigenbasis; of course the linear operator $H$ has to commute with the infinite sum so there are convergence issues to be concerned about.

The outline goes something like this: suppose $\epsilon_i$ is the complete set of eigenvectors for $H$ with eigenvalues $a_i$ and $\psi = \sum_i b_i \epsilon_i$ and $\phi = \sum_i c_i \epsilon_i$
$\langle H(\psi ), \phi \rangle =\langle H(\sum_i b_i \epsilon_i ), \phi \rangle = \langle \sum_i H(b_i \epsilon_i ), \phi \rangle = \langle \sum_i b_i a_i \epsilon_i , \phi \rangle = \sum_i a_i\langle b_i \epsilon_i , \phi \rangle$

Now do the same operation on the left side of the inner product and use the fact that the basis vectors are mutually orthogonal. Note: there are convergence issues here; those that relate the switching of the infinite sum notation outside of the inner product can be handled with a dominated convergence theorem for integrals. But the intuition taken from finite vector spaces works here.

The other thing to note is that not every Hermitian operator is “closed”; that is it is possible for $\psi$ to be square integrable but for operator $H(\phi) = x \phi$ to not be square integrable.

Probability Density Functions

Let $H$ be a linear operator with a complete orthonormal eigenbasis $\epsilon_i$ and corresponding real eigenvalues $a_i$. Let $\psi$ be an element of the Hilbert space with unit norm and let $\psi = \sum_j b_j \epsilon_j$.

Claim: the function $P(y = a_i) = (|b_i|)^2$ is a probability density function. (note: $b_i = \langle \epsilon_i , \psi \rangle$).

The fact that $(|b_i|)^2 \leq 1$ follows easily from the Cauchy-Schwartz inequality. Also note that $1 = | \langle \psi, \psi \rangle | = | \langle \sum b_i \epsilon_i,\sum b_i \epsilon_i \rangle | = |\sum_i (b_i)^2 \langle \epsilon_i, \epsilon_i \rangle | = |\sum_i (b_i)^2|$

Yes, I skipped some steps that are easy to fill in. But the bottom line is that this density function now has a (sometimes) finite expected value and a (sometimes) finite variance.

With the mathematical preliminaries (mostly) out of the way, we are ready to see how this applies to physics.