# College Math Teaching

## July 28, 2011

### Quantum Mechanics and Undergraduate Mathematics VII: Time Evolution of the State Vector

Filed under: advanced mathematics, applied mathematics, physics, quantum mechanics, science — collegemathteaching @ 2:38 pm

Of course the state vector $\psi$ changes with time. The question is how does it change with time and how does the probability density function associated with an observable change with time?

Note: we will write $\psi_t$ for $\psi(x,t)$. Now let $A$ be an observable. Note that the eigenvectors and the eigenvalues associated with $A$ do NOT change with time, so if we expand $\psi_t$ in terms of the eigenbasis for $A$ we have $\psi_t = \sum_k \langle \alpha_k, \psi_t \rangle \alpha_k$ hence $\frac{\partial \psi_t}{\partial t} = \sum_k \langle \alpha_k, \frac{\partial \psi_t}{\partial t} \rangle \alpha_k$

Of course, we need the state vector to “stay in the class of state vectors” when it evolves with respect to time, which means that the norm cannot change; or $\frac{d}{dt} \langle \psi_t, \psi_t \rangle = 0$.

Needless to say there has to be some restriction on how the state vector can change with time. So we have another postulate:

Postulate 5
For every physical system there exists a linear Hermitian operator $H$ called the Hamiltonian operator such that:
1. $i\hbar \frac{\partial}{\partial t} \psi(x,t) = H\psi(x,t)$ and
2. $H$ corresponds to the total energy of the system and possesses a complete set of eigenvectors $\eta_k$ and eigenvalues $e_k$ where the eigenvalues are the “allowed values” of the total energy of the system.

Note: $\hbar$ is the constant $\frac{h}{\pi}$ where $h$ is Plank’s constant.

Note: $H$ is not specified; it is something that the physicists have to come up with by observing the system. That is, there is a partial differential equation to be solved!

But does this give us what we want, at least in terms of $\psi_t$ staying at unit norm for all times $t$? (note: again we write $\psi_t$ for $\psi(x,t)$ ).

The answer is yes; first note that $\frac{d}{dt} \langle \psi_t, \psi_t \rangle = \langle \frac{\partial \psi_t}{\partial t}, \psi_t \rangle + \langle \psi_t, \frac{\partial \psi_t}{\partial t}\rangle$; this is an easy exercise in using the definition of our inner product and differentiating under the integral sign and noting that the partial derivative operation and the conjugate operation commute).

Now note: $\langle \frac{\partial \psi_t}{\partial t}, \psi_t \rangle + \langle \psi_t, \frac{\partial \psi_t}{\partial t}\rangle = \overline{-\frac{i}{\hbar}}\langle H\psi_t, \psi_t \rangle + -\frac{i}{\hbar}\langle \psi_t, H \psi_t \rangle = \frac{i}{\hbar}(\langle H\psi_t, \psi_t \rangle - \langle \psi_t, H\psi_t \rangle) = 0$
because $H$ is Hermitian.

Note: at this point Gillespie takes an aside and notes that if one denotes the state vector at time $t = 0$ by $\psi_0$ then one can attempt to find an operator $U(t)$ where $\psi_t = U(t)\psi_0$. This leads to $i \hbar \frac{\partial U(t)}{\partial t} = HU\psi_0$ which must be true for all $\psi_0$. This leads to $i\hbar \frac{\partial U(t)}{\partial t} = HU(t)$ with initial condition $U(0) = 1$. This leads to the solution $U(t) = exp(\frac{-iHt}{\hbar})$ where the exponential is defined in the sense of linear algebra (use the power series expansion of $exp(A)$). Note: $U$ is not Hermitian and therefore NOT an observable. But it has a special place in quantum mechanics and is called the time-evolution operator.

Next: we’ll deal with the time energy relation for the expected value of a specific observable.

## July 25, 2011

### Quantum Mechanics and Undergraduate Mathematics VI: Heisenberg Uncertainty Principle

Filed under: advanced mathematics, applied mathematics, physics, probability, quantum mechanics, science — collegemathteaching @ 10:05 pm

Here we use Cauchy-Schwartz inequality, other facts about inner products and basic probability to derive the Heisenberg Uncertainty Principle for incompatible observables $A$ and $B$. We assume some state vector $\psi$ which has not been given time to evolve between measurements and we will abuse notation by viewing $A$ and $B$ as random variables for their given eigenvalues $a_k, b_k$ given state vector $\psi$.

What we are after is the following: $V(A)V(B) \geq (1/4)|\langle \psi, (AB-BA) \psi \rangle|^2.$
When $AB-BA = c$ we get: $V(A)V(B) \geq (1/4)|c|^2$ which is how it is often stated.

The proof is a bit easier when we make the expected values of $A$ and $B$ equal to zero; we do this by introducing a new linear operator $A' = A -E(A)$ and $B' = B - E(B)$; note that $(A - E(A))\psi = A\psi - E(A)\psi$. The following are routine exercises:
1. $A'$ and $B'$ are Hermitian
2. $A'B' - B'A' = AB-BA$
3. $V(A') = V(A)$.

If one is too lazy to work out 3:
$V(A') = E((A-E(A))^2) - E(A -E(A)) = E(A^2 - 2AE(A) + E(A)E(A)) = E(A^2) -2E(A)E(A) + (E(A))^2 = V(A)$

Now we have everything in place:
$\langle \psi, (AB-BA) \psi \rangle = \langle \psi, (A'B'-B'A') \psi \rangle = \langle A'\psi, B' \psi \rangle - \langle B'\psi, A' \psi \rangle = \langle A'\psi, B' \psi \rangle - \overline{\langle A'\psi, B' \psi \rangle} = 2iIm\langle A'\psi, B'\psi \rangle$
We now can take the modulus of both sides:
$|\langle \psi, (AB-BA)\psi \rangle | = 2 |Im \langle A'\psi, B'\psi \rangle \leq 2|\langle A'\psi, B'\psi\rangle | \leq 2 \sqrt{\langle A'\psi,A'\psi\rangle}\sqrt{\langle B'\psi, B'\psi\rangle} = 2 \sqrt{\langle A\psi,A\psi\rangle}\sqrt{\langle B\psi,B\psi\rangle} = 2\sqrt{V(A)}\sqrt{V(B)}$

This means that, unless $A$ and $B$ are compatible observables, there is a lower bound on the product of their standard deviations that cannot be done away with by more careful measurement. It is physically impossible to drive this product to zero. This also means that one of the standard deviations cannot be zero unless the other is infinite.

### Quantum Mechanics and Undergraduate Mathematics V: compatible observables

This builds on our previous example. We start with a state $\psi$ and we will make three successive observations of observables which have operators $A$ and $B$ in the following order: $A, B, A$. The assumption is that these observations are made so quickly that no time evolution of the state vector can take place; all of the change to the state vector will be due to the effect of the observations.

A simplifying assumption will be that the observation operators have the following property: no two different eigenvectors have the same eigenvalues (e. g., the eigenvalue uniquely determines the eigenvector up to multiplication by a constant of unit modulus).

First of all, this is what “compatible observables” means: two observables $A, B$ are compatible if, upon three successive measurements $A, B, A$ the first measurement of $A$ is guaranteed to be the second measurement of $A$. That is, the state vector after the first measurement of $A$ is the same state vector after the second measurement of $A$.

So here is what the compatibility theorem says (I am freely abusing notation by calling the observable by the name of its associated operator):

Compatibility Theorem
The following are equivalent:

1. $A, B$ are compatible observables.
2. $A, B$ have a common eigenbasis.
3. $A, B$ commute (as operators)

Note: for this discussion, we’ll assume an eigenbasis of $\alpha_i$ for $A$ and $\beta_i$ for $B$.

1 implies 2: Suppose the state of the system is $\alpha_k$ just prior to the first measurement. Then the first measurement is $a_k$. The second measurement yields $b_j$ which means the system is in state $\beta_j$, in which case the third measurement is guaranteed to be $a_k$ (it is never anything else by the compatible observable assumption). Hence the state vector must have been $\alpha_k$ which is the same as $\beta_j$. So, by some reindexing we can assume that $\alpha_1 = \beta_1$. An argument about completeness and orthogonality finishes the proof of this implication.

2 implies 1: after the first measurement, the state of the system is $\alpha_k$ which, being a basis vector for observable $B$ means that the system after the measurement of $B$ stays in the same state, which implies that the state of the system will remain $\alpha_k$ after the second measurement of $A$. Since this is true for all basis vectors, we can extend this to all state vectors, hence the observables are compatible.

2 implies 3: a common eigenbasis implies that the operators commute on basis elements so the result follows (by some routine linear-algebra type calculations)

3 implies 2: given any eigenvector $\alpha_k$ we have $AB \alpha_k = BA \alpha_k = a_k B \alpha_k$ which implies that $B \alpha_k$ is an eigenvector for $A$ with eigenvalue $\alpha_k$. This means that $B \alpha_k = c \alpha_k$ where $c$ has unit modulus; hence $\alpha_k$ must be an eigenvector of $B$. In this way, we establish a correspondence between the eigenbasis of $B$ with the eigenbasis of $A$.

Ok, what happens when the observables are NOT compatible?

Here is a lovely application of conditional probability. It works this way: suppose on the first measurement, $a_k$ is observed. This puts us in state vector $\alpha_k$. Now we measure the observable $B$ which means that there is a probability $|\langle \alpha_k, \beta_i \rangle|^2$ of observing eigenvalue $b_i$. Now $\beta_i$ is the new state vector and when observable $A$ is measured, we have a probability $|\langle \alpha_j, \beta_i \rangle|^2$ of observing eigenvalue $a_j$ in the second measurement of observable $A$.

Therefore given the initial measurement we can construct a conditional probability density function $p(a_j|a_k) = \sum_i p(b_i|a_k)p(a_j|b_i)= \sum_i |\langle \alpha_k, \beta_i \rangle| |^2 |\langle \beta_i, \alpha_j |^2$

Again, this makes sense only if the observations were taken so close together so as to not allow the state vector to undergo time evolution; ONLY the measurements changes the state vector.

Next: we move to the famous Heisenberg Uncertainty Principle, which states that, if we view the interaction of the observables $A$ and $B$ with a set state vector and abuse notation a bit and regard the associated density functions (for the eigenvalues) by the same letters, then $V(A)V(B) \geq (1/4)|\langle \psi, [AB-BA]\psi \rangle |^2.$

Of course, if the observables are compatible, then the right side becomes zero and if $AB-BA = c$ for some non-zero scalar $c$ (that is, $(AB-BA) \psi = c\psi$ for all possible state vectors $\psi$ ), then we get $V(A)V(B) \geq (1/4)|c|^2$ which is how it is often stated.

## July 19, 2011

### Quantum Mechanics and Undergraduate Mathematics IV: measuring an observable (example)

Ok, we have to relate the observables to the state of the system. We know that the only possible “values” of the observable are the eigenvalues of the operator and the relation of the operator to the state vector provides the density function. But what does this measurement do to the state? That is, immediately after a measurement is taken, what is the state?

True, the system undergoes a "time evolution" but once an observable is measured, an immediate (termed "successive") measurement will yield the same value; a "repeated" measurement (one made giving the system to undergo a time evolution) might give a different value.

So we get:

Postulate 4 A measurement of an observable generally (?) causes a drastic, uncontrollable alteration in the state vector of the system; immediately after the measurement it will coincide with the eigenvector corresponding to the eigenvalue obtained in the measurement.

Note: we assume that our observable operators have distinct eigenvalues; that is, no two distinct eigenvectors have the same eigenvalue.

That is, if we measure an observable with operator $A$ and obtain measurement $a_i$ then the new system eigenvector is $\alpha_i$ regardless of what $\psi$ was prior to measurement. Of course, this eigenvector can (and usually will) evolve with time.

Roughly speaking, here is what is going on:
Say the system is in state $\psi$. We measure and observable with operator $A$. We can only obtain one of the eigenvalues $\alpha_k$ as a measurement. Recall: remember all of those “orbitals” from chemistry class? Those were the energy levels of the electrons and the orbital level was a permissible energy state that we could obtain by a measurement.

Now if we get $\alpha_k$ as a measurement, the new state vector is $\alpha_k$. One might say that we started with a probability density function (given the state and the observable), we made a measurement, and now, for a brief instant anyway, our density function “collapsed” to the density function $P(A = a_k) = 1$.

This situation (brief) coincides with our classical intuition of an observable “having a value”.

For the purposes of this example, we’ll set our Hilbert space to the the square integrable piecewise smooth functions on $[-\pi, \pi]$ and let our “state vector” $\psi(x) =\left\{ \begin{array}{c}1/\sqrt{\pi}, 0 < x \leq \pi \\ 0,-\pi \leq x \leq 0 \end{array}\right.$

Now suppose our observable corresponds to the eigenfunctions mentioned in this post, and we measure “-4” for our observable. This is the eigenvalue for $(1/\sqrt{\pi})sin(2x)$ so our new state vector is $(1/\sqrt{\pi})sin(2x)$.

So what happens if a different observable is measured IMMEDIATELY (e. g., no chance for a time evolution to take place).

Example We’ll still use the space of square integrable functions over $[-\pi, \pi]$
One might recall the Legendre polynomials which are eigenfucntions of the following operator:
$d/dt((1-t^2) dP_n/dt) = -(n)(n+1) P_n(t)$. These polynomials obey the orthogonality relation $\int^{1}_{-1} P_m(t)P_n(t)dt = 2/(2n+1) \delta_{m,n}$ hence $\int^{1}_{-1} P_m(t)P_m(t)dt = 2/(2m+1)$.
The first few of these are $P_0 = 1, P_1 =t, P_2 = (1/2)(3t^2-1), P_3 = (1/2)(5t^3 - 3t), ..$

We can adjust these polynomials by the change of variable $t =x/\pi$ and multiply each polynomial $P_m$ by the factor $sqrt{2/(\pi (2m+1) }$ to obtain an orthonormal eigenbasis. Of course, one has to adjust the operator by the chain rule.

So for this example, let $P_n$ denote the adjusted Legendre polynomial with eigenvalue $-n(n+1)$.

Now back to our original state vector which was changed to state function $(1/\sqrt{\pi})sin(2x)$.

Now suppose eigenvalue $-6 = -2(3)$ is observed as an observable with the Lengendre operator; this corresponds to eigenvector $\sqrt{(2/5)(1/\pi)}(1/2)(3(x/\pi)^2 -1)$ which is now the new state vector.

Now if we were to do an immediate measurement of the first observable, we’d have to a Fourier like expansion of our new state vector; hence the probability density function for the observables changes from the initial measurement. Bottom line: the order in which the observations are taken matters….in general.

The case in which the order wouldn’t matter: if the second observable had the state vector (from the first measurement) as an element of its eigenbasis.

We will state this as a general principle in our next post.

## July 15, 2011

### Quantum Mechanics and Undergraduate Mathematics III: an example of a state function

I feel bad that I haven’t given a demonstrative example, so I’ll “cheat” a bit and give one:

For the purposes of this example, we’ll set our Hilbert space to the the square integrable piecewise smooth functions on $[-\pi, \pi]$ and let our “state vector” $\psi(x) =\left\{ \begin{array}{c}1/\sqrt{\pi}, 0 < x \leq \pi \\ 0,-\pi \leq x \leq 0 \end{array}\right.$

Now consider a (bogus) state operator $d^2/dx^2$ which has an eigenbasis $(1/\sqrt{\pi})cos(kx), (1/\sqrt{\pi})sin(kx), k \in {, 1, 2, 3,...}$ and $1/\sqrt{2\pi}$ with eigenvalues $0, -1, -4, -9,......$ (note: I know that this is a degenerate case in which some eigenvalues share two eigenfunctions).

Note also that the eigenfunctions are almost the functions used in the usual Fourier expansion; the difference is that I have scaled the functions so that $\int^{\pi}_{-\pi} (sin(kx)/\sqrt{\pi})^2 dx = 1$ as required for an orthonormal basis with this inner product.

Now we can write $\psi = 1/(2 \sqrt{\pi}) + 4/(\pi^{3/2})(sin(x) + (1/3)sin(3x) + (1/5)sin(5x) +..)$
(yes, I am abusing the equal sign here)
This means that $b_0 = 1/\sqrt{2}, b_k = 2/(k \pi), k \in {1,3,5,7...}$

Now the only possible measurements of the operator are 0, -1, -4, -9, …. and the probability density function is: $p(A = 0) = 1/2, P(A = -1) = 4/(\pi^2), P(A = -3) = 4/(9 \pi^2),...P(A = -(2k-1))= 4/(((2k-1)\pi)^2)..$

One can check that $1/2 + (4/(\pi^2))(1 + 1/9 + 1/25 + 1/49 + 1/81....) = 1.$

Here is a plot of the state function (blue line at the top) along with some of the eigenfunctions multiplied by their respective $b_k$.

## July 13, 2011

### Quantum Mechanics and Undergraduate Mathematics II

In the first part of this series, we reviewed some of the mathematical background that we’ll use. Now we get into a bit of the physics.

For simplification, we’ll assume one dimensional, non-relativistic motion. No, nature isn’t that simple; that is why particle physics is hard! 🙂

What we will do is to describe a state of a system and the observables. The state of the system is hard to describe; in the classical case (say the damped mass-spring system in harmonic motion), the state of the system is determined by the system parameters (mass, damping constant, spring constant) and the velocity and acceleration at a set time.

And observable is, roughly speaking, something that can give us information about the state of the system. In classical mechanics, one observable might be $H(x, p) = P^2/2m + V(x)$ where $p$ is the system’s momentum and $V(x)$ represents the potential energy at position $x$. If this seems strange, remember that $p = mv$ therefore kinetic energy is $mv^2/2$ and solving for momentum $p$ gives us the formula. We bring this up because something similar will appear later.

In quantum mechanics, certain postulates are assumed. I’ll present the ones that Gillespie uses:

Postulate 1: Every possible physical state of a given system corresponds to a Hilbert space vector $\psi$ of unit norm (using the inner product that we talked about) and every such vector corresponds to a possible state of a system. The correspondence of states to the vectors is well defined up to multiplication of a vector by a complex number of unit modulus.

Note: this state vector, while containing all of the knowable information of the system, says nothing about what could be known or how such knowledge might be observed. Of course, this state vector might evolve with time and sometimes it is written as $\psi_{t}$ for this reason.

Postulate 2 There is a one to one correspondence between physical observables and linear Hermitian operators $A$, each of which possesses a complete, orthonormal set of eigenvectors $\alpha_{i}$ and a corresponding set of real eigenvalues $a_i$ and the only possible values of any measurement of this observable is one of these eigenvalues.

Note: in the cases when the eigenvalues are discretely distributed (e. g., the eigenvalues fail to have a limit point), we get “quantized” behavior from this observable.

We’ll use observables with discrete eigenvalues unless we say otherwise.

Now: is a function of an observable itself an observable? The answer is “yes” if the function is real analytic and we assume that $(A)^n(\psi) = A(A(A....A(\psi))$. To see this: assume that $f(z) = \sum_i c_i z^i$ and note that if $A$ is an observable operator then so is $cA^n$ for all $n$. Note: one can do this by showing that the eigenvectors for $A$ do not change and that the eigenvalues merely go up by power. The completeness of the eigenvectors imply convergence when we pass to $f$.

Now we have states and observables. But how do they interact?
Remember that we showed the following:

Let $A$ be a linear operator with a complete orthonormal eigenbasis $\alpha_i$ and corresponding real eigenvalues $a_i$. Let $\psi$ be an element of the Hilbert space with unit norm and let $\psi = \sum_j b_j \alpha_j$.

Then the function $P(y = a_i) = (|b_i|)^2$ is a probability density function. (note: $b_i = \langle \alpha_i , \psi \rangle$).

This will give us exactly what we need! Basically, if the observable has operator $A$ system and is in state $\psi$, then the probability of a measurement yielding a result of $a_i$ is $(|\langle \alpha_i , \psi \rangle|)^2$ Note: it follows that if the state $\phi = \alpha_i$ then the probability of obtaining $a_i$ is exactly one.

We summarize this up by Postulate 3: (page 49 of Gillespie, stated for the “scattered eigenvalues” case):

Postulate 3: If an observable operator $A$ has eigenbasis $\alpha_i$ with eigenvalues $a_i$ and if the corresponding observable is measured on a system which, immediately prior to the measurement is in state $\psi$ then the strongest predictive statement that can be made concerning the result of this measurement is as follows: the probability that the measurement will yield $a_k$ is $(|\langle \alpha_i , \psi \rangle|)^2$.

Note: for simplicity, we are restricting ourselves to observables which have distinct eigenvalues (e. g., no two linearly independent eigenvectors have the same eigenvalues). In real life, some observables DO have different eigenvectors with the same eigenvalue (example from calculus; these are NOT Hilbert Space vectors, but if the operator is $d^2/dx^2$ then $sin(x)$ and $cos(x)$ both have eigenvalue -1. )

Where we are now: we have a probability distribution to work with which means that we can calculate an expected value and a variance. These values will be fundamental when we tackle uncertainty principles!

Just a reminder from our courses in probability theory: if $Y$ is a random variable with density function $P$

$E(Y) = \sum_i y_i P(y_i)$ and $V(Y) = E(Y^2) -(E(Y))^2$.

So with our density function $P(y = a_i) = (|b_i|)^2$ (we use $b_i = \langle \alpha_i , \psi \rangle$ to save space), then if $E(A)$ is the expected observed value of the observable (the expected value of the eigenvalues):
$E(A) = \sum_i a_i (b_i)^2$. But this quantity can be calculated in another way:

$\langle \psi , A(\psi) \rangle = \langle \sum b_i \alpha_i , A(\sum b_i \alpha_i) \rangle = \langle \sum b_i \alpha_i , \sum a_i b_i \alpha_i) \rangle = \sum_i \overline{b_i} b_i a_i \langle \alpha_i, \alpha_i \rangle = \sum_i \overline{b_i} b_i a_i = \sum_i |b_i|^2 a_i = E(A)$. Yes, I skipped some easy steps.

Using this we find $V(A) = \langle \psi, A^2(\psi) \rangle - (\langle \psi, A(\psi) \rangle )^2$ and it is customary to denote the standard deviation $\sqrt{V(A)} = \Delta(A)$

In our next installment, I give an illustrative example.

In a subsequent installment, we’ll show how a measurement of an observable affects the state and later how the distribution of the observable changes with time.

## July 11, 2011

### Quantum Mechanics for teachers of undergraduate mathematics I

I am planning on writing up a series of notes from the out of print book A Quantum Mechanics Primer by Daniel Gillespie.

My background: mathematics instructor (Ph.D. research area: geometric topology) whose last physics course (at the Naval Nuclear Power School) was almost 30 years ago; sophomore physics was 33 years ago.

Your background: you teach undergraduate mathematics for a living and haven’t had a course in quantum mechanics; those who have the time to study a book such as Quantum Mechanics and the Particles of Nature by Anthony Sudbery would be better off studying that. Those who have had a course in quantum mechanics would be bored stiff.

Topics the reader should know: probability density functions, square integrability, linear algebra, (abstract inner products (Hermitian), eigenbasis, orthonormal basis), basic analysis (convergence of a series of functions) differential equations, dirac delta distribution.

My purpose: present some opportunities to present applications to undergraduate students e. g., “the dirac delta “function” (distribution really) can be thought of as an eigenvector for this linear transformation”, or “here is an application of non-standard inner products and an abstract vector space”, or “here is a non-data application to the idea of the expected value and variance of a probability density function”, etc.

Basic mathematical objects
Our vector space will consist of functions $\psi : R \rightarrow C$ (complex valued functions of a real variable) for which $\int^{\infty}_{-\infty} \overline{\psi} \psi dx$ is finite. Note: the square root of a probability density function is a vector of this vector space. Scalars are complex numbers and the operation is the usual function addition.

Our inner product $\langle \psi , \phi \rangle = \int^{\infty}_{-\infty} \overline{\psi} \phi dx$ has the following type of symmetry: $\langle \psi , \phi \rangle= \overline{\langle \phi , \psi \rangle}$ and $\langle c\psi , \phi \rangle = \langle \psi , \overline{c} \phi \rangle = \overline{c}\langle \psi , \phi \rangle$.

Note: Our vector space will have a metric that is compatible with the inner product; such spaces are called Hilbert spaces. This means that we will allow for infinite sums of functions with some convergence; one might think of “convergence in the mean” which uses our inner product in the usual way to define the mean.

Of interest to us will be the Hermitian linear transformations $H$ where $\langle H(\psi ), \phi \rangle = \langle \psi ,H(\phi) \rangle .$ It is an easy exercise to see that such a linear transformation can only have real eigenvalues. We will also be interested in the subset (NOT a vector subspace) of vectors $\psi$ for which $||(\langle \psi , \phi \rangle)||^2 = 1$.

Eigenvalues and eigenvectors will be defined in the usual way: if $H(\psi) = \alpha \psi$ then we say that $\psi$ is an eigenvector for $H$ with associated eigenvalue $\alpha$. If there is a countable number of orthornormal eigenvectors whose “span” (allowing for infinite sums) includes every element of the vector space, then we say that $H$ has a complete orthonormal eigenbasis.

It is a good warm up exercise to show that if $H$ has a complete orthonormal eigenbasis then $H$ is Hermitian.

Hint: start with $\langle H(\psi ), \phi \rangle$ and expand $\psi$ and $\phi$ in terms of the eigenbasis; of course the linear operator $H$ has to commute with the infinite sum so there are convergence issues to be concerned about.

The outline goes something like this: suppose $\epsilon_i$ is the complete set of eigenvectors for $H$ with eigenvalues $a_i$ and $\psi = \sum_i b_i \epsilon_i$ and $\phi = \sum_i c_i \epsilon_i$
$\langle H(\psi ), \phi \rangle =\langle H(\sum_i b_i \epsilon_i ), \phi \rangle = \langle \sum_i H(b_i \epsilon_i ), \phi \rangle = \langle \sum_i b_i a_i \epsilon_i , \phi \rangle = \sum_i a_i\langle b_i \epsilon_i , \phi \rangle$

Now do the same operation on the left side of the inner product and use the fact that the basis vectors are mutually orthogonal. Note: there are convergence issues here; those that relate the switching of the infinite sum notation outside of the inner product can be handled with a dominated convergence theorem for integrals. But the intuition taken from finite vector spaces works here.

The other thing to note is that not every Hermitian operator is “closed”; that is it is possible for $\psi$ to be square integrable but for operator $H(\phi) = x \phi$ to not be square integrable.

Probability Density Functions

Let $H$ be a linear operator with a complete orthonormal eigenbasis $\epsilon_i$ and corresponding real eigenvalues $a_i$. Let $\psi$ be an element of the Hilbert space with unit norm and let $\psi = \sum_j b_j \epsilon_j$.

Claim: the function $P(y = a_i) = (|b_i|)^2$ is a probability density function. (note: $b_i = \langle \epsilon_i , \psi \rangle$).

The fact that $(|b_i|)^2 \leq 1$ follows easily from the Cauchy-Schwartz inequality. Also note that $1 = | \langle \psi, \psi \rangle | = | \langle \sum b_i \epsilon_i,\sum b_i \epsilon_i \rangle | = |\sum_i (b_i)^2 \langle \epsilon_i, \epsilon_i \rangle | = |\sum_i (b_i)^2|$

Yes, I skipped some steps that are easy to fill in. But the bottom line is that this density function now has a (sometimes) finite expected value and a (sometimes) finite variance.

With the mathematical preliminaries (mostly) out of the way, we are ready to see how this applies to physics.