This blog isn’t about cosmology or about arguments over religion. But it is unusual to hear “on all but a set of measure zero” in the middle of a pop-science talk: (2:40-2:50)
This blog isn’t about cosmology or about arguments over religion. But it is unusual to hear “on all but a set of measure zero” in the middle of a pop-science talk: (2:40-2:50)
One thing that surprised me about the professor’s job (at a non-research intensive school; we have a modest but real research requirement, but mostly we teach): I never knew how much time I’d spend doing tasks that have nothing to do with teaching and scholarship. Groan….how much of this do I tell our applicants that arrive on campus to interview? 🙂
But there is something mathematical that I want to talk about; it is a follow up to this post. It has to do with what string theorist tell us: . Needless to say, they are using a non-standard definition of “value of a series”.
Where I think the problem is: when we hear “series” we think of something related to the usual process of addition. Clearly, this non-standard assignment doesn’t related to addition in the way we usually think about it.
So, it might make more sense to think of a “generalized series” as a map from the set of sequences of real numbers (or: the infinite dimensional real vector space) to the real numbers; the usual “limit of partial sums” definition has some nice properties with respect to sequence addition, scalar multiplication and with respect to a “shift operation” and addition, provided we restrict ourselves to a suitable collection of sequences (say, those whose traditional sum of components are absolutely convergent).
So, this “non-standard sum” can be thought of as a map where . That is a bit less offensive than calling it a “sum”. 🙂
One “fun” math book is Knopp’s book Theory and Application of Infinite Series. I highly recommend it to anyone who frequently teaches calculus, or to talented, motivated calculus students.
One of the more interesting chapters in the book is on “divergent series”. If that sounds boring consider the following:
we all know that when and diverges elsewhere, PROVIDED one uses the “sequence of partial sums” definition of covergence of sums. But, as Knopp points out, there are other definitions of convergence which leaves all the convergent (by the usual definition) series convergent (to the same value) but also allows one to declare a larger set of series to be convergent.
Consider
of course this is a divergent geometric series by the usual definition. But note that if one uses the geometric series formula:
and substitutes which IS in the domain of the right hand side (but NOT in the interval of convergence in the left hand side) one obtains .
Now this is nonsense unless we use a different definition of sum convergence, such as the Cesaro summation: if is the usual “partial sum of the first terms: then one declares the Cesaro sum of the series to be provided this limit exists (this is the arithmetic average of the partial sums).
(see here)
So for our we easily see that so for even we see and for odd we get which tends to as tends to infinity.
Now, we have this weird type of assignment.
But that won’t help with . But weirdly enough, string theorists find a way to assign this particular series a number! In fact, the number that they assign to this makes no sense at all: .
What the heck? Well, one way this is done is explained here:
Consider Now differentiate term by term to get and now multiply both sides by to obtain This has a pole of order 2 at . But now substitute and calculate the Laurent series about ; the 0 order term turns out to be . Yes, this has applications in string theory!
Now of course, if one uses the usual definitions of convergence, I played fast and loose with the usual intervals of convergence and when I could differentiate term by term. This theory is NOT the usual calculus theory.
Now if you want to see some “fun nonsense” applied to this (spot how many “errors” are made….it is a nice exercise):
And read this to see exploding heads. 🙂
What is going on: when one sums a series, one is really “assigning a value” to an object; think of this as a type of morphism of the set of series to the set of numbers. The usual definition of “sum of a series” is an especially nice morphism as it allows, WITH PRECAUTIONS, some nice algebraic operations in the domain (the set of series) to be carried over into the range. I say “with precautions” because of things like the following:
1. If one is talking about series of numbers, then one must have an absolutely convergent series for derangements of a given series to be assigned the same number. Example: it is well known that a conditionally convergent alternating series can be arranged to converge to any value of choice.
2. If one is talking about a series of functions (say, power series where one sums things like ) one has to be in OPEN interval of absolute convergence to justify term by term differentiation and integration; then of course a series is assigned a function rather than a number.
So when one tries to go with a different notion of convergence, one must be extra cautious as to which operations in the domain space carry through under the “assignment morphism” and what the “equivalence classes” of a given series are (e. g. can a series be deranged and keep the same sum?)
This Phil Plait article started this post in motion for me and I got to it via 3-quarks daily.
I was blogging about the topic of how “classroom knowledge” turns into “walking around knowledge” and came across an “elementary physics misconceptions” webpage at the University of Montana. It is fun, but it helped me realize how easy things can be when one thinks mathematically.
This becomes very easy if one does a bit of mathematics. Let represent the mass of the object; implies that which isn’t that important; we’ll just use . Now putting into vector form we have . By elementary integration, obtain and integrate again to obtain which has parametric equations which has a “sideways parabola” as a graph.
Let’s look at another example:
So what is going on? Force . The first term is thrust and is against the direction of acceleration. So we have: which, upon integration, implies that and so we see that the rocket continues to speed up at a constant acceleration.
These problems are easier with mathematics, aren’t they? 🙂
Via Jerry Coyne’s website; you’ll see some great comments there.
Watch standing waves in action:
Here is what is going on; the particles collect at the “stationary” points.
This is an excellent reason to take a course that deals with Fourier Series!
Here is an example of a projection, and what happens when you take the image and move it a little.
This is a fun little post about the interplay between physics, mathematics and statistics (Brownian Motion)
Here is a teaser video:
The article itself has a nice animation showing the effects of a Poisson process: one will get some statistical clumping in areas rather than uniform spreading.
Treat yourself to the whole article; it is entertaining.
The purpose of this note is to give a bit of direction to the perplexed student.
I am not going to go into all the possible uses of eigenvalues, eigenvectors, eigenfuntions and the like; I will say that these are essential concepts in areas such as partial differential equations, advanced geometry and quantum mechanics:
Quantum mechanics, in particular, is a specific yet very versatile implementation of this scheme. (And quantum field theory is just a particular example of quantum mechanics, not an entirely new way of thinking.) The states are “wave functions,” and the collection of every possible wave function for some given system is “Hilbert space.” The nice thing about Hilbert space is that it’s a very restrictive set of possibilities (because it’s a vector space, for you experts); once you tell me how big it is (how many dimensions), you’ve specified your Hilbert space completely. This is in stark contrast with classical mechanics, where the space of states can get extraordinarily complicated. And then there is a little machine — “the Hamiltonian” — that tells you how to evolve from one state to another as time passes. Again, there aren’t really that many kinds of Hamiltonians you can have; once you write down a certain list of numbers (the energy eigenvalues, for you pesky experts) you are completely done.
(emphasis mine).
So it is worth understanding the eigenvector/eigenfunction and eigenvalue concept.
First note: “eigen” is German for “self”; one should keep that in mind. That is part of the concept as we will see.
The next note: “eigenfunctions” really are a type of “eigenvector” so if you understand the latter concept at an abstract level, you’ll understand the former one.
The third note: if you are reading this, you are probably already familiar with some famous eigenfunctions! We’ll talk about some examples prior to giving the formal definition. This remark might sound cryptic at first (but hang in there), but remember when you learned ? That is, you learned that the derivative of is a scalar multiple of itself? (emphasis on SELF). So you already know that the function is an eigenfunction of the “operator” with eigenvalue because that is the scalar multiple.
The basic concept of eigenvectors (eigenfunctions) and eigenvalues is really no more complicated than that. Let’s do another one from calculus:
the function is an eigenfunction of the operator with eigenvalue because . That is, the function is a scalar multiple of its second derivative. Can you think of more eigenfunctions for the operator ?
Answer: and are two others, if we only allow for non zero eigenvalues (scalar multiples).
So hopefully you are seeing the basic idea: we have a collection of objects called vectors (can be traditional vectors or abstract ones such as differentiable functions) and an operator (linear transformation) that acts on these objects to yield a new object. In our example, the vectors were differentiable functions, and the operators were the derivative operators (the thing that “takes the derivative of” the function). An eigenvector (eigenfunction)-eigenvalue pair for that operator is a vector (function) that is transformed to a scalar multiple of itself by the operator; e. g., the derivative operator takes to which is a scalar multiple of the original function.
Formal Definition
We will give the abstract, formal definition. Then we will follow it with some examples and hints on how to calculate.
First we need the setting. We start with a set of objects called “vectors” and “scalars”; the usual rules of arithmetic (addition, multiplication, subtraction, division, distributive property) hold for the scalars and there is a type of addition for the vectors and scalars and the vectors “work together” in the intuitive way. Example: in the set of, say, differentiable functions, the scalars will be real numbers and we have rules such as , etc. We could also use things like real numbers for scalars, and say, three dimensional vectors such as More formally, we start with a vector space (sometimes called a linear space) which is defined as a set of vectors and scalars which obey the vector space axioms.
Now, we need a linear transformation, which is sometimes called a linear operator. A linear transformation (or operator) is a function that obeys the following laws: and . Note that I am using to denote the vectors and the undecorated variable to denote the scalars. Also note that this linear transformation might take one vector space to a different vector space.
Common linear transformations (and there are many others!) and their eigenvectors and eigenvalues.
Consider the vector space of two-dimensional vectors with real numbers as scalars. We can create a linear transformation by matrix multiplication:
(note: is the transpose of the row vector; we need to use a column vector for the usual rules of matrix multiplication to apply).
It is easy to check that the operation of matrix multiplying a vector on the left by an appropriate matrix is yields a linear transformation.
Here is a concrete example:
So, does this linear transformation HAVE non-zero eigenvectors and eigenvalues? (not every one does).
Let’s see if we can find the eigenvectors and eigenvalues, provided they exist at all.
For to be an eigenvector for , remember that for some real number
So, using the matrix we get: . So doing some algebra (subtracting the vector on the right hand side from both sides) we obtain
At this point it is tempting to try to use a distributive law to factor out from the left side. But, while the expression makes sense prior to factoring, it wouldn’t AFTER factoring as we’d be subtracting a scalar number from a 2 by 2 matrix! But there is a way out of this: one can then insert the 2 x 2 identity matrix to the left of the second term of the left hand side:
Notice that by doing this, we haven’t changed anything except now we can factor out that vector; this would leave:
Which leads to:
Now we use a fact from linear algebra: if is not the zero vector, we have a non-zero matrix times a non-zero vector yielding the zero vector. This means that the matrix is singular. In linear algebra class, you learn that singular matrices have determinant equal to zero. This means that which means that are the respective eigenvalues. Note: when we do this procedure with any 2 by 2 matrix, we always end up with a quadratic with as the variable; if this quadratic has real roots then the linear transformation (or matrix) has real eigenvalues. If it doesn’t have real roots, the linear transformation (or matrix) doesn’t have non-zero real eigenvalues.
Now to find the associated eigenvectors: if we start with we get
which has solution . So that is the eigenvector associated with eigenvalue 1.
If we next try we get
which has solution . So that is the eigenvector associated with the eigenvalue 3.
In the general “k-dimensional vector space” case, the recipe for finding the eigenvectors and eigenvalues is the same.
1. Find the matrix for the linear transformation.
2. Form the matrix which is the same as matrix except that you have subtracted from each diagonal entry.
3. Note that is a polynomial in variable ; find its roots . These will be the eigenvalues.
4. Start with Substitute this into the matrix-vector equation and solve for . That will be the eigenvector associated with the first eigenvalue. Do this for each eigenvalue, one at a time. Note: you can get up to “linearly independent” eigenvectors in this manner; that will be all of them.
Practical note
Yes, this should work “in theory” but practically speaking, there are many challenges. For one: for equations of degree 5 or higher, it is known that there is no formula that will find the roots for every equation of that degree (Galios proved this; this is a good reason to take an abstract algebra course!). Hence one must use a numerical method of some sort. Also, calculation of the determinant involves many round-off error-inducing calculations; hence sometimes one must use sophisticated numerical techniques to get the eigenvalues (a good reason to take a numerical analysis course!)
Consider a calculus/differential equation related case of eigenvectors (eigenfunctions) and eigenvalues.
Our vectors will be, say, infinitely differentiable functions and our scalars will be real numbers. We will define the operator (linear transformation) , that is, the process that takes the n’th derivative of a function. You learned that the sum of the derivatives is the derivative of the sums and that you can pull out a constant when you differentiate. Hence is a linear operator (transformation); we use the term “operator” when we talk about the vector space of functions, but it is really just a type of linear transformation.
We can also use these operators to form new operators; that is We see that such “linear combinations” of linear operators is a linear operator.
So, what does it mean to find eigenvectors and eigenvalues of such beasts?
Suppose we with to find the eigenvectors and eigenvalues of . An eigenvector is a twice differentiable function (ok, we said “infinitely differentiable”) such that or which means . You might recognize this from your differential equations class; the only “tweak” is that we don’t know what is. But if you had a differential equations class, you’d recognize that the solution to this differential equation depends on the roots of the characteristic equation which has solutions: and the solution takes the form if the roots are real and distinct, if the roots are complex conjugates and if there is a real, repeated root. In any event, those functions are the eigenfunctions and these very much depend on the eigenvalues.
Of course, reading this little note won’t make you an expert, but it should get you started on studying.
I’ll close with a link on how these eigenfunctions and eigenvalues are calculated (in the context of solving a partial differential equation).
Suppose we are trying to solve the following partial differential equation:
subject to boundary conditions:
It turns out that we will be using techniques from ordinary differential equations and concepts from linear algebra; these might be confusing at first.
The first thing to note is that this differential equation (the so-called heat equation) is known to satisfy a “uniqueness property” in that if one obtains a solution that meets the boundary criteria, the solution is unique. Hence we can attempt to find a solution in any way we choose; if we find it, we don’t have to wonder if there is another one lurking out there.
So one technique that is often useful is to try: let where is a function of alone and is a function of alone. Then when we substitute into the partial differential equation we obtain:
which leads to
The next step is to note that the left hand side does NOT depend on ; it is a function of alone. The right hand side does not depend on as it is a function of alone. But the two sides are equal; hence neither side can depend on or ; they must be constant.
Hence we have
So far, so good. But then you are told that is an eigenvalue. What is that about?
The thing to notice is that and
First, the equation in can be written as with the operator denoting the first derivative. Then the second can be written as where denotes the second derivative operator. Recall from linear algebra that these operators meet the requirements for a linear transformation if the vector space is the set of all functions that are “differentiable enough”. So what we are doing, in effect, are trying to find eigenvectors for these operators.
So in this sense, solving a homogeneous differential equation is really solving an eigenvector problem; often this is termed the “eigenfucntion” problem.
Note that the differential equations are not difficult to solve:
; the real valued form of the equation in depends on whether is positive, zero or negative.
But the point is that we are merely solving a constant coefficient differential equation just as we did in our elementary differential equations course with one important difference: we don’t know what the constant (the eigenvalue) is.
Now if we turn to the boundary conditions on we see that a solution of the form cannot meet the zero at the boundaries conditions; we can rule out the condition as well.
Hence we know that is negative and we get solution and then solution.
But now we notice that these solutions have a in them; this is what makes these ordinary differential equations into an “eigenvalue/eigenfucntion” problem.
So what values of will work? We know it is negative so we say If we look at the end conditions and note that is never zero, we see that the cosine term must vanish ( ) and we can ensure that which implies that So we get a whole host of functions: .
Now we still need to meet the last condition (set at ) and that is where Fourier analysis comes in. Because the equation was linear, we can add the solutions and get another solution; hence the term is just obtained by taking the Fourier expansion for the function in terms of sines.
The coefficients are and the solution is:
I feel a bit guilty as I haven’t gone over an example of how one might work out a problem. So here goes:
Suppose our potential function is some sort of energy well: for and elsewhere.
Note: I am too lazy to keep writing so I am going with for now.
So, we have the two Schrödinger equations with being the state vector and being one of the stationary states:
Where are the eigenvalues for
Now apply the potential for and the equations become:
Yes, I know that equation II is a consequence of equation I.
Now we use a fact from partial differential equations: the first equation is really a form of the “diffusion” or “heat” equation; it has been shown that once one takes boundary conditions into account, the equation posses a unique solution. Hence if we find a solution by any means necessary, we don’t have to worry about other solutions being out there.
So attempt a solution of the form where the first factor is a function of alone and the second is of alone.
Now put into the second equation:
Now assume and divide both sides by and do a little algebra to obtain:
are the eigenvalues for the stationary states; assume that these are positive and we obtain:
from our knowledge of elementary differential equations.
Now for we have . Our particle is in our well and we can’t have values below 0; hence . Now
We want zero at so which means .
Now let’s look at the first Schrödinger equation:
This gives the equation:
Note: in partial differential equations, it is customary to note that the left side of the equation is a function of alone and therefore independent of and that the right hand side is a function of alone and therefore independent of ; since these sides are equal they must be independent of both and and therefore constant. But in our case, we already know that . So our equation involving becomes so our differential equation becomes
which has the solution
So our solution is where .
This becomes which, written in rectangular complex coordinates is
Here are some graphs: we use and plot for and . The plot is of the real part of the stationary state vector.
Up to now, I’ve used mathematical notation for state vectors, inner products and operators. However, physicists use something called “Dirac” notation (“bras” and “kets”) which we will now discuss.
Recall: our vectors are integrable functions where converges.
Our inner product is:
Here is the Dirac notation version of this:
A “ket” can be thought of as the vector . Of course, there is an easy vector space isomorphism (Hilbert space isomorphism really) between the vector space of state vectors and kets given by . The kets are denoted by .
Similarly there are the “bra” vectors which are “dual” to the “kets”; these are denoted by and the vector space isomorphism is given by . I chose this isomorphism because in the bra vector space, . Then there is a vector space isomorphism between the bras and the kets given by .
Now is the inner product; that is
By convention: if is a linear operator, and Now if is a Hermitian operator (the ones that correspond to observables are), then there is no ambiguity in writing .
This leads to the following: let be an operator corresponding to an observable with eigenvectors and eigenvalues . Let be a state vector.
Then and if is a random variable corresponding to the observed value of , then and the expectation .