College Math Teaching

May 26, 2012

Eigenvalues, Eigenvectors, Eigenfunctions and all that….

The purpose of this note is to give a bit of direction to the perplexed student.

I am not going to go into all the possible uses of eigenvalues, eigenvectors, eigenfuntions and the like; I will say that these are essential concepts in areas such as partial differential equations, advanced geometry and quantum mechanics:

Quantum mechanics, in particular, is a specific yet very versatile implementation of this scheme. (And quantum field theory is just a particular example of quantum mechanics, not an entirely new way of thinking.) The states are “wave functions,” and the collection of every possible wave function for some given system is “Hilbert space.” The nice thing about Hilbert space is that it’s a very restrictive set of possibilities (because it’s a vector space, for you experts); once you tell me how big it is (how many dimensions), you’ve specified your Hilbert space completely. This is in stark contrast with classical mechanics, where the space of states can get extraordinarily complicated. And then there is a little machine — “the Hamiltonian” — that tells you how to evolve from one state to another as time passes. Again, there aren’t really that many kinds of Hamiltonians you can have; once you write down a certain list of numbers (the energy eigenvalues, for you pesky experts) you are completely done.

(emphasis mine).

So it is worth understanding the eigenvector/eigenfunction and eigenvalue concept.

First note: “eigen” is German for “self”; one should keep that in mind. That is part of the concept as we will see.

The next note: “eigenfunctions” really are a type of “eigenvector” so if you understand the latter concept at an abstract level, you’ll understand the former one.

The third note: if you are reading this, you are probably already familiar with some famous eigenfunctions! We’ll talk about some examples prior to giving the formal definition. This remark might sound cryptic at first (but hang in there), but remember when you learned $\frac{d}{dx} e^{ax} = ae^{ax}$? That is, you learned that the derivative of $e^{ax}$ is a scalar multiple of itself? (emphasis on SELF). So you already know that the function $e^{ax}$ is an eigenfunction of the “operator” $\frac{d}{dx}$ with eigenvalue $a$ because that is the scalar multiple.

The basic concept of eigenvectors (eigenfunctions) and eigenvalues is really no more complicated than that. Let’s do another one from calculus:
the function $sin(wx)$ is an eigenfunction of the operator $\frac{d^2}{dx^2}$ with eigenvalue $-w^2$ because $\frac{d^2}{dx^2} sin(wx) = -w^2sin(wx)$. That is, the function $sin(wx)$ is a scalar multiple of its second derivative. Can you think of more eigenfunctions for the operator $\frac{d^2}{dx^2}$?

Answer: $cos(wx)$ and $e^{ax}$ are two others, if we only allow for non zero eigenvalues (scalar multiples).

So hopefully you are seeing the basic idea: we have a collection of objects called vectors (can be traditional vectors or abstract ones such as differentiable functions) and an operator (linear transformation) that acts on these objects to yield a new object. In our example, the vectors were differentiable functions, and the operators were the derivative operators (the thing that “takes the derivative of” the function). An eigenvector (eigenfunction)-eigenvalue pair for that operator is a vector (function) that is transformed to a scalar multiple of itself by the operator; e. g., the derivative operator takes $e^{ax}$ to $ae^{ax}$ which is a scalar multiple of the original function.

Formal Definition
We will give the abstract, formal definition. Then we will follow it with some examples and hints on how to calculate.

First we need the setting. We start with a set of objects called “vectors” and “scalars”; the usual rules of arithmetic (addition, multiplication, subtraction, division, distributive property) hold for the scalars and there is a type of addition for the vectors and scalars and the vectors “work together” in the intuitive way. Example: in the set of, say, differentiable functions, the scalars will be real numbers and we have rules such as $a (f + g) =af + ag$, etc. We could also use things like real numbers for scalars, and say, three dimensional vectors such as $[a, b, c]$ More formally, we start with a vector space (sometimes called a linear space) which is defined as a set of vectors and scalars which obey the vector space axioms.

Now, we need a linear transformation, which is sometimes called a linear operator. A linear transformation (or operator) is a function $L$ that obeys the following laws: $L(\vec{v} + \vec{w}) = L(\vec{v}) + L(\vec{w} )$ and $L(a\vec{v}) = aL(\vec{v})$. Note that I am using $\vec{v}$ to denote the vectors and the undecorated variable to denote the scalars. Also note that this linear transformation $L$ might take one vector space to a different vector space.

Common linear transformations (and there are many others!) and their eigenvectors and eigenvalues.
Consider the vector space of two-dimensional vectors with real numbers as scalars. We can create a linear transformation by matrix multiplication:

$L([x,y]^T) = \left[ \begin{array}{cc} a & b \\ c & d \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right]=\left[ \begin{array}{c} ax+ by \\ cx+dy \end{array} \right]$ (note: $[x,y]^T$ is the transpose of the row vector; we need to use a column vector for the usual rules of matrix multiplication to apply).

It is easy to check that the operation of matrix multiplying a vector on the left by an appropriate matrix is yields a linear transformation.
Here is a concrete example: $L([x,y]^T) = \left[ \begin{array}{cc} 1 & 2 \\ 0 & 3 \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right]=\left[ \begin{array}{c} x+ 2y \\ 3y \end{array} \right]$

So, does this linear transformation HAVE non-zero eigenvectors and eigenvalues? (not every one does).
Let’s see if we can find the eigenvectors and eigenvalues, provided they exist at all.

For $[x,y]^T$ to be an eigenvector for $L$, remember that $L([x,y]^T) = \lambda [x,y]^T$ for some real number $\lambda$

So, using the matrix we get: $L([x,y]^T) = \left[ \begin{array}{cc} 1 & 2 \\ 0 & 3 \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right]= \lambda \left[ \begin{array}{c} x \\ y \end{array} \right]$. So doing some algebra (subtracting the vector on the right hand side from both sides) we obtain $\left[ \begin{array}{cc} 1 & 2 \\ 0 & 3 \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right] - \lambda \left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \end{array} \right]$

At this point it is tempting to try to use a distributive law to factor out $\left[ \begin{array}{c} x \\ y \end{array} \right]$ from the left side. But, while the expression makes sense prior to factoring, it wouldn’t AFTER factoring as we’d be subtracting a scalar number from a 2 by 2 matrix! But there is a way out of this: one can then insert the 2 x 2 identity matrix to the left of the second term of the left hand side:
$\left[ \begin{array}{cc} 1 & 2 \\ 0 & 3 \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right] - \lambda\left[ \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \end{array} \right]$

Notice that by doing this, we haven’t changed anything except now we can factor out that vector; this would leave:
$(\left[ \begin{array}{cc} 1 & 2 \\ 0 & 3 \end{array} \right] - \lambda\left[ \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \right] )\left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \end{array} \right]$

$(\left[ \begin{array}{cc} 1-\lambda & 2 \\ 0 & 3-\lambda \end{array} \right] ) \left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \end{array} \right]$

Now we use a fact from linear algebra: if $[x,y]^T$ is not the zero vector, we have a non-zero matrix times a non-zero vector yielding the zero vector. This means that the matrix is singular. In linear algebra class, you learn that singular matrices have determinant equal to zero. This means that $(1-\lambda)(3-\lambda) = 0$ which means that $\lambda = 1, \lambda = 3$ are the respective eigenvalues. Note: when we do this procedure with any 2 by 2 matrix, we always end up with a quadratic with $\lambda$ as the variable; if this quadratic has real roots then the linear transformation (or matrix) has real eigenvalues. If it doesn’t have real roots, the linear transformation (or matrix) doesn’t have non-zero real eigenvalues.

Now to find the associated eigenvectors: if we start with $\lambda = 1$ we get
$(\left[ \begin{array}{cc} 0 & 2 \\ 0 & 2 \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \end{array} \right]$ which has solution $\left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} 1 \\ 0 \end{array} \right]$. So that is the eigenvector associated with eigenvalue 1.
If we next try $\lambda = 3$ we get
$(\left[ \begin{array}{cc} -2 & 2 \\ 0 & 0 \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \end{array} \right]$ which has solution $\left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} 1 \\ 1 \end{array} \right]$. So that is the eigenvector associated with the eigenvalue 3.

In the general “k-dimensional vector space” case, the recipe for finding the eigenvectors and eigenvalues is the same.
1. Find the matrix $A$ for the linear transformation.
2. Form the matrix $A - \lambda I$ which is the same as matrix $A$ except that you have subtracted $\lambda$ from each diagonal entry.
3. Note that $det(A - \lambda I)$ is a polynomial in variable $\lambda$; find its roots $\lambda_1, \lambda_2, ...\lambda_n$. These will be the eigenvalues.
4. Start with $\lambda = \lambda_1$ Substitute this into the matrix-vector equation $det(A - \lambda I) \vec{v_1} = \vec{0}$ and solve for $\vec({v_1}$. That will be the eigenvector associated with the first eigenvalue. Do this for each eigenvalue, one at a time. Note: you can get up to $k$ “linearly independent” eigenvectors in this manner; that will be all of them.

Practical note
Yes, this should work “in theory” but practically speaking, there are many challenges. For one: for equations of degree 5 or higher, it is known that there is no formula that will find the roots for every equation of that degree (Galios proved this; this is a good reason to take an abstract algebra course!). Hence one must use a numerical method of some sort. Also, calculation of the determinant involves many round-off error-inducing calculations; hence sometimes one must use sophisticated numerical techniques to get the eigenvalues (a good reason to take a numerical analysis course!)

Consider a calculus/differential equation related case of eigenvectors (eigenfunctions) and eigenvalues.
Our vectors will be, say, infinitely differentiable functions and our scalars will be real numbers. We will define the operator (linear transformation) $D^n = \frac{d^n}{dx^n}$, that is, the process that takes the n’th derivative of a function. You learned that the sum of the derivatives is the derivative of the sums and that you can pull out a constant when you differentiate. Hence $D^n$ is a linear operator (transformation); we use the term “operator” when we talk about the vector space of functions, but it is really just a type of linear transformation.

We can also use these operators to form new operators; that is $(D^2 + 3D)(y) = D^2(y) + 3D(y) = \frac{d^2y}{dx^2} + 3\frac{dy}{dx}$ We see that such “linear combinations” of linear operators is a linear operator.

So, what does it mean to find eigenvectors and eigenvalues of such beasts?

Suppose we with to find the eigenvectors and eigenvalues of $(D^2 + 3D)$. An eigenvector is a twice differentiable function $y$ (ok, we said “infinitely differentiable”) such that $(D^2 + 3D) = \lambda y$ or $\frac{d^2y}{dx^2} + 3\frac{dy}{dx} = \lambda y$ which means $\frac{d^2y}{dx^2} + 3\frac{dy}{dx} - \lambda y = 0$. You might recognize this from your differential equations class; the only “tweak” is that we don’t know what $\lambda$ is. But if you had a differential equations class, you’d recognize that the solution to this differential equation depends on the roots of the characteristic equation $m^2 + 3m - \lambda = 0$ which has solutions: $m = -\frac{3}{2} \pm \frac{\sqrt{9-4\lambda}}{2}$ and the solution takes the form $e^{m_1}, e^{m_2}$ if the roots are real and distinct, $e^{ax}sin(bx), e^{ax}cos(bx)$ if the roots are complex conjugates $a \pm bi$ and $e^{m}, xe^{m}$ if there is a real, repeated root. In any event, those functions are the eigenfunctions and these very much depend on the eigenvalues.

Of course, reading this little note won’t make you an expert, but it should get you started on studying.

1. […] value”???? (if you want to know what an eigenvalue is, read this Note that “eigen” is not a proper name; it is German for […]

Pingback by Plagiarism and Pandering « blueollie — August 14, 2012 @ 2:38 am

2. Hi,
Would the eigenfunction be like a photocopier that makes multiple copies (eigenvalues) of the document ( the original function) ?
Thanks
Vidya Dean

Comment by Vidya Dean — January 16, 2018 @ 12:32 pm

• The photo copier would be the linear operator and the eigenfunction the document.

Comment by blueollie — January 16, 2018 @ 12:53 pm

3. PLEASE KINDLY HELP WITH THIS;WHAT IS THE EIGIN VECTORS B=e[4,
1]

Comment by DOZIE — February 7, 2018 @ 6:40 am