College Math Teaching

September 9, 2014

Chebyshev polynomials: a topological viewpoint

Chebyshev (or Tchebycheff) polynomials are a class of mutually orthogonal polynomials (with respect to the inner product: f \cdot g  = \int^1_{-1} \frac{1}{\sqrt{1 - x^2}} f(x)g(x) dx ) defined on the interval [-1, 1] . Yes, I realize that this is an improper integral, but it does converge in our setting.

These are used in approximation theory; here are a couple of uses:

1. The roots of the Chebyshev polynomial can be used to find the values of x_0, x_1, x_2, ...x_k \in [-1,1] that minimize the maximum of |(x-x_0)(x-x_1)(x-x_2)...(x-x_k)| over the interval [-1,1] . This is important in minimizing the error of the Lagrange interpolation polynomial.

2. The Chebyshev polynomial can be used to adjust an approximating Taylor polynomial P_n to increase its accuracy (away from the center of expansion) without increasing its degree.

The purpose of this note isn’t to discuss the utility but rather to discuss an interesting property that these polynomials have. The Wiki article on these polynomials is reasonably good for that purpose.

Let’s discuss the polynomials themselves. They are defined for all positive integers n as follows:

T_n = cos(n acos(x)) . Now, it is an interesting exercise in trig identities to discover that these ARE polynomials to begin with; one shows this to be true for, say, n \in \{0, 1, 2\} by using angle addition formulas and the standard calculus resolution of things like sin(acos(x)) . Then one discovers a relation: T_{n+1} =2xT_n - T_{n-1} to calculate the rest.

The cos(n acos(x)) definition allows for some properties to be calculated with ease: the zeros occur when acos(x) = \frac{\pi}{2n} + \frac{k \pi}{n} and the first derivative has zeros where arcos(x) = \frac{k \pi}{n} ; these ALL correspond to either an endpoint max/min at x=1, x = -1 or local max and mins whose y values are also \pm 1 . Here are the graphs of T_4(x), T_5 (x)



Now here is a key observation: the graph of a T_n forms n spanning arcs in the square [-1, 1] \times [-1,1] and separates the square into n+1 regions. So, if there is some other function f whose graph is a connected, piecewise smooth arc that is transverse to the graph of T_n that both spans the square from x = -1 to x = 1 and that stays within the square, that graph must have n points of intersection with the graph of T_n .

Now suppose that f is the graph of a polynomial of degree n whose leading coefficient is 2^{n-1} and whose graph stays completely in the square [-1, 1] \times [-1,1] . Then the polynomial Q(x) = T_n(x) - f(x) has degree n-1 (because the leading terms cancel via the subtraction) but has n roots (the places where the graphs cross). That is clearly impossible; hence the only such polynomial is f(x) = T_n(x) .

This result is usually stated in the following way: T_n(x) is normalized to be monic (have leading coefficient 1) by dividing the polynomial by 2^{n-1} and then it is pointed out that the normalized T_n(x) is the unique monic polynomial over [-1,1] that stays within [-\frac{1}{2^{n-1}}, \frac{1}{2^{n-1}}] for all x \in [-1,1] . All other monic polynomials have a graph that leaves that box at some point over [-1,1] .

Of course, one can easily cook up analytic functions which don’t leave the box but these are not monic polynomials of degree n .

March 9, 2014

Bézier Curves

I am currently teaching Numerical Analysis and using Burden-Faires. The book covers the topics we like, but I feel that the section on splines and parametrized curves is a bit weak; in particular the discussion on Bézier curves is a bit lacking. The pity: the discussion need not be all that deep, and the standard equation for Bézier curves is actually easy to remember.

Also: where the text talks about how the Bézier curve equations differs from the “bare handed parametric cubic spline” that they derive, they don’t explain the reason for the difference.

So, I decided to write these notes. I will have to explain some basic concepts.

The setting: R^n with the usual geometry induced by the usual “dot product”.

Convex Sets in R^n

A set X \subset R^n is said to be convex if for any two points x, y \in X , the straight line segment connecting x to y is also in X ; that is, the set tx + (1-t)y \in X for all t \in [0,1] .


(from here)

Convex Hull for a set of points

Now suppose one is given a collection of points C= x_0, x_1, x_2, x_3,.... \in R^n . The convex hull H for C is the smallest convex set which contains all of C . That is, if Y is any convex set that contains C , then H \subseteq Y. In the case where the set of points is finite (say, C = \{x_0, x_1, x_2, ....x_n \} ) then H consists the set of all \sum^{n}_{i = 0} \alpha_i x_i where \alpha_i \ge 0 and \sum^{n}_{i=0} \alpha_i = 1 .

Note: the convex hull for a set of points is, in general, an example of a vector subset that is NOT a vector subspace.

Binomial Theorem and the Bernstein coefficient polynomials

Recall from algebra: if n is a positive integer and a, b numbers (real, complex, or even arbitrary field elements), (a+b)^n = \sum^{n}_{j =0} { n \choose j} a^{n-j} b^{j} , where {n \choose j} = \frac{n!}{(n-j)! j !} . For example, (a+b)^3 = a^3 + 3a^2b + 3ab^2 + b^3 .

Now consider the rather silly looking: 1^n = ((1-t) + t)^n = \sum^n_{j=0}{ n \choose j} (1-t)^{n-j} t^{j} Note that this expression is equal to 1 for ALL values of t and that for t \in [0,1] , each summand { n \choose j} (1-t)^{n-j} t^{j} is positive or zero.

These “coefficient polynomials” { n \choose j} (1-t)^{n-j} t^{j} are called the Bernstein polynomials (or Bernstein basis polynomials) and we denote them as follows: b_{j,n}(t) = { n \choose j} (1-t)^{n-j} t^{j} . We now see that for all t \in [0,1], 0 \le b_{j,n}(t) \le 1 and \sum^n_{j=0}b_{j,n}(t) = ((1-t)+t)^n =1^n =1

Definition of a Bézier curve and some of its properties

Now let P_0, P_1, P_2, ...P_n be a collection of distinct points in R^k . One can think of these points as vectors.
The Bézier curve with control points P_0, P_1, P_2, ...P_n is defined to be B(t)=  \sum^n_{j=0}b_{j,n}(t)P_j, t \in [0,1] .


B(0) = P_0, B(1) =P_n . This is clear because b_{0,n}(0) = 1, b_{n,n}(1) =1 and for i \notin \{0,1\}, b_{i,n}(0)=b_{i,n}(1) = 0 .

The polygon formed by P_0, P_1, ....P_n is called the control polygon for the Bézier curve.

For all t \in [0,1], B(t) is in the convex hull of P_0, P_1, ...P_n . This is clear because \sum^n_{j=0}b_{j,n}(t) = ((1-t)+t)^n =1^n =1 and each b_{i,n}(t) is positive.

“Guideposts”: the text talks about the “guideposts”: the text looks at a cubic Bézier curve in the plane and uses (x_0, y_0) =P_0, (x_0+ \alpha_0, y_0 + \beta_0) = P_1,  (x_1 - \alpha_1, y_1 - \beta_1)= P_2, (x_1, y_1) =P_3

Now P_1 and P_{n-1} directly affect the (one sided) tangent to the Bézier curve at t=0, t=1 . In fact we will show that if we use the one-sided parametric curve derivative, we see that B'(0) = n(P_1 - P_0), B'(1) = n(P_n - P_{n-1}) . The text calls n the scaling factor and notes that the scaling factor is 3 when n = 3 .

We’ll do the calculations for B'(0), B'(1) for the general degree n Bézier curve using elementary calculus (product rule):

First write B(t) = (1-t)^nP_0 + n(1-t)^{n-1}tP_1 + \sum^{n-2}_{j=2} b_{j,n}(t) P_j + n(1-t)t^{n-1}P_{n-1} + t^n P_n . Now take the derivative and we see:
B'(t) = -n(1-t)^{n-1}P_0 + (n(1-t)^{n-1} - n(n-1)(1-t)^{n-2}t)P_1 + \frac{d}{dt} (\sum^{n-2}_{j=2} b_{j,n}(t) P_j) +(n(n-1)(1-t)t^{n-2}-nt^{n-1})P_{n-1} + nt^{n-1}P_n

Key observation: every term of \frac{d}{dt} (\sum^{n-2}_{j=2} b_{j,n}(t) P_j) has both a factor of t and (1-t) in it; hence this middle term evaluates to zero when t \in {0,1} and is therefor irrelevant to the calculation of B'(0) and B'(1) .

So B'(0) = -nP_0 + nP_1 = n(P_1 - P_0) (the last two terms are zero at t =0 and B'(1) = -nP_{n-1} + nP_n = n(P_n - P_{n-1}) (the first two terms are zero at t = 1 ).

It follows that the DIRECTION of the (one sided) tangents at the ends of the Bézier curve depends only on the unit tangent vectors in the direction of P_1 - P_0, P_n - P_{n-1} respectively. Of course, the tangent vector has a magnitude (norm) as well, and that certainly affects the curve.

(graphic from here)


Here are some examples of Bézier cubic curves: the points with the open circles are P_0, P_3 and the points that are filled in with gray are the control points P_1, P_2 . The last curve is two Bézier cubics joined together.

The software that I provided writes the cubic Bézier curve as a “conventional” cubic in x, y coordinates: B_{x}(t) = a_3t^3 + a_2t^2 + a_1t + a_0 and B_{y} = b_3t^3 + b_2t^2 + b_1t + b_0 .

Create a free website or blog at