I am writing this post because I’ve seen that there is some misunderstanding of what mathematical induction is and why it works.
- What is mathematical induction? It is a common proof technique. Basically, if one wants to show that a statement is true in generality and that one can index the set of statements via the integers (or by some other appropriate index set), then one can use induction.
Here is a common example: suppose one wants to show that
for all positive integers
(for example, ).
Initial step: so the statement is true for .
Inductive step: assume that the formula holds for some integer .
Finish the proof: show that if the formula holds for some integer , then it holds for as well.
(why? because we assumed that was an integer for which
so (factor out a k+1 term)
which is what we needed to show. So the proof would be done.
- Why does induction “prove” anything? Mathematical induction is equivalent to the so called “least positive integer” principle in mathematics.
- What is the least positive integer principle? It says this: “any non-empty set of positive integers has a smallest element”. That statement is taken as an axiom; that is, it isn’t something that can be proved.
Notice that this statement is false if we change some conditions. For example, is is NOT true that, say, any set of positive numbers (or even rational numbers) has a smallest element. For example, the set of all numbers between 0 and 1 (exclusive; 0 is not included) does NOT have a least element (not according to the “usual” ordering induced by the real number line; it is an easy exercise to see that the rationals can be ordered so as to have a least element). Why? Let be a candidate to be the least element. Then is between 0 and 1. But then is greater than zero but is less than ; hence could not have been the least element. Neither could any other number.
Note that the set of negative integers has no least element; hence we need the condition that the integers are positive.
Notice also that there could be groups of positive integers with no greatest element. For example, Let be the largest element in the set of all even integers. But then is also even and is bigger than . Hence it is impossible to have a largest one.
- What does this principle have to do with induction? This is what: an induction proof is nothing more than a least integer argument in disguise. Lets return to our previous example for a demonstation; that is, our proof that
We start by labeling our statements: is statement P(1),
is statement P(2), … is statement P(5) and so on.
We assume that the statement is false for some integer. The set of integers for which the statement is false has a least element by the least element principle for positive integers.
We assume that the first integer for which the statement is false is . We can always do this, because we proved that the statement is true for , so the first possible false statement is or some larger integer, and these integers can always be written in the form .
That is why the anchor statement (the beginning) is so important.
We now can assume that the statement is true for since is the first time the statement fails.
Now when we show “if statement P() is true then P() is also true (this is where we did the algebra to add up . This contradicts that statement P() is false.
Hence the statement cannot be false for ANY positive integer .
- Weak versus strong induction. As you can see, the least positive integer principle supposes that the statement is true for all statements P(1) through P(), so in fact there is no difference (when inducting on the set of positive integers) between weak induction (which assumes the induction hypothesis for some integer ) and strong induction (which assumes the induction hypothesis for through ).
- Other index sets: any index set that one has to induct on has to have the “least element principle” to its subsets. Also, if there is a cardinal w that has no immediate predecessor, then one must “reanchor” the induction hypothesis as prior to proceeding.