This term I shall be giving Cambridge’s course Analysis I, a standard first course in analysis, covering convergence, infinite sums, continuity, differentiation and integration. This post is aimed at people attending that course. I plan to write a few posts as I go along, in which I will attempt to provide further explanations of the new concepts that will be covered, as well as giving advice about how to solve routine problems in the area. (This advice will be heavily influenced by my experience in attempting to teach a computer, about which I have reported elsewhere on this blog.)
I cannot promise to follow the amazing example of Vicky Neale, my predecessor on this course, who posted after every single lecture. However, her posts are still available online, so in some ways you are better off than the people who took Analysis I last year, since you will have her posts as well as mine. (I am making the assumption here that my posts will not contribute negatively to your understanding — I hope that proves to be correct.) Having said that, I probably won’t cover exactly the same material in each lecture as she did, so the correspondence between my lectures and her posts won’t be as good as the correspondence between her lectures and her posts. Nevertheless, I strongly recommend you look at her posts and see whether you find them helpful.
You will find this course much easier to understand if you are comfortable with basic logic. In particular, you should be clear about what “implies” means and should not be afraid of the quantifiers and . You may find a series of posts I wrote a couple of years ago helpful, and in particular the ones where I wrote about logic (NB, as with Vicky Neale’s posts above, they appear in reverse order). I also have a few old posts that are directly relevant to the Analysis I course (since they are old posts you may have to click on “older entries” a couple of times to reach them), but they are detailed discussions of Tripos questions rather than accompaniments to lectures. You may find them useful in the summer, and you may even be curious to have a quick look at them straight away, but for now your job is to learn mathematics rather than trying to get good at one particular style of exam, so I would not recommend devoting much time to them yet.
What do I need to know before taking the course?
For the rest of this post, I want to describe briefly the prerequisites for this course. One of the messages I want to get across is that in a sense the entire course is built on one axiom, namely the least upper bound axiom for the real numbers. I don’t really mean that, but it would be correct to say that it is built on one new axiom, together with other properties of the real numbers that you are so familiar with that you hardly give them a second’s thought.
If I want to say that more precisely, then I will say that the course is built on the following assumption: there is, up to isomorphism, exactly one complete ordered field. If the phrase “complete ordered field” is unfamiliar to you, it doesn’t matter, though I will try to explain what it means in a moment. Roughly speaking, this assumption is saying that there is exactly one mathematical structure that has all the arithmetical and order properties that you would expect of the real numbers, and also satisfies the least upper bound axiom. And that structure is the one we call the real numbers.
And now let me make that more precise.
What is a field?
A field is a set with two binary operations and that behave in the same nice ways that addition and multiplication behave in the real numbers. That is, they have the following properties.
(i) is commutative and associative and has an identity element. Every element of has an inverse under .
(ii) is commutative and associative and has an identity element. Every element of other than the identity of has an inverse under .
(iii) is distributive over . That is, for any three elements of we have .
If we define an algebraic structure with some notions of addition and multiplication, then to say that it is a field is to say that all the usual rules we use to do algebraic manipulations are valid. It can be amusing and instructive to prove facts such as that assuming nothing more than the field axioms, but in this course I shall take these slightly less elementary facts as read as well. But I assure you that they do follow from the field axioms.
Some examples of fields that you have already met are , , and . (That last one is the field that consists of integers mod for a prime , with addition and multiplication mod . The only axiom that is not easy to verify is the existence of multiplicative inverses for non-zero elements of the field, which follows from the fact that if and are coprime then there are integers and such that .)
What is an ordered field?
This question splits into two. First we need to know what an ordering is, and then we need to know how the ordering relates to the algebraic operations. Let me take these two in turn.
A totally ordered set is a set together with a relation that has the following properties.
- is transitive: that is, if and , then .
- satisfies the law of trichotomy: that is, for any exactly one of the statements , , holds.
Note that the trichotomy law implies that is antisymmetric: that is, if then it cannot also be the case that .
In the above situation, we say that is a total ordering on . Given a total ordering we can make some obvious further definitions. For instance, we can define by saying that if and only if . (Note that is also a total ordering on .) Also, we can define by saying that if and only if either or , and similarly we can define .
Here’s an example of a totally ordered set that is not just a subset of the real numbers. We take to be the set of all polynomials with real coefficients, and if and are two polynomials, we say that if there exists a real number such that for every . (That is, if is “eventually bigger than “.) It is easy to check that this relation is transitive, and an instructive exercise to prove that the trichotomy law holds. (It is also not too hard, so I think it is better not to give the proof here.)
How should we define an ordered field? A first guess might be to say that it is a field with a total ordering on it. But a moment’s thought shows that that is a ridiculous definition, since we could define a “stupid” total ordering that had nothing to do with any natural ordering we might want to put on the field. For example, we could define an ordering on the rationals as follows: given two rational numbers and , written in their lowest terms with and positive, say that if either or and . That is certainly a total ordering on the rationals, but it is a rather strange one. For example, with this ordering we have and also .
What has gone wrong? The answer is that it is not interesting to have two structures on a set (in this case, the algebraic structure and the order structure) unless those structures interact. In fact, we have already seen this in the field axioms themselves: we have addition and multiplication, and it is absolutely crucial to have some kind of relationship between them. The relation we have is the distributivity law. Without that, we would allow “stupid” examples of pairs of binary operations that had nothing to do with each other.
An ordered field is a field together with a total ordering that satisfies the following properties.
- For every , if , then .
- For every , if and , then .
Basically what these properties are saying is that the usual rules we use when manipulating inequalities, such as adding the same thing to both sides, apply.
In practice, we tend to use a rather larger set of rules. For example, if we know that , we will feel free to deduce that . And nobody will bat an eyelid if you have a real number and state without proof that . Both these facts can be deduced fairly easily from the properties of ordered fields, and again it is quite a good exercise to do this if you haven’t already. However, in this course we shall take the following attitude. There are the axioms for an ordered field. There are also some simple deductions from these axioms that provide us with some further rules for manipulating equations and inequalities. All of these we will treat in the same way: we just use them without comment.
Abstract versus concrete
Before I get on to the most important axiom, and the one that very definitely will not be used without comment, I want to discuss a distinction that it is important to understand: the distinction between the abstract and the concrete approaches to mathematics. The abstract approach is to concentrate on the properties that mathematical structures have. We are given a bunch of properties and we see what we can deduce from them, and we do that quite independently of whether any object with those properties exists. Of course, we do like to check that the properties are consistent, which we do by finding an object that satisfies them, but once we have carried out that check we go back to concentrating on the properties themselves.
The concrete approach to mathematics is much more focused on the objects themselves. We take an object, such as the set of all prime numbers, and try to describe it, prove results about it, and so on.
The boundary between the two approaches is extremely fuzzy, because we often like to convert the concrete approach into a more abstract one. For example, consider the function . This can be defined concretely as the function given by the formula . (That’s just a concise way of writing .) And a similar definition can be given for . But somewhere along the line we will want to prove basic facts such as that , or that , or that . And once we’ve proved a few of those facts, we find that we no longer want to use the formula, because everything we need to know follows from those basic facts. And that is because with just a couple more facts of the above kind, we find that we have characterized the trigonometric functions: that is, we have written down properties that are satisfied by and and by no other pair of functions. When this kind of thing happens, our approach has shifted from the concrete (we are given the formulae and want to prove things about the resulting functions) to the abstract (we are given some properties and want to use them to deduce other properties).
Something very similar happens with the real numbers. Up to now (at least until taking Numbers and Sets), you will have been used to thinking of the real numbers as infinite decimals. In other words, the real number system is just out there, an object that you look at and prove things about. But at university level one takes the abstract approach. We start with a set of properties (the properties of ordered fields, together with the least upper bound axiom) and use those to deduce everything else. It’s important to understand that this is what is going on, or else you will be confused when your lecturers spend time proving things that appear to be completely obvious, such as that the sequence converges to 0. Isn’t that obvious? Well, yes it is if you think of a real number as one of those things with a decimal expansion. But it takes quite a lot of work to prove, using just the properties of a complete ordered field, that every real number has a decimal expansion, and rather than rely on all that work it is much easier to prove directly that converges to 0.
The least upper bound axiom
What is a least upper bound?
Let be a set of real numbers. A real number is an upper bound for if for every . For example, if is the open interval , then is an upper bound for .
A real number is the least upper bound of if it has the following two properties.
- is an upper bound for .
- If , then is not an upper bound for .
Another way of writing these two properties is as follows. I’ll use quantifiers.
In words, everything in is less than or equal to , and for any there is some that is bigger than .
As an example, is the least upper bound of the open interval . Why? Because if then , and if then we can find such that . (How do we do this? Well, if then take and if then take .)
The least upper bound property is the following statement: every non-empty subset of the reals that has an upper bound has a least upper bound.
But since we are thinking abstractly, we will not think of this as a property (of the previously given real numbers) but more as an axiom. To do so we can state it as follows.
Let be an ordered field. We say that has the least upper bound property if every non-empty subset of that has an upper bound has a least upper bound.
For reasons that will become clear only after the course has started, we say that an ordered field with the least upper bound property is complete. There are then two very important theorems that we shall assume.
Theorem 1. There exists a complete ordered field.
Theorem 2. There is only one complete ordered field, in the sense that any two complete ordered fields are isomorphic.
I don’t propose to give proofs of either of these results, but let me at least give some indication, for those who are interested, of how they can be proved. The proofs are not required knowledge for the course, but it’s not a bad idea to have some inkling of how they go.
Why does there exist a complete ordered field?
One answer to this is that the reals are a complete ordered field! That is, if you take the good old infinite decimals that you are used to, and you say very carefully what it means to add or multiply two of them together, and you order them in the obvious way, then you can actually prove rigorously that you have a complete ordered field. It’s not very pretty (partly because of the fact that point nine recurring equals 1) but it can be done.
Here’s how one can prove the least upper bound property. For convenience let us take a non-empty set that consists of positive numbers only. Assuming that is bounded above, we would like to find a least upper bound. We can do this as follows. First, find the smallest integer that is an upper bound for . (We know that there must be an integer — just take any integer that is bigger than the upper bound we are given for . If we are defining the reals as infinite decimals, then it is genuinely obvious that such an integer exists — you just chop off everything beyond the decimal point and add 1.) Call this integer . Next, we find the smallest multiple of that is an upper bound for . This will be one of the numbers . Then you take the smallest multiple of that is an upper bound for , and so on. This gives you a sequence that might be something like . If you look at an individual digit of the numbers in this sequence, such as the fifth after the decimal point, it will eventually stabilize, and if you take these stabilized digits as the digits of a certain number, then that number will be an upper bound for and no smaller number will be. (Both these statements need to be checked, but both are reasonably straightforward.)
A more elegant way to prove the existence of a complete ordered field is to use objects called Dedekind cuts. A Dedekind cut is a partition of the rational numbers into two non-empty subsets and such that every element of is less than every element of , and such that does not have a minimal element.
To see why this might be a reasonably sensible definition, consider the sets and , where consists of all rationals such that either or , and consists of all positive rationals such that . This is the Dedekind cut that corresponds to our ordinary conception of the number .
The condition that should not have a minimal element is to make sure that we don’t have two different Dedekind cuts representing each rational number. (If the rational number is , the partition we are ruling out is and . We just allow the partition and .)
If and are two Dedekind cuts, we can define their sum to be , where is defined to be the set of all numbers such that and , and similarly for . It’s a bit harder to define products — you may like to try it. It’s not so hard to define a sensible total ordering on the set of all Dedekind cuts. And then there’s a lot of checking needed to prove that what results is a complete ordered field. (I may as well admit at this point that I’ve never bothered to check this for myself, or to read a proof in a book. I’m happy to know that it can be done, just as I’m happy to fly in an aeroplane without checking that the lift will be enough to keep me in the sky.)
Why is there essentially just one complete ordered field?
Here’s one answer. You just go back to your notes in Numbers and Sets and look at the proof that every real number has a decimal expansion. Obviously if you define real numbers to be things with decimal expansions, then this is saying nothing at all, but that’s not what Professor Leader did. He deduced the existence of decimal expansions from the properties of complete ordered fields. So effectively he proved the following result: every element of a complete ordered field has a decimal expansion. We can say slightly more: it has a decimal expansion that does not end with an infinite sequence of 9s. Oh, and two different elements have different decimal expansions. So now if you want an isomorphism between two complete ordered fields, you just match up an element of one with the element of the other that has the same decimal expansion.
Let me very briefly sketch a neater approach. You first match up 1 with 1. (That is, you match up the multiplicative identity with the multiplicative identity.) Then you match up 1+1 with 1+1, and so on, until you have “the positive integers” inside your two complete ordered fields matched together. Then you match up 0 with 0 and the additive inverses of the positive integers with the additive inverses of the positive integers. Then you match up the reciprocals of the positive integers (or rather, their multiplicative inverses) with the reciprocals of the positive integers, and finally all the rationals with all the rationals. What I’m saying here is that in any complete ordered field you can make sense in only one reasonable way of the fraction when and are integers with , and you send each in one complete ordered field to its counterpart in the other.
Now let’s take any element of a complete ordered field. We can associate with the set of all “rationals” less than and map that set over to the other complete ordered field, using our correspondence between rationals. That gives us a set in the other complete ordered field. The least upper bound of is then the element that corresponds to .
As ever, there is work needed if you want to turn the above idea into a complete proof: if the map you’ve defined is , then you need to check things like that or that if a set has least upper bound , then has least upper bound . But all that can be done.
If you found what I’ve just written a bit intimidating, let me remind you that all you need to take away from it is that everything in this course will be deduced from the familiar algebraic and order properties of the reals, together with the least upper bound property. Since the algebraic and order properties should be very familiar to you, that means that the main things you need to learn are the definition of a least upper bound and the statement of the least upper bound property. The details matter, so a vague idea is not enough, but even so it’s not very much to learn.