This post is about a very simple idea that can dramatically improve the readability of just about anything, though I shall restrict my discussion to the question of how to write clearly about mathematics. The idea is more or less there in the title: present examples before you discuss general concepts. Before I go any further, I want to make very clear what the point is here. It is not the extremely obvious point that it is good to illustrate what you are saying with examples. Rather, it is to do with where those examples should appear in the exposition. So the emphasis is on the word “first” rather than on the word “examples”.
If this too seems pretty obvious, I invite you to consider how common it is to do the opposite. Open a textbook about some general concept in mathematics — Banach algebras, say — and the chances are very high that it will start with a formal definition of Banach algebras and only then give you a few examples. I myself became consciously aware of the principle as a result of editing the Princeton Companion to Mathematics: over and over again I found that I could make an article clearer by putting the authors’ well-chosen examples earlier in their discussion.
Why should it be better to do it that way round? Well, if a general definition is at all complex, then you will have quite a lot to hold in your head. This can be difficult, but it is much easier if the various aspects of the definition can be related to an example with which you are familiar. Then the words of the definition cease to be free-floating, so to speak, and instead become labels that you can attach to bits of your mental picture of the example.
By now the alert reader will have noticed that I have not practised what I have preached. So let’s forget all about the discussion so far and start again, this time doing things properly.
My favourite pedagogical principle: examples first!
Which of the following two explanations do you find clearer and easier to read? They are intended to introduce the concept of a field to a reader who knows what a binary operation is and knows basic definitions such as those of commutativity and identity elements.
Explanation 1. A field is a set X together with two binary operations, for which one conventionally uses the notation of addition and multiplication, that has the following properties. Both operations are commutative and associative and have identity elements. Every element of has an inverse under the operation +, and every element other than 0 (the name given to the identity of the operation +) has an inverse under as well. Finally, we have the rule known as the distributive law: for every , and in .
If we interpret + and to be the usual operations of addition and multiplication, then we readily see that the familiar number systems , and are fields. (If one goes right back to first principles, then these statements cease to be obvious, but we shall take facts such as the commutativity of multiplication of complex numbers as already established.) By contrast, the number systems and are not fields, since there is no additive identity in and not every element of has a multiplicative inverse. Less obvious examples of fields are number fields (subfields of that contain ) such as the field of all complex numbers of the form where and are rational, which is denoted . (All the field properties are very easily verified, with the exception of the existence of multiplicative inverses: but even that is a simple exercise.) Another important source of examples is the collection of finite fields, of which the simplest cases are obtained by taking a prime p and the set of all integers modulo p. (Here again the only field axiom that is not almost trivial to verify is the existence of multiplicative inverses — for that one needs Euclid’s algorithm.)
Explanation 2. The five main number systems, , , , and , though different from each other, have many features in common. For example, if is one of these number systems and and are numbers in , then one can add or multiply x and y together. One may also be able to subtract from or divide by , but this is not always possible, at least if one wants to stay in the same number system. For example, if we are confined to , then we cannot subtract 5 from 3, and if we are confined to then we cannot divide 5 by 3.
It is noticeable that there are far fewer problems of this kind in , and than there are in and . In these larger number systems subtraction is always possible (as it is in ), and so is division, provided only that one does not try to divide by 0.
Returning to the properties that these number systems share, we notice that in all five of them addition and multiplication are commutative and associative, and they obey the distributive law: for every , and in the number system in question.
A field is a mathematical structure that has the basic properties of the larger number systems , and . In other words, it is a set on which two binary operations, which we think of as addition and multiplication and therefore denote using conventional notation for these operations, are defined. These operations must both be commutative and associative, and they must obey the distributive law. Also, both operations must have identities, every element must have an inverse under the additive operation, and every element other than 0 (which is defined more formally as the additive identity) must have an inverse under the multiplicative operation as well. Once we have these inverses, we can easily define subtraction and division: is the sum of and the additive inverse of and is the product of and the multiplicative inverse of .
Thus, a field is basically an algebraic structure that “behaves like , or ,” in the sense that it has two binary operations that obey the algebraic rules that one observes in those number systems.
The concept of a field is important because besides , and there are some less obvious examples that play a central role in number theory. Most notable are number fields and finite fields. The former are fields such as , which consists of all complex numbers of the form where and are rational. (In general a number field is a subfield of that contains .) In a number field it tends to be very easy to verify all the field properties, with the exception of the existence of multiplicative inverses: but even that is usually a simple exercise. The simplest examples of finite fields are obtained by taking the set of all integers modulo a prime p. Here again the only field axiom that is not almost trivial to verify is the existence of multiplicative inverses — for that one needs Euclid’s algorithm. End of explanation 2.
I hope very much that you found the second explanation vastly preferable, or that if you didn’t, then at least you found that it had some quality of speaking directly to the reader that the first explanation lacked. What is the main difference between the two explanations? The content is more or less the same. But there is an important difference in the way that content is organized: in the first explanation the abstract definition of a field is given first and is then followed by some examples, whereas the second starts with the examples (or at least some of them) and uses them as a springboard for a more general discussion. Why should it be an advantage to put the examples first? Well, try to imagine the reaction of a reader who does not know what a field is. At the beginning of the first explanation she [I decided on the sex of the reader by tossing a coin, by the way] is presented with a list that is not related to her previous mathematical experience. Therefore, it is extremely forgettable. Probably she will go on to read about the examples and then look back at the definition to check that it really does apply to them — a clear sign that the order is pedagogically unnatural. By contrast, if she reads the second explanation then the field axioms are describing something, namely her mental picture of a few fields that she already knows. So she has no need to commit anything to memory or to look back on parts of the text that she has not fully understood.
The comparison may seem unfair because the second explanation was longer, and spent a bit longer discussing the number systems. But that was such a natural thing to do when one started the discussion with the number systems that I think of it as almost a consequence of the policy of putting the examples first.
Back to the top level of this post.
And so, which one of those two explanations of the pedagogical principle did you prefer? I hope very much that you preferred the second. It was certainly much easier for me to explain why I think it is better to put examples first when I had an example to use to illustrate what I was talking about. (I’m referring here to the theory that the examples give you a mental picture of the concept that allows you to treat the abstract definition as a set of labels attached to concepts you already know rather than as a set of meaningless words with relationships that you just have to learn off by heart.)
When this principle occurred to me, I realized that I had sometimes put it into practice, but I had never been fully conscious of what I was doing. Now that I am, I always either put examples first, or make a conscious decision not to (perhaps because I judge that the reader can cope without). If you are not already conscious of what you are doing in this way, then try thinking about it for a while: if this post does not persuade you, then your own experience surely will — both of your own writing and of other people’s.