By now you will have seen several definitions in lectures. Many of them will be written in the form
Definition. A blah is …
That is, the definition is displayed and the word being defined is in italics (or underlined if somebody is writing by hand). Sometimes, one doesn’t bother with the display, and simply says, during a discussion, “We define a blah to be …”
What is likely to have been emphasized less is that there are several different kinds of definition. In this post I’d like to enumerate some of them and give examples. It’s very much worth being aware, each time you meet a definition, what kind it is.
Before I start, I should make clear that there is some overlap between some of the categories of definition below.
1. Mere abbreviations.
Some definitions are little more than convenient abbreviations. For instance, it is annoying to keep writing so instead we write . And having decided that the ratio of the circumference of a circle to its diameter is important (or not, as the case may be), we prefer to write rather than something like “half the circumference of a unit circle”.
A slightly different example (because what is being defined is an adjective) is saying “even” instead of “divisible by 2″ and saying “odd” instead of “not divisible by 2″.
I don’t find it all that easy to think of examples of “mere” abbreviations, because almost all definitions do something more. “Equivalence relation” perhaps counts, since “ is an equivalence relation” is short(ish) for “ is reflexive, symmetric and transitive”.
However, that example shows that even mere abbreviations aren’t completely “mere”, since the fact that one bothers to come up with the abbreviation is a signal that the concept is worth naming. (This is similar to real-life examples such as UK, CIA, DPMMS, “bike”, “mobile”, etc.)
2. Definitions that replace entire sentences by single words or short phrases.
A positive integer not equal to 1 is prime if its only factors are 1 and . Why do I think of this as a slightly different kind of definition from the “mere” abbreviations?
Consider the following two sentences.
Although I had to put “divisible by 2″ after “number”, basically all I did was replace “even” by “divisible by 2″. That is the sense in which I am calling “even” a “mere” abbreviation of “divisible by 2″.
What if I also wanted to do without the word “primes”? I would have to write something much more convoluted like this.
Here are a few more definitions of a similar kind.
If I want to replace the word being defined by what it means, I don’t just stick some slightly longer phrase where the word was: I end up rewriting the whole sentence. For instance, if I want to explain what it means to say that every right coset of a normal subgroup is equal to a left coset, I have to say something like, “Let be a subgroup of and suppose that whenever and . Then every left coset of is equal to a right coset of .”
3. Strange new definitions of concepts you thought were already defined.
A few concepts that you may have seen “defined” in this sense are positive integers, integers, rational numbers, real numbers, complex numbers (not to mention how to add and multiply all these kinds of numbers), ordered pairs, functions, relations, binary operations, and sequences.
Let me list how those are defined. (Apologies in advance if I get any of them wrong.)
The definitions so far are of little value until one defines algebraic operations. I won’t give all those, but here’s one.
It is customary to present definitions of this kind as though they were getting to the essence of the concept being defined. Based on examples like “is less than” or “is a factor of” or “is congruent mod 7 to”, you might have thought that a relation on a set was a potential relationship between pairs of elements of that holds for some pairs and not for others, but actually, you are told, it’s just a subset of .
When you are presented with one of these “but actually” definitions, you should go through the following process.
(i) Understand how your intuitive understanding of the concept being defined relates to the formal definition you are presented with.
(ii) Continue to use the intuitive understanding, turning to the formal definition if you are ever in danger of getting muddled, or if you want to make general statements about the concept in question.
(iii) Think about what properties of the intuitive concept the formal definition is trying to capture.
Just to clarify what I mean by (ii), suppose you were asked a simple question like, “How many relations are there on a set of size ?” If you think of a relation as a way of relating elements of the set, then this could seem a difficult question. If you think of it as a subset of , then you will see instantly (I hope) that the answer is .
Let’s try to do (i), again using relations as our example. How does the subset-of- definition correspond to the relating-things definition? Well, if I have some way of relating things, it will be expressed as a sentence with gaps into which you insert two elements and , both of which can range over all of . For example, if is the set of all positive integers, we might have or as a relation on . Once I have two elements and I can use the relation to form sentences such as and .
To convert something like that into a subset of I just take the set of all ordered pairs such that is related to in the way stated.
In the other direction, if I’m given a subset of , I can define a relating-things relation by setting if and only if . So every method of relating pairs of elements of gives rise to a subset of and vice versa.
The exercise I have just carried out for relations tends not to be hard to carry out for other concepts. To give one other example, a real sequence gives us the function defined by and a function gives us the real sequence defined by .
What do I mean by (iii)? This is easier to say for some examples than it is for others. A particularly easy case is ordered pairs. What is the point of defining the ordered pair as the set ? It’s that, officially at least, mathematicians don’t like having too many primitive concepts — that is, concepts that can’t be defined in terms of lower-level concepts — so they try to build everything up from sets.
So far so good, but what makes us choose that funny set to count as “the ordered pair” ? Well, what is the main thing we care about when we deal with ordered pairs? It’s that if and only if and . It’s a simple exercise to show that the sets and are equal if and only if and , so this set-theoretic construction gives us a way of defining a set-theoretic object that has the key property we want of ordered pairs.
One consequence of the fact that it’s really the properties we are interested in rather than the objects themselves, is that we can “define” the same concept in more than one way. For example, I could define the ordered pair to be the set instead. That would again have the required property.
Real numbers can be “defined” in several ways. I mentioned the Cauchy-sequences definition above, but another well-known one is the notion of a Dedekind cut. We define a real number to be a partition of the rational numbers into two sets and such that every element of is less than every element of .
How does this correspond to what you might think of as a real number ? Well, given your number , you can define to be the set of all rationals less than and to be the set of all rationals greater than or equal to . In the other direction, given two sets and with that property, you can calculate the decimal expansion of a real number by using the following procedure. Start with the biggest integer that belongs to . Let’s say it is 3. Now take the biggest multiple of 0.1 that belongs to . Let’s say that is . Then take the biggest multiple of . Let’s say that is . Continuing in this way, we build up the decimal expansion of a number that is at least as big as every number in .
What properties do we want real numbers to have? The answer is that we want them to have the kinds of arithmetic properties we expect — things like — and to have the property that every increasing sequence that is bounded above converges to a limit. If you don’t know what that means, it doesn’t matter too much here. What matters here is that it is an axiom on which is built the theory of real analysis, which you will be doing next term. There are certain properties that turn out to imply all the other statements we want to make about real numbers, and Dedekind cuts are a way of showing that if we’ve got the rationals and we’ve got some set theory, then we don’t have to introduce any new objects. (The rationals themselves are defined in terms of the integers, which are defined in terms of the natural numbers, which are defined in terms of sets.)
4. Calculation definitions.
I commented above that one can define ordered pairs or real numbers, or many other mathematical concepts, in several different ways. By that I meant really different: an equivalence class of Cauchy sequences of rational numbers is not the same thing as a Dedekind cut, but either will serve as a construction-definition of a real number.
There is another kind of non-uniqueness that frequently occurs when we want to define a number or function. For example, can be defined as , or it can be defined as the area of a unit circle (which itself can be defined using an integral). It is far from obvious that these two definitions result in the same number, but a bit of theory shows that they do.
An example of a function that can be defined in more than one way is . Here are four ways of defining it.
(1) Let be the number
For every positive integer define to be , with defined to be 1.
For every pair of positive integers define to be the th root of .
Finally, given a real number , let be a sequence of rational numbers converging to and define to be the limit of the numbers .
(2) Define to be
(3) Define to be the unique solution of the differential equation such that .
(4) Define to be the limit as of .
Now it might seem that is a pre-existing function, and that these definitions are just ways of calculating . But that isn’t really the point of these definitions. The point is that it is incredibly useful to us to have a continuous function with the property that for every and . It is an exercise to prove that if such a function exists, then it is determined by a single parameter (such as its value at 1, or the value of its derivative at 0). The above definitions are four routes to proving that it exists. You will probably be given Definition (2) as the “official” definition of (though I myself prefer to use Definition (4)). Whichever definition you choose, one of the first things you do is prove that is always equal to . That pins your function down to one of the form for some real number . To get the right function, you need to impose one more condition, which can be done in many ways: perhaps the simplest is to insist that .
In general, when you are presented with a calculation-definition, for example of some new function, I strongly recommend that you pay close attention to the basic properties that your lecturer goes on to prove. Very often these determine the function uniquely and are what you use in practice when you are proving further things about the function.
With the exponential function, it is profitable to think of and as “axioms for “. As an indication of how that is a useful point of view, let’s imagine that we have been given the power-series definition of and are now faced with the task of proving that the derivative of is . We can of course differentiate term by term, but then we need to know that that’s allowed. (It is, but it takes a little bit of work to prove it.) Another way of proceeding is first to show that and that the derivative at 0 is 1. Then the derivative at x, if it exists, is . This fraction equals , so we see straight away that the derivative will be times the derivative at 0.
That second proof probably ends up involving a similar amount of work to the first, but it has a big advantage, which is that it shows that the properties “differentiates to itself” and “turns addition into multiplication” are closely related. They aren’t just two properties that a function given by a funny formula happens to have.
Calculation-definitions are different from the construction-definitions discussed in the previous section, since what is calculated is the same thing for each definition. For instance, although definitions (1)-(4) above are different, they all define the same object — a certain function from to . By contrast, as already mentioned, different ways of defining ordered pairs or real numbers give you distinct mathematical objects (that nevertheless have the same important properties).
As with construction-definitions, calculation-definitions are sometimes helpful for more than just the basic properties of what is being defined. For example, if I want to prove that is irrational, then I will certainly avail myself of the power-series definition. Note that once we know that there is only one continuous function such that for every and and , we know that once we have established continuity and those properties for a new definition, we know that it gives us the same function as the other ones. In other words, we don’t have to prove that by means of some laborious calculation.
I haven’t posted for a while now, so I’m going to post this, even though I think that there may be entire classes of definitions that I have not mentioned. However, the main thing I want to say about definitions — that some definitions don’t look like definitions at all — is so important that I am going to devote a separate post (the next one) to it.
To summarize, some definitions are mere abbreviations. The main purpose of some definitions is to pick out certain properties (e.g., saying that a triangle is equilateral if its three sides have the same length). Some definitions are constructions of mathematical objects that may look a little bizarre but are designed to have properties that enable them to model our pre-existing intuitive concepts. And some are ways of specifying numbers or functions that are again of interest mainly for their properties.